Big Data

Track Chairs : Duo zhang, Junping Du, Lidong Dai

Big Data is leading and changing various industries and is inseparable from our lives. Big Data is also a very important part of ASF. ASF has so many big data projects, such as Hadoop, Hive, Spark, HBase, Kylin, Ozone, CarbonData, Doris, Cassandra, etc. In this topic, you will learn the cutting-edge trends of these technologies and the practical experience, principles, architecture analysis and other exciting content from first-line users

2021-08-06 ROOM : A

13:30 GMT+8 Scaling Impala - Common Mistakes and Best Practices _{English Session} Manish Maheshwari

14:10 GMT+8 How Security is implemented in Apache Ozone _{English Session} Bharat Viswanadham, Shashikant Banerjee

14:50 GMT+8 OpenLooKeng heuristic index framework architecture analysis and application practice _{English Session} Zheng Li

15:30 GMT+8 Data Analysis across Disparate Data Sources in CMB _{English Session} Qiumin Wu

16:10 GMT+8 Java-based machine learning solutions for big data _{Chinese Session} Qing Lan

16:50 GMT+8 Apache HUDI on AWS _{Chinese Session} Lianghong Fei

2021-08-07 ROOM : A

13:30 GMT+8 analyzing transactional data in Apache druid _{English Session} Vijay Narayanan

14:10 GMT+8 OminiRuntime: A comon big data runtime framework _{English Session} Jingfang Zhang

14:50 GMT+8 Apache Ozone: A High Performance Object Store for analytics workloads _{English Session} Rakesh Radhakrishnan, Mukul Kumar Singh

15:30 GMT+8 Apache Atlas meets Apache Flink _{English Session} Josh Yeh, Yan Liu

16:10 GMT+8 Apache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline _{English Session} Aviem Zur, Assaf Pinhasi

16:50 GMT+8 Bigtop 3.0: Rerising community driven Hadoop distribution _{English Session} Kengo Seki, Masatake Iwasaki

2021-08-08 ROOM : A

13:30 GMT+8 Technical tips for secure Apache Hadoop cluster _{English Session} Akira Ajisaka, Kei KORI

14:10 GMT+8 Data Lake accelerator on Hadoop-COS in Tencent Cloud _{English Session} Li Cheng

14:50 GMT+8 Apache Kylin 4.0 : an architectural upgrade and a new path to tuning _{Chinese Session} 张智超

15:30 GMT+8 Big Data Format at Uber Data Infra _{English Session} Xinli Shang, Pavi Subenderan

16:10 GMT+8 Milvus: A Vector Database for Unstructured Data Processing _{English Session} Xiaofan Luan

16:50 GMT+8 Apache Arrow based DataFrame for Data Processing in Python _{English Session} Supun Kamburugamuve

2021-08-06 ROOM : B

13:30 GMT+8 How a DBS[Development Bank of Singapore] Data Platform Drives Real-time Insights & Analytics using Apache CarbonData _{English Session} Ravindra Pesala, Kumar Vishal

14:10 GMT+8 Challenges of Building a Distributed Fault-Tolerant Scalable Analytics Stack _{English Session} Nishant Bangarwa

14:50 GMT+8 Kyuubi: NetEase's Exploration and Practical Application of Serverless Spark Scenarios _{Chinese Session} Kent Yao （姚琴）

15:30 GMT+8 Inside Apache Druid's Storage and Query Engine _{English Session} Gian Merlino

16:10 GMT+8 Faster Bigdata Analytics by maneuvering Apache CarbonData’s Indexes _{English Session} Akash R Nilugal, Kunal Kapoor

16:50 GMT+8 Insight into the secret of Open Source community —— the best practise for data-driven community operations _{Chinese Session} Jun Zhong, Yikun Jiang, Lei Peng

2021-08-07 ROOM : B

13:30 GMT+8 Cassandra powered workflows to automate at scale _{English Session} Maciej Swiderski

14:10 GMT+8 Advanced User Behavior Analysis System Based on Apache Impala & Kudu _{Chinese Session} Qianqiong Zhang

14:50 GMT+8 How Apache Ozone builds up High Availablity with Raft protocol _{English Session} Li Cheng, Shashikant Banerjee, Nanda Kumar

15:30 GMT+8 Running Realtime Analytics at scale with Apache Pinot at LinkedIn and Uber _{English Session} Siddharth Teotia, Yupeng Fu

16:10 GMT+8 Past, present and future of Doris _{Chinese Session} Mingyu Chen（陈明雨）

16:50 GMT+8 Apache InLong, a one-stop streaming data integration solution _{Chinese Session} gosonzhang, leobiaoliu

2021-08-08 ROOM : B

13:30 GMT+8 Building an authentication & authorization system using HashiCorp Vault _{Chinese Session} guangning

14:10 GMT+8 Lambda Architecture for Big Data with Apache C *, Spark and Pulsar _{Chinese Session} 孟亚斌

14:50 GMT+8 State of the Union with Apache YuniKorn (Incubating) - Cloud Native Scheduler for Big Data Usecases _{English Session} Sunil Govindan, Julia Kinga Marton

15:30 GMT+8 Flexible Optimizations and Efficient Execution of Data Processing on Apache Nemo _{English Session} Won Wook SONG

16:10 GMT+8 A Change-Data-Capture use-case: designing an evergreen cache _{English Session} Nicolas Fränkel

16:50 GMT+8 New Apache Bigtop 1.5 and Wikimedia: Empower BigData in the real world _{English Session} Yuqi Gu, Luca Toscano