Hi All, I want to let you know that we have confirmed most of the agenda for Hadoop Community Meetup. It will be a whole day event.
Agenda & Dial-In info because see below, *please RSVP at https://www.meetup.com/Hadoop-Contributors/events/262055924/ <https://www.meetup.com/Hadoop-Contributors/events/262055924/>* Huge thanks to Daniel Templeton, Wei-Chiu Chuang, Christina Vu for helping with organizing and logistics. *Please help to promote meetup information on Twitter, LinkedIn, etc. Appreciated! * Best, Wangda *AM:9:00: Arrival and check-in--------------------------9:30 - 10:15:-------------Talk: Hadoop storage in cloud-native environmentsAbstract: Hadoop is a mature storage system but designed years before the cloud-native movement. Kubernetes and other cloud-native tools are emerging solutions for containerized environments but sometimes they require different approaches.In this presentation we would like to share our experiences to run Apache Hadoop Ozone in Kubernetes and the connection point to other cloud-native ecosystem elements. We will compare the benefits and drawbacks to use Kubernetes and Hadoop storage together and show our current achievements and future plans.Speaker: Marton Elek (Cloudera)10:20 - 11:00:--------------Talk: Selective Wire Encryption In HDFSAbstract: Wire data encryption is a key component of the Hadoop Distributed File System (HDFS). However, such encryption enforcement comes in as an all-or-nothing feature. In our use case at LinkedIn, we would like to selectively expose fast unencrypted access to fully managed internal clients, which can be trusted, while only expose encrypted access to clients outside of the trusted circle with higher security risks. That way we minimize performance overhead for trusted internal clients while still securing data from potential outside threats. Our design extends HDFS NameNode to run on multiple ports, connecting to different NameNode ports would end up with different levels of encryption protection. This protection then gets enforced for both NameNode RPC and the subsequent data transfers to/from DataNode. This approach comes with minimum operational and performance overhead.Speaker: Konstantin Shvachko (LinkedIn), Chen Liang (LinkedIn)11:10 - 11:55:-------------Talk: YuniKorn: Next Generation Scheduling for YARN and K8sAbstract: We will talk about our open source work - YuniKorn scheduler project (Y for YARN, K for K8s, uni- for Unified) brings long-wanted features such as hierarchical queues, fairness between users/jobs/queues, preemption to Kubernetes; and it brings service scheduling enhancements to YARN. Any improvements to this scheduler can benefit both Kubernetes and YARN community.Speaker: Wangda Tan (Cloudera)PM:12:00 - 12:55 Lunch Break (Provided by Cloudera)------------------------------------------------1:00 - 1:25-----------Talk: Yarn Efficiency at UberAbstract: We will present the work done at Uber to improve YARN cluster utilization and job SOA with elastic resource management, low compute workload on passive datacenter, preemption, larger container, etc. We will also go through YARN upgrade in order to adopt new features and talk about the challenges.Speaker: Aihua Xu (Uber), Prashant Golash (Uber)1:30 - 2:10 One more talk---------------------------------2:20 - 4:00-----------BoF sessions & Breakout Sessions & Group discussions: Talk about items like JDK 11 support, next releases (2.10.0, 3.3.0, etc.), Hadoop on Cloud, etc.4:00: Reception provided by Cloudera.==============================================Join Zoom Meetinghttps://cloudera.zoom.us/j/116816195 <https://cloudera.zoom.us/j/116816195>*