Hi folks, The next Bay Area Stream Processing meetup will be held on Wednesday, February 5, 2020. This meetup will focus on Apache Kafka, Apache Samza and related streaming technologies.
Where: Unify Conference room, 950 W Maude Ave, Sunnyvale When: 5:00 - 8:00 PM RSVP: link <https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/events/267283444/> Agenda: 5:00 PM: Doors open and catered food available 5:00 - 6:00 PM: Networking 6:00 - 6:30 PM: High-performance data replication at Salesforce with Mirus Paul Davidson, Salesforce At Salesforce we manage high-volume Apache Kafka clusters in a growing number of data centers around the globe. In the past we relied on Kafka's Mirror Maker tool for cross-data center replication but, as the volume and variety of data increased, we needed a new solution to maintain a high standard of service reliability. In this talk, we will describe Mirus, our open-source data replication tool based on Kafka Connect. Mirus was designed for reliable, high-performance data replication at scale. It successfully replaced MirrorMaker at Salesforce and has now been running reliably in production for more than a year. We will give an overview of the Mirus design and discuss the lessons we learned deploying, tuning, and operating Mirus in a high-volume production environment. 6:30 - 7:00 PM: Defending users from Abuse using Stream Processing at LinkedIn Bhargav Golla, LinkedIn When there are more than half a billion users, how can one effectively, reliably and scalably classify them as good and bad users? This talk will highlight how Anti-Abuse team at LinkedIn leverages Streams Processing techniques like Samza and Brooklin to keep the good users in a trusted environment devoid of bad actors. 7:00 - 7:30 PM: Enabling Mission-critical Stateful Stream Processing with Samza Ray Manpreet Singh Matharu, LinkedIn Samza powers a variety of large-scale business-critical stateful stream processing applications at LinkedIn. Their scale necessitates using persistent and replicated local state. Unfortunately, hard failures can cause a loss of this local state, and re-caching it can incur downtime ranging from a few minutes to hours! In this talk, we describe the systems and protocols that we've devised that bound the down time to a few seconds. We detail the tradeoffs our approach brings and how we tackle them in production at LinkedIn. 7:30 - 8:00 PM: Additional networking and Q&A If you are interested in attending, please RSVP via this meetup.com link <https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/events/267283444/> . Hope to see you there! Prateek