Great talk, Eugene. Ted, will share more info on Kafka IO for Python soon :)
- Cham On Thu, Mar 8, 2018 at 4:55 PM Ted Yu <yuzhih...@gmail.com> wrote: > I see. > > I have added myself as watcher on BEAM-3788. > > Thanks > > On Thu, Mar 8, 2018 at 4:51 PM, Eugene Kirpichov <kirpic...@google.com> > wrote: > >> Hi Ted - KafkaIO is not yet implemented using Splittable DoFn's (it was >> implemented before SDFs existed and hasn't been rewritten yet), but it will >> be, once more runners catch up with the support: currently we have Dataflow >> and Flink. +Chamikara Jayalath <chamik...@google.com> is currently >> working on implementing it using SDFs in the Python SDK. >> >> On Thu, Mar 8, 2018 at 4:34 PM Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Eugene: >>> Very informative talk. >>> >>> I looked at: >>> >>> sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTrackerTest.java >>> >>> Is there some example showing how OffsetRangeTracker works with Kafka >>> partition(s) ? >>> >>> Thanks >>> >>> On Thu, Mar 8, 2018 at 3:58 PM, Eugene Kirpichov <kirpic...@google.com> >>> wrote: >>> >>>> Hi Thomas! >>>> >>>> In case of tailing a Kafka partition, the restriction would be >>>> [start_offset, infinity), and it would keep being split by checkpointing >>>> into [start_offset, end_offset) and [end_offset, infinity) >>>> >>>> On Thu, Mar 8, 2018 at 3:52 PM Thomas Weise <t...@apache.org> wrote: >>>> >>>>> Eugene, >>>>> >>>>> I actually had one question regarding the application of SDF for the >>>>> Kafka consumer. Reading through a topic partition can be parallel by >>>>> splitting a partition into multiple restrictions (for use cases where >>>>> order >>>>> does not matter). But how would the tail read be managed? I assume there >>>>> would not be a new restriction whenever new records arrive (added >>>>> latency)? >>>>> The examples on slide 40 show an end offset for Kafka, but for a >>>>> continuous >>>>> read there wouldn't be an end offset? >>>>> >>>>> Thanks, >>>>> Thomas >>>>> >>>>> >>>>> On Thu, Mar 8, 2018 at 2:59 PM, Thomas Weise <t...@apache.org> wrote: >>>>> >>>>>> Great, thanks for sharing! >>>>>> >>>>>> >>>>>> On Thu, Mar 8, 2018 at 12:16 PM, Eugene Kirpichov < >>>>>> kirpic...@google.com> wrote: >>>>>> >>>>>>> Oops that's just the template I used. Thanks for noticing, will >>>>>>> regenerate the PDF and reupload when I get to it. >>>>>>> >>>>>>> >>>>>>> On Thu, Mar 8, 2018, 11:59 AM Dan Halperin <dhalp...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Looks like it was a good talk! Why is it Google Confidential & >>>>>>>> Proprietary, though? >>>>>>>> >>>>>>>> Dan >>>>>>>> >>>>>>>> On Thu, Mar 8, 2018 at 11:49 AM, Eugene Kirpichov < >>>>>>>> kirpic...@google.com> wrote: >>>>>>>> >>>>>>>>> Hey all, >>>>>>>>> >>>>>>>>> The slides for my yesterday's talk at Strata San Jose >>>>>>>>> https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63696 >>>>>>>>> have >>>>>>>>> been posted on the talk page. They may be of interest both to users >>>>>>>>> and IO >>>>>>>>> authors. >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>> >>> >