Re: Sub-second Beam

2019-12-13 Thread Aaron Dixon
Mehran, My benchmarking is very ad hoc and I don't have the resources at the moment to do something formal or publish anything (but I may in the coming year.) The short summary from cursory online research, the feedback in this email thread, and my own empirical evidence (even taking KafkaIO out o

Re: Sub-second Beam

2019-12-13 Thread Aaron Dixon
Ah great point about the Kafka partition skew. I can see how the problem domain is kind of stacked against sub-second latency expectations. Thanks again. On Fri, Dec 13, 2019 at 3:48 PM Robert Bradshaw wrote: > On Fri, Dec 13, 2019 at 1:20 PM Aaron Dixon wrote: > > > > Thank you Robert, this is

Re: Sub-second Beam

2019-12-13 Thread Mehran Nazir
Hey Aaron, Do you plan on publishing the results of your test, as well as a description of your testing method? I'd be interested in seeing what you find. Mehran On Fri, Dec 13, 2019 at 4:48 PM Robert Bradshaw wrote: > On Fri, Dec 13, 2019 at 1:20 PM Aaron Dixon wrote: > > > > Thank you Rober

Re: Sub-second Beam

2019-12-13 Thread Robert Bradshaw
On Fri, Dec 13, 2019 at 1:20 PM Aaron Dixon wrote: > > Thank you Robert, this is really helpful context and confirmation. > > > You mention having an "agressive" watermark--could you clarify what you > > mean by this? > > I'm using KafkaIO and I customize the watermark to follow my message event

Re: Sub-second Beam

2019-12-13 Thread Aaron Dixon
Thank you Robert, this is really helpful context and confirmation. > You mention having an "agressive" watermark--could you clarify what you mean by this? I'm using KafkaIO and I customize the watermark to follow my message event timestamps as they come (instead of having it lag behind the stream

Re: Sub-second Beam

2019-12-13 Thread Robert Bradshaw
In general, sub-second latencies are difficult because one must wait for the watermark to catch up before actually firing. This would require the oldest item in flight across all machines to be almost exactly the same timestamp as the newest. Furthermore most sources cannot provide sub-second water