Mehran,
My benchmarking is very ad hoc and I don't have the resources at the moment
to do something formal or publish anything (but I may in the coming year.)
The short summary from cursory online research, the feedback in this email
thread, and my own empirical evidence (even taking KafkaIO out o
Ah great point about the Kafka partition skew. I can see how the problem
domain is kind of stacked against sub-second latency expectations. Thanks
again.
On Fri, Dec 13, 2019 at 3:48 PM Robert Bradshaw wrote:
> On Fri, Dec 13, 2019 at 1:20 PM Aaron Dixon wrote:
> >
> > Thank you Robert, this is
Hey Aaron,
Do you plan on publishing the results of your test, as well as a
description of your testing method? I'd be interested in seeing what you
find.
Mehran
On Fri, Dec 13, 2019 at 4:48 PM Robert Bradshaw wrote:
> On Fri, Dec 13, 2019 at 1:20 PM Aaron Dixon wrote:
> >
> > Thank you Rober
On Fri, Dec 13, 2019 at 1:20 PM Aaron Dixon wrote:
>
> Thank you Robert, this is really helpful context and confirmation.
>
> > You mention having an "agressive" watermark--could you clarify what you
> > mean by this?
>
> I'm using KafkaIO and I customize the watermark to follow my message event
Thank you Robert, this is really helpful context and confirmation.
> You mention having an "agressive" watermark--could you clarify what you
mean by this?
I'm using KafkaIO and I customize the watermark to follow my message event
timestamps as they come (instead of having it lag behind the stream
In general, sub-second latencies are difficult because one must wait
for the watermark to catch up before actually firing. This would
require the oldest item in flight across all machines to be almost
exactly the same timestamp as the newest. Furthermore most sources
cannot provide sub-second water