Hey Piotr,
I think we are broadly in agreement, hopefully.
So out of the three scenarios you describe, the code is simulating scenario
2). The only additional comment I would make to this is that the additional
load on a node could be an independent service or job.
I am guessing we can agree,
Hi Owen,
Thanks for the quick response. No, I haven’t seen the previous blog post, yes
it clears the things out a bit.
> To clarify, the code is attempting to simulate a straggler node due to high
> load, which therefore processes data at a slower rate - not a failing node.
> Some degree of
Hi Piotr,
Thanks for getting back to me and for the info. I try to describe the
motivation around the scenarios in the original post in the series - see
the 'Backpressure - why you might care' section on
http://owenrh.me.uk/blog/2019/09/30/. Maybe it could have been clearer.
As you note, this
Hi,
I’m not entirely sure what you are testing. I have looked at your code (only
the constant straggler scenario) and please correct me if’m wrong, in your job
you are basically measuring throughput of `Thread.sleep(straggler.waitMillis)`.
In the first RichMap task (`subTaskId == 0`), per
Hi,
I am having a few issues with the Flink (v1.8.1) backpressure default
settings, which lead to poor throughput in a comparison I am doing between
Storm, Spark and Flink.
I have a setup that simulates a progressively worse straggling task that
Storm and Spark cope with the relatively well.