Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-21 Thread Robert Bradshaw
I agree. Borrowing the mutation detection from the direct runner as an intermediate point sounds like a good idea. On Mon, Dec 21, 2020 at 8:57 AM Kenneth Knowles wrote: > I really think we should make a plan to make this the default. If you test > with the DirectRunner it will do mutation

Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-21 Thread Kenneth Knowles
I really think we should make a plan to make this the default. If you test with the DirectRunner it will do mutation checking and catch pipelines that depend on the runner cloning every element. (also the DirectRunner doesn't clone). Since the cloning is similar in cost to the mutation detection,

Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-21 Thread Teodor Spæren
Hey! My option is not default as of now, since it can break pipelines which rely on the faulty flink implementation. I'm creating my own benchmarks locally and will run against those, but the idea of adding it to the official benchmark runs sounds interesting, thanks for bringing it up!

Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-15 Thread Ahmet Altay
Hi Teodor, Thank you for working on this. If I remember correctly, there were some opportunities to improve in the previous paper (e.g. not focusing deprecated runners, long running benchmarks, varying data sizes). And I am excited that you are keeping the community as part of your research

Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-15 Thread Teodor Spæren
Hey! Yeah, that paper was what prompted my master thesis! I definitivly will post here, once I get more data :) Teodor On Mon, Dec 14, 2020 at 06:56:30AM -0600, Rion Williams wrote: Hi Teodor, Although I’m sure you’ve come across it, this might have some valuable resources or

Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-14 Thread Rion Williams
Hi Teodor, Although I’m sure you’ve come across it, this might have some valuable resources or methodologies to consider as you explore this a bit more: https://arxiv.org/pdf/1907.08302.pdf I’m looking forward to reading about your finding, especially using a more recent iteration of Beam!

Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-14 Thread Teodor Spæren
Just bumping this so people see it now that 2.26.0 is out :) On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote: Hey! My name is Teodor Spæren and I'm writing a master thesis investigating the performance overhead of using Beam instead of using the underlying systems directly. My