Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-21 Thread Robert Bradshaw
I agree. Borrowing the mutation detection from the direct runner as an
intermediate point sounds like a good idea.

On Mon, Dec 21, 2020 at 8:57 AM Kenneth Knowles  wrote:

> I really think we should make a plan to make this the default. If you test
> with the DirectRunner it will do mutation checking and catch pipelines that
> depend on the runner cloning every element. (also the DirectRunner doesn't
> clone). Since the cloning is similar in cost to the mutation detection,
> could we actually add some mutation detection to FlinkRunner pipelines and
> also directly warn if a pipeline is depending on it?
>
> Kenn
>
> On Mon, Dec 21, 2020 at 5:04 AM Teodor Spæren 
> wrote:
>
>> Hey! My option is not default as of now, since it can break pipelines
>> which rely on the faulty flink implementation. I'm creating my own
>> benchmarks locally and will run against those, but the idea of adding it
>> to the official benchmark runs sounds interesting, thanks for bringing
>> it up!
>>
>> Teodor
>>
>> On Tue, Dec 15, 2020 at 06:51:38PM -0800, Ahmet Altay wrote:
>> >Hi Teodor,
>> >
>> >Thank you for working on this. If I remember correctly, there were some
>> >opportunities to improve in the previous paper (e.g. not focusing
>> >deprecated runners, long running benchmarks, varying data sizes). And I
>> am
>> >excited that you are keeping the community as part of your research
>> process
>> >and we will be happy to help you where we can.
>> >
>> >Related to your question. Was the new option used by default? If that
>> >is the case you will probably see its impact on the metrics dashboard
>> [1].
>> >And if it is not on by default, you can add your variant as a new
>> benchmark
>> >and compare the difference across many runs in a controlled benchmarking
>> >environment. Would that help?
>> >
>> >Ahmet
>> >
>> >[1] http://metrics.beam.apache.org/d/1/getting-started?orgId=1
>> >
>> >
>> >On Tue, Dec 15, 2020 at 5:48 AM Teodor Spæren > >
>> >wrote:
>> >
>> >> Hey!
>> >>
>> >> Yeah, that paper was what prompted my master thesis! I definitivly will
>> >> post here, once I get more data :)
>> >>
>> >> Teodor
>> >>
>> >> On Mon, Dec 14, 2020 at 06:56:30AM -0600, Rion Williams wrote:
>> >> >Hi Teodor,
>> >> >
>> >> >Although I’m sure you’ve come across it, this might have some valuable
>> >> resources or methodologies to consider as you explore this a bit more:
>> >> >
>> >> >https://arxiv.org/pdf/1907.08302.pdf
>> >> >
>> >> >I’m looking forward to reading about your finding, especially using a
>> >> more recent iteration of Beam!
>> >> >
>> >> >Rion
>> >> >
>> >> >> On Dec 14, 2020, at 6:37 AM, Teodor Spæren <
>> teodor_spae...@riseup.net>
>> >> wrote:
>> >> >>
>> >> >> Just bumping this so people see it now that 2.26.0 is out :)
>> >> >>
>> >> >>> On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
>> >> >>> Hey!
>> >> >>>
>> >> >>> My name is Teodor Spæren and I'm writing a master thesis
>> investigating
>> >> the performance overhead of using Beam instead of using the underlying
>> >> systems directly. My focus has been on Flink and I've made a discovery
>> >> about some unnecessary copying between operators in the Flink
>> runner[1][2].
>> >> I wrote a fixed for this and it got accepted and merged,
>> >> >>> and will be in the upcoming 2.26.0 release[3].
>> >> >>>
>> >> >>> I'm writing this email to ask if anyone on these mailing lists
>> would
>> >> be willing to send me some result of applying this option when the new
>> >> version of beam releases. Anything will be very much appreciated,
>> stories,
>> >> screenshots of performance monitoring before and after, hard numbers,
>> >> anything! If you include the cluster size and the workload that would
>> be
>> >> awesome too! My master thesis is set to be complete the coming summer,
>> so
>> >> there is no real hurry :)
>> >> >>>
>> >> >>> The thesis will be freely accessible[4] and I hope that these
>> findings
>> >> will be of help to the beam community. If anyone wishes to submit
>> stories,
>> >> but remain anonymous that is also ok :)
>> >> >>>
>> >> >>> The best way to contact me would be to send an email my way here,
>> or
>> >> on teod...@mail.uio.no.
>> >> >>>
>> >> >>> Any help is appreciated, thanks for your attention!
>> >> >>>
>> >> >>> Best regards,
>> >> >>> Teodor Spæren
>> >> >>>
>> >> >>>
>> >> >>> [1]:
>> >>
>> https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
>> >> >>> [2]: https://issues.apache.org/jira/browse/BEAM-11146
>> >> >>> [3]: https://github.com/apache/beam/pull/13240
>> >> >>> [4]: https://www.duo.uio.no/
>> >>
>>
>


Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-21 Thread Kenneth Knowles
I really think we should make a plan to make this the default. If you test
with the DirectRunner it will do mutation checking and catch pipelines that
depend on the runner cloning every element. (also the DirectRunner doesn't
clone). Since the cloning is similar in cost to the mutation detection,
could we actually add some mutation detection to FlinkRunner pipelines and
also directly warn if a pipeline is depending on it?

Kenn

On Mon, Dec 21, 2020 at 5:04 AM Teodor Spæren 
wrote:

> Hey! My option is not default as of now, since it can break pipelines
> which rely on the faulty flink implementation. I'm creating my own
> benchmarks locally and will run against those, but the idea of adding it
> to the official benchmark runs sounds interesting, thanks for bringing
> it up!
>
> Teodor
>
> On Tue, Dec 15, 2020 at 06:51:38PM -0800, Ahmet Altay wrote:
> >Hi Teodor,
> >
> >Thank you for working on this. If I remember correctly, there were some
> >opportunities to improve in the previous paper (e.g. not focusing
> >deprecated runners, long running benchmarks, varying data sizes). And I am
> >excited that you are keeping the community as part of your research
> process
> >and we will be happy to help you where we can.
> >
> >Related to your question. Was the new option used by default? If that
> >is the case you will probably see its impact on the metrics dashboard [1].
> >And if it is not on by default, you can add your variant as a new
> benchmark
> >and compare the difference across many runs in a controlled benchmarking
> >environment. Would that help?
> >
> >Ahmet
> >
> >[1] http://metrics.beam.apache.org/d/1/getting-started?orgId=1
> >
> >
> >On Tue, Dec 15, 2020 at 5:48 AM Teodor Spæren 
> >wrote:
> >
> >> Hey!
> >>
> >> Yeah, that paper was what prompted my master thesis! I definitivly will
> >> post here, once I get more data :)
> >>
> >> Teodor
> >>
> >> On Mon, Dec 14, 2020 at 06:56:30AM -0600, Rion Williams wrote:
> >> >Hi Teodor,
> >> >
> >> >Although I’m sure you’ve come across it, this might have some valuable
> >> resources or methodologies to consider as you explore this a bit more:
> >> >
> >> >https://arxiv.org/pdf/1907.08302.pdf
> >> >
> >> >I’m looking forward to reading about your finding, especially using a
> >> more recent iteration of Beam!
> >> >
> >> >Rion
> >> >
> >> >> On Dec 14, 2020, at 6:37 AM, Teodor Spæren <
> teodor_spae...@riseup.net>
> >> wrote:
> >> >>
> >> >> Just bumping this so people see it now that 2.26.0 is out :)
> >> >>
> >> >>> On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
> >> >>> Hey!
> >> >>>
> >> >>> My name is Teodor Spæren and I'm writing a master thesis
> investigating
> >> the performance overhead of using Beam instead of using the underlying
> >> systems directly. My focus has been on Flink and I've made a discovery
> >> about some unnecessary copying between operators in the Flink
> runner[1][2].
> >> I wrote a fixed for this and it got accepted and merged,
> >> >>> and will be in the upcoming 2.26.0 release[3].
> >> >>>
> >> >>> I'm writing this email to ask if anyone on these mailing lists would
> >> be willing to send me some result of applying this option when the new
> >> version of beam releases. Anything will be very much appreciated,
> stories,
> >> screenshots of performance monitoring before and after, hard numbers,
> >> anything! If you include the cluster size and the workload that would be
> >> awesome too! My master thesis is set to be complete the coming summer,
> so
> >> there is no real hurry :)
> >> >>>
> >> >>> The thesis will be freely accessible[4] and I hope that these
> findings
> >> will be of help to the beam community. If anyone wishes to submit
> stories,
> >> but remain anonymous that is also ok :)
> >> >>>
> >> >>> The best way to contact me would be to send an email my way here, or
> >> on teod...@mail.uio.no.
> >> >>>
> >> >>> Any help is appreciated, thanks for your attention!
> >> >>>
> >> >>> Best regards,
> >> >>> Teodor Spæren
> >> >>>
> >> >>>
> >> >>> [1]:
> >>
> https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
> >> >>> [2]: https://issues.apache.org/jira/browse/BEAM-11146
> >> >>> [3]: https://github.com/apache/beam/pull/13240
> >> >>> [4]: https://www.duo.uio.no/
> >>
>


Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-21 Thread Teodor Spæren
Hey! My option is not default as of now, since it can break pipelines 
which rely on the faulty flink implementation. I'm creating my own 
benchmarks locally and will run against those, but the idea of adding it 
to the official benchmark runs sounds interesting, thanks for bringing 
it up!


Teodor

On Tue, Dec 15, 2020 at 06:51:38PM -0800, Ahmet Altay wrote:

Hi Teodor,

Thank you for working on this. If I remember correctly, there were some
opportunities to improve in the previous paper (e.g. not focusing
deprecated runners, long running benchmarks, varying data sizes). And I am
excited that you are keeping the community as part of your research process
and we will be happy to help you where we can.

Related to your question. Was the new option used by default? If that
is the case you will probably see its impact on the metrics dashboard [1].
And if it is not on by default, you can add your variant as a new benchmark
and compare the difference across many runs in a controlled benchmarking
environment. Would that help?

Ahmet

[1] http://metrics.beam.apache.org/d/1/getting-started?orgId=1


On Tue, Dec 15, 2020 at 5:48 AM Teodor Spæren 
wrote:


Hey!

Yeah, that paper was what prompted my master thesis! I definitivly will
post here, once I get more data :)

Teodor

On Mon, Dec 14, 2020 at 06:56:30AM -0600, Rion Williams wrote:
>Hi Teodor,
>
>Although I’m sure you’ve come across it, this might have some valuable
resources or methodologies to consider as you explore this a bit more:
>
>https://arxiv.org/pdf/1907.08302.pdf
>
>I’m looking forward to reading about your finding, especially using a
more recent iteration of Beam!
>
>Rion
>
>> On Dec 14, 2020, at 6:37 AM, Teodor Spæren 
wrote:
>>
>> Just bumping this so people see it now that 2.26.0 is out :)
>>
>>> On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
>>> Hey!
>>>
>>> My name is Teodor Spæren and I'm writing a master thesis investigating
the performance overhead of using Beam instead of using the underlying
systems directly. My focus has been on Flink and I've made a discovery
about some unnecessary copying between operators in the Flink runner[1][2].
I wrote a fixed for this and it got accepted and merged,
>>> and will be in the upcoming 2.26.0 release[3].
>>>
>>> I'm writing this email to ask if anyone on these mailing lists would
be willing to send me some result of applying this option when the new
version of beam releases. Anything will be very much appreciated, stories,
screenshots of performance monitoring before and after, hard numbers,
anything! If you include the cluster size and the workload that would be
awesome too! My master thesis is set to be complete the coming summer, so
there is no real hurry :)
>>>
>>> The thesis will be freely accessible[4] and I hope that these findings
will be of help to the beam community. If anyone wishes to submit stories,
but remain anonymous that is also ok :)
>>>
>>> The best way to contact me would be to send an email my way here, or
on teod...@mail.uio.no.
>>>
>>> Any help is appreciated, thanks for your attention!
>>>
>>> Best regards,
>>> Teodor Spæren
>>>
>>>
>>> [1]:
https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
>>> [2]: https://issues.apache.org/jira/browse/BEAM-11146
>>> [3]: https://github.com/apache/beam/pull/13240
>>> [4]: https://www.duo.uio.no/



Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-15 Thread Ahmet Altay
Hi Teodor,

Thank you for working on this. If I remember correctly, there were some
opportunities to improve in the previous paper (e.g. not focusing
deprecated runners, long running benchmarks, varying data sizes). And I am
excited that you are keeping the community as part of your research process
and we will be happy to help you where we can.

Related to your question. Was the new option used by default? If that
is the case you will probably see its impact on the metrics dashboard [1].
And if it is not on by default, you can add your variant as a new benchmark
and compare the difference across many runs in a controlled benchmarking
environment. Would that help?

Ahmet

[1] http://metrics.beam.apache.org/d/1/getting-started?orgId=1


On Tue, Dec 15, 2020 at 5:48 AM Teodor Spæren 
wrote:

> Hey!
>
> Yeah, that paper was what prompted my master thesis! I definitivly will
> post here, once I get more data :)
>
> Teodor
>
> On Mon, Dec 14, 2020 at 06:56:30AM -0600, Rion Williams wrote:
> >Hi Teodor,
> >
> >Although I’m sure you’ve come across it, this might have some valuable
> resources or methodologies to consider as you explore this a bit more:
> >
> >https://arxiv.org/pdf/1907.08302.pdf
> >
> >I’m looking forward to reading about your finding, especially using a
> more recent iteration of Beam!
> >
> >Rion
> >
> >> On Dec 14, 2020, at 6:37 AM, Teodor Spæren 
> wrote:
> >>
> >> Just bumping this so people see it now that 2.26.0 is out :)
> >>
> >>> On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
> >>> Hey!
> >>>
> >>> My name is Teodor Spæren and I'm writing a master thesis investigating
> the performance overhead of using Beam instead of using the underlying
> systems directly. My focus has been on Flink and I've made a discovery
> about some unnecessary copying between operators in the Flink runner[1][2].
> I wrote a fixed for this and it got accepted and merged,
> >>> and will be in the upcoming 2.26.0 release[3].
> >>>
> >>> I'm writing this email to ask if anyone on these mailing lists would
> be willing to send me some result of applying this option when the new
> version of beam releases. Anything will be very much appreciated, stories,
> screenshots of performance monitoring before and after, hard numbers,
> anything! If you include the cluster size and the workload that would be
> awesome too! My master thesis is set to be complete the coming summer, so
> there is no real hurry :)
> >>>
> >>> The thesis will be freely accessible[4] and I hope that these findings
> will be of help to the beam community. If anyone wishes to submit stories,
> but remain anonymous that is also ok :)
> >>>
> >>> The best way to contact me would be to send an email my way here, or
> on teod...@mail.uio.no.
> >>>
> >>> Any help is appreciated, thanks for your attention!
> >>>
> >>> Best regards,
> >>> Teodor Spæren
> >>>
> >>>
> >>> [1]:
> https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
> >>> [2]: https://issues.apache.org/jira/browse/BEAM-11146
> >>> [3]: https://github.com/apache/beam/pull/13240
> >>> [4]: https://www.duo.uio.no/
>


Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-15 Thread Teodor Spæren

Hey!

Yeah, that paper was what prompted my master thesis! I definitivly will 
post here, once I get more data :)


Teodor

On Mon, Dec 14, 2020 at 06:56:30AM -0600, Rion Williams wrote:

Hi Teodor,

Although I’m sure you’ve come across it, this might have some valuable 
resources or methodologies to consider as you explore this a bit more:

https://arxiv.org/pdf/1907.08302.pdf

I’m looking forward to reading about your finding, especially using a more 
recent iteration of Beam!

Rion


On Dec 14, 2020, at 6:37 AM, Teodor Spæren  wrote:

Just bumping this so people see it now that 2.26.0 is out :)


On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
Hey!

My name is Teodor Spæren and I'm writing a master thesis investigating the 
performance overhead of using Beam instead of using the underlying systems 
directly. My focus has been on Flink and I've made a discovery about some 
unnecessary copying between operators in the Flink runner[1][2]. I wrote a 
fixed for this and it got accepted and merged,
and will be in the upcoming 2.26.0 release[3].

I'm writing this email to ask if anyone on these mailing lists would be willing 
to send me some result of applying this option when the new version of beam 
releases. Anything will be very much appreciated, stories, screenshots of 
performance monitoring before and after, hard numbers, anything! If you include 
the cluster size and the workload that would be awesome too! My master thesis 
is set to be complete the coming summer, so there is no real hurry :)

The thesis will be freely accessible[4] and I hope that these findings will be 
of help to the beam community. If anyone wishes to submit stories, but remain 
anonymous that is also ok :)

The best way to contact me would be to send an email my way here, or on 
teod...@mail.uio.no.

Any help is appreciated, thanks for your attention!

Best regards,
Teodor Spæren


[1]: 
https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
[2]: https://issues.apache.org/jira/browse/BEAM-11146
[3]: https://github.com/apache/beam/pull/13240
[4]: https://www.duo.uio.no/


Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-14 Thread Rion Williams
Hi Teodor,

Although I’m sure you’ve come across it, this might have some valuable 
resources or methodologies to consider as you explore this a bit more:

https://arxiv.org/pdf/1907.08302.pdf

I’m looking forward to reading about your finding, especially using a more 
recent iteration of Beam!

Rion

> On Dec 14, 2020, at 6:37 AM, Teodor Spæren  wrote:
> 
> Just bumping this so people see it now that 2.26.0 is out :)
> 
>> On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:
>> Hey!
>> 
>> My name is Teodor Spæren and I'm writing a master thesis investigating the 
>> performance overhead of using Beam instead of using the underlying systems 
>> directly. My focus has been on Flink and I've made a discovery about some 
>> unnecessary copying between operators in the Flink runner[1][2]. I wrote a 
>> fixed for this and it got accepted and merged,
>> and will be in the upcoming 2.26.0 release[3].
>> 
>> I'm writing this email to ask if anyone on these mailing lists would be 
>> willing to send me some result of applying this option when the new version 
>> of beam releases. Anything will be very much appreciated, stories, 
>> screenshots of performance monitoring before and after, hard numbers, 
>> anything! If you include the cluster size and the workload that would be 
>> awesome too! My master thesis is set to be complete the coming summer, so 
>> there is no real hurry :)
>> 
>> The thesis will be freely accessible[4] and I hope that these findings will 
>> be of help to the beam community. If anyone wishes to submit stories, but 
>> remain anonymous that is also ok :)
>> 
>> The best way to contact me would be to send an email my way here, or on 
>> teod...@mail.uio.no.
>> 
>> Any help is appreciated, thanks for your attention!
>> 
>> Best regards,
>> Teodor Spæren
>> 
>> 
>> [1]: 
>> https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
>> [2]: https://issues.apache.org/jira/browse/BEAM-11146
>> [3]: https://github.com/apache/beam/pull/13240
>> [4]: https://www.duo.uio.no/


Re: Help measuring upcoming performance increase in flink runner on production systems

2020-12-14 Thread Teodor Spæren

Just bumping this so people see it now that 2.26.0 is out :)

On Wed, Nov 25, 2020 at 11:09:52AM +0100, Teodor Spæren wrote:

Hey!

My name is Teodor Spæren and I'm writing a master thesis investigating 
the performance overhead of using Beam instead of using the underlying 
systems directly. My focus has been on Flink and I've made a discovery 
about some unnecessary copying between operators in the Flink 
runner[1][2]. I wrote a fixed for this and it got accepted and merged,

and will be in the upcoming 2.26.0 release[3].

I'm writing this email to ask if anyone on these mailing lists would 
be willing to send me some result of applying this option when the new 
version of beam releases. Anything will be very much appreciated, 
stories, screenshots of performance monitoring before and after, hard 
numbers, anything! If you include the cluster size and the workload 
that would be awesome too! My master thesis is set to be complete the 
coming summer, so there is no real hurry :)


The thesis will be freely accessible[4] and I hope that these findings 
will be of help to the beam community. If anyone wishes to submit 
stories, but remain anonymous that is also ok :)


The best way to contact me would be to send an email my way here, or 
on teod...@mail.uio.no.


Any help is appreciated, thanks for your attention!

Best regards,
Teodor Spæren


[1]: 
https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E
[2]: https://issues.apache.org/jira/browse/BEAM-11146
[3]: https://github.com/apache/beam/pull/13240
[4]: https://www.duo.uio.no/