Re: Monitoring performance for releases

2020-08-06 Thread Maximilian Michels
Robert, this is not too far off what I'm proposing. We can always create 
JIRA issues for performance regressions and mark them with a Fix 
Version. Especially, the time of the release is a good time to 
re-evaluate whether some gross performance regressions can be detected. 
Of course, if it's a more gradual and less noticeable once, we might 
miss it.


I agree that good performance is a continuous effort, but many times 
there are actual problems which can be dealt with at the time of the 
release. Those might be very hard to fix a couple of releases down the 
line because they are layered by new problems and much harder to find.


-Max

On 03.08.20 22:17, Robert Bradshaw wrote:
I have to admit I still have some qualms about tying detecting and 
fixing performance regressions as part of the release process (which is 
onerous enough as it is). Instead, I think we'd be better off with a 
separate process to detect and triage performance issues, which, when 
they occur, may merit filing a blocker which will require fixing before 
the release just like any other blocker would. Hopefully this would 
result in issues being detected (and resolved) sooner.


That being said, if a release is known to have performance regressions, 
that should be called out when the RCs are cut, and if not resolved, 
probably as part of the release notes as well.


On Mon, Aug 3, 2020 at 9:40 AM Maximilian Michels > wrote:


Here a first version of the updated release guide:
https://github.com/apache/beam/pull/12455

Feel free to comment.

-Max

On 29.07.20 17:27, Maximilian Michels wrote:
 > Thanks! I'm following up with this PR to display the Flink Pardo
 > streaming data: https://github.com/apache/beam/pull/12408
 >
 > Streaming data appears to be missing for Dataflow. We can revise the
 > Jenkins jobs to add those.
 >
 > -Max
 >
 > On 29.07.20 17:01, Tyson Hamilton wrote:
 >> Max,
 >>
 >> The runner dimension are present when hovering over a particular
 >> graph. For some more info, the load test configurations can be
found
 >> here [1]. I didn't get a chance to look into them but there are
tests
 >> for all the runners there, possibly not for every loadtest.
 >>
 >> [1]: https://github.com/apache/beam/tree/master/.test-infra/jenkins
 >>
 >> -Tyson
 >>
 >> On Wed, Jul 29, 2020 at 3:46 AM Maximilian Michels
mailto:m...@apache.org>
 >> >> wrote:
 >>
 >>     Looks like the permissions won't be necessary because backup
data
 >> gets
 >>     loaded into the local InfluxDb instance which makes writing
queries
 >>     locally possible.
 >>
 >>     On 29.07.20 12:21, Maximilian Michels wrote:
 >>  > Thanks Michał!
 >>  >
 >>  > It is a bit tricky to verify the exported query works if
I don't
 >>     have
 >>  > access to the data stored in InfluxDb.
 >>  >
 >>  > ==> Could somebody give me permissions to
max.mich...@gmail.com 
 >>     > for
 >>  > apache-beam-testing such that I can setup a ssh
port-forwarding
 >>     from the
 >>  > InfluxDb pod to my machine? I do have access to see the
pods but
 >>     that is
 >>  > not enough.
 >>  >
 >>  >> I think that the only test data is from Python streaming
tests,
 >>     which
 >>  >> are not implemented right now (check out
 >>  >>
 >>
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python

 >>
 >>
 >>  >> )
 >>  >
 >>  > Additionally, there is an entire dimension missing:
Runners. I'm
 >>  > assuming this data is for Dataflow?
 >>  >
 >>  > -Max
 >>  >
 >>  > On 29.07.20 11:55, Michał Walenia wrote:
 >>  >> Hi there,
 >>  >>
 >>  >>  > Indeed the Python load test data appears to be missing:
 >>  >>  >
 >>  >>
 >>
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

 >>
 >>
 >>  >>
 >>  >>
 >>  >> I think that the only test data is from Python streaming
tests,
 >>     which
 >>  >> are not implemented right now (check out
 >>  >>
 >>
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python

 >>
 >>
 >>  >> )
 >>  >>
 >>  >> As for updating the dashboards, the manual for doing
this is
 >> here:
 >>  >>
 >>
 >>

https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards

 >>
 >>
 >>  >>
 >>  >>
 >>    

Re: Monitoring performance for releases

2020-08-03 Thread Robert Bradshaw
I have to admit I still have some qualms about tying detecting and fixing
performance regressions as part of the release process (which is onerous
enough as it is). Instead, I think we'd be better off with a
separate process to detect and triage performance issues, which, when they
occur, may merit filing a blocker which will require fixing before the
release just like any other blocker would. Hopefully this would result in
issues being detected (and resolved) sooner.

That being said, if a release is known to have performance regressions,
that should be called out when the RCs are cut, and if not resolved,
probably as part of the release notes as well.

On Mon, Aug 3, 2020 at 9:40 AM Maximilian Michels  wrote:

> Here a first version of the updated release guide:
> https://github.com/apache/beam/pull/12455
>
> Feel free to comment.
>
> -Max
>
> On 29.07.20 17:27, Maximilian Michels wrote:
> > Thanks! I'm following up with this PR to display the Flink Pardo
> > streaming data: https://github.com/apache/beam/pull/12408
> >
> > Streaming data appears to be missing for Dataflow. We can revise the
> > Jenkins jobs to add those.
> >
> > -Max
> >
> > On 29.07.20 17:01, Tyson Hamilton wrote:
> >> Max,
> >>
> >> The runner dimension are present when hovering over a particular
> >> graph. For some more info, the load test configurations can be found
> >> here [1]. I didn't get a chance to look into them but there are tests
> >> for all the runners there, possibly not for every loadtest.
> >>
> >> [1]: https://github.com/apache/beam/tree/master/.test-infra/jenkins
> >>
> >> -Tyson
> >>
> >> On Wed, Jul 29, 2020 at 3:46 AM Maximilian Michels  >> > wrote:
> >>
> >> Looks like the permissions won't be necessary because backup data
> >> gets
> >> loaded into the local InfluxDb instance which makes writing queries
> >> locally possible.
> >>
> >> On 29.07.20 12:21, Maximilian Michels wrote:
> >>  > Thanks Michał!
> >>  >
> >>  > It is a bit tricky to verify the exported query works if I don't
> >> have
> >>  > access to the data stored in InfluxDb.
> >>  >
> >>  > ==> Could somebody give me permissions to max.mich...@gmail.com
> >>  for
> >>  > apache-beam-testing such that I can setup a ssh port-forwarding
> >> from the
> >>  > InfluxDb pod to my machine? I do have access to see the pods but
> >> that is
> >>  > not enough.
> >>  >
> >>  >> I think that the only test data is from Python streaming tests,
> >> which
> >>  >> are not implemented right now (check out
> >>  >>
> >>
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python
> >>
> >>
> >>  >> )
> >>  >
> >>  > Additionally, there is an entire dimension missing: Runners. I'm
> >>  > assuming this data is for Dataflow?
> >>  >
> >>  > -Max
> >>  >
> >>  > On 29.07.20 11:55, Michał Walenia wrote:
> >>  >> Hi there,
> >>  >>
> >>  >>  > Indeed the Python load test data appears to be missing:
> >>  >>  >
> >>  >>
> >>
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python
> >>
> >>
> >>  >>
> >>  >>
> >>  >> I think that the only test data is from Python streaming tests,
> >> which
> >>  >> are not implemented right now (check out
> >>  >>
> >>
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python
> >>
> >>
> >>  >> )
> >>  >>
> >>  >> As for updating the dashboards, the manual for doing this is
> >> here:
> >>  >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards
> >>
> >>
> >>  >>
> >>  >>
> >>  >> I hope this helps,
> >>  >>
> >>  >> Michal
> >>  >>
> >>  >> On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels
> >> mailto:m...@apache.org>
> >>  >> >> wrote:
> >>  >>
> >>  >> Indeed the Python load test data appears to be missing:
> >>  >>
> >>  >>
> >>
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python
> >>
> >>
> >>  >>
> >>  >>
> >>  >> How do we typically modify the dashboards?
> >>  >>
> >>  >> It looks like we need to edit this json file:
> >>  >>
> >>  >>
> >>
> >>
> https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81
> >>
> >>
> >>  >>
> >>  >>
> >>  >> I found some documentation on the deployment:
> >>  >>
> >>  >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring
> >>  >>
> >>  >>
> >>  >> +1 for alerting or weekly emails including performance
> >> numbers for
> >>  >> fixed
> >>  >> 

Re: Monitoring performance for releases

2020-08-03 Thread Maximilian Michels
Here a first version of the updated release guide: 
https://github.com/apache/beam/pull/12455


Feel free to comment.

-Max

On 29.07.20 17:27, Maximilian Michels wrote:
Thanks! I'm following up with this PR to display the Flink Pardo 
streaming data: https://github.com/apache/beam/pull/12408


Streaming data appears to be missing for Dataflow. We can revise the 
Jenkins jobs to add those.


-Max

On 29.07.20 17:01, Tyson Hamilton wrote:

Max,

The runner dimension are present when hovering over a particular 
graph. For some more info, the load test configurations can be found 
here [1]. I didn't get a chance to look into them but there are tests 
for all the runners there, possibly not for every loadtest.


[1]: https://github.com/apache/beam/tree/master/.test-infra/jenkins

-Tyson

On Wed, Jul 29, 2020 at 3:46 AM Maximilian Michels > wrote:


    Looks like the permissions won't be necessary because backup data 
gets

    loaded into the local InfluxDb instance which makes writing queries
    locally possible.

    On 29.07.20 12:21, Maximilian Michels wrote:
 > Thanks Michał!
 >
 > It is a bit tricky to verify the exported query works if I don't
    have
 > access to the data stored in InfluxDb.
 >
 > ==> Could somebody give me permissions to max.mich...@gmail.com
     for
 > apache-beam-testing such that I can setup a ssh port-forwarding
    from the
 > InfluxDb pod to my machine? I do have access to see the pods but
    that is
 > not enough.
 >
 >> I think that the only test data is from Python streaming tests,
    which
 >> are not implemented right now (check out
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 



 >> )
 >
 > Additionally, there is an entire dimension missing: Runners. I'm
 > assuming this data is for Dataflow?
 >
 > -Max
 >
 > On 29.07.20 11:55, Michał Walenia wrote:
 >> Hi there,
 >>
 >>  > Indeed the Python load test data appears to be missing:
 >>  >
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python 



 >>
 >>
 >> I think that the only test data is from Python streaming tests,
    which
 >> are not implemented right now (check out
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 



 >> )
 >>
 >> As for updating the dashboards, the manual for doing this is 
here:

 >>

https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards 



 >>
 >>
 >> I hope this helps,
 >>
 >> Michal
 >>
 >> On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels
    mailto:m...@apache.org>
 >> >> wrote:
 >>
 >>     Indeed the Python load test data appears to be missing:
 >>
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python 



 >>
 >>
 >>     How do we typically modify the dashboards?
 >>
 >>     It looks like we need to edit this json file:
 >>
 >>

https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81 



 >>
 >>
 >>     I found some documentation on the deployment:
 >>
 >>

https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring

 >>
 >>
 >>     +1 for alerting or weekly emails including performance
    numbers for
 >>     fixed
 >>     intervals (1d, 1w, 1m, previous release).
 >>
 >>     +1 for linking the dashboards in the release guide to allow
    for a
 >>     comparison as part of the release process.
 >>
 >>     As a first step, consolidating all the data seems like the 
most

 >>     pressing
 >>     problem to solve.
 >>
 >>     @Kamil I could need some advice regarding how to proceed
    updating the
 >>     dashboards.
 >>
 >>     -Max
 >>
 >>     On 22.07.20 20:20, Robert Bradshaw wrote:
 >>  > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise
    mailto:t...@apache.org>
 >>     >
 >>  > 
    >  >
 >>  >     It appears that there is coverage missing in the 
Grafana

 >>     dashboards
 >>  >     (it could also be that I just don't find it).
 >>  >
 >>  >     For example:
 >>  >
 >>
 >>

https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 



 >>
 >>  >
 >>  >     The GBK and ParDo tests have a selection for {batch,
 >>     streaming} and
 >>  >     SDK. 

Re: Monitoring performance for releases

2020-07-29 Thread Maximilian Michels
Thanks! I'm following up with this PR to display the Flink Pardo 
streaming data: https://github.com/apache/beam/pull/12408


Streaming data appears to be missing for Dataflow. We can revise the 
Jenkins jobs to add those.


-Max

On 29.07.20 17:01, Tyson Hamilton wrote:

Max,

The runner dimension are present when hovering over a particular graph. 
For some more info, the load test configurations can be found here [1]. 
I didn't get a chance to look into them but there are tests for all the 
runners there, possibly not for every loadtest.


[1]: https://github.com/apache/beam/tree/master/.test-infra/jenkins

-Tyson

On Wed, Jul 29, 2020 at 3:46 AM Maximilian Michels > wrote:


Looks like the permissions won't be necessary because backup data gets
loaded into the local InfluxDb instance which makes writing queries
locally possible.

On 29.07.20 12:21, Maximilian Michels wrote:
 > Thanks Michał!
 >
 > It is a bit tricky to verify the exported query works if I don't
have
 > access to the data stored in InfluxDb.
 >
 > ==> Could somebody give me permissions to max.mich...@gmail.com
 for
 > apache-beam-testing such that I can setup a ssh port-forwarding
from the
 > InfluxDb pod to my machine? I do have access to see the pods but
that is
 > not enough.
 >
 >> I think that the only test data is from Python streaming tests,
which
 >> are not implemented right now (check out
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python

 >> )
 >
 > Additionally, there is an entire dimension missing: Runners. I'm
 > assuming this data is for Dataflow?
 >
 > -Max
 >
 > On 29.07.20 11:55, Michał Walenia wrote:
 >> Hi there,
 >>
 >>  > Indeed the Python load test data appears to be missing:
 >>  >
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

 >>
 >>
 >> I think that the only test data is from Python streaming tests,
which
 >> are not implemented right now (check out
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python

 >> )
 >>
 >> As for updating the dashboards, the manual for doing this is here:
 >>

https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards

 >>
 >>
 >> I hope this helps,
 >>
 >> Michal
 >>
 >> On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels
mailto:m...@apache.org>
 >> >> wrote:
 >>
 >>     Indeed the Python load test data appears to be missing:
 >>
 >>

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

 >>
 >>
 >>     How do we typically modify the dashboards?
 >>
 >>     It looks like we need to edit this json file:
 >>
 >>

https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81

 >>
 >>
 >>     I found some documentation on the deployment:
 >>
 >>
https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring
 >>
 >>
 >>     +1 for alerting or weekly emails including performance
numbers for
 >>     fixed
 >>     intervals (1d, 1w, 1m, previous release).
 >>
 >>     +1 for linking the dashboards in the release guide to allow
for a
 >>     comparison as part of the release process.
 >>
 >>     As a first step, consolidating all the data seems like the most
 >>     pressing
 >>     problem to solve.
 >>
 >>     @Kamil I could need some advice regarding how to proceed
updating the
 >>     dashboards.
 >>
 >>     -Max
 >>
 >>     On 22.07.20 20:20, Robert Bradshaw wrote:
 >>  > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise
mailto:t...@apache.org>
 >>     >
 >>  > 
>  >
 >>  >     It appears that there is coverage missing in the Grafana
 >>     dashboards
 >>  >     (it could also be that I just don't find it).
 >>  >
 >>  >     For example:
 >>  >
 >>
 >>
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

 >>
 >>  >
 >>  >     The GBK and ParDo tests have a selection for {batch,
 >>     streaming} and
 >>  >     SDK. No coverage for streaming and python? There is
also no
 >>     runner
 >>  >     option currently.
 >>  >
 >>  >     We have seen repeated regressions with streaming, 

Re: Monitoring performance for releases

2020-07-29 Thread Tyson Hamilton
Max,

The runner dimension are present when hovering over a particular graph. For
some more info, the load test configurations can be found here [1]. I
didn't get a chance to look into them but there are tests for all the
runners there, possibly not for every loadtest.

[1]: https://github.com/apache/beam/tree/master/.test-infra/jenkins

-Tyson

On Wed, Jul 29, 2020 at 3:46 AM Maximilian Michels  wrote:

> Looks like the permissions won't be necessary because backup data gets
> loaded into the local InfluxDb instance which makes writing queries
> locally possible.
>
> On 29.07.20 12:21, Maximilian Michels wrote:
> > Thanks Michał!
> >
> > It is a bit tricky to verify the exported query works if I don't have
> > access to the data stored in InfluxDb.
> >
> > ==> Could somebody give me permissions to max.mich...@gmail.com for
> > apache-beam-testing such that I can setup a ssh port-forwarding from the
> > InfluxDb pod to my machine? I do have access to see the pods but that is
> > not enough.
> >
> >> I think that the only test data is from Python streaming tests, which
> >> are not implemented right now (check out
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python
> >> )
> >
> > Additionally, there is an entire dimension missing: Runners. I'm
> > assuming this data is for Dataflow?
> >
> > -Max
> >
> > On 29.07.20 11:55, Michał Walenia wrote:
> >> Hi there,
> >>
> >>  > Indeed the Python load test data appears to be missing:
> >>  >
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python
> >>
> >>
> >> I think that the only test data is from Python streaming tests, which
> >> are not implemented right now (check out
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python
> >> )
> >>
> >> As for updating the dashboards, the manual for doing this is here:
> >>
> https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards
> >>
> >>
> >> I hope this helps,
> >>
> >> Michal
> >>
> >> On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels  >> > wrote:
> >>
> >> Indeed the Python load test data appears to be missing:
> >>
> >>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python
> >>
> >>
> >> How do we typically modify the dashboards?
> >>
> >> It looks like we need to edit this json file:
> >>
> >>
> https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81
> >>
> >>
> >> I found some documentation on the deployment:
> >>
> >>
> https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring
> >>
> >>
> >> +1 for alerting or weekly emails including performance numbers for
> >> fixed
> >> intervals (1d, 1w, 1m, previous release).
> >>
> >> +1 for linking the dashboards in the release guide to allow for a
> >> comparison as part of the release process.
> >>
> >> As a first step, consolidating all the data seems like the most
> >> pressing
> >> problem to solve.
> >>
> >> @Kamil I could need some advice regarding how to proceed updating
> the
> >> dashboards.
> >>
> >> -Max
> >>
> >> On 22.07.20 20:20, Robert Bradshaw wrote:
> >>  > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise  >> 
> >>  > >> wrote:
> >>  >
> >>  > It appears that there is coverage missing in the Grafana
> >> dashboards
> >>  > (it could also be that I just don't find it).
> >>  >
> >>  > For example:
> >>  >
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
> >>
> >>  >
> >>  > The GBK and ParDo tests have a selection for {batch,
> >> streaming} and
> >>  > SDK. No coverage for streaming and python? There is also no
> >> runner
> >>  > option currently.
> >>  >
> >>  > We have seen repeated regressions with streaming, Python,
> >> Flink. The
> >>  > test has been contributed. It would be great if the results
> >> can be
> >>  > covered as part of release verification.
> >>  >
> >>  >
> >>  > Even better would be if we can use these dashboards (plus
> >> alerting or
> >>  > similar?) to find issues before release verification. It's much
> >> easier
> >>  > to fix things earlier.
> >>  >
> >>  >
> >>  > Thomas
> >>  >
> >>  >
> >>  >
> >>  > On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
> >>  >  >> 
> >>  >> >>
> >>  > wrote:
> >>  >
> >>  > The prerequisite is that we have all the stats in one
> >> place.
> >>  >   

Re: Monitoring performance for releases

2020-07-29 Thread Maximilian Michels
Looks like the permissions won't be necessary because backup data gets 
loaded into the local InfluxDb instance which makes writing queries 
locally possible.


On 29.07.20 12:21, Maximilian Michels wrote:

Thanks Michał!

It is a bit tricky to verify the exported query works if I don't have 
access to the data stored in InfluxDb.


==> Could somebody give me permissions to max.mich...@gmail.com for 
apache-beam-testing such that I can setup a ssh port-forwarding from the 
InfluxDb pod to my machine? I do have access to see the pods but that is 
not enough.


I think that the only test data is from Python streaming tests, which 
are not implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 
)


Additionally, there is an entire dimension missing: Runners. I'm 
assuming this data is for Dataflow?


-Max

On 29.07.20 11:55, Michał Walenia wrote:

Hi there,

 > Indeed the Python load test data appears to be missing:
 > 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python 



I think that the only test data is from Python streaming tests, which 
are not implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 
)


As for updating the dashboards, the manual for doing this is here: 
https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards 



I hope this helps,

Michal

On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels > wrote:


    Indeed the Python load test data appears to be missing:

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python 



    How do we typically modify the dashboards?

    It looks like we need to edit this json file:

https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81 



    I found some documentation on the deployment:

https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring



    +1 for alerting or weekly emails including performance numbers for
    fixed
    intervals (1d, 1w, 1m, previous release).

    +1 for linking the dashboards in the release guide to allow for a
    comparison as part of the release process.

    As a first step, consolidating all the data seems like the most
    pressing
    problem to solve.

    @Kamil I could need some advice regarding how to proceed updating the
    dashboards.

    -Max

    On 22.07.20 20:20, Robert Bradshaw wrote:
 > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise mailto:t...@apache.org>
 > >> wrote:
 >
 >     It appears that there is coverage missing in the Grafana
    dashboards
 >     (it could also be that I just don't find it).
 >
 >     For example:
 >

https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 


 >
 >     The GBK and ParDo tests have a selection for {batch,
    streaming} and
 >     SDK. No coverage for streaming and python? There is also no
    runner
 >     option currently.
 >
 >     We have seen repeated regressions with streaming, Python,
    Flink. The
 >     test has been contributed. It would be great if the results
    can be
 >     covered as part of release verification.
 >
 >
 > Even better would be if we can use these dashboards (plus
    alerting or
 > similar?) to find issues before release verification. It's much
    easier
 > to fix things earlier.
 >
 >
 >     Thomas
 >
 >
 >
 >     On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
 >     mailto:kamil.wasilew...@polidea.com>
    >>
 >     wrote:
 >
 >             The prerequisite is that we have all the stats in one
    place.
 >             They seem
 >             to be scattered across 
http://metrics.beam.apache.org and

 > https://apache-beam-testing.appspot.com.
 >
 >             Would it be possible to consolidate the two, i.e. 
use the

 >             Grafana-based
 >             dashboard to load the legacy stats?
 >
 >
 >         I'm pretty sure that all dashboards have been moved to
 > http://metrics.beam.apache.org. Let me know if I missed
 >         something during the migration.
 >
 >         I think we should turn off
 > https://apache-beam-testing.appspot.com in the near future. New
 >         Grafana-based dashboards have been working seamlessly for
    some
 >         time now and there's no point in maintaining the older
    solution.
 >         We'd also avoid ambiguity in where the stats should be
    looked for.
 >
 >         Kamil
 >
 >         On Tue, Jul 21, 2020 at 4:17 PM 

Re: Monitoring performance for releases

2020-07-29 Thread Maximilian Michels

Thanks Michał!

It is a bit tricky to verify the exported query works if I don't have 
access to the data stored in InfluxDb.


==> Could somebody give me permissions to max.mich...@gmail.com for 
apache-beam-testing such that I can setup a ssh port-forwarding from the 
InfluxDb pod to my machine? I do have access to see the pods but that is 
not enough.



I think that the only test data is from Python streaming tests, which are not 
implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python
 )


Additionally, there is an entire dimension missing: Runners. I'm 
assuming this data is for Dataflow?


-Max

On 29.07.20 11:55, Michał Walenia wrote:

Hi there,

 > Indeed the Python load test data appears to be missing:
 > 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python


I think that the only test data is from Python streaming tests, which 
are not implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python 
)


As for updating the dashboards, the manual for doing this is here: 
https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards


I hope this helps,

Michal

On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels > wrote:


Indeed the Python load test data appears to be missing:

http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

How do we typically modify the dashboards?

It looks like we need to edit this json file:

https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81

I found some documentation on the deployment:
https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring


+1 for alerting or weekly emails including performance numbers for
fixed
intervals (1d, 1w, 1m, previous release).

+1 for linking the dashboards in the release guide to allow for a
comparison as part of the release process.

As a first step, consolidating all the data seems like the most
pressing
problem to solve.

@Kamil I could need some advice regarding how to proceed updating the
dashboards.

-Max

On 22.07.20 20:20, Robert Bradshaw wrote:
 > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise mailto:t...@apache.org>
 > >> wrote:
 >
 >     It appears that there is coverage missing in the Grafana
dashboards
 >     (it could also be that I just don't find it).
 >
 >     For example:
 >
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
 >
 >     The GBK and ParDo tests have a selection for {batch,
streaming} and
 >     SDK. No coverage for streaming and python? There is also no
runner
 >     option currently.
 >
 >     We have seen repeated regressions with streaming, Python,
Flink. The
 >     test has been contributed. It would be great if the results
can be
 >     covered as part of release verification.
 >
 >
 > Even better would be if we can use these dashboards (plus
alerting or
 > similar?) to find issues before release verification. It's much
easier
 > to fix things earlier.
 >
 >
 >     Thomas
 >
 >
 >
 >     On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
 >     mailto:kamil.wasilew...@polidea.com>
>>
 >     wrote:
 >
 >             The prerequisite is that we have all the stats in one
place.
 >             They seem
 >             to be scattered across http://metrics.beam.apache.org and
 > https://apache-beam-testing.appspot.com.
 >
 >             Would it be possible to consolidate the two, i.e. use the
 >             Grafana-based
 >             dashboard to load the legacy stats?
 >
 >
 >         I'm pretty sure that all dashboards have been moved to
 > http://metrics.beam.apache.org. Let me know if I missed
 >         something during the migration.
 >
 >         I think we should turn off
 > https://apache-beam-testing.appspot.com in the near future. New
 >         Grafana-based dashboards have been working seamlessly for
some
 >         time now and there's no point in maintaining the older
solution.
 >         We'd also avoid ambiguity in where the stats should be
looked for.
 >
 >         Kamil
 >
 >         On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels
 >         mailto:m...@apache.org>
>> wrote:
 >
 >              > It doesn't support https. I had to add an exception to
 >             the 

Re: Monitoring performance for releases

2020-07-29 Thread Michał Walenia
Hi there,

> Indeed the Python load test data appears to be missing:
>
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python

I think that the only test data is from Python streaming tests, which are
not implemented right now (check out
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=batch=python
)

As for updating the dashboards, the manual for doing this is here:
https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards

I hope this helps,

Michal

On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels  wrote:

> Indeed the Python load test data appears to be missing:
>
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python
>
> How do we typically modify the dashboards?
>
> It looks like we need to edit this json file:
>
> https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81
>
> I found some documentation on the deployment:
> https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring
>
>
> +1 for alerting or weekly emails including performance numbers for fixed
> intervals (1d, 1w, 1m, previous release).
>
> +1 for linking the dashboards in the release guide to allow for a
> comparison as part of the release process.
>
> As a first step, consolidating all the data seems like the most pressing
> problem to solve.
>
> @Kamil I could need some advice regarding how to proceed updating the
> dashboards.
>
> -Max
>
> On 22.07.20 20:20, Robert Bradshaw wrote:
> > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise  > > wrote:
> >
> > It appears that there is coverage missing in the Grafana dashboards
> > (it could also be that I just don't find it).
> >
> > For example:
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
> >
> > The GBK and ParDo tests have a selection for {batch, streaming} and
> > SDK. No coverage for streaming and python? There is also no runner
> > option currently.
> >
> > We have seen repeated regressions with streaming, Python, Flink. The
> > test has been contributed. It would be great if the results can be
> > covered as part of release verification.
> >
> >
> > Even better would be if we can use these dashboards (plus alerting or
> > similar?) to find issues before release verification. It's much easier
> > to fix things earlier.
> >
> >
> > Thomas
> >
> >
> >
> > On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
> > mailto:kamil.wasilew...@polidea.com>>
> > wrote:
> >
> > The prerequisite is that we have all the stats in one place.
> > They seem
> > to be scattered across http://metrics.beam.apache.org and
> > https://apache-beam-testing.appspot.com.
> >
> > Would it be possible to consolidate the two, i.e. use the
> > Grafana-based
> > dashboard to load the legacy stats?
> >
> >
> > I'm pretty sure that all dashboards have been moved to
> > http://metrics.beam.apache.org. Let me know if I missed
> > something during the migration.
> >
> > I think we should turn off
> > https://apache-beam-testing.appspot.com in the near future. New
> > Grafana-based dashboards have been working seamlessly for some
> > time now and there's no point in maintaining the older solution.
> > We'd also avoid ambiguity in where the stats should be looked
> for.
> >
> > Kamil
> >
> > On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels
> > mailto:m...@apache.org>> wrote:
> >
> >  > It doesn't support https. I had to add an exception to
> > the HTTPS Everywhere extension for "metrics.beam.apache.org
> > ".
> >
> > *facepalm* Thanks Udi! It would always hang on me because I
> > use HTTPS
> > Everywhere.
> >
> >  > To be explicit, I am supporting the idea of reviewing the
> > release guide but not changing the release process for the
> > already in-progress release.
> >
> > I consider the release guide immutable for the process of a
> > release.
> > Thus, a change to the release guide can only affect new
> > upcoming
> > releases, not an in-process release.
> >
> >  > +1 and I think we can also evaluate whether flaky tests
> > should be reviewed as release blockers or not. Some flaky
> > tests would be hiding real issues our users could face.
> >
> > Flaky tests are also worth to take into account when
> > releasing, but a
> > little harder to find because may just happen to pass during
> > building
> > the release. 

Re: Monitoring performance for releases

2020-07-27 Thread Maximilian Michels
Indeed the Python load test data appears to be missing: 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1=streaming=python


How do we typically modify the dashboards?

It looks like we need to edit this json file: 
https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81


I found some documentation on the deployment: 
https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring



+1 for alerting or weekly emails including performance numbers for fixed 
intervals (1d, 1w, 1m, previous release).


+1 for linking the dashboards in the release guide to allow for a 
comparison as part of the release process.


As a first step, consolidating all the data seems like the most pressing 
problem to solve.


@Kamil I could need some advice regarding how to proceed updating the 
dashboards.


-Max

On 22.07.20 20:20, Robert Bradshaw wrote:
On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise > wrote:


It appears that there is coverage missing in the Grafana dashboards
(it could also be that I just don't find it).

For example:
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

The GBK and ParDo tests have a selection for {batch, streaming} and
SDK. No coverage for streaming and python? There is also no runner
option currently.

We have seen repeated regressions with streaming, Python, Flink. The
test has been contributed. It would be great if the results can be
covered as part of release verification.


Even better would be if we can use these dashboards (plus alerting or 
similar?) to find issues before release verification. It's much easier 
to fix things earlier.



Thomas



On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
mailto:kamil.wasilew...@polidea.com>>
wrote:

The prerequisite is that we have all the stats in one place.
They seem
to be scattered across http://metrics.beam.apache.org and
https://apache-beam-testing.appspot.com.

Would it be possible to consolidate the two, i.e. use the
Grafana-based
dashboard to load the legacy stats?


I'm pretty sure that all dashboards have been moved to
http://metrics.beam.apache.org. Let me know if I missed
something during the migration.

I think we should turn off
https://apache-beam-testing.appspot.com in the near future. New
Grafana-based dashboards have been working seamlessly for some
time now and there's no point in maintaining the older solution.
We'd also avoid ambiguity in where the stats should be looked for.

Kamil

On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels
mailto:m...@apache.org>> wrote:

 > It doesn't support https. I had to add an exception to
the HTTPS Everywhere extension for "metrics.beam.apache.org
".

*facepalm* Thanks Udi! It would always hang on me because I
use HTTPS
Everywhere.

 > To be explicit, I am supporting the idea of reviewing the
release guide but not changing the release process for the
already in-progress release.

I consider the release guide immutable for the process of a
release.
Thus, a change to the release guide can only affect new
upcoming
releases, not an in-process release.

 > +1 and I think we can also evaluate whether flaky tests
should be reviewed as release blockers or not. Some flaky
tests would be hiding real issues our users could face.

Flaky tests are also worth to take into account when
releasing, but a
little harder to find because may just happen to pass during
building
the release. It is possible though if we strictly capture
flaky tests
via JIRA and mark them with the Fix Version for the release.

 > We keep accumulating dashboards and
 > tests that few people care about, so it is probably worth
that we use
 > them or get a way to alert us of regressions during the
release cycle
 > to catch this even before the RCs.

+1 The release guide should be explicit about which
performance test
results to evaluate.

The prerequisite is that we have all the stats in one place.
They seem
to be scattered across http://metrics.beam.apache.org and
https://apache-beam-testing.appspot.com.

Would it be possible to consolidate the two, i.e. use the
Grafana-based
dashboard to load the legacy stats?

For the 

Re: Monitoring performance for releases

2020-07-22 Thread Robert Bradshaw
On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise  wrote:

> It appears that there is coverage missing in the Grafana dashboards (it
> could also be that I just don't find it).
>
> For example:
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
>
> The GBK and ParDo tests have a selection for {batch, streaming} and SDK.
> No coverage for streaming and python? There is also no runner option
> currently.
>
> We have seen repeated regressions with streaming, Python, Flink. The test
> has been contributed. It would be great if the results can be covered as
> part of release verification.
>

Even better would be if we can use these dashboards (plus alerting or
similar?) to find issues before release verification. It's much easier to
fix things earlier.


>
> Thomas
>
>
>
> On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski <
> kamil.wasilew...@polidea.com> wrote:
>
>> The prerequisite is that we have all the stats in one place. They seem
>>> to be scattered across http://metrics.beam.apache.org and
>>> https://apache-beam-testing.appspot.com.
>>>
>>> Would it be possible to consolidate the two, i.e. use the Grafana-based
>>> dashboard to load the legacy stats?
>>
>>
>> I'm pretty sure that all dashboards have been moved to
>> http://metrics.beam.apache.org. Let me know if I missed something during
>> the migration.
>>
>> I think we should turn off https://apache-beam-testing.appspot.com in
>> the near future. New Grafana-based dashboards have been working seamlessly
>> for some time now and there's no point in maintaining the older solution.
>> We'd also avoid ambiguity in where the stats should be looked for.
>>
>> Kamil
>>
>> On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels 
>> wrote:
>>
>>> > It doesn't support https. I had to add an exception to the HTTPS
>>> Everywhere extension for "metrics.beam.apache.org".
>>>
>>> *facepalm* Thanks Udi! It would always hang on me because I use HTTPS
>>> Everywhere.
>>>
>>> > To be explicit, I am supporting the idea of reviewing the release
>>> guide but not changing the release process for the already in-progress
>>> release.
>>>
>>> I consider the release guide immutable for the process of a release.
>>> Thus, a change to the release guide can only affect new upcoming
>>> releases, not an in-process release.
>>>
>>> > +1 and I think we can also evaluate whether flaky tests should be
>>> reviewed as release blockers or not. Some flaky tests would be hiding real
>>> issues our users could face.
>>>
>>> Flaky tests are also worth to take into account when releasing, but a
>>> little harder to find because may just happen to pass during building
>>> the release. It is possible though if we strictly capture flaky tests
>>> via JIRA and mark them with the Fix Version for the release.
>>>
>>> > We keep accumulating dashboards and
>>> > tests that few people care about, so it is probably worth that we use
>>> > them or get a way to alert us of regressions during the release cycle
>>> > to catch this even before the RCs.
>>>
>>> +1 The release guide should be explicit about which performance test
>>> results to evaluate.
>>>
>>> The prerequisite is that we have all the stats in one place. They seem
>>> to be scattered across http://metrics.beam.apache.org and
>>> https://apache-beam-testing.appspot.com.
>>>
>>> Would it be possible to consolidate the two, i.e. use the Grafana-based
>>> dashboard to load the legacy stats?
>>>
>>> For the evaluation during the release process, I suggest to use a
>>> standardized set of performance tests for all runners, e.g.:
>>>
>>> - Nexmark
>>> - ParDo (Classic/Portable)
>>> - GroupByKey
>>> - IO
>>>
>>>
>>> -Max
>>>
>>> On 21.07.20 01:23, Ahmet Altay wrote:
>>> >
>>> > On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía >> > > wrote:
>>> >
>>> > +1
>>> >
>>> > This is not in the release guide and we should probably re
>>> evaluate if
>>> > this should be a release blocking reason.
>>> > Of course exceptionally a performance regression could be
>>> motivated by
>>> > a correctness fix or a worth refactor, so we should consider this.
>>> >
>>> >
>>> > +1 and I think we can also evaluate whether flaky tests should be
>>> > reviewed as release blockers or not. Some flaky tests would be hiding
>>> > real issues our users could face.
>>> >
>>> > To be explicit, I am supporting the idea of reviewing the release
>>> guide
>>> > but not changing the release process for the already in-progress
>>> release.
>>> >
>>> >
>>> > We have been tracking and fixing performance regressions multiple
>>> > times found simply by checking the nexmark tests including on the
>>> > ongoing 2.23.0 release so value is there. Nexmark does not cover
>>> yet
>>> > python and portable runners so we are probably still missing many
>>> > issues and it is worth to work on this. In any case we should
>>> probably
>>> > decide what validations matter. We keep accumulating dashboards and
>>> > 

Re: Monitoring performance for releases

2020-07-21 Thread Thomas Weise
It appears that there is coverage missing in the Grafana dashboards (it
could also be that I just don't find it).

For example:
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

The GBK and ParDo tests have a selection for {batch, streaming} and SDK. No
coverage for streaming and python? There is also no runner option currently.

We have seen repeated regressions with streaming, Python, Flink. The test
has been contributed. It would be great if the results can be covered as
part of release verification.

Thomas



On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski <
kamil.wasilew...@polidea.com> wrote:

> The prerequisite is that we have all the stats in one place. They seem
>> to be scattered across http://metrics.beam.apache.org and
>> https://apache-beam-testing.appspot.com.
>>
>> Would it be possible to consolidate the two, i.e. use the Grafana-based
>> dashboard to load the legacy stats?
>
>
> I'm pretty sure that all dashboards have been moved to
> http://metrics.beam.apache.org. Let me know if I missed something during
> the migration.
>
> I think we should turn off https://apache-beam-testing.appspot.com in the
> near future. New Grafana-based dashboards have been working seamlessly for
> some time now and there's no point in maintaining the older solution. We'd
> also avoid ambiguity in where the stats should be looked for.
>
> Kamil
>
> On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels  wrote:
>
>> > It doesn't support https. I had to add an exception to the HTTPS
>> Everywhere extension for "metrics.beam.apache.org".
>>
>> *facepalm* Thanks Udi! It would always hang on me because I use HTTPS
>> Everywhere.
>>
>> > To be explicit, I am supporting the idea of reviewing the release guide
>> but not changing the release process for the already in-progress release.
>>
>> I consider the release guide immutable for the process of a release.
>> Thus, a change to the release guide can only affect new upcoming
>> releases, not an in-process release.
>>
>> > +1 and I think we can also evaluate whether flaky tests should be
>> reviewed as release blockers or not. Some flaky tests would be hiding real
>> issues our users could face.
>>
>> Flaky tests are also worth to take into account when releasing, but a
>> little harder to find because may just happen to pass during building
>> the release. It is possible though if we strictly capture flaky tests
>> via JIRA and mark them with the Fix Version for the release.
>>
>> > We keep accumulating dashboards and
>> > tests that few people care about, so it is probably worth that we use
>> > them or get a way to alert us of regressions during the release cycle
>> > to catch this even before the RCs.
>>
>> +1 The release guide should be explicit about which performance test
>> results to evaluate.
>>
>> The prerequisite is that we have all the stats in one place. They seem
>> to be scattered across http://metrics.beam.apache.org and
>> https://apache-beam-testing.appspot.com.
>>
>> Would it be possible to consolidate the two, i.e. use the Grafana-based
>> dashboard to load the legacy stats?
>>
>> For the evaluation during the release process, I suggest to use a
>> standardized set of performance tests for all runners, e.g.:
>>
>> - Nexmark
>> - ParDo (Classic/Portable)
>> - GroupByKey
>> - IO
>>
>>
>> -Max
>>
>> On 21.07.20 01:23, Ahmet Altay wrote:
>> >
>> > On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía > > > wrote:
>> >
>> > +1
>> >
>> > This is not in the release guide and we should probably re evaluate
>> if
>> > this should be a release blocking reason.
>> > Of course exceptionally a performance regression could be motivated
>> by
>> > a correctness fix or a worth refactor, so we should consider this.
>> >
>> >
>> > +1 and I think we can also evaluate whether flaky tests should be
>> > reviewed as release blockers or not. Some flaky tests would be hiding
>> > real issues our users could face.
>> >
>> > To be explicit, I am supporting the idea of reviewing the release guide
>> > but not changing the release process for the already in-progress
>> release.
>> >
>> >
>> > We have been tracking and fixing performance regressions multiple
>> > times found simply by checking the nexmark tests including on the
>> > ongoing 2.23.0 release so value is there. Nexmark does not cover yet
>> > python and portable runners so we are probably still missing many
>> > issues and it is worth to work on this. In any case we should
>> probably
>> > decide what validations matter. We keep accumulating dashboards and
>> > tests that few people care about, so it is probably worth that we
>> use
>> > them or get a way to alert us of regressions during the release
>> cycle
>> > to catch this even before the RCs.
>> >
>> >
>> > I agree. And if we cannot use dashboards/tests in a meaningful way, IMO
>> > we can remove them. There is not much value to maintain them if they do
>> > not 

Re: Monitoring performance for releases

2020-07-21 Thread Kamil Wasilewski
>
> The prerequisite is that we have all the stats in one place. They seem
> to be scattered across http://metrics.beam.apache.org and
> https://apache-beam-testing.appspot.com.
>
> Would it be possible to consolidate the two, i.e. use the Grafana-based
> dashboard to load the legacy stats?


I'm pretty sure that all dashboards have been moved to
http://metrics.beam.apache.org. Let me know if I missed something during
the migration.

I think we should turn off https://apache-beam-testing.appspot.com in the
near future. New Grafana-based dashboards have been working seamlessly for
some time now and there's no point in maintaining the older solution. We'd
also avoid ambiguity in where the stats should be looked for.

Kamil

On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels  wrote:

> > It doesn't support https. I had to add an exception to the HTTPS
> Everywhere extension for "metrics.beam.apache.org".
>
> *facepalm* Thanks Udi! It would always hang on me because I use HTTPS
> Everywhere.
>
> > To be explicit, I am supporting the idea of reviewing the release guide
> but not changing the release process for the already in-progress release.
>
> I consider the release guide immutable for the process of a release.
> Thus, a change to the release guide can only affect new upcoming
> releases, not an in-process release.
>
> > +1 and I think we can also evaluate whether flaky tests should be
> reviewed as release blockers or not. Some flaky tests would be hiding real
> issues our users could face.
>
> Flaky tests are also worth to take into account when releasing, but a
> little harder to find because may just happen to pass during building
> the release. It is possible though if we strictly capture flaky tests
> via JIRA and mark them with the Fix Version for the release.
>
> > We keep accumulating dashboards and
> > tests that few people care about, so it is probably worth that we use
> > them or get a way to alert us of regressions during the release cycle
> > to catch this even before the RCs.
>
> +1 The release guide should be explicit about which performance test
> results to evaluate.
>
> The prerequisite is that we have all the stats in one place. They seem
> to be scattered across http://metrics.beam.apache.org and
> https://apache-beam-testing.appspot.com.
>
> Would it be possible to consolidate the two, i.e. use the Grafana-based
> dashboard to load the legacy stats?
>
> For the evaluation during the release process, I suggest to use a
> standardized set of performance tests for all runners, e.g.:
>
> - Nexmark
> - ParDo (Classic/Portable)
> - GroupByKey
> - IO
>
>
> -Max
>
> On 21.07.20 01:23, Ahmet Altay wrote:
> >
> > On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía  > > wrote:
> >
> > +1
> >
> > This is not in the release guide and we should probably re evaluate
> if
> > this should be a release blocking reason.
> > Of course exceptionally a performance regression could be motivated
> by
> > a correctness fix or a worth refactor, so we should consider this.
> >
> >
> > +1 and I think we can also evaluate whether flaky tests should be
> > reviewed as release blockers or not. Some flaky tests would be hiding
> > real issues our users could face.
> >
> > To be explicit, I am supporting the idea of reviewing the release guide
> > but not changing the release process for the already in-progress release.
> >
> >
> > We have been tracking and fixing performance regressions multiple
> > times found simply by checking the nexmark tests including on the
> > ongoing 2.23.0 release so value is there. Nexmark does not cover yet
> > python and portable runners so we are probably still missing many
> > issues and it is worth to work on this. In any case we should
> probably
> > decide what validations matter. We keep accumulating dashboards and
> > tests that few people care about, so it is probably worth that we use
> > them or get a way to alert us of regressions during the release cycle
> > to catch this even before the RCs.
> >
> >
> > I agree. And if we cannot use dashboards/tests in a meaningful way, IMO
> > we can remove them. There is not much value to maintain them if they do
> > not provide important signals.
> >
> >
> > On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri  > > wrote:
> >  >
> >  > On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels
> > mailto:m...@apache.org>> wrote:
> >  >>
> >  >> Not yet, I just learned about the migration to a new frontend,
> > including
> >  >> a new backend (InfluxDB instead of BigQuery).
> >  >>
> >  >> >  - Are the metrics available on metrics.beam.apache.org
> > ?
> >  >>
> >  >> Is http://metrics.beam.apache.org online? I was never able to
> > access it.
> >  >
> >  >
> >  > It doesn't support https. I had to add an exception to the HTTPS
> > Everywhere extension 

Re: Monitoring performance for releases

2020-07-21 Thread Maximilian Michels

It doesn't support https. I had to add an exception to the HTTPS Everywhere extension for 
"metrics.beam.apache.org".


*facepalm* Thanks Udi! It would always hang on me because I use HTTPS 
Everywhere.



To be explicit, I am supporting the idea of reviewing the release guide but not 
changing the release process for the already in-progress release.


I consider the release guide immutable for the process of a release. 
Thus, a change to the release guide can only affect new upcoming 
releases, not an in-process release.



+1 and I think we can also evaluate whether flaky tests should be reviewed as 
release blockers or not. Some flaky tests would be hiding real issues our users 
could face.


Flaky tests are also worth to take into account when releasing, but a 
little harder to find because may just happen to pass during building 
the release. It is possible though if we strictly capture flaky tests 
via JIRA and mark them with the Fix Version for the release.



We keep accumulating dashboards and
tests that few people care about, so it is probably worth that we use
them or get a way to alert us of regressions during the release cycle
to catch this even before the RCs.


+1 The release guide should be explicit about which performance test 
results to evaluate.


The prerequisite is that we have all the stats in one place. They seem 
to be scattered across http://metrics.beam.apache.org and 
https://apache-beam-testing.appspot.com.


Would it be possible to consolidate the two, i.e. use the Grafana-based 
dashboard to load the legacy stats?


For the evaluation during the release process, I suggest to use a 
standardized set of performance tests for all runners, e.g.:


- Nexmark
- ParDo (Classic/Portable)
- GroupByKey
- IO


-Max

On 21.07.20 01:23, Ahmet Altay wrote:


On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía > wrote:


+1

This is not in the release guide and we should probably re evaluate if
this should be a release blocking reason.
Of course exceptionally a performance regression could be motivated by
a correctness fix or a worth refactor, so we should consider this.


+1 and I think we can also evaluate whether flaky tests should be 
reviewed as release blockers or not. Some flaky tests would be hiding 
real issues our users could face.


To be explicit, I am supporting the idea of reviewing the release guide 
but not changing the release process for the already in-progress release.



We have been tracking and fixing performance regressions multiple
times found simply by checking the nexmark tests including on the
ongoing 2.23.0 release so value is there. Nexmark does not cover yet
python and portable runners so we are probably still missing many
issues and it is worth to work on this. In any case we should probably
decide what validations matter. We keep accumulating dashboards and
tests that few people care about, so it is probably worth that we use
them or get a way to alert us of regressions during the release cycle
to catch this even before the RCs.


I agree. And if we cannot use dashboards/tests in a meaningful way, IMO 
we can remove them. There is not much value to maintain them if they do 
not provide important signals.



On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri mailto:eh...@google.com>> wrote:
 >
 > On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels
mailto:m...@apache.org>> wrote:
 >>
 >> Not yet, I just learned about the migration to a new frontend,
including
 >> a new backend (InfluxDB instead of BigQuery).
 >>
 >> >  - Are the metrics available on metrics.beam.apache.org
?
 >>
 >> Is http://metrics.beam.apache.org online? I was never able to
access it.
 >
 >
 > It doesn't support https. I had to add an exception to the HTTPS
Everywhere extension for "metrics.beam.apache.org
".
 >
 >>
 >>
 >> >  - What is the feature delta between usinig
metrics.beam.apache.org  (much
better UI) and using apache-beam-testing.appspot.com
?
 >>
 >> AFAIK it is an ongoing migration and the delta appears to be high.
 >>
 >> >  - Can we notice regressions faster than release cadence?
 >>
 >> Absolutely! A report with the latest numbers including
statistics about
 >> the growth of metrics would be useful.
 >>
 >> >  - Can we get automated alerts?
 >>
 >> I think we could setup a Jenkins job to do this.
 >>
 >> -Max
 >>
 >> On 09.07.20 20:26, Kenneth Knowles wrote:
 >> > Questions:
 >> >
 >> >   - Are the metrics available on metrics.beam.apache.org

 >> > ?
 >> >   - What is the feature delta between usinig

Re: Monitoring performance for releases

2020-07-20 Thread Ahmet Altay
On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía  wrote:

> +1
>
> This is not in the release guide and we should probably re evaluate if
> this should be a release blocking reason.
> Of course exceptionally a performance regression could be motivated by
> a correctness fix or a worth refactor, so we should consider this.
>

+1 and I think we can also evaluate whether flaky tests should be reviewed
as release blockers or not. Some flaky tests would be hiding real issues
our users could face.

To be explicit, I am supporting the idea of reviewing the release guide but
not changing the release process for the already in-progress release.


>
> We have been tracking and fixing performance regressions multiple
> times found simply by checking the nexmark tests including on the
> ongoing 2.23.0 release so value is there. Nexmark does not cover yet
> python and portable runners so we are probably still missing many
> issues and it is worth to work on this. In any case we should probably
> decide what validations matter. We keep accumulating dashboards and
> tests that few people care about, so it is probably worth that we use
> them or get a way to alert us of regressions during the release cycle
> to catch this even before the RCs.
>

I agree. And if we cannot use dashboards/tests in a meaningful way, IMO we
can remove them. There is not much value to maintain them if they do not
provide important signals.


> On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri  wrote:
> >
> > On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels 
> wrote:
> >>
> >> Not yet, I just learned about the migration to a new frontend, including
> >> a new backend (InfluxDB instead of BigQuery).
> >>
> >> >  - Are the metrics available on metrics.beam.apache.org?
> >>
> >> Is http://metrics.beam.apache.org online? I was never able to access
> it.
> >
> >
> > It doesn't support https. I had to add an exception to the HTTPS
> Everywhere extension for "metrics.beam.apache.org".
> >
> >>
> >>
> >> >  - What is the feature delta between usinig metrics.beam.apache.org
> (much better UI) and using apache-beam-testing.appspot.com?
> >>
> >> AFAIK it is an ongoing migration and the delta appears to be high.
> >>
> >> >  - Can we notice regressions faster than release cadence?
> >>
> >> Absolutely! A report with the latest numbers including statistics about
> >> the growth of metrics would be useful.
> >>
> >> >  - Can we get automated alerts?
> >>
> >> I think we could setup a Jenkins job to do this.
> >>
> >> -Max
> >>
> >> On 09.07.20 20:26, Kenneth Knowles wrote:
> >> > Questions:
> >> >
> >> >   - Are the metrics available on metrics.beam.apache.org
> >> > ?
> >> >   - What is the feature delta between usinig metrics.beam.apache.org
> >> >  (much better UI) and using
> >> > apache-beam-testing.appspot.com <
> http://apache-beam-testing.appspot.com>?
> >> >   - Can we notice regressions faster than release cadence?
> >> >   - Can we get automated alerts?
> >> >
> >> > Kenn
> >> >
> >> > On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels  >> > > wrote:
> >> >
> >> > Hi,
> >> >
> >> > We recently saw an increase in latency migrating from Beam 2.18.0
> to
> >> > 2.21.0 (Python SDK with Flink Runner). This proofed very hard to
> debug
> >> > and it looks like each version in between the two versions let to
> >> > increased latency.
> >> >
> >> > This is not the first time we saw issues when migrating, another
> >> > time we
> >> > had a decline in checkpointing performance and thus added a
> >> > checkpointing test [1] and dashboard [2] (see checkpointing
> widget).
> >> >
> >> > That makes me wonder if we should monitor performance (throughput
> /
> >> > latency) for basic use cases as part of the release testing.
> Currently,
> >> > our release guide [3] mentions running examples but not
> evaluating the
> >> > performance. I think it would be good practice to check relevant
> charts
> >> > with performance measurements as part of of the release process.
> The
> >> > release guide should reflect that.
> >> >
> >> > WDYT?
> >> >
> >> > -Max
> >> >
> >> > PS: Of course, this requires tests and metrics to be available.
> This PR
> >> > adds latency measurements to the load tests [4].
> >> >
> >> >
> >> > [1] https://github.com/apache/beam/pull/11558
> >> > [2]
> >> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
> >> > [3] https://beam.apache.org/contribute/release-guide/
> >> > [4] https://github.com/apache/beam/pull/12065
> >> >
>


Re: Monitoring performance for releases

2020-07-20 Thread Ismaël Mejía
+1

This is not in the release guide and we should probably re evaluate if
this should be a release blocking reason.
Of course exceptionally a performance regression could be motivated by
a correctness fix or a worth refactor, so we should consider this.

We have been tracking and fixing performance regressions multiple
times found simply by checking the nexmark tests including on the
ongoing 2.23.0 release so value is there. Nexmark does not cover yet
python and portable runners so we are probably still missing many
issues and it is worth to work on this. In any case we should probably
decide what validations matter. We keep accumulating dashboards and
tests that few people care about, so it is probably worth that we use
them or get a way to alert us of regressions during the release cycle
to catch this even before the RCs.

On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri  wrote:
>
> On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels  wrote:
>>
>> Not yet, I just learned about the migration to a new frontend, including
>> a new backend (InfluxDB instead of BigQuery).
>>
>> >  - Are the metrics available on metrics.beam.apache.org?
>>
>> Is http://metrics.beam.apache.org online? I was never able to access it.
>
>
> It doesn't support https. I had to add an exception to the HTTPS Everywhere 
> extension for "metrics.beam.apache.org".
>
>>
>>
>> >  - What is the feature delta between usinig metrics.beam.apache.org (much 
>> > better UI) and using apache-beam-testing.appspot.com?
>>
>> AFAIK it is an ongoing migration and the delta appears to be high.
>>
>> >  - Can we notice regressions faster than release cadence?
>>
>> Absolutely! A report with the latest numbers including statistics about
>> the growth of metrics would be useful.
>>
>> >  - Can we get automated alerts?
>>
>> I think we could setup a Jenkins job to do this.
>>
>> -Max
>>
>> On 09.07.20 20:26, Kenneth Knowles wrote:
>> > Questions:
>> >
>> >   - Are the metrics available on metrics.beam.apache.org
>> > ?
>> >   - What is the feature delta between usinig metrics.beam.apache.org
>> >  (much better UI) and using
>> > apache-beam-testing.appspot.com ?
>> >   - Can we notice regressions faster than release cadence?
>> >   - Can we get automated alerts?
>> >
>> > Kenn
>> >
>> > On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels > > > wrote:
>> >
>> > Hi,
>> >
>> > We recently saw an increase in latency migrating from Beam 2.18.0 to
>> > 2.21.0 (Python SDK with Flink Runner). This proofed very hard to debug
>> > and it looks like each version in between the two versions let to
>> > increased latency.
>> >
>> > This is not the first time we saw issues when migrating, another
>> > time we
>> > had a decline in checkpointing performance and thus added a
>> > checkpointing test [1] and dashboard [2] (see checkpointing widget).
>> >
>> > That makes me wonder if we should monitor performance (throughput /
>> > latency) for basic use cases as part of the release testing. Currently,
>> > our release guide [3] mentions running examples but not evaluating the
>> > performance. I think it would be good practice to check relevant charts
>> > with performance measurements as part of of the release process. The
>> > release guide should reflect that.
>> >
>> > WDYT?
>> >
>> > -Max
>> >
>> > PS: Of course, this requires tests and metrics to be available. This PR
>> > adds latency measurements to the load tests [4].
>> >
>> >
>> > [1] https://github.com/apache/beam/pull/11558
>> > [2]
>> > 
>> > https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
>> > [3] https://beam.apache.org/contribute/release-guide/
>> > [4] https://github.com/apache/beam/pull/12065
>> >


Re: Monitoring performance for releases

2020-07-10 Thread Udi Meiri
On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels  wrote:

> Not yet, I just learned about the migration to a new frontend, including
> a new backend (InfluxDB instead of BigQuery).
>
> >  - Are the metrics available on metrics.beam.apache.org?
>
> Is http://metrics.beam.apache.org online? I was never able to access it.
>

It doesn't support https. I had to add an exception to the HTTPS Everywhere
extension for "metrics.beam.apache.org".


>
> >  - What is the feature delta between usinig metrics.beam.apache.org
> (much better UI) and using apache-beam-testing.appspot.com?
>
> AFAIK it is an ongoing migration and the delta appears to be high.
>
> >  - Can we notice regressions faster than release cadence?
>
> Absolutely! A report with the latest numbers including statistics about
> the growth of metrics would be useful.
>
> >  - Can we get automated alerts?
>
> I think we could setup a Jenkins job to do this.
>
> -Max
>
> On 09.07.20 20:26, Kenneth Knowles wrote:
> > Questions:
> >
> >   - Are the metrics available on metrics.beam.apache.org
> > ?
> >   - What is the feature delta between usinig metrics.beam.apache.org
> >  (much better UI) and using
> > apache-beam-testing.appspot.com  >?
> >   - Can we notice regressions faster than release cadence?
> >   - Can we get automated alerts?
> >
> > Kenn
> >
> > On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels  > > wrote:
> >
> > Hi,
> >
> > We recently saw an increase in latency migrating from Beam 2.18.0 to
> > 2.21.0 (Python SDK with Flink Runner). This proofed very hard to
> debug
> > and it looks like each version in between the two versions let to
> > increased latency.
> >
> > This is not the first time we saw issues when migrating, another
> > time we
> > had a decline in checkpointing performance and thus added a
> > checkpointing test [1] and dashboard [2] (see checkpointing widget).
> >
> > That makes me wonder if we should monitor performance (throughput /
> > latency) for basic use cases as part of the release testing.
> Currently,
> > our release guide [3] mentions running examples but not evaluating
> the
> > performance. I think it would be good practice to check relevant
> charts
> > with performance measurements as part of of the release process. The
> > release guide should reflect that.
> >
> > WDYT?
> >
> > -Max
> >
> > PS: Of course, this requires tests and metrics to be available. This
> PR
> > adds latency measurements to the load tests [4].
> >
> >
> > [1] https://github.com/apache/beam/pull/11558
> > [2]
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
> > [3] https://beam.apache.org/contribute/release-guide/
> > [4] https://github.com/apache/beam/pull/12065
> >
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Monitoring performance for releases

2020-07-09 Thread Maximilian Michels
Not yet, I just learned about the migration to a new frontend, including 
a new backend (InfluxDB instead of BigQuery).



 - Are the metrics available on metrics.beam.apache.org?


Is http://metrics.beam.apache.org online? I was never able to access it.


 - What is the feature delta between usinig metrics.beam.apache.org (much 
better UI) and using apache-beam-testing.appspot.com?


AFAIK it is an ongoing migration and the delta appears to be high.


 - Can we notice regressions faster than release cadence?


Absolutely! A report with the latest numbers including statistics about 
the growth of metrics would be useful.



 - Can we get automated alerts?


I think we could setup a Jenkins job to do this.

-Max

On 09.07.20 20:26, Kenneth Knowles wrote:

Questions:

  - Are the metrics available on metrics.beam.apache.org 
?
  - What is the feature delta between usinig metrics.beam.apache.org 
 (much better UI) and using 
apache-beam-testing.appspot.com ?

  - Can we notice regressions faster than release cadence?
  - Can we get automated alerts?

Kenn

On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels > wrote:


Hi,

We recently saw an increase in latency migrating from Beam 2.18.0 to
2.21.0 (Python SDK with Flink Runner). This proofed very hard to debug
and it looks like each version in between the two versions let to
increased latency.

This is not the first time we saw issues when migrating, another
time we
had a decline in checkpointing performance and thus added a
checkpointing test [1] and dashboard [2] (see checkpointing widget).

That makes me wonder if we should monitor performance (throughput /
latency) for basic use cases as part of the release testing. Currently,
our release guide [3] mentions running examples but not evaluating the
performance. I think it would be good practice to check relevant charts
with performance measurements as part of of the release process. The
release guide should reflect that.

WDYT?

-Max

PS: Of course, this requires tests and metrics to be available. This PR
adds latency measurements to the load tests [4].


[1] https://github.com/apache/beam/pull/11558
[2]
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
[3] https://beam.apache.org/contribute/release-guide/
[4] https://github.com/apache/beam/pull/12065



Re: Monitoring performance for releases

2020-07-09 Thread Kenneth Knowles
Questions:

 - Are the metrics available on metrics.beam.apache.org?
 - What is the feature delta between usinig metrics.beam.apache.org (much
better UI) and using apache-beam-testing.appspot.com?
 - Can we notice regressions faster than release cadence?
 - Can we get automated alerts?

Kenn

On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels  wrote:

> Hi,
>
> We recently saw an increase in latency migrating from Beam 2.18.0 to
> 2.21.0 (Python SDK with Flink Runner). This proofed very hard to debug
> and it looks like each version in between the two versions let to
> increased latency.
>
> This is not the first time we saw issues when migrating, another time we
> had a decline in checkpointing performance and thus added a
> checkpointing test [1] and dashboard [2] (see checkpointing widget).
>
> That makes me wonder if we should monitor performance (throughput /
> latency) for basic use cases as part of the release testing. Currently,
> our release guide [3] mentions running examples but not evaluating the
> performance. I think it would be good practice to check relevant charts
> with performance measurements as part of of the release process. The
> release guide should reflect that.
>
> WDYT?
>
> -Max
>
> PS: Of course, this requires tests and metrics to be available. This PR
> adds latency measurements to the load tests [4].
>
>
> [1] https://github.com/apache/beam/pull/11558
> [2]
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
> [3] https://beam.apache.org/contribute/release-guide/
> [4] https://github.com/apache/beam/pull/12065
>


Monitoring performance for releases

2020-07-09 Thread Maximilian Michels

Hi,

We recently saw an increase in latency migrating from Beam 2.18.0 to 
2.21.0 (Python SDK with Flink Runner). This proofed very hard to debug 
and it looks like each version in between the two versions let to 
increased latency.


This is not the first time we saw issues when migrating, another time we 
had a decline in checkpointing performance and thus added a 
checkpointing test [1] and dashboard [2] (see checkpointing widget).


That makes me wonder if we should monitor performance (throughput / 
latency) for basic use cases as part of the release testing. Currently, 
our release guide [3] mentions running examples but not evaluating the 
performance. I think it would be good practice to check relevant charts 
with performance measurements as part of of the release process. The 
release guide should reflect that.


WDYT?

-Max

PS: Of course, this requires tests and metrics to be available. This PR 
adds latency measurements to the load tests [4].



[1] https://github.com/apache/beam/pull/11558
[2] 
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

[3] https://beam.apache.org/contribute/release-guide/
[4] https://github.com/apache/beam/pull/12065