Re: Monitoring performance for releases

Maximilian Michels Wed, 29 Jul 2020 03:21:24 -0700

Thanks Michał!

It is a bit tricky to verify the exported query works if I don't haveaccess to the data stored in InfluxDb.

==> Could somebody give me permissions to [email protected] forapache-beam-testing such that I can setup a ssh port-forwarding from theInfluxDb pod to my machine? I do have access to see the pods but that isnot enough.

I think that the only test data is from Python streaming tests, which are not 
implemented right now (check out 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=batch&var-sdk=python
 )

Additionally, there is an entire dimension missing: Runners. I'massuming this data is for Dataflow?


-Max

On 29.07.20 11:55, Michał Walenia wrote:

Hi there,

 > Indeed the Python load test data appears to be missing:

>http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python

I think that the only test data is from Python streaming tests, whichare not implemented right now (check outhttp://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=batch&var-sdk=python)

As for updating the dashboards, the manual for doing this is here:https://cwiki.apache.org/confluence/display/BEAM/Community+Metrics#CommunityMetrics-UpdatingDashboards


I hope this helps,

Michal

On Mon, Jul 27, 2020 at 4:31 PM Maximilian Michels <[email protected]<mailto:[email protected]>> wrote:


    Indeed the Python load test data appears to be missing:
    
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python

    How do we typically modify the dashboards?

    It looks like we need to edit this json file:
    
https://github.com/apache/beam/blob/8d460db620d2ff1257b0e092218294df15b409a1/.test-infra/metrics/grafana/dashboards/perftests_metrics/ParDo_Load_Tests.json#L81

    I found some documentation on the deployment:
    https://cwiki.apache.org/confluence/display/BEAM/Test+Results+Monitoring


    +1 for alerting or weekly emails including performance numbers for
    fixed
    intervals (1d, 1w, 1m, previous release).

    +1 for linking the dashboards in the release guide to allow for a
    comparison as part of the release process.

    As a first step, consolidating all the data seems like the most
    pressing
    problem to solve.

    @Kamil I could need some advice regarding how to proceed updating the
    dashboards.

    -Max

    On 22.07.20 20:20, Robert Bradshaw wrote:
     > On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise <[email protected]
    <mailto:[email protected]>
     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
     >
     >     It appears that there is coverage missing in the Grafana
    dashboards
     >     (it could also be that I just don't find it).
     >
     >     For example:
     >
    https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
     >
     >     The GBK and ParDo tests have a selection for {batch,
    streaming} and
     >     SDK. No coverage for streaming and python? There is also no
    runner
     >     option currently.
     >
     >     We have seen repeated regressions with streaming, Python,
    Flink. The
     >     test has been contributed. It would be great if the results
    can be
     >     covered as part of release verification.
     >
     >
     > Even better would be if we can use these dashboards (plus
    alerting or
     > similar?) to find issues before release verification. It's much
    easier
     > to fix things earlier.
     >
     >
     >     Thomas
     >
     >
     >
     >     On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski
     >     <[email protected]
    <mailto:[email protected]>
    <mailto:[email protected]
    <mailto:[email protected]>>>
     >     wrote:
     >
     >             The prerequisite is that we have all the stats in one
    place.
     >             They seem
     >             to be scattered across http://metrics.beam.apache.org and
     > https://apache-beam-testing.appspot.com.
     >
     >             Would it be possible to consolidate the two, i.e. use the
     >             Grafana-based
     >             dashboard to load the legacy stats?
     >
     >
     >         I'm pretty sure that all dashboards have been moved to
     > http://metrics.beam.apache.org. Let me know if I missed
     >         something during the migration.
     >
     >         I think we should turn off
     > https://apache-beam-testing.appspot.com in the near future. New
     >         Grafana-based dashboards have been working seamlessly for
    some
     >         time now and there's no point in maintaining the older
    solution.
     >         We'd also avoid ambiguity in where the stats should be
    looked for.
     >
     >         Kamil
     >
     >         On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels
     >         <[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>> wrote:
     >
     >              > It doesn't support https. I had to add an exception to
     >             the HTTPS Everywhere extension for
    "metrics.beam.apache.org <http://metrics.beam.apache.org>
     >             <http://metrics.beam.apache.org>".
     >
     >             *facepalm* Thanks Udi! It would always hang on me
    because I
     >             use HTTPS
     >             Everywhere.
     >
     >              > To be explicit, I am supporting the idea of
    reviewing the
     >             release guide but not changing the release process
    for the
     >             already in-progress release.
     >
     >             I consider the release guide immutable for the
    process of a
     >             release.
     >             Thus, a change to the release guide can only affect new
     >             upcoming
     >             releases, not an in-process release.
     >
     >              > +1 and I think we can also evaluate whether flaky
    tests
     >             should be reviewed as release blockers or not. Some flaky
     >             tests would be hiding real issues our users could face.
     >
     >             Flaky tests are also worth to take into account when
     >             releasing, but a
     >             little harder to find because may just happen to pass
    during
     >             building
     >             the release. It is possible though if we strictly capture
     >             flaky tests
     >             via JIRA and mark them with the Fix Version for the
    release.
     >
     >              > We keep accumulating dashboards and
     >              > tests that few people care about, so it is
    probably worth
     >             that we use
     >              > them or get a way to alert us of regressions
    during the
     >             release cycle
     >              > to catch this even before the RCs.
     >
     >             +1 The release guide should be explicit about which
     >             performance test
     >             results to evaluate.
     >
     >             The prerequisite is that we have all the stats in one
    place.
     >             They seem
     >             to be scattered across http://metrics.beam.apache.org and
     > https://apache-beam-testing.appspot.com.
     >
     >             Would it be possible to consolidate the two, i.e. use the
     >             Grafana-based
     >             dashboard to load the legacy stats?
     >
     >             For the evaluation during the release process, I
    suggest to
     >             use a
     >             standardized set of performance tests for all
    runners, e.g.:
     >
     >             - Nexmark
     >             - ParDo (Classic/Portable)
     >             - GroupByKey
     >             - IO
     >
     >
     >             -Max
     >
     >             On 21.07.20 01:23, Ahmet Altay wrote:
     >              >
     >              > On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía
     >             <[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>
     >              > <mailto:[email protected]
    <mailto:[email protected]> <mailto:[email protected]
    <mailto:[email protected]>>>> wrote:
     >              >
     >              >     +1
     >              >
     >              >     This is not in the release guide and we should
     >             probably re evaluate if
     >              >     this should be a release blocking reason.
     >              >     Of course exceptionally a performance regression
     >             could be motivated by
     >              >     a correctness fix or a worth refactor, so we
    should
     >             consider this.
     >              >
     >              >
     >              > +1 and I think we can also evaluate whether
    flaky tests
     >             should be
     >              > reviewed as release blockers or not. Some flaky tests
     >             would be hiding
     >              > real issues our users could face.
     >              >
     >              > To be explicit, I am supporting the idea of
    reviewing the
     >             release guide
     >              > but not changing the release process for the already
     >             in-progress release.
     >              >
     >              >
     >              >     We have been tracking and fixing performance
     >             regressions multiple
     >              >     times found simply by checking the nexmark tests
     >             including on the
     >              >     ongoing 2.23.0 release so value is there. Nexmark
     >             does not cover yet
     >              >     python and portable runners so we are probably
    still
     >             missing many
     >              >     issues and it is worth to work on this. In any
    case
     >             we should probably
     >              >     decide what validations matter. We keep
    accumulating
     >             dashboards and
     >              >     tests that few people care about, so it is
    probably
     >             worth that we use
     >              >     them or get a way to alert us of regressions
    during
     >             the release cycle
     >              >     to catch this even before the RCs.
     >              >
     >              >
     >              > I agree. And if we cannot use dashboards/tests in a
     >             meaningful way, IMO
     >              > we can remove them. There is not much value to
    maintain
     >             them if they do
     >              > not provide important signals.
     >              >
     >              >
     >              >     On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri
     >             <[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>
     >              >     <mailto:[email protected]
    <mailto:[email protected]> <mailto:[email protected]
    <mailto:[email protected]>>>>
     >             wrote:
     >              >      >
     >              >      > On Thu, Jul 9, 2020 at 12:48 PM Maximilian
    Michels
     >              >     <[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>
     >             <mailto:[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>>> wrote:
     >              >      >>
     >              >      >> Not yet, I just learned about the
    migration to a
     >             new frontend,
     >              >     including
     >              >      >> a new backend (InfluxDB instead of BigQuery).
     >              >      >>
     >              >      >> >  - Are the metrics available on
     > metrics.beam.apache.org <http://metrics.beam.apache.org>
    <http://metrics.beam.apache.org>
     >              >     <http://metrics.beam.apache.org>?
     >              >      >>
     >              >      >> Is http://metrics.beam.apache.org online?
    I was
     >             never able to
     >              >     access it.
     >              >      >
     >              >      >
     >              >      > It doesn't support https. I had to add an
     >             exception to the HTTPS
     >              >     Everywhere extension for
    "metrics.beam.apache.org <http://metrics.beam.apache.org>
     >             <http://metrics.beam.apache.org>
     >              >     <http://metrics.beam.apache.org>".
     >              >      >
     >              >      >>
     >              >      >>
     >              >      >> >  - What is the feature delta between usinig
     >              > metrics.beam.apache.org
    <http://metrics.beam.apache.org> <http://metrics.beam.apache.org>
     >             <http://metrics.beam.apache.org> (much
     >              >     better UI) and using
    apache-beam-testing.appspot.com <http://apache-beam-testing.appspot.com>
     >             <http://apache-beam-testing.appspot.com>
     >              >     <http://apache-beam-testing.appspot.com>?
     >              >      >>
     >              >      >> AFAIK it is an ongoing migration and the delta
     >             appears to be high.
     >              >      >>
     >              >      >> >  - Can we notice regressions faster than
     >             release cadence?
     >              >      >>
     >              >      >> Absolutely! A report with the latest numbers
     >             including
     >              >     statistics about
     >              >      >> the growth of metrics would be useful.
     >              >      >>
     >              >      >> >  - Can we get automated alerts?
     >              >      >>
     >              >      >> I think we could setup a Jenkins job to do
    this.
     >              >      >>
     >              >      >> -Max
     >              >      >>
     >              >      >> On 09.07.20 20:26, Kenneth Knowles wrote:
     >              >      >> > Questions:
     >              >      >> >
     >              >      >> >   - Are the metrics available on
     > metrics.beam.apache.org <http://metrics.beam.apache.org>
    <http://metrics.beam.apache.org>
     >              >     <http://metrics.beam.apache.org>
     >              >      >> > <http://metrics.beam.apache.org>?
     >              >      >> >   - What is the feature delta between usinig
     >              > metrics.beam.apache.org
    <http://metrics.beam.apache.org> <http://metrics.beam.apache.org>
     >             <http://metrics.beam.apache.org>
     >              >      >> > <http://metrics.beam.apache.org> (much
    better
     >             UI) and using
     >              >      >> > apache-beam-testing.appspot.com
    <http://apache-beam-testing.appspot.com>
     >             <http://apache-beam-testing.appspot.com>
     >              >     <http://apache-beam-testing.appspot.com>
     >              >     <http://apache-beam-testing.appspot.com>?
     >              >      >> >   - Can we notice regressions faster than
     >             release cadence?
     >              >      >> >   - Can we get automated alerts?
     >              >      >> >
     >              >      >> > Kenn
     >              >      >> >
     >              >      >> > On Thu, Jul 9, 2020 at 10:21 AM
    Maximilian Michels
     >              >     <[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>
     >             <mailto:[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>>
     >              >      >> > <mailto:[email protected]
    <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>>
     >             <mailto:[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>>>> wrote:
     >              >      >> >
     >              >      >> >     Hi,
     >              >      >> >
     >              >      >> >     We recently saw an increase in latency
     >             migrating from Beam
     >              >     2.18.0 to
     >              >      >> >     2.21.0 (Python SDK with Flink
    Runner). This
     >             proofed very
     >              >     hard to debug
     >              >      >> >     and it looks like each version in
    between
     >             the two versions
     >              >     let to
     >              >      >> >     increased latency.
     >              >      >> >
     >              >      >> >     This is not the first time we saw issues
     >             when migrating,
     >              >     another
     >              >      >> >     time we
     >              >      >> >     had a decline in checkpointing
    performance
     >             and thus added a
     >              >      >> >     checkpointing test [1] and dashboard
    [2] (see
     >              >     checkpointing widget).
     >              >      >> >
     >              >      >> >     That makes me wonder if we should
    monitor
     >             performance
     >              >     (throughput /
     >              >      >> >     latency) for basic use cases as part
    of the
     >             release
     >              >     testing. Currently,
     >              >      >> >     our release guide [3] mentions running
     >             examples but not
     >              >     evaluating the
     >              >      >> >     performance. I think it would be good
     >             practice to check
     >              >     relevant charts
     >              >      >> >     with performance measurements as
    part of of
     >             the release
     >              >     process. The
     >              >      >> >     release guide should reflect that.
     >              >      >> >
     >              >      >> >     WDYT?
     >              >      >> >
     >              >      >> >     -Max
     >              >      >> >
     >              >      >> >     PS: Of course, this requires tests and
     >             metrics to be
     >              >     available. This PR
     >              >      >> >     adds latency measurements to the
    load tests
     >             [4].
     >              >      >> >
     >              >      >> >
     >              >      >> >     [1]
    https://github.com/apache/beam/pull/11558
     >              >      >> >     [2]
     >              >      >> >
     >              >
     >
    https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
     >              >      >> >     [3]
     > https://beam.apache.org/contribute/release-guide/
     >              >      >> >     [4]
    https://github.com/apache/beam/pull/12065
     >              >      >> >
     >              >
     >



--

Michał Walenia
Polidea <https://www.polidea.com/> | Software Engineer

M: +48 791 432 002 <tel:+48791432002>
E: [email protected] <mailto:[email protected]>

Unique Tech
Check out our projects! <https://www.polidea.com/our-work>

Re: Monitoring performance for releases

Reply via email to