Re: Monitoring performance for releases

Robert Bradshaw Wed, 22 Jul 2020 11:22:04 -0700

On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise <[email protected]> wrote:


> It appears that there is coverage missing in the Grafana dashboards (it
> could also be that I just don't find it).
>
> For example:
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
>
> The GBK and ParDo tests have a selection for {batch, streaming} and SDK.
> No coverage for streaming and python? There is also no runner option
> currently.
>
> We have seen repeated regressions with streaming, Python, Flink. The test
> has been contributed. It would be great if the results can be covered as
> part of release verification.
>

Even better would be if we can use these dashboards (plus alerting or
similar?) to find issues before release verification. It's much easier to
fix things earlier.


>
> Thomas
>
>
>
> On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski <
> [email protected]> wrote:
>
>> The prerequisite is that we have all the stats in one place. They seem
>>> to be scattered across http://metrics.beam.apache.org and
>>> https://apache-beam-testing.appspot.com.
>>>
>>> Would it be possible to consolidate the two, i.e. use the Grafana-based
>>> dashboard to load the legacy stats?
>>
>>
>> I'm pretty sure that all dashboards have been moved to
>> http://metrics.beam.apache.org. Let me know if I missed something during
>> the migration.
>>
>> I think we should turn off https://apache-beam-testing.appspot.com in
>> the near future. New Grafana-based dashboards have been working seamlessly
>> for some time now and there's no point in maintaining the older solution.
>> We'd also avoid ambiguity in where the stats should be looked for.
>>
>> Kamil
>>
>> On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels <[email protected]>
>> wrote:
>>
>>> > It doesn't support https. I had to add an exception to the HTTPS
>>> Everywhere extension for "metrics.beam.apache.org".
>>>
>>> *facepalm* Thanks Udi! It would always hang on me because I use HTTPS
>>> Everywhere.
>>>
>>> > To be explicit, I am supporting the idea of reviewing the release
>>> guide but not changing the release process for the already in-progress
>>> release.
>>>
>>> I consider the release guide immutable for the process of a release.
>>> Thus, a change to the release guide can only affect new upcoming
>>> releases, not an in-process release.
>>>
>>> > +1 and I think we can also evaluate whether flaky tests should be
>>> reviewed as release blockers or not. Some flaky tests would be hiding real
>>> issues our users could face.
>>>
>>> Flaky tests are also worth to take into account when releasing, but a
>>> little harder to find because may just happen to pass during building
>>> the release. It is possible though if we strictly capture flaky tests
>>> via JIRA and mark them with the Fix Version for the release.
>>>
>>> > We keep accumulating dashboards and
>>> > tests that few people care about, so it is probably worth that we use
>>> > them or get a way to alert us of regressions during the release cycle
>>> > to catch this even before the RCs.
>>>
>>> +1 The release guide should be explicit about which performance test
>>> results to evaluate.
>>>
>>> The prerequisite is that we have all the stats in one place. They seem
>>> to be scattered across http://metrics.beam.apache.org and
>>> https://apache-beam-testing.appspot.com.
>>>
>>> Would it be possible to consolidate the two, i.e. use the Grafana-based
>>> dashboard to load the legacy stats?
>>>
>>> For the evaluation during the release process, I suggest to use a
>>> standardized set of performance tests for all runners, e.g.:
>>>
>>> - Nexmark
>>> - ParDo (Classic/Portable)
>>> - GroupByKey
>>> - IO
>>>
>>>
>>> -Max
>>>
>>> On 21.07.20 01:23, Ahmet Altay wrote:
>>> >
>>> > On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía <[email protected]
>>> > <mailto:[email protected]>> wrote:
>>> >
>>> >     +1
>>> >
>>> >     This is not in the release guide and we should probably re
>>> evaluate if
>>> >     this should be a release blocking reason.
>>> >     Of course exceptionally a performance regression could be
>>> motivated by
>>> >     a correctness fix or a worth refactor, so we should consider this.
>>> >
>>> >
>>> > +1 and I think we can also evaluate whether flaky tests should be
>>> > reviewed as release blockers or not. Some flaky tests would be hiding
>>> > real issues our users could face.
>>> >
>>> > To be explicit, I am supporting the idea of reviewing the release
>>> guide
>>> > but not changing the release process for the already in-progress
>>> release.
>>> >
>>> >
>>> >     We have been tracking and fixing performance regressions multiple
>>> >     times found simply by checking the nexmark tests including on the
>>> >     ongoing 2.23.0 release so value is there. Nexmark does not cover
>>> yet
>>> >     python and portable runners so we are probably still missing many
>>> >     issues and it is worth to work on this. In any case we should
>>> probably
>>> >     decide what validations matter. We keep accumulating dashboards and
>>> >     tests that few people care about, so it is probably worth that we
>>> use
>>> >     them or get a way to alert us of regressions during the release
>>> cycle
>>> >     to catch this even before the RCs.
>>> >
>>> >
>>> > I agree. And if we cannot use dashboards/tests in a meaningful way,
>>> IMO
>>> > we can remove them. There is not much value to maintain them if they
>>> do
>>> > not provide important signals.
>>> >
>>> >
>>> >     On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri <[email protected]
>>> >     <mailto:[email protected]>> wrote:
>>> >      >
>>> >      > On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels
>>> >     <[email protected] <mailto:[email protected]>> wrote:
>>> >      >>
>>> >      >> Not yet, I just learned about the migration to a new frontend,
>>> >     including
>>> >      >> a new backend (InfluxDB instead of BigQuery).
>>> >      >>
>>> >      >> >  - Are the metrics available on metrics.beam.apache.org
>>> >     <http://metrics.beam.apache.org>?
>>> >      >>
>>> >      >> Is http://metrics.beam.apache.org online? I was never able to
>>> >     access it.
>>> >      >
>>> >      >
>>> >      > It doesn't support https. I had to add an exception to the HTTPS
>>> >     Everywhere extension for "metrics.beam.apache.org
>>> >     <http://metrics.beam.apache.org>".
>>> >      >
>>> >      >>
>>> >      >>
>>> >      >> >  - What is the feature delta between usinig
>>> >     metrics.beam.apache.org <http://metrics.beam.apache.org> (much
>>> >     better UI) and using apache-beam-testing.appspot.com
>>> >     <http://apache-beam-testing.appspot.com>?
>>> >      >>
>>> >      >> AFAIK it is an ongoing migration and the delta appears to be
>>> high.
>>> >      >>
>>> >      >> >  - Can we notice regressions faster than release cadence?
>>> >      >>
>>> >      >> Absolutely! A report with the latest numbers including
>>> >     statistics about
>>> >      >> the growth of metrics would be useful.
>>> >      >>
>>> >      >> >  - Can we get automated alerts?
>>> >      >>
>>> >      >> I think we could setup a Jenkins job to do this.
>>> >      >>
>>> >      >> -Max
>>> >      >>
>>> >      >> On 09.07.20 20:26, Kenneth Knowles wrote:
>>> >      >> > Questions:
>>> >      >> >
>>> >      >> >   - Are the metrics available on metrics.beam.apache.org
>>> >     <http://metrics.beam.apache.org>
>>> >      >> > <http://metrics.beam.apache.org>?
>>> >      >> >   - What is the feature delta between usinig
>>> >     metrics.beam.apache.org <http://metrics.beam.apache.org>
>>> >      >> > <http://metrics.beam.apache.org> (much better UI) and using
>>> >      >> > apache-beam-testing.appspot.com
>>> >     <http://apache-beam-testing.appspot.com>
>>> >     <http://apache-beam-testing.appspot.com>?
>>> >      >> >   - Can we notice regressions faster than release cadence?
>>> >      >> >   - Can we get automated alerts?
>>> >      >> >
>>> >      >> > Kenn
>>> >      >> >
>>> >      >> > On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels
>>> >     <[email protected] <mailto:[email protected]>
>>> >      >> > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>>> >      >> >
>>> >      >> >     Hi,
>>> >      >> >
>>> >      >> >     We recently saw an increase in latency migrating from
>>> Beam
>>> >     2.18.0 to
>>> >      >> >     2.21.0 (Python SDK with Flink Runner). This proofed very
>>> >     hard to debug
>>> >      >> >     and it looks like each version in between the two
>>> versions
>>> >     let to
>>> >      >> >     increased latency.
>>> >      >> >
>>> >      >> >     This is not the first time we saw issues when migrating,
>>> >     another
>>> >      >> >     time we
>>> >      >> >     had a decline in checkpointing performance and thus
>>> added a
>>> >      >> >     checkpointing test [1] and dashboard [2] (see
>>> >     checkpointing widget).
>>> >      >> >
>>> >      >> >     That makes me wonder if we should monitor performance
>>> >     (throughput /
>>> >      >> >     latency) for basic use cases as part of the release
>>> >     testing. Currently,
>>> >      >> >     our release guide [3] mentions running examples but not
>>> >     evaluating the
>>> >      >> >     performance. I think it would be good practice to check
>>> >     relevant charts
>>> >      >> >     with performance measurements as part of of the release
>>> >     process. The
>>> >      >> >     release guide should reflect that.
>>> >      >> >
>>> >      >> >     WDYT?
>>> >      >> >
>>> >      >> >     -Max
>>> >      >> >
>>> >      >> >     PS: Of course, this requires tests and metrics to be
>>> >     available. This PR
>>> >      >> >     adds latency measurements to the load tests [4].
>>> >      >> >
>>> >      >> >
>>> >      >> >     [1] https://github.com/apache/beam/pull/11558
>>> >      >> >     [2]
>>> >      >> >
>>> >
>>> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
>>> >      >> >     [3] https://beam.apache.org/contribute/release-guide/
>>> >      >> >     [4] https://github.com/apache/beam/pull/12065
>>> >      >> >
>>> >
>>>
>>

Re: Monitoring performance for releases

Reply via email to