On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía <[email protected]> wrote:

> +1
>
> This is not in the release guide and we should probably re evaluate if
> this should be a release blocking reason.
> Of course exceptionally a performance regression could be motivated by
> a correctness fix or a worth refactor, so we should consider this.
>

+1 and I think we can also evaluate whether flaky tests should be reviewed
as release blockers or not. Some flaky tests would be hiding real issues
our users could face.

To be explicit, I am supporting the idea of reviewing the release guide but
not changing the release process for the already in-progress release.


>
> We have been tracking and fixing performance regressions multiple
> times found simply by checking the nexmark tests including on the
> ongoing 2.23.0 release so value is there. Nexmark does not cover yet
> python and portable runners so we are probably still missing many
> issues and it is worth to work on this. In any case we should probably
> decide what validations matter. We keep accumulating dashboards and
> tests that few people care about, so it is probably worth that we use
> them or get a way to alert us of regressions during the release cycle
> to catch this even before the RCs.
>

I agree. And if we cannot use dashboards/tests in a meaningful way, IMO we
can remove them. There is not much value to maintain them if they do not
provide important signals.


> On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri <[email protected]> wrote:
> >
> > On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels <[email protected]>
> wrote:
> >>
> >> Not yet, I just learned about the migration to a new frontend, including
> >> a new backend (InfluxDB instead of BigQuery).
> >>
> >> >  - Are the metrics available on metrics.beam.apache.org?
> >>
> >> Is http://metrics.beam.apache.org online? I was never able to access
> it.
> >
> >
> > It doesn't support https. I had to add an exception to the HTTPS
> Everywhere extension for "metrics.beam.apache.org".
> >
> >>
> >>
> >> >  - What is the feature delta between usinig metrics.beam.apache.org
> (much better UI) and using apache-beam-testing.appspot.com?
> >>
> >> AFAIK it is an ongoing migration and the delta appears to be high.
> >>
> >> >  - Can we notice regressions faster than release cadence?
> >>
> >> Absolutely! A report with the latest numbers including statistics about
> >> the growth of metrics would be useful.
> >>
> >> >  - Can we get automated alerts?
> >>
> >> I think we could setup a Jenkins job to do this.
> >>
> >> -Max
> >>
> >> On 09.07.20 20:26, Kenneth Knowles wrote:
> >> > Questions:
> >> >
> >> >   - Are the metrics available on metrics.beam.apache.org
> >> > <http://metrics.beam.apache.org>?
> >> >   - What is the feature delta between usinig metrics.beam.apache.org
> >> > <http://metrics.beam.apache.org> (much better UI) and using
> >> > apache-beam-testing.appspot.com <
> http://apache-beam-testing.appspot.com>?
> >> >   - Can we notice regressions faster than release cadence?
> >> >   - Can we get automated alerts?
> >> >
> >> > Kenn
> >> >
> >> > On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels <[email protected]
> >> > <mailto:[email protected]>> wrote:
> >> >
> >> >     Hi,
> >> >
> >> >     We recently saw an increase in latency migrating from Beam 2.18.0
> to
> >> >     2.21.0 (Python SDK with Flink Runner). This proofed very hard to
> debug
> >> >     and it looks like each version in between the two versions let to
> >> >     increased latency.
> >> >
> >> >     This is not the first time we saw issues when migrating, another
> >> >     time we
> >> >     had a decline in checkpointing performance and thus added a
> >> >     checkpointing test [1] and dashboard [2] (see checkpointing
> widget).
> >> >
> >> >     That makes me wonder if we should monitor performance (throughput
> /
> >> >     latency) for basic use cases as part of the release testing.
> Currently,
> >> >     our release guide [3] mentions running examples but not
> evaluating the
> >> >     performance. I think it would be good practice to check relevant
> charts
> >> >     with performance measurements as part of of the release process.
> The
> >> >     release guide should reflect that.
> >> >
> >> >     WDYT?
> >> >
> >> >     -Max
> >> >
> >> >     PS: Of course, this requires tests and metrics to be available.
> This PR
> >> >     adds latency measurements to the load tests [4].
> >> >
> >> >
> >> >     [1] https://github.com/apache/beam/pull/11558
> >> >     [2]
> >> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
> >> >     [3] https://beam.apache.org/contribute/release-guide/
> >> >     [4] https://github.com/apache/beam/pull/12065
> >> >
>

Reply via email to