On Tue, Jul 21, 2020 at 9:58 AM Thomas Weise <[email protected]> wrote:
> It appears that there is coverage missing in the Grafana dashboards (it > could also be that I just don't find it). > > For example: > https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 > > The GBK and ParDo tests have a selection for {batch, streaming} and SDK. > No coverage for streaming and python? There is also no runner option > currently. > > We have seen repeated regressions with streaming, Python, Flink. The test > has been contributed. It would be great if the results can be covered as > part of release verification. > Even better would be if we can use these dashboards (plus alerting or similar?) to find issues before release verification. It's much easier to fix things earlier. > > Thomas > > > > On Tue, Jul 21, 2020 at 7:55 AM Kamil Wasilewski < > [email protected]> wrote: > >> The prerequisite is that we have all the stats in one place. They seem >>> to be scattered across http://metrics.beam.apache.org and >>> https://apache-beam-testing.appspot.com. >>> >>> Would it be possible to consolidate the two, i.e. use the Grafana-based >>> dashboard to load the legacy stats? >> >> >> I'm pretty sure that all dashboards have been moved to >> http://metrics.beam.apache.org. Let me know if I missed something during >> the migration. >> >> I think we should turn off https://apache-beam-testing.appspot.com in >> the near future. New Grafana-based dashboards have been working seamlessly >> for some time now and there's no point in maintaining the older solution. >> We'd also avoid ambiguity in where the stats should be looked for. >> >> Kamil >> >> On Tue, Jul 21, 2020 at 4:17 PM Maximilian Michels <[email protected]> >> wrote: >> >>> > It doesn't support https. I had to add an exception to the HTTPS >>> Everywhere extension for "metrics.beam.apache.org". >>> >>> *facepalm* Thanks Udi! It would always hang on me because I use HTTPS >>> Everywhere. >>> >>> > To be explicit, I am supporting the idea of reviewing the release >>> guide but not changing the release process for the already in-progress >>> release. >>> >>> I consider the release guide immutable for the process of a release. >>> Thus, a change to the release guide can only affect new upcoming >>> releases, not an in-process release. >>> >>> > +1 and I think we can also evaluate whether flaky tests should be >>> reviewed as release blockers or not. Some flaky tests would be hiding real >>> issues our users could face. >>> >>> Flaky tests are also worth to take into account when releasing, but a >>> little harder to find because may just happen to pass during building >>> the release. It is possible though if we strictly capture flaky tests >>> via JIRA and mark them with the Fix Version for the release. >>> >>> > We keep accumulating dashboards and >>> > tests that few people care about, so it is probably worth that we use >>> > them or get a way to alert us of regressions during the release cycle >>> > to catch this even before the RCs. >>> >>> +1 The release guide should be explicit about which performance test >>> results to evaluate. >>> >>> The prerequisite is that we have all the stats in one place. They seem >>> to be scattered across http://metrics.beam.apache.org and >>> https://apache-beam-testing.appspot.com. >>> >>> Would it be possible to consolidate the two, i.e. use the Grafana-based >>> dashboard to load the legacy stats? >>> >>> For the evaluation during the release process, I suggest to use a >>> standardized set of performance tests for all runners, e.g.: >>> >>> - Nexmark >>> - ParDo (Classic/Portable) >>> - GroupByKey >>> - IO >>> >>> >>> -Max >>> >>> On 21.07.20 01:23, Ahmet Altay wrote: >>> > >>> > On Mon, Jul 20, 2020 at 3:07 PM Ismaël Mejía <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > >>> > +1 >>> > >>> > This is not in the release guide and we should probably re >>> evaluate if >>> > this should be a release blocking reason. >>> > Of course exceptionally a performance regression could be >>> motivated by >>> > a correctness fix or a worth refactor, so we should consider this. >>> > >>> > >>> > +1 and I think we can also evaluate whether flaky tests should be >>> > reviewed as release blockers or not. Some flaky tests would be hiding >>> > real issues our users could face. >>> > >>> > To be explicit, I am supporting the idea of reviewing the release >>> guide >>> > but not changing the release process for the already in-progress >>> release. >>> > >>> > >>> > We have been tracking and fixing performance regressions multiple >>> > times found simply by checking the nexmark tests including on the >>> > ongoing 2.23.0 release so value is there. Nexmark does not cover >>> yet >>> > python and portable runners so we are probably still missing many >>> > issues and it is worth to work on this. In any case we should >>> probably >>> > decide what validations matter. We keep accumulating dashboards and >>> > tests that few people care about, so it is probably worth that we >>> use >>> > them or get a way to alert us of regressions during the release >>> cycle >>> > to catch this even before the RCs. >>> > >>> > >>> > I agree. And if we cannot use dashboards/tests in a meaningful way, >>> IMO >>> > we can remove them. There is not much value to maintain them if they >>> do >>> > not provide important signals. >>> > >>> > >>> > On Fri, Jul 10, 2020 at 9:30 PM Udi Meiri <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > > >>> > > On Thu, Jul 9, 2020 at 12:48 PM Maximilian Michels >>> > <[email protected] <mailto:[email protected]>> wrote: >>> > >> >>> > >> Not yet, I just learned about the migration to a new frontend, >>> > including >>> > >> a new backend (InfluxDB instead of BigQuery). >>> > >> >>> > >> > - Are the metrics available on metrics.beam.apache.org >>> > <http://metrics.beam.apache.org>? >>> > >> >>> > >> Is http://metrics.beam.apache.org online? I was never able to >>> > access it. >>> > > >>> > > >>> > > It doesn't support https. I had to add an exception to the HTTPS >>> > Everywhere extension for "metrics.beam.apache.org >>> > <http://metrics.beam.apache.org>". >>> > > >>> > >> >>> > >> >>> > >> > - What is the feature delta between usinig >>> > metrics.beam.apache.org <http://metrics.beam.apache.org> (much >>> > better UI) and using apache-beam-testing.appspot.com >>> > <http://apache-beam-testing.appspot.com>? >>> > >> >>> > >> AFAIK it is an ongoing migration and the delta appears to be >>> high. >>> > >> >>> > >> > - Can we notice regressions faster than release cadence? >>> > >> >>> > >> Absolutely! A report with the latest numbers including >>> > statistics about >>> > >> the growth of metrics would be useful. >>> > >> >>> > >> > - Can we get automated alerts? >>> > >> >>> > >> I think we could setup a Jenkins job to do this. >>> > >> >>> > >> -Max >>> > >> >>> > >> On 09.07.20 20:26, Kenneth Knowles wrote: >>> > >> > Questions: >>> > >> > >>> > >> > - Are the metrics available on metrics.beam.apache.org >>> > <http://metrics.beam.apache.org> >>> > >> > <http://metrics.beam.apache.org>? >>> > >> > - What is the feature delta between usinig >>> > metrics.beam.apache.org <http://metrics.beam.apache.org> >>> > >> > <http://metrics.beam.apache.org> (much better UI) and using >>> > >> > apache-beam-testing.appspot.com >>> > <http://apache-beam-testing.appspot.com> >>> > <http://apache-beam-testing.appspot.com>? >>> > >> > - Can we notice regressions faster than release cadence? >>> > >> > - Can we get automated alerts? >>> > >> > >>> > >> > Kenn >>> > >> > >>> > >> > On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels >>> > <[email protected] <mailto:[email protected]> >>> > >> > <mailto:[email protected] <mailto:[email protected]>>> wrote: >>> > >> > >>> > >> > Hi, >>> > >> > >>> > >> > We recently saw an increase in latency migrating from >>> Beam >>> > 2.18.0 to >>> > >> > 2.21.0 (Python SDK with Flink Runner). This proofed very >>> > hard to debug >>> > >> > and it looks like each version in between the two >>> versions >>> > let to >>> > >> > increased latency. >>> > >> > >>> > >> > This is not the first time we saw issues when migrating, >>> > another >>> > >> > time we >>> > >> > had a decline in checkpointing performance and thus >>> added a >>> > >> > checkpointing test [1] and dashboard [2] (see >>> > checkpointing widget). >>> > >> > >>> > >> > That makes me wonder if we should monitor performance >>> > (throughput / >>> > >> > latency) for basic use cases as part of the release >>> > testing. Currently, >>> > >> > our release guide [3] mentions running examples but not >>> > evaluating the >>> > >> > performance. I think it would be good practice to check >>> > relevant charts >>> > >> > with performance measurements as part of of the release >>> > process. The >>> > >> > release guide should reflect that. >>> > >> > >>> > >> > WDYT? >>> > >> > >>> > >> > -Max >>> > >> > >>> > >> > PS: Of course, this requires tests and metrics to be >>> > available. This PR >>> > >> > adds latency measurements to the load tests [4]. >>> > >> > >>> > >> > >>> > >> > [1] https://github.com/apache/beam/pull/11558 >>> > >> > [2] >>> > >> > >>> > >>> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 >>> > >> > [3] https://beam.apache.org/contribute/release-guide/ >>> > >> > [4] https://github.com/apache/beam/pull/12065 >>> > >> > >>> > >>> >>
