Hi everyone, As discussed earlier, we plan to create a benchmark channel in Apache Flink slack[1], but the plan was shelved for a while[2]. So I went on with this work, and created the #flink-dev-benchmarks channel for performance regression notifications.
We have a regression report script[3] that runs daily, and a notification would be sent to the slack channel when the last few benchmark results are significantly worse than the baseline. Note, regressions are detected by a simple script which may have false positives and false negatives. And all benchmarks are executed on one physical machine[4] which is provided by Ververica(Alibaba)[5], it might happen that hardware issues affect performance, like "[FLINK-18614 <https://issues.apache.org/jira/browse/FLINK-18614>] Performance regression 2020.07.13"[6]. After the migration, we need a procedure to watch over the entire performance of Flink code together. For example, if a regression occurs, investigating the cause and resolving the problem are needed. In the past, this procedure is maintained internally within Ververica, but we think making the procedure public would benefit all. I volunteer to serve as one of the initial maintainers, and would be glad if more contributors can join me. I'd also prepare some guidelines to help others get familiar with the workflow. I will start a new thread to discuss the workflow soon. [1] https://www.mail-archive.com/dev@flink.apache.org/msg58666.html [2] https://issues.apache.org/jira/browse/FLINK-28468 [3] https://github.com/apache/flink-benchmarks/blob/master/regression_report.py [4] http://codespeed.dak8s.net:8080 [5] https://lists.apache.org/thread/jzljp4233799vwwqnr0vc9wgqs0xj1ro [6] https://issues.apache.org/jira/browse/FLINK-18614