[ https://issues.apache.org/jira/browse/FLINK-29825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685367#comment-17685367 ]
Piotr Nowojski commented on FLINK-29825: ---------------------------------------- Thanks for the investigation [~Yanfei Lei]. As I said, I'm pretty sure we should be able to find a better, more sophisticated solution, but at the same time I can not dive deeper into this myself. I would encourage one of you to take a look at the Hunter tool that I mentioned above, and maybe include it in the comparison. But at the same time if you are strongly inclined towards [~lindong]'s idea, I wouldn't block it, as it's indeed most likely an improvement over what we have right now. > Improve benchmark stability > --------------------------- > > Key: FLINK-29825 > URL: https://issues.apache.org/jira/browse/FLINK-29825 > Project: Flink > Issue Type: Improvement > Components: Benchmarks > Affects Versions: 1.17.0 > Reporter: Yanfei Lei > Assignee: Yanfei Lei > Priority: Minor > > Currently, regressions are detected by a simple script which may have false > positives and false negatives, especially for benchmarks with small absolute > values, small value changes would cause large percentage changes. see > [here|https://github.com/apache/flink-benchmarks/blob/master/regression_report.py#L132-L136] > for details. > And all benchmarks are executed on one physical machine, it might happen that > hardware issues affect performance, like "[FLINK-18614] Performance > regression 2020.07.13". > > This ticket aims to improve the precision and recall of the regression-check > script. > -- This message was sent by Atlassian Jira (v8.20.10#820010)