[jira] [Commented] (FLINK-29825) Improve benchmark stability

Dong Lin (Jira) Tue, 07 Feb 2023 17:49:06 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-29825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685641#comment-17685641
 ]


Dong Lin commented on FLINK-29825:
----------------------------------

Thanks [~Yanfei Lei] for implementing and evaluating the algorithm!

[~pnowojski] Cool, I think we have agreed to make incremental improvements and 
used the algorithm proposed in the above doc to detect regression for Flink 
benchmarks.

We probably still have different understandings regarding the pros/cons of 
these alternative choices. It will be great if you or someone else can help 
implement an alternative choice and show that it can do better than the one we 
are going to use. I probably won't have time to try the Hunter algorithm myself 
in the near future.




> Improve benchmark stability
> ---------------------------
>
>                 Key: FLINK-29825
>                 URL: https://issues.apache.org/jira/browse/FLINK-29825
>             Project: Flink
>          Issue Type: Improvement
>          Components: Benchmarks
>    Affects Versions: 1.17.0
>            Reporter: Yanfei Lei
>            Assignee: Yanfei Lei
>            Priority: Minor
>
> Currently, regressions are detected by a simple script which may have false 
> positives and false negatives, especially for benchmarks with small absolute 
> values, small value changes would cause large percentage changes. see 
> [here|https://github.com/apache/flink-benchmarks/blob/master/regression_report.py#L132-L136]
>  for details.
> And all benchmarks are executed on one physical machine, it might happen that 
> hardware issues affect performance, like "[FLINK-18614] Performance 
> regression 2020.07.13".
>  
> This ticket aims to improve the precision and recall of the regression-check 
> script.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29825) Improve benchmark stability

Reply via email to