[ https://issues.apache.org/jira/browse/FLINK-23593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392423#comment-17392423 ]
Piotr Nowojski edited comment on FLINK-23593 at 8/3/21, 4:59 PM: ----------------------------------------------------------------- Because of FLINK-23392, FLINK-23560, you can not compare the results from July 15th to the current results. Also because of various braking changes like FLINK-23464, you can not use the benchmarking code from current `flink-benchmarks` master to run old `flink` code. You have to use both Flink and flink-benchmarks code from the the time of the regression. I was able quite easily reproduce the regression of FLINK-23392 using flink-benchmarks commit: d816a18 http://codespeed.dak8s.net:8080/job/flink-benchmark-request/345/ (last good, flink commit: 4a78097d038) {noformat} "Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedMultiInput","thrpt",1,30,1996.460479,28.904057,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedOneInput","thrpt",1,30,2337.385239,43.234577,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedTwoInput","thrpt",1,30,1946.457665,28.919437,"ops/ms" {noformat} http://codespeed.dak8s.net:8080/job/flink-benchmark-request/347/ (first bad, flink commit: d8b1a6fd368) {noformat} "Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedMultiInput","thrpt",1,30,1837.391829,23.495855,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedOneInput","thrpt",1,30,2370.271382,37.804557,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedTwoInput","thrpt",1,30,1788.425393,22.619503,"ops/ms" {noformat} (those numbers perfectly align with the performance regression visible in the webUI on 15.07) was (Author: pnowojski): Because of https://issues.apache.org/jira/browse/FLINK-23392, https://issues.apache.org/jira/browse/FLINK-23560, you can not compare the results from July 15th to the current results. Also because of various braking changes like https://issues.apache.org/jira/browse/FLINK-23464, you can not use the benchmarking code from current `flink-benchmarks` master to run old `flink` code. You have to use both Flink and flink-benchmarks code from the the time of the regression. I was able quite easily reproduce the regression of FLINK-23392 using flink-benchmarks commit: d816a18 http://codespeed.dak8s.net:8080/job/flink-benchmark-request/345/ (last good, flink commit: 4a78097d038) {noformat} "Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedMultiInput","thrpt",1,30,1996.460479,28.904057,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedOneInput","thrpt",1,30,2337.385239,43.234577,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedTwoInput","thrpt",1,30,1946.457665,28.919437,"ops/ms" {noformat} http://codespeed.dak8s.net:8080/job/flink-benchmark-request/347/ (first bad, flink commit: d8b1a6fd368) {noformat} "Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedMultiInput","thrpt",1,30,1837.391829,23.495855,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedOneInput","thrpt",1,30,2370.271382,37.804557,"ops/ms" "org.apache.flink.benchmark.SortingBoundedInputBenchmarks.sortedTwoInput","thrpt",1,30,1788.425393,22.619503,"ops/ms" {noformat} (those numbers perfectly align with the performance regression visible in the webUI on 15.07) > Performance regression on 15.07.2021 > ------------------------------------ > > Key: FLINK-23593 > URL: https://issues.apache.org/jira/browse/FLINK-23593 > Project: Flink > Issue Type: Bug > Components: API / DataStream, Benchmarks > Affects Versions: 1.14.0 > Reporter: Piotr Nowojski > Assignee: Timo Walther > Priority: Blocker > Fix For: 1.14.0 > > > http://codespeed.dak8s.net:8000/timeline/?ben=sortedMultiInput&env=2 > http://codespeed.dak8s.net:8000/timeline/?ben=sortedTwoInput&env=2 > {noformat} > pnowojski@piotr-mbp: [~/flink - ((no branch, bisect started on pr/16589))] $ > git ls f4afbf3e7de..eb8100f7afe > eb8100f7afe [4 weeks ago] (pn/bad, bad, refs/bisect/bad) > [FLINK-22017][coordination] Allow BLOCKING result partition to be > individually consumable [Thesharing] > d2005268b1e [4 weeks ago] (HEAD, pn/bisect-4, bisect-4) > [FLINK-22017][coordination] Get the ConsumedPartitionGroup that > IntermediateResultPartition and DefaultResultPartition belong to [Thesharing] > d8b1a6fd368 [3 weeks ago] [FLINK-23372][streaming-java] Disable > AllVerticesInSameSlotSharingGroupByDefault in batch mode [Timo Walther] > 4a78097d038 [3 weeks ago] (pn/bisect-3, bisect-3, > refs/bisect/good-4a78097d0385749daceafd8326930c8cc5f26f1a) > [FLINK-21928][clients][runtime] Introduce static method constructors of > DuplicateJobSubmissionException for better readability. [David Moravek] > 172b9e32215 [3 weeks ago] [FLINK-21928][clients] JobManager failover should > succeed, when trying to resubmit already terminated job in application mode. > [David Moravek] > f483008db86 [3 weeks ago] [FLINK-21928][core] Introduce > org.apache.flink.util.concurrent.FutureUtils#handleException method, that > allows future to recover from the specied exception. [David Moravek] > d7ac08c2ac0 [3 weeks ago] (pn/bisect-2, bisect-2, > refs/bisect/good-d7ac08c2ac06b9ff31707f3b8f43c07817814d4f) > [FLINK-22843][docs-zh] Document and code are inconsistent [ZhiJie Yang] > 16c3ea427df [3 weeks ago] [hotfix] Split the final checkpoint related tests > to a separate test class. [Yun Gao] > 31b3d37a22c [7 weeks ago] [FLINK-21089][runtime] Skip the execution of new > sources if finished on restore [Yun Gao] > 20fe062e1b5 [3 weeks ago] [FLINK-21089][runtime] Skip execution for the > legacy source task if finished on restore [Yun Gao] > 874c627114b [3 weeks ago] [FLINK-21089][runtime] Skip the lifecycle method of > operators if finished on restore [Yun Gao] > ceaf24b1d88 [3 weeks ago] (pn/bisect-1, bisect-1, > refs/bisect/good-ceaf24b1d881c2345a43f305d40435519a09cec9) [hotfix] Fix > isClosed() for operator wrapper and proxy operator close to the operator > chain [Yun Gao] > 41ea591a6db [3 weeks ago] [FLINK-22627][runtime] Remove unused slot request > protocol [Yangze Guo] > 489346b60f8 [3 months ago] [FLINK-22627][runtime] Remove PendingSlotRequest > [Yangze Guo] > 8ffb4d2af36 [3 months ago] [FLINK-22627][runtime] Remove TaskManagerSlot > [Yangze Guo] > 72073741588 [3 months ago] [FLINK-22627][runtime] Remove SlotManagerImpl and > its related tests [Yangze Guo] > bdb3b7541b3 [3 months ago] [hotfix][yarn] Remove unused internal options in > YarnConfigOptionsInternal [Yangze Guo] > a6a9b192eac [3 weeks ago] [FLINK-23201][streaming] Reset alignment only for > the currently processed checkpoint [Anton Kalashnikov] > b35701a35c7 [3 weeks ago] [FLINK-23201][streaming] Calculate checkpoint > alignment time only for last started checkpoint [Anton Kalashnikov] > 3abec22c536 [3 weeks ago] [FLINK-23107][table-runtime] Separate > implementation of deduplicate rank from other rank functions [Shuo Cheng] > 1a195f5cc59 [3 weeks ago] [FLINK-16093][docs-zh] Translate "System Functions" > page of "Functions" into Chinese (#16348) [ZhiJie Yang] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)