[jira] [Comment Edited] (FLINK-32306) Multiple batch scheduler performance regressions

Zhu Zhu (Jira) Sun, 11 Jun 2023 07:02:04 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-32306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731351#comment-17731351
 ]


Zhu Zhu edited comment on FLINK-32306 at 6/11/23 2:01 PM:
----------------------------------------------------------

Thanks for reporting this! [~martijnvisser]
This happens because we have changed these benchmarks to benchmark 
AdaptiveBatchScheduler(previously DefaultScheduler), which is currently the 
recommended and default scheduler for batch jobs.
So this is not a blocker issue. It's just some existing problems are newly 
exposes.

Here are some more details:
1. The regression of InitScheduling.BATCH is small (several milli-seconds) and 
it's a one time operation, and therefore can be ignored.
2. We are aware of the performance regression of 
schedulingDownstreamTasks.BATCH, FLINK-32288 was opened for it and there is a 
fix in development.
3. The regression of startScheduling.BATCH was not expected. We will take a 
look. Yet it may be due to some implementation differences between the two 
schedulers and is acceptable. (the regression is less than 100ms, while it's a 
one-time operation in job lifecycle). cc [~xiasun]


was (Author: zhuzh):
Thanks for reporting this! [~martijnvisser]
This happens because we have changed these benchmarks to benchmark 
AdaptiveBatchScheduler(previously DefaultScheduler), which is currently the 
recommended and default scheduler for batch jobs.
So this is not a blocker issue. It's just some existing problems are newly 
exposes.

Here are some more details:
1. The regression of InitScheduling.BATCH is small (several milli-seconds) and 
can be ignored.
2. We are aware of the performance regression of 
schedulingDownstreamTasks.BATCH, FLINK-32288 was opened for it and there is a 
fix in development.
3. The regression of startScheduling.BATCH was not expected. We will take a 
look. Yet it may be due to some implementation differences between the two 
schedulers and is acceptable. (the regression is less than 100ms, while it's a 
one-time operation in job lifecycle). cc [~xiasun]

> Multiple batch scheduler performance regressions
> ------------------------------------------------
>
>                 Key: FLINK-32306
>                 URL: https://issues.apache.org/jira/browse/FLINK-32306
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>            Reporter: Martijn Visser
>            Priority: Blocker
>
> InitScheduling.BATCH
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=initSchedulingStrategy.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200
> schedulingDownstreamTasks.BATCH 
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=schedulingDownstreamTasks.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200
> startScheduling.BATCH
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-32306) Multiple batch scheduler performance regressions

Reply via email to