[ https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085555#comment-16085555 ]
Sihua Zhou commented on FLINK-5747: ----------------------------------- Hi,[~StephanEwen], there's some problems i found with Eager Scheduling in flink 1.3.x. i will be preciate if you have time to review what i've posted in (FLINK-7153)[link title|https://issues.apache.org/jira/browse/FLINK-7153], i will close the issue if i was wrong. Thanks. Sihua zhou > Eager Scheduling should deploy all Tasks together > ------------------------------------------------- > > Key: FLINK-5747 > URL: https://issues.apache.org/jira/browse/FLINK-5747 > Project: Flink > Issue Type: Bug > Components: JobManager > Affects Versions: 1.2.0 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Fix For: 1.3.0 > > > Currently, eager scheduling immediately triggers the scheduling for all > vertices and their subtasks in topological order. > This has two problems: > - This works only, as long as resource acquisition is "synchronous". With > dynamic resource acquisition in FLIP-6, the resources are returned as Futures > which may complete out of order. This results in out-of-order (not in > topological order) scheduling of tasks which does not work for streaming. > - Deploying some tasks that depend on other tasks before it is clear that > the other tasks have resources as well leads to situations where many > deploy/recovery cycles happen before enough resources are available to get > the job running fully. > For eager scheduling, we should allocate all resources in one chunk and then > deploy once we know that all are available. > As a follow-up, the same should be done per pipelined component in lazy batch > scheduling as well. That way we get lazy scheduling across blocking > boundaries, and bulk (gang) scheduling in pipelined subgroups. > This also does not apply for efforts of fine grained recovery, where > individual tasks request replacement resources. -- This message was sent by Atlassian JIRA (v6.4.14#64029)