[ 
https://issues.apache.org/jira/browse/FLINK-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091338#comment-17091338
 ] 

Zhu Zhu commented on FLINK-17330:
---------------------------------

Thanks for the explanation an d suggestion! This fix is not a blocker of other 
work items of FLIP-119. So I think we can focus on the required work items 
first and decide how we do this fix based on the time we have after the other 
items are done. 
If there is not enough time, we can just add cyclic dependency detection and 
throw exception to ask users to set edges to BLOCKING. 
If we have sufficient time, we can directly go to the solution to "merge 
regions with cyclic dependencies". I think it's possible since the feature 
freeze is highly possibly to be extended for 2 weeks.
WDYT?

> Avoid scheduling deadlocks caused by cyclic input dependencies between regions
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-17330
>                 URL: https://issues.apache.org/jira/browse/FLINK-17330
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.11.0
>            Reporter: Zhu Zhu
>            Priority: Major
>             Fix For: 1.11.0
>
>
> Imagine a job like this:
> A -- (pipelined FORWARD) --> B -- (blocking ALL-to-ALL) --> D
> A -- (pipelined FORWARD) --> C -- (pipelined FORWARD) --> D
> parallelism=2 for all vertices.
> We will have 2 execution pipelined regions:
> R1 = {A1, B1, C1, D1}
> R2 = {A2, B2, C2, D2}
> R1 has a cross-region input edge (B2->D1).
> R2 has a cross-region input edge (B1->D2).
> Scheduling deadlock will happen since we schedule a region only when all its 
> inputs are consumable (i.e. blocking partitions to be finished). This is 
> because R1 can be scheduled only if R2 finishes, while R2 can be scheduled 
> only if R1 finishes.
> To avoid this, one solution is to force a logical pipelined region with 
> intra-region ALL-to-ALL blocking edges to form one only execution pipelined 
> region, so that there would not be cyclic input dependency between regions.
> Besides that, we should also pay attention to avoid cyclic cross-region 
> POINTWISE blocking edges. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to