[ https://issues.apache.org/jira/browse/FLINK-21920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392042#comment-17392042 ]
Zhilong Hong edited comment on FLINK-21920 at 8/3/21, 7:11 AM: --------------------------------------------------------------- The proposed solution will make {{ExecutionGraphToInputsLocationsRetrieverAdapter}} stateful, which is hard to maintain. However, the improvement of this optimization seems not obvious. For now we just close it. If later we got a better idea, we'd like to reopen it. was (Author: thesharing): The proposed solution will make {{ExecutionGraphToInputsLocationsRetrieverAdapter}} stateful, which is hard to maintain. However, the improvement of this optimization seems not obvious. > Optimize ExecutionGraphToInputsLocationsRetrieverAdapter > -------------------------------------------------------- > > Key: FLINK-21920 > URL: https://issues.apache.org/jira/browse/FLINK-21920 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination > Affects Versions: 1.13.0 > Reporter: Zhilong Hong > Priority: Major > Labels: auto-unassigned, pull-request-available > > Based on the scheduler benchmark introduced in FLINK-21731, we find that > there's a procedure related to {{DefaultScheduler#allocateSlots}} that has > O(N^2) complexity, which is: > {{ExecutionGraphToInputsLocationsRetrieverAdapter#getConsumedResultPartitionsProducers}}. > The original implementation is: > {code:java} > for all SchedulingExecutionVertex in DefaultScheduler: > for all ConsumedPartitionGroup of the SchedulingExecutionVertex: > for all IntermediateResultPartition in the ConsumedPartitionGroup: > get producer of the IntermediateResultPartition {code} > This procedure has O(N^2) complexity. > We can see that for each SchedulingExecutionVertex, the producers of its > ConsumedPartitionGroup is calculated separately. For the > SchedulingExecutionVertices in the same ConsumerVertexGroup, they have the > same ConsumedPartitionGroup. Therefore, we don't need to calculate the > producers over and over again. We can use a local cache to cache the > producers. This will decrease the complexity from O(N^2) to O(N). > -- This message was sent by Atlassian Jira (v8.3.4#803005)