[ https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296087#comment-17296087 ]
Till Rohrmann commented on FLINK-19142: --------------------------------------- How would the system behave if we lose two slots (a and b) which have been used for bulk_1 and bulk_2. Now with the decreased resources bulk_1 and bulk_2 cannot be run concurrently and we need to use slot b for bulk_1 to make it work. Would this also work with your proposal? > Investigate slot hijacking from preceding pipelined regions after failover > -------------------------------------------------------------------------- > > Key: FLINK-19142 > URL: https://issues.apache.org/jira/browse/FLINK-19142 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Affects Versions: 1.12.0 > Reporter: Andrey Zagrebin > Assignee: Zhu Zhu > Priority: Major > Fix For: 1.13.0 > > > The ticket originates from [this PR > discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221]. > The previous AllocationIDs are used by > PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot > where they were previously executed before a failover. If the previous slot > (AllocationID) is not available, we do not want subtasks to take previous > slots (AllocationIDs) of other subtasks. > The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the > bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. > The previous AllocationIDs of other bulks stay unknown. Therefore, the > current bulk can potentially hijack the previous slots from the preceding > bulks. On the other hand the previous AllocationIDs of other tasks should be > taken if the other tasks are not going to run at the same time, e.g. not > enough resources after failover or other bulks are done. > One way to do it may be to give to MergingSharedSlotProfileRetriever all > previous AllocationIDs of bulks which are going to run at the same time. -- This message was sent by Atlassian Jira (v8.3.4#803005)