[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-09-09 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-19142:

Labels: pull-request-available  (was: auto-unassigned 
pull-request-available)

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-09-09 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-19142:

Fix Version/s: (was: 1.14.0)
   1.15.0

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-06-23 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-19142:
---
Labels: pull-request-available stale-assigned  (was: pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issue is assigned but has not 
received an update in 14 days, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it. If the "warning_label" label is not removed in 7 days, the 
issue will be automatically unassigned.


> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-06-08 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-19142:

Labels: pull-request-available  (was: pull-request-available stale-assigned)

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-06-08 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-19142:
---
Labels: pull-request-available stale-assigned  (was: pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issue is assigned but has not 
received an update in 14, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it. If the "warning_label" label is not removed in 7 days, the 
issue will be automatically unassigned.


> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-25 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-19142:
--
Labels: pull-request-available  (was: pull-request-available stale-assigned)

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-22 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-19142:
---
Labels: pull-request-available stale-assigned  (was: pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issue is assigned but has not 
received an update in 14, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it. If the "warning_label" label is not removed in 7 days, the 
issue will be automatically unassigned.


> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-05-06 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-19142:

Labels: pull-request-available  (was: auto-unassigned 
pull-request-available)

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-04-29 Thread Dawid Wysakowicz (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-19142:
-
Fix Version/s: (was: 1.13.0)
   1.14.0

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.14.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-04-27 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-19142:
---
Labels: auto-unassigned pull-request-available  (was: 
pull-request-available stale-assigned)

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.13.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-04-16 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-19142:
---
Labels: pull-request-available stale-assigned  (was: pull-request-available)

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.13.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2021-03-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19142:
---
Labels: pull-request-available  (was: )

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2020-12-07 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-19142:
---
Fix Version/s: (was: 1.12.0)
   1.13.0

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Priority: Major
> Fix For: 1.13.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2020-09-04 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-19142:
--
Fix Version/s: 1.12.0

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Priority: Major
> Fix For: 1.12.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2020-09-04 Thread Andrey Zagrebin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Zagrebin updated FLINK-19142:

Affects Version/s: 1.12.0

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Priority: Major
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19142) Investigate slot hijacking from preceding pipelined regions after failover

2020-09-04 Thread Andrey Zagrebin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Zagrebin updated FLINK-19142:

Component/s: Runtime / Coordination

> Investigate slot hijacking from preceding pipelined regions after failover
> --
>
> Key: FLINK-19142
> URL: https://issues.apache.org/jira/browse/FLINK-19142
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.12.0
>Reporter: Andrey Zagrebin
>Priority: Major
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)