Till Rohrmann created FLINK-9908: ------------------------------------ Summary: Inconsistent state of SlotPool after ExecutionGraph cancellation Key: FLINK-9908 URL: https://issues.apache.org/jira/browse/FLINK-9908 Project: Flink Issue Type: Bug Affects Versions: 1.5.1, 1.6.0, 1.7.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.5.2, 1.6.0, 1.7.0
If the {{ExecutionGraph}} is concurrently scheduled and cancelled, it can happen that requested {{Slots}} are not properly returned to the {{SlotPool}}. This causes an inconsistent state of the {{SlotPool}} where it thinks that some of its slots are still occupied even though the respective {{Execution}} has already been cancelled. The problem seems to be caused by propagating the cancellation of the overall scheduling future to the individual scheduling futures. If the individual scheduling future is cancelled, then the callback which produces its value and also handles the failure case won't be called. -- This message was sent by Atlassian JIRA (v7.6.3#76005)