azagrebin commented on a change in pull request #9250: 
[FLINK-13371][coordination] Prevent leaks of blocking partitions 
URL: https://github.com/apache/flink/pull/9250#discussion_r308678007
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java
 ##########
 @@ -1066,6 +1066,8 @@ void markFinished(Map<String, Accumulator<?, ?>> 
userAccumulators, IOMetrics met
                        else if (current == CANCELING) {
                                // we sent a cancel call, and the task manager 
finished before it arrived. We
                                // will never get a CANCELED call back from the 
job manager
+                               // release all partitions because partitions 
should only be kept if the execution reaches FINISHED
+                               
sendReleaseIntermediateResultPartitionsRpcCall();
 
 Review comment:
   what about other following "not properly finished" branches? no need for 
release calls there?
   
   Also, this is a bit misleading:
   ```
   At this point the PartitionTracker is not yet tracking these partitions 
(since we never officially reached a state FINISHED in the EG), hence the 
execution is sending these through separate RPC logic.
   ```
   From what I see, we start tracking while the execution is being deployed in 
`Execution#registerProducedPartitions` and this is why we do:
   ```
   Additionally, the execution no longer issues release calls through the 
PartitionTracker if it reached a terminal state, but just removes the 
partitions from the tracker.
   ```
   but it does not need release in case of normally confirmed cancelation by 
Task which does the release internally (maybe simplify and always send it as 
before?).
   
   At the same time, this change partially addresses:
   ```
   Note that a similar issue can occur for pipelined partitions that are 
buffered in the producers side before a consumer was actually scheduled.
   ```
   because RPCs are sent for all partitions. Since this will not be needed once 
task state is coupled with consumer confirmation for the pipelined, I would do 
this `sendReleaseIntermediateResultPartitionsRpcCall` only for pipelined and 
use partition tracker "removeWithRelease" for blocking.
   
   Also, jira issue title/description should be adjusted then if we do not 
address here the previous cancel/suspend of finished partitions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to