azagrebin opened a new pull request #8789: [FLINK-12890] Add partition 
lifecycle related Shuffle API
URL: https://github.com/apache/flink/pull/8789
 
 
   ## What is the purpose of the change
   
   At the moment we have `ShuffleEnvironment.releasePartitions` which is used 
to release locally occupied resources of partition. JM can also use it by 
calling `TaskExecutorGateway.releasePartitions`.
   
   To support lifecycle management of partitions 
([FLINK-12069](https://issues.apache.org/jira/browse/FLINK-12069), relevant 
mostly for batch and blocking partitions), we need to extend Shuffle API:
   
     - `ShuffleDescriptor.hasLocalResources` indicates that this partition 
occupies local resources on TM and requires TM running to consume the produced 
data (e.g. true for default `NettyShuffleEnviroment` and false for externally 
stored partitions). If a partition needs external lifecycle management and is 
not released after the first consumption is done 
(`ResultPartitionDeploymentDescriptor.isReleasedOnConsumption`), then RM/JM 
should keep TMs, which produce these partitions, running until partition still 
needs to be consumed. The connection to these TMs should also to be kept to 
issue the RPC call `TaskExecutorGateway.releasePartitions` once partition is 
not needed any more.
   
     - `ShuffleMaster.removePartitionExternally`: JM should call this whenever 
the partition does not need to be consumed any more. This call releases 
partition resources possibly occupied externally outside of TM and should not 
depend on `ShuffleDescriptor.hasLocalResources`.
   
   ## Brief change log
   
     - Introduce `ShuffleDescriptor.hasLocalResources` and default netty 
shuffle implementation
     - Introduce `ShuffleMaster.removePartitionExternally` and default netty 
shuffle implementation
   
   ## Verifying this change
   
   Trivial shuffle interface extension.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: ( no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (no)
     - If yes, how is the feature documented? (not applicable)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to