zentol opened a new pull request #11443: [FLINK-14791][coordination] 
ResourceManager tracks ClusterPartitions
URL: https://github.com/apache/flink/pull/11443
 
 
   Based on #10362 (FLINK-14792).
   
   With this PR the ResourceManager tracks cluster partitions hosted on task 
executors, based on the `ClusterPartitionReports` that are already submitted 
via TE heartbeats.
   
   The entire partition handling logic is encapsulated in a 
`ResourceManagerPartitionTracker`; the RM does not really do anything and just 
informs the tracker about certain events (arrival of a new report, shutdown of 
a TM, requested release of a partition via the REST API).
   
   The tracker maintains the following mappings
   a) data set -> TE
   b) TE -> data set
   c) TE -> partitions
   
   Mapping a) is required for the release of partitions via the REST API.
   Mapping b) is required for figuring out which data sets may be affected by a 
TE shutdown.
   Mapping c) is required for detecting when all partitions of a partition are 
being tracked and whether a TE has lost a single partition.
   
   The tracker not only tracks partitions but also searches for data sets that 
have been corrupted by the loss of partitions (e.g.,  because of a TE 
shutdown). In this case the tracker will issue release calls for the remaining 
partitions to the hosting task executors.
   
   Note that `ResourceManagerPartitionTracker#listDataSets` and  
`#releastPartitions` are unused at this time, but are required for the upcoming 
REST API.
   
   This PR does _NOT_ handle cases where a partition of a cluster partition is 
never being tracked, for example because the TE died just after the job 
finished. In this case we will currently never release partitions of the 
corresponding data set.
   This will be tackled in a follow-up, and requires setting up a timeout for 
the maximum duration between the first and last registration of a partition.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to