[ https://issues.apache.org/jira/browse/FLINK-23354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhu Zhu reassigned FLINK-23354: ------------------------------- Assignee: Zhilong Hong > Limit the size of ShuffleDescriptors in PermanentBlobCache on TaskExecutor > -------------------------------------------------------------------------- > > Key: FLINK-23354 > URL: https://issues.apache.org/jira/browse/FLINK-23354 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination > Reporter: Zhilong Hong > Assignee: Zhilong Hong > Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > _This is the part 3 of the optimization related to task deployments. For more > details about the overall description and the part 1, please see FLINK-23005. > For more details about the part 2 please see FLINK-23218._ > Currently a TaskExecutor uses BlobCache to cache the blobs transported from > JobManager. The caches are the local file stored on the TaskExecutor. The > blob cache will not be cleaned up until one hour after the related job is > finished. In FLINK-23218, we are going to distribute the cached > ShuffleDescriptors via blob. When large amount of failovers happen, there > will be a lot of cache stored on local disk. The blob cache will occupy large > amount of disk space. In extreme cases, the blob would blow up the disk space. > So we need to add a limit size for the ShuffleDescriptors stored in > PermanentBlobCache on TaskExecutor, as described in the comments of > FLINK-23218. The main idea is to add a size limit and and delete the blobs in > LRU order if the size limit is exceeded. Before a blob item is cached, > TaskExecutor will firstly check the overall size of cache. If the overall > size exceeds the limit, the blob will be deleted in LRU order until the limit > is not exceeded anymore. For the blob cache that is deleted, if it is used > afterwards, it will be downloaded from the HA or the blob server again. > The default value of the size limit for the ShuffleDescriptors in > PermanentBlobCache on TaskExecutor will be 100 MiB. -- This message was sent by Atlassian Jira (v8.3.4#803005)