[ https://issues.apache.org/jira/browse/FLINK-30461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685023#comment-17685023 ]
Yuan Mei commented on FLINK-30461: ---------------------------------- Thanks [~fanrui] for the fix and [~yanfei] for discussion and review! merged commit [{{7326add}}|https://github.com/apache/flink/commit/7326addbb3f2b4867689755a04512412bcf69657] into apache:master (for the left-over rocksdb shared ssts) merged commit [{{d761f51}}|https://github.com/apache/flink/commit/d761f51ab5a76331eb9b8f423e0ffaf1d04f97f5] into apache:master (for the concurrent thread race-condition in the unit test) merged commit [{{e2c3d61}}|https://github.com/apache/flink/commit/e2c3d6152d95074561d062ce0b61b6428f6dd6e1] into apache:release-1.16 (back port the above two to 1.16 branch) merged commit [{{a63f298}}|https://github.com/apache/flink/commit/a63f2983703f438989d069ababcf4cc173441646] into apache:release-1.15 (back port the above two to 1.15 branch) > Some rocksdb sst files will remain forever > ------------------------------------------ > > Key: FLINK-30461 > URL: https://issues.apache.org/jira/browse/FLINK-30461 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / State Backends > Affects Versions: 1.16.0, 1.17.0, 1.15.3 > Reporter: Rui Fan > Assignee: Rui Fan > Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.15.4, 1.16.2 > > Attachments: image-2022-12-20-18-45-32-948.png, > image-2022-12-20-18-47-42-385.png, screenshot-1.png > > > In rocksdb incremental checkpoint mode, during file upload, if some files > have been uploaded and some files have not been uploaded, the checkpoint is > canceled due to checkpoint timeout at this time, and the uploaded files will > remain. > > h2. Impact: > The shared directory of a flink job has more than 1 million files. It > exceeded the hdfs upper limit, causing new files not to be written. > However only 50k files are available, the other 950k files should be cleaned > up. > !https://user-images.githubusercontent.com/38427477/207588272-dda7ba69-c84c-4372-aeb4-c54657b9b956.png|width=1962,height=364! > h2. Root cause: > If an exception is thrown during the checkpoint async phase, flink will clean > up metaStateHandle, miscFiles and sstFiles. > However, when all sst files are uploaded, they are added together to > sstFiles. If some sst files have been uploaded and some sst files are still > being uploaded, and the checkpoint is canceled due to checkpoint timeout at > this time, all sst files will not be added to sstFiles. The uploaded sst will > remain on hdfs. > [code > link|https://github.com/apache/flink/blob/49146cdec41467445de5fc81f100585142728bdf/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java#L328] > h2. Solution: > Using the CloseableRegistry as the tmpResourcesRegistry. If the async phase > is failed, the tmpResourcesRegistry will cleanup these temporary resources. > > POC code: > [https://github.com/1996fanrui/flink/commit/86a456b2bbdad6c032bf8e0bff71c4824abb3ce1] > > > !image-2022-12-20-18-45-32-948.png|width=1114,height=442! > !image-2022-12-20-18-47-42-385.png|width=1332,height=552! > -- This message was sent by Atlassian Jira (v8.20.10#820010)