Feifan Wang created FLINK-28172: ----------------------------------- Summary: Scatter dstl files into separate directories by job id Key: FLINK-28172 URL: https://issues.apache.org/jira/browse/FLINK-28172 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Affects Versions: 1.15.0 Reporter: Feifan Wang
In the current implementation of {_}FsStateChangelogStorage{_}, dstl files from all jobs are put into the same directory (configured via {_}dstl.dfs.base-path{_}). Everything is fine if it's a filesystem like S3.But if it is a file system like hadoop, there will be some problems. First, there may be an upper limit to the number of files in a single directory. Increasing this threshold will greatly reduce the performance of the distributed file system. Second, dstl file management becomes difficult because the user cannot tell which job the dstl file belongs to, especially when the retained checkpoint is turned on. h3. Propose # create a subdirectory named with the job id under the _dstl.dfs.base-path_ directory when the job starts # all dstl files upload to the subdirectory ( Going a step further, we can even create two levels of subdirectories under the _dstl.dfs.base-path_ directory, like _base-path/\{jobId}/dstl ._ This way, if the user configures the same dstl.dfs.base-path as state.checkpoints.dir, all files needed for job recovery will be in the same directory and well organized. ) -- This message was sent by Atlassian Jira (v8.20.7#820007)