Zakelly commented on code in PR #24766: URL: https://github.com/apache/flink/pull/24766#discussion_r1601510494
########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints + +The unified file merging mechanism for checkpointing is introduced to Flink 1.20 as an MVP ("minimum viable product") feature, +which allows scattered small checkpoint files to be written into a single file, reducing the number of file creations +and file deletions, helping to alleviate the pressure of file system metadata management and file flooding problem. +The unified fie merging mechanism can be enabled by setting the property `state.checkpoints.file-merging.enabled` to `true`. +**Note** that enabling this mechanism may lead to space amplification, that is, the actual occupation on the file system +will be larger than actual state size. `state.checkpoints.file-merging.max-space-amplification` +can be used to limit the upper bound of space amplification. + +This mechanism is applicable to keyed state, operator state and channel state in Flink. Subtask level granular merging is +provided for shared scope state; TaskManager-level granular merging is provided for private scope state. The maximum number of subtasks +allowed to be written to a single file can be configured through the `state.checkpoints.file-merging.max-subtasks-per-file` option. + +The unified fie merging mechanism also supports file merging across checkpoints, which can be enabled by setting +`state.checkpoints.file-merging.across-checkpoint-boundary` to `true`. + +This mechanism introduces a file pool to handle concurrent writing scenarios. The blocking mode can be Review Comment: ```suggestion This mechanism introduces a file pool to handle concurrent writing scenarios. There are two modes....... The blocking mode...... while the non-blocking modes...... . This can be configured via ``. ``` Add some description to mode? instead of talking about enabling the option. ########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints + +The unified file merging mechanism for checkpointing is introduced to Flink 1.20 as an MVP ("minimum viable product") feature, +which allows scattered small checkpoint files to be written into a single file, reducing the number of file creations +and file deletions, helping to alleviate the pressure of file system metadata management and file flooding problem. Review Comment: ```suggestion and file deletions, which alleviates the pressure of file system metadata management raised by the file flooding problem during checkpoints. ``` ########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints + +The unified file merging mechanism for checkpointing is introduced to Flink 1.20 as an MVP ("minimum viable product") feature, +which allows scattered small checkpoint files to be written into a single file, reducing the number of file creations +and file deletions, helping to alleviate the pressure of file system metadata management and file flooding problem. +The unified fie merging mechanism can be enabled by setting the property `state.checkpoints.file-merging.enabled` to `true`. Review Comment: ```suggestion The mechanism can be enabled by setting `state.checkpoints.file-merging.enabled` to `true`. ``` ########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints Review Comment: How about adding `(Experimental)` in title. ########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints + +The unified file merging mechanism for checkpointing is introduced to Flink 1.20 as an MVP ("minimum viable product") feature, +which allows scattered small checkpoint files to be written into a single file, reducing the number of file creations Review Comment: ```suggestion which allows scattered small checkpoint files to be written into larger files, reducing the number of file creations ``` ########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints + +The unified file merging mechanism for checkpointing is introduced to Flink 1.20 as an MVP ("minimum viable product") feature, +which allows scattered small checkpoint files to be written into a single file, reducing the number of file creations +and file deletions, helping to alleviate the pressure of file system metadata management and file flooding problem. +The unified fie merging mechanism can be enabled by setting the property `state.checkpoints.file-merging.enabled` to `true`. +**Note** that enabling this mechanism may lead to space amplification, that is, the actual occupation on the file system +will be larger than actual state size. `state.checkpoints.file-merging.max-space-amplification` +can be used to limit the upper bound of space amplification. + +This mechanism is applicable to keyed state, operator state and channel state in Flink. Subtask level granular merging is Review Comment: ```suggestion This mechanism is applicable to keyed state, operator state and channel state in Flink. Merging at subtask level is ``` ########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints + +The unified file merging mechanism for checkpointing is introduced to Flink 1.20 as an MVP ("minimum viable product") feature, +which allows scattered small checkpoint files to be written into a single file, reducing the number of file creations +and file deletions, helping to alleviate the pressure of file system metadata management and file flooding problem. +The unified fie merging mechanism can be enabled by setting the property `state.checkpoints.file-merging.enabled` to `true`. +**Note** that enabling this mechanism may lead to space amplification, that is, the actual occupation on the file system Review Comment: ```suggestion **Note** that as a trade-off, enabling this mechanism may lead to space amplification, that is, the actual occupation on the file system ``` ########## docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md: ########## @@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after all operators have rea without waiting for periodic triggering, but the job will need to wait for this final checkpoint to be completed. +## Unify file merging mechanism for checkpoints + +The unified file merging mechanism for checkpointing is introduced to Flink 1.20 as an MVP ("minimum viable product") feature, +which allows scattered small checkpoint files to be written into a single file, reducing the number of file creations +and file deletions, helping to alleviate the pressure of file system metadata management and file flooding problem. +The unified fie merging mechanism can be enabled by setting the property `state.checkpoints.file-merging.enabled` to `true`. +**Note** that enabling this mechanism may lead to space amplification, that is, the actual occupation on the file system +will be larger than actual state size. `state.checkpoints.file-merging.max-space-amplification` +can be used to limit the upper bound of space amplification. + +This mechanism is applicable to keyed state, operator state and channel state in Flink. Subtask level granular merging is +provided for shared scope state; TaskManager-level granular merging is provided for private scope state. The maximum number of subtasks +allowed to be written to a single file can be configured through the `state.checkpoints.file-merging.max-subtasks-per-file` option. + +The unified fie merging mechanism also supports file merging across checkpoints, which can be enabled by setting +`state.checkpoints.file-merging.across-checkpoint-boundary` to `true`. Review Comment: ```suggestion This feature also supports merging files across checkpoints. To enable this, set `state.checkpoints.file-merging.across-checkpoint-boundary` to `true`. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org