[ 
https://issues.apache.org/jira/browse/FLINK-17571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104131#comment-17104131
 ] 

Congxian Qiu(klion26) commented on FLINK-17571:
-----------------------------------------------

sorry for the late replay, seems lost some email notification :(

[~pnowojski]  previously, I just want to show all the files(include EXCLUSIVE, 
SHARED and TASKOWNED) are being used by one checkpoint.  because
 * If we don't restore from an external checkpoint, there always no orphaned 
checkpoint files left(the files should be deleted when the checkpoint has been 
discarded)
 * If we restore from an external checkpoint, then we just need to know the 
files used in the "restored" checkpoint
 * If we retain more than 1 checkpoint and want to remove some of them(always 
the older checkpoints) – because of the storage pressure, in most cases, we 
just need to keep the fresh one checkpoint.

For implementation, we just need the leverage the 
{{Checkpoints#loadCheckpointMetaData}} for this. 

>From my side, another command {{removes files in specified checkpoints}} maybe 
>some more tricky, (because we may not know who references the files in the 
>given checkpoint. (if we retained more than one checkpoint – and the reference 
>one same shared file, then two separate jobs restore from the different 
>checkpoint, then we can't simply delete the files in each checkpoint)

[~trystan] good to know that you're interested in helping this, I'm fine if you 
want to help to implement this, I can help to have the init review for your 
contribution. And please let me know if you have time to contribute this. No 
matter who finally helps to contribute this, we need to have an agreement with 
the implementation first on the issue side.

> A better way to show the files used in currently checkpoints
> ------------------------------------------------------------
>
>                 Key: FLINK-17571
>                 URL: https://issues.apache.org/jira/browse/FLINK-17571
>             Project: Flink
>          Issue Type: New Feature
>          Components: Runtime / Checkpointing
>            Reporter: Congxian Qiu(klion26)
>            Priority: Major
>
> Inspired by the 
> [userMail|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Shared-Checkpoint-Cleanup-and-S3-Lifecycle-Policy-tt34965.html]
> Currently, there are [three types of 
> directory|https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/state/checkpoints.html#directory-structure]
>  for a checkpoint, the files in TASKOWND and EXCLUSIVE directory can be 
> deleted safely, but users can't delete the files in the SHARED directory 
> safely(the files may be created a long time ago).
> I think it's better to give users a better way to know which files are 
> currently used(so the others are not used)
> maybe a command-line command such as below is ok enough to support such a 
> feature.
> {{./bin/flink checkpoint list $checkpointDir  # list all the files used in 
> checkpoint}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to