[ 
https://issues.apache.org/jira/browse/FLINK-27132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17520497#comment-17520497
 ] 

Matthias Pohl commented on FLINK-27132:
---------------------------------------

We have some issues with the checkpoint cleanup in general which we're covering 
right now in FLINK-26606. Right now, your observation is correct that a 
subsumed Checkpoint is removed from the {{StateHandleStore}} (i.e. what you're 
referring to as ZK; I'm just mentioning it here because the k8s setup should 
have the same problem). Any Checkpoint that's not listed in the 
{{StateHandleStore}} is not subject to removal after the job is terminated 
globally.

AFAIU, the {{SharedStateRegistry}} only loads the data from 
{{CompletedCheckpointStore}}, i.e. if the data is not present through these 
checkpoints, it won't be discarded.

> CheckpointResourcesCleanupRunner might discard shared state of the initial 
> checkpoint
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-27132
>                 URL: https://issues.apache.org/jira/browse/FLINK-27132
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.15.0, 1.16.0
>            Reporter: Roman Khachatryan
>            Priority: Major
>
> When considering the following case: # A job starts from a checkpoint in 
> NO_CLAIM mode, with incremental checkpoints enabled
>  # It produces some new checkpoints and subsumes the original one (not 
> discarding shared state - before FLINK-24611 or after FLINK-26985)
>  # Job terminates abruptly
>  # The cleaner is started for that job
>  # ZK doesn't have the initial checkpoint, so the store will load only the 
> new checkpoints (created in 2). Shared state is registered
>  # The store is shut down - discarding all the checkpoints and also any 
> shared state
>  
> In 6, if some checkpoint uses the initial state, it will also be discarded
>  
> [~mapohl] could you please confirm this?
>  
> cc: [~yunta]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to