Roman Khachatryan created FLINK-28597:
-----------------------------------------

             Summary: Empty checkpoint folders not deleted on job cancellation 
if their shared state is still in use
                 Key: FLINK-28597
                 URL: https://issues.apache.org/jira/browse/FLINK-28597
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Checkpointing
    Affects Versions: 1.16.0
            Reporter: Roman Khachatryan
            Assignee: Roman Khachatryan
             Fix For: 1.16.0


After FLINK-25872, SharedStateRegistry registers all state handles, including 
private ones.
Once the state isn't use AND the checkpoint is subsumed, it will actually be 
discarded.
This is done to prevent premature deletion when recovering in CLAIM mode:
1. RocksDB native savepoint folder (shared state is stored in chk-xx folder so 
it might fail the deletion)
2. Initial non-changelog checkpoint when switching to changelog-based 
checkpoints (private state of the initial checkpoint might be included into 
later checkpoints and its deletion would invalidate them)

Additionally, checkpoint folders are not deleted for a longer time which might 
be confusing.
In case of a crash, more folders will remain.

cc: [~Yanfei Lei], [~ym]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to