[ https://issues.apache.org/jira/browse/FLINK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roman Khachatryan updated FLINK-27114: -------------------------------------- Description: Scenario (1.14): # A job starts from an existing checkpoint 1, with incremental checkpoints enabled # Checkpoint 1 is loaded with discardOnSubsume=false by CheckpointCoordinator.restoreSavepoint # A new checkpoint 2 completes, it reuses some state from the initial checkpoint # At some point, checkpoint 1 is subsumed, but the state is not discarded (thanks to discardOnSubsume=false, ref counts stay 1) # JM crashes # JM restarts, loads the checkpoints 2..x from ZK (or other store) - discardOnSubsume=true (as deserialized from handles) # At some point, checkpoint 2 is subsumed and the initial shared state is not used anymore; because checkpoint 2 has discardOnSubsume=true, shared state will be erroneously discarded In 1.15, there were the following changes: # RestoreMode was added; only LEGACY mode is affected (in NO_CLAIM mode, checkpoint 2 can't reuse any initial state; and in CLAIM mode, it's fine to discard the initial state) # SharedStateRegistry was changed from refCounts to highest checkpoint ID # In step (7), state will not be discarded; however, because it's impossible to distinguish initial state from the state created by this job, the latter will not be discarded as well, leading to left-over state artifacts. The proposed solution is to store the initial checkpoint ID (in store such as ZK or in checkpoints) and adjust steps 6 or 7. was: Scenario (1.14): # A job starts from an existing checkpoint 1, with incremental checkpoints enabled # Checkpoint 1 is loaded with discardOnSubsume=false by CheckpointCoordinator.restoreSavepoint # A new checkpoint 2 completes, it reuses some state from the initial checkpoint # At some point, checkpoint 1 is subsumed, but the state is not discarded (thanks to discardOnSubsume=false, ref counts stay 1) # JM crashes # JM restarts, loads the checkpoints 2..x from ZK (or other store) - discardOnSubsume=true (as deserialized from handles) # At some point, checkpoint 2 is subsumed and the initial shared state is not used anymore; because checkpoint 2 has discardOnSubsume=true, shared state will be erroneously discarded In 1.15, there were the following changes: # RestoreMode was added; only NO_CLAIM and LEGACY modes are affected # SharedStateRegistry was changed from refCounts to highest checkpoint ID # In step (7), state will not be discarded; however, because it's impossible to distinguish initial state from the state created by this job, the latter will not be discarded as well, leading to left-over state artifacts. The proposed solution is to store the initial checkpoint ID (in store such as ZK or in checkpoints) and adjust steps 6 or 7. > On JM restart, the information about the initial checkpoints can be lost > ------------------------------------------------------------------------ > > Key: FLINK-27114 > URL: https://issues.apache.org/jira/browse/FLINK-27114 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing > Affects Versions: 1.15.0, 1.14.4, 1.16.0 > Reporter: Roman Khachatryan > Priority: Major > Fix For: 1.16.0, 1.14.5, 1.15.1 > > > Scenario (1.14): > # A job starts from an existing checkpoint 1, with incremental checkpoints > enabled > # Checkpoint 1 is loaded with discardOnSubsume=false by > CheckpointCoordinator.restoreSavepoint > # A new checkpoint 2 completes, it reuses some state from the initial > checkpoint > # At some point, checkpoint 1 is subsumed, but the state is not discarded > (thanks to discardOnSubsume=false, ref counts stay 1) > # JM crashes > # JM restarts, loads the checkpoints 2..x from ZK (or other store) - > discardOnSubsume=true (as deserialized from handles) > # At some point, checkpoint 2 is subsumed and the initial shared state is > not used anymore; because checkpoint 2 has discardOnSubsume=true, shared > state will be erroneously discarded > In 1.15, there were the following changes: > # RestoreMode was added; only LEGACY mode is affected (in NO_CLAIM mode, > checkpoint 2 can't reuse any initial state; and in CLAIM mode, it's fine to > discard the initial state) > # SharedStateRegistry was changed from refCounts to highest checkpoint ID > # In step (7), state will not be discarded; however, because it's impossible > to distinguish initial state from the state created by this job, the latter > will not be discarded as well, leading to left-over state artifacts. > The proposed solution is to store the initial checkpoint ID (in store such as > ZK or in checkpoints) and adjust steps 6 or 7. -- This message was sent by Atlassian Jira (v8.20.1#820001)