Yanfei Lei created FLINK-28614:
----------------------------------
Summary: Empty local state folders not cleanup on retrieving local
state
Key: FLINK-28614
URL: https://issues.apache.org/jira/browse/FLINK-28614
Project: Flink
Issue Type: Bug
Components: Runtime / Coordination
Affects Versions: 1.15.1, 1.15.0, 1.16.0
Reporter: Yanfei Lei
Fix For: 1.16.0
It would create a checkpoint directory when trying to load
{{TaskStateSnapshot}} from the disk. The local checkpoint directory is not
deleted on exit {{tryLoadTaskStateSnapshotFromDisk() }}even though
{{TaskStateSnapshot}} doesn't exist.
{code:java}
File getTaskStateSnapshotFile(long checkpointId) {
final File checkpointDirectory =
localRecoveryConfig
.getLocalStateDirectoryProvider()
.orElseThrow(
() -> new IllegalStateException("Local recovery
must be enabled."))
.subtaskSpecificCheckpointDirectory(checkpointId);
if (!checkpointDirectory.exists() && !checkpointDirectory.mkdirs()) {
throw new FlinkRuntimeException(
String.format(
"Could not create the checkpoint directory '%s'",
checkpointDirectory));
}
return new File(checkpointDirectory, TASK_STATE_SNAPSHOT_FILENAME);
} {code}
This will cause the folder in /{{{}localState{}}} to remain after failover.
Here is an example:
{code:java}
41854 [flink-akka.actor.default-dispatcher-8] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Restoring job
35644df535ca04613d6a6116dcfcfd59 from Checkpoint 2 @ 1658292943408 for
35644df535ca04613d6a6116dcfcfd59 located at
file:/var/folders/4n/q3r37vws2f910rt_f469kwg00000gn/T/junit1426665332205293555/junit63847204117629783/35644df535ca04613d6a6116dcfcfd59/chk-2.
_______________________________________
directory of localState
_______________________________________
tm_2
│ ├── blobStorage
│ ├── localState
│ │ └── aid_6df21e53ca06ea69ee0643d25d27dbee
│ │ └── jid_35644df535ca04613d6a6116dcfcfd59
│ │ └── vtx_0a448493b4782967b150582570326227_sti_1
│ │ ├── chk_2
│ │ └── chk_5
│ │ ├── _task_state_snapshot
│ │ ├── edab98058083464a9ca29b6d7a950c68
│ │ │ ├── 000014.sst
│ │ │ ├── 000015.sst
│ │ │ ├── 000022.sst
│ │ │ ├── 000023.sst
│ │ │ ├── CURRENT
│ │ │ ├── MANIFEST-000018
│ │ │ └── OPTIONS-000021
│ │ └── f3724ae6-fd24-4e9a-80a8-02aa34bca0f0 {code}
cc: [~trohrmann] , [~masteryhx]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)