Stephan Ewen created FLINK-8531:
-----------------------------------
Summary: Support separation of "Exclusive", "Shared" and "Task
owned" state
Key: FLINK-8531
URL: https://issues.apache.org/jira/browse/FLINK-8531
Project: Flink
Issue Type: Sub-task
Components: State Backends, Checkpointing
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Fix For: 1.5.0
Currently, all state created at a certain checkpoint goes into the directory
{{chk-id}}.
With incremental checkpointing, some state is shared across checkpoint and is
referenced by newer checkpoints. That way, old {{chk-id}} directories stay
around, containing some shared chunks. That makes it both for users and cleanup
hooks hard to determine when a {{chk-x}} directory could be deleted.
The same holds for state that can only every be dropped by certain operators on
the TaskManager, never by the JobManager / CheckpointCoordinator. Examples of
that state are write ahead logs, which need to be retained until the move to
the target system is complete, which may in some cases be later then when the
checkpoint that created them is disposed.
I propose to introduce different scopes for tasks:
- **EXCLUSIVE** is for state that belongs to one checkpoint only
- **SHARED** is for state that is possibly part of multiple checkpoints
- **TASKOWNED** is for state that must never by dropped by the JobManager.
For file based checkpoint targets, I propose that we have the following
directory layout:
{code}
/user-defined-checkpoint-dir
|
+ --shared/
+ --taskowned/
+ --chk-00001/
+ --chk-00002/
+ --chk-00003/
...
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)