[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Congxian Qiu(klion26) updated FLINK-5763: - Release Note: After FLINK-5763, we made savepoint self-contained and relocatable so that users can migrate savepoint from one place to another without any other processing manually. Currently do not support this feature after Entropy Injection enabled. > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Ufuk Celebi >Assignee: Congxian Qiu(klion26) >Priority: Critical > Labels: pull-request-available, usability > Fix For: 1.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-5763: -- Labels: pull-request-available usability (was: usability) > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Ufuk Celebi >Priority: Critical > Labels: pull-request-available, usability > Fix For: 1.11.0 > > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen updated FLINK-5763: Fix Version/s: 1.11.0 > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Ufuk Celebi >Priority: Critical > Fix For: 1.11.0 > > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen updated FLINK-5763: Labels: usability (was: ) > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Reporter: Ufuk Celebi >Priority: Critical > Labels: usability > Fix For: 1.11.0 > > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-5763: Priority: Critical (was: Blocker) > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Ufuk Celebi >Priority: Critical > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-5763: Fix Version/s: (was: 1.6.0) > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Ufuk Celebi >Priority: Critical > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-5763: Fix Version/s: (was: 1.5.0) 1.6.0 > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Ufuk Celebi >Priority: Blocker > Fix For: 1.6.0 > > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-5763: Fix Version/s: (was: 1.4.0) 1.5.0 > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Ufuk Celebi >Priority: Blocker > Fix For: 1.5.0 > > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-5763: Priority: Blocker (was: Major) > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Ufuk Celebi >Priority: Blocker > Fix For: 1.4.0 > > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-5763: -- Fix Version/s: (was: 1.3.0) 1.4.0 > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Ufuk Celebi > Fix For: 1.4.0 > > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian JIRA (v6.3.15#6346)