[ 
https://issues.apache.org/jira/browse/FLINK-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864068#comment-16864068
 ] 

Yu Li edited comment on FLINK-12619 at 6/14/19 1:32 PM:
--------------------------------------------------------

Wrote a document to arrange and better share my thoughts, and below is a brief 
summary:

# Conceptually it worth a second thought about introducing an optimized 
snapshot format for now (i.e. use checkpoint format in savepoint), just like 
it's not recommended to use snapshot for backup in database (although 
practically it could be implemented).
# Stop-with-checkpoint mechanism is like stopping database instance with a data 
flush, thus (IMHO) a different story from the checkpoint/savepoint (db 
snapshot/backup) diversity.
# In the long run we may improve the checkpoint to allow a short enough 
interval thus it may become some format of transactional log, then we could 
enable checkpoint-based savepoint (like transactional log based backup), so I 
agree to still call the new format in FLIP-41 a "Unified Format" although in 
the short term it only unifies savepoint.

Please check it and let me know your thoughts everyone. Thanks!


was (Author: carp84):
Wrote a document to arrange and better share my thoughts, and below is a brief 
summary:

1. Conceptually it worth a second thought about introducing an optimized 
snapshot format for now (i.e. use checkpoint format in savepoint), just like 
it's not recommended to use snapshot for backup in database (although 
practically it could be implemented).
2. Stop-with-checkpoint mechanism is like stopping database instance with a 
data flush, thus (IMHO) a different story from the checkpoint/savepoint (db 
snapshot/backup) diversity.
3. In the long run we may improve the checkpoint to allow a short enough 
interval thus it may become some format of transactional log, then we could 
enable checkpoint-based savepoint (like transactional log based backup), so I 
agree to still call the new format in FLIP-41 a "Unified Format" although in 
the short term it only unifies savepoint.

Please check it and let me know your thoughts everyone. Thanks!

> Support TERMINATE/SUSPEND Job with Checkpoint
> ---------------------------------------------
>
>                 Key: FLINK-12619
>                 URL: https://issues.apache.org/jira/browse/FLINK-12619
>             Project: Flink
>          Issue Type: New Feature
>          Components: Runtime / State Backends
>            Reporter: Congxian Qiu(klion26)
>            Assignee: Congxian Qiu(klion26)
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Inspired by the idea of FLINK-11458, we propose to support terminate/suspend 
> a job with checkpoint. This improvement cooperates with incremental and 
> external checkpoint features, that if checkpoint is retained and this feature 
> is configured, we will trigger a checkpoint before the job stops. It could 
> accelarate job recovery a lot since:
> 1. No source rewinding required any more.
> 2. It's much faster than taking a savepoint since incremental checkpoint is 
> enabled.
> Please note that conceptually savepoints is different from checkpoint in a 
> similar way that backups are different from recovery logs in traditional 
> database systems. So we suggest using this feature only for job recovery, 
> while stick with FLINK-11458 for the 
> upgrading/cross-cluster-job-migration/state-backend-switch cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to