[ https://issues.apache.org/jira/browse/FLINK-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16849589#comment-16849589 ]
Congxian Qiu(klion26) commented on FLINK-12619: ----------------------------------------------- [~aljoscha] thanks for you reply. For this issue, I want to the following steps (most of them will reuse the code of FLINK-11458) * add a requried functions and RPCs in JM and TM for this issue and ** {{CheckpointCoordinator#triggerSynchronousCheckpoint}} (aligned with {{triggerSynchronousSavepoint}}) ** {{SchedulerNG#stopWithCheckpoint}} (aligned with {{stopWithSavepoint}}) ** a new {{CheckpointType}} named with {{SYNC_CHECKPOINT}}(aiigned with {{SYNC_SAVEPOINT}} ** {{JobMaster#stopWithCheckpoint}} (aligned with {{stopWithSavepoint}}) ** Aligned to allow sync checkpoint (current only support sync savepoint) ** Some needed test for this * export this to CLI ** will add a option receive no paramer(will reuse the preconfigured checkpoint directory), mostly like {{CliFrontedParser.STOP_WITH_SAVEPOINT}} * add rest api for this ** will add endpoint, restful gateway, trigger handler, request boby and so on(like FLINK-11458) What do you think? > Support TERMINATE/SUSPEND Job with Checkpoint > --------------------------------------------- > > Key: FLINK-12619 > URL: https://issues.apache.org/jira/browse/FLINK-12619 > Project: Flink > Issue Type: New Feature > Components: Runtime / State Backends > Reporter: Congxian Qiu(klion26) > Assignee: Congxian Qiu(klion26) > Priority: Major > > Inspired by the idea of FLINK-11458, we propose to support terminate/suspend > a job with checkpoint. This improvement cooperates with incremental and > external checkpoint features, that if checkpoint is retained and this feature > is configured, we will trigger a checkpoint before the job stops. It could > accelarate job recovery a lot since: > 1. No source rewinding required any more. > 2. It's much faster than taking a savepoint since incremental checkpoint is > enabled. > Please note that conceptually savepoints is different from checkpoint in a > similar way that backups are different from recovery logs in traditional > database systems. So we suggest using this feature only for job recovery, > while stick with FLINK-11458 for the > upgrading/cross-cluster-job-migration/state-backend-switch cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)