[ https://issues.apache.org/jira/browse/FLINK-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426559#comment-17426559 ]
Feifan Wang edited comment on FLINK-9465 at 10/10/21, 2:29 AM: --------------------------------------------------------------- Hi [~trohrmann], I open a [pull request|https://github.com/apache/flink/pull/17443] to resolve this, but there are still some unit test that I think need to be complete. Can you take a glance over this PR and give me some guidance on the unit test ? was (Author: feifan wang): Hi [~trohrmann], I open a pull request to resolve this, but there are still some unit test that I think need to be complete. Can you take a glance over this PR and give me some guidance on the unit test ? > Specify a separate savepoint timeout option via CLI > --------------------------------------------------- > > Key: FLINK-9465 > URL: https://issues.apache.org/jira/browse/FLINK-9465 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing > Affects Versions: 1.5.0 > Reporter: Truong Duc Kien > Assignee: Feifan Wang > Priority: Minor > Labels: auto-deprioritized-major, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Savepoint can take much longer time to perform than checkpoint, especially > with incremental checkpoint enabled. This leads to a couple of troubles: > * For our job, we currently have to set the checkpoint timeout much large > than necessary, otherwise we would be unable to perform savepoint. > * During rush hour, our cluster would encounter high rate of checkpoint > timeout due to backpressure, however we're unable to migrate to a larger > configuration, because savepoint also timeout. > In my opinion, the timeout for savepoint should be configurable separately, > both in the config file and as parameter to the savepoint command. -- This message was sent by Atlassian Jira (v8.3.4#803005)