[ https://issues.apache.org/jira/browse/FLINK-21999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315371#comment-17315371 ]
Till Rohrmann commented on FLINK-21999: --------------------------------------- I think the current behaviour is the following: Only don't start the {{CheckpointCoordinator}} if the {{CheckpointConfig == null}}. This is currently only the case if the user uses the {{ExecutionEnvironment}} (e.g. {{DataSet}} API). When using the {{StreamExecutionEnvironment}}, then the {{CheckpointConfig}} is always set. This will always instantiate the {{CheckpointCoordinator}} and also configure the state backends. Whether the {{CheckpointCoordinator}} triggers periodic checkpoints or not is controlled by the interval. That is also what happens if you call {{CheckpointConfig.disableCheckpointing}}. This does not disable the instantiation of the {{CheckpointCoordinator}} but only the periodic checkpoint triggering. With the introduction of the BATCH execution mode for streaming jobs the whole separation has been blurred further. Here we want to instantiate the state backends but don't want to enable periodic checkpointing. That's why the {{StreamGraphGenerator}} disables the periodic checkpointing. I think the underlying problem is that the state backend configuration is coupled with the {{CheckpointCoordinator}} configuration which should no longer be the case due to the BATCH execution mode. But also keep in mind that the {{CheckpointCoordinator}} is needed for executing savepoint requests from the user. Hence, the checkpoint interval alone is not sufficient to decide whether to instantiate the {{CheckpointCoordinator}} or not. Given that this seems to be a bit more involved, I would suggest to not do this change at the moment or at least start with a proper proposal how to configure what and when to enable what component. > The logic about whether Checkpoint is enabled. > ---------------------------------------------- > > Key: FLINK-21999 > URL: https://issues.apache.org/jira/browse/FLINK-21999 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Reporter: ZhangWei > Assignee: ZhangWei > Priority: Major > Labels: pull-request-available > > org.apache.flink.runtime.executiongraph.DefaultExecutionGraphBuilder#isCheckpointingEnabled > assumes checkpoint enabled when JobCheckpointingSettings is not null. While > this is not enough, we must also guarantee the checkpoint interval is between > [MINIMAL_CHECKPOINT_TIME, Long.MaxValue). That is like the > JobGraph#isCheckpointingEnabled does. > In current implement, when we do not set checkpoint interval, leaving it > the default value -1, the interval will be changed to Long.MaxValue. Thus > DefaultExecutionGraphBuilder#isCheckpointingEnabled will return true. That is > not correct. > in addition, there are different classes assume checkpoint enabled with > different interval range. > 1. CheckpointConfig -> (0,Long.MaxValue*]*. > 2. JobGraph -> (0,Long.MaxValue) > This is not consistent. And the correct range is [MINIMAL_CHECKPOINT_TIME, > Long.MaxValue). -- This message was sent by Atlassian Jira (v8.3.4#803005)