[ https://issues.apache.org/jira/browse/FLINK-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhijiang closed FLINK-18336. ---------------------------- Resolution: Fixed Merged in master: 06af98a6e0247dbf1b08f5caf6ca67f2a56dd8e5 > CheckpointFailureManager forgets failed checkpoints after a successful one > -------------------------------------------------------------------------- > > Key: FLINK-18336 > URL: https://issues.apache.org/jira/browse/FLINK-18336 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing > Reporter: Roman Khachatryan > Assignee: Roman Khachatryan > Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > To my understanding, failure shouldn't be counted more than once for a single > checkpoint. > However, after a successful checkpoint, all previous failures are cleared. > So this test will currently fail: > > {code:java} > TestFailJobCallback callback = new TestFailJobCallback(); > CheckpointFailureManager failureManager = new CheckpointFailureManager(2, > callback); > failureManager.handleJobLevelCheckpointException(new > CheckpointException(CHECKPOINT_EXPIRED), 1L); > failureManager.handleJobLevelCheckpointException(new > CheckpointException(CHECKPOINT_EXPIRED), 2L); > failureManager.handleCheckpointSuccess(2L); > failureManager.handleJobLevelCheckpointException(new > CheckpointException(CHECKPOINT_EXPIRED), 3L); > failureManager.handleJobLevelCheckpointException(new > CheckpointException(CHECKPOINT_EXPIRED), 4L); > // shouldn't be counted because 1L has already failed: > failureManager.handleJobLevelCheckpointException(new > CheckpointException(CHECKPOINT_EXPIRED), 1L); > assertEquals(0, callback.getInvokeCounter());{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)