Hi All, We are using flink 1.7.2 and have enabled checkpoint with RocksDB configured as state backend with retain checkpoints on job cancel. In our scenario we are cancelling the job and while resubmitting the job, we try to restore the job with latest checkpoint / savepoint available. We are observing ambiguous behavior based on the way job is being cancelled, below are the captured observations:
Observations : 1. When we cancel the job with a savepoint option, a savepoint is created as expected but flink is deleting the latest checkpoint directory available for the running job. Is this an expected behavior even when the configuration asks to retain checkpoints on job cancellation? 2. When we cancel the job without the savepoint option, the same latest checkpoint was retained by flink as opposed to before where it was deleted as job was cancelled with the savepoint option. As we have configured flink to retain only a single checkpoint at any point of time, could there be any issue wherein when we cancel the job with a savepoint, the savepoint gets triggered but fails midway. So now we would end up with an incomplete savepoint and no trace of checkpoint for the job as it would have been erased. Thanks Parth Sarathy -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/