Well, that's not that easy to do, because checkpoints must be coordinated and triggered the JobManager. Also, the checkpointing mechanism with flowing checkpoint barriers (to ensure checkpoint consistency) won't work once a task failed because it cannot continue processing and forward barriers. If the task failed with an OOME, the whole JVM is gone anyway. I don't think it is possible to take something like a consistent rescue checkpoint in case of a failure.
I might be possible to checkpoint application state of non-failed tasks, but this would result in data loss for the failed task and we would need to weigh the use cases for such a feature are the implementation effort. Maybe there are better ways to address such use cases. Best, Fabian 2018-03-20 6:43 GMT+01:00 makeyang <riverbuild...@hotmail.com>: > currently there is only time based way to trigger a checkpoint. based on > this > discussion, I think flink need to introduce event based way to trigger > checkpoint such as restart a task manager should be count as a event. > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/ >