[ https://issues.apache.org/jira/browse/FLINK-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528078#comment-14528078 ]
ASF GitHub Bot commented on FLINK-1953: --------------------------------------- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/651#issuecomment-98984867 Great! I'll integrate it with my new Kafka Source and test everything on a cluster. > Rework Checkpoint Coordinator > ----------------------------- > > Key: FLINK-1953 > URL: https://issues.apache.org/jira/browse/FLINK-1953 > Project: Flink > Issue Type: Bug > Components: Streaming > Affects Versions: 0.9 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Fix For: 0.9 > > > The checkpoint coordinator currently contains no tests and is vulnerable to a > variety of situations. In particular, I propose to add: > - Better configurability which tasks receive the trigger checkpoint > messages, which tasks need to acknowledge the checkpoint, and which tasks > need to receive confirmation messages. > - checkpoint timeouts, such that incomplete checkpoints are guaranteed to be > cleaned up after a while, regardless of successful checkpoints > - better sanity checking of messages and fields, to properly handle/ignore > messages for old/expired checkpoints, or invalidly routed messages > - Better handling of checkpoint attempts at points where the execution has > just failed is is currently being canceled. > - Add a good set of tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)