Github user senorcarbone commented on the issue: https://github.com/apache/flink/pull/1668 Ok, so I am progressing this a bit independently from the termination stuff and then we rebase to the first PR that is merged. I just changed everything and rebased to the current master. Some notable changes: - The `StreamIterationCheckpointingITCase` is not made deterministic, it fails after the first successful checkpoint once and the jobs stops after everything has been recovered appropriately. - I am now using ListState which is supposed to work like a charm with the rocksdb file backend. Note that with the default in-memory backend there is a high chance to get issues given the low memory capacity that it is given by default. - One tricky part that can be potentially done better is the way I set the logger in the StreamIterationHead (had to change the head op field access to `protected` in the OperatorChain) Whenever you find time go ahead and check it out. It passes my super-strict test which is a good thing. :)
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---