[ https://issues.apache.org/jira/browse/FLINK-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561122#comment-14561122 ]
ASF GitHub Bot commented on FLINK-2004: --------------------------------------- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/674#issuecomment-105951860 I have those unit test style tests all in the KafkaITCase because they depend on the testing clusters started for the test. They all need at least a running Zookeeper instance. I can of course put them into a different class, but this will further slow down our tests because we spend more time starting and stopping zookeeper. I've reworked the restore to reflect the open() / restoreState() order. The PR has been updated. > Memory leak in presence of failed checkpoints in KafkaSource > ------------------------------------------------------------ > > Key: FLINK-2004 > URL: https://issues.apache.org/jira/browse/FLINK-2004 > Project: Flink > Issue Type: Bug > Components: Streaming > Affects Versions: 0.9 > Reporter: Stephan Ewen > Assignee: Robert Metzger > Priority: Critical > Fix For: 0.9 > > > Checkpoints that fail never send a commit message to the tasks. > Maintaining a map of all pending checkpoints introduces a memory leak, as > entries for failed checkpoints will never be removed. > Approaches to fix this: > - The source cleans up entries from older checkpoints once a checkpoint is > committed (simple implementation in a linked hash map) > - The commit message could include the optional state handle (source needs > not maintain the map) > - The checkpoint coordinator could send messages for failed checkpoints? -- This message was sent by Atlassian JIRA (v6.3.4#6332)