[ https://issues.apache.org/jira/browse/FLINK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305468#comment-14305468 ]
ASF GitHub Bot commented on FLINK-1442: --------------------------------------- Github user hsaputra commented on a diff in the pull request: https://github.com/apache/flink/pull/344#discussion_r24097257 --- Diff: flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingMemoryArchivist.scala --- @@ -31,10 +31,11 @@ trait TestingMemoryArchivist extends ActorLogMessages { def receiveTestingMessages: Receive = { case RequestExecutionGraph(jobID) => - graphs.get(jobID) match { - case Some(executionGraph) => sender ! ExecutionGraphFound(jobID, executionGraph) - case None => sender ! ExecutionGraphNotFound(jobID) + val executionGraph = getGraph(jobID) + if (executionGraph != null) { --- End diff -- I like @tillrohrmann to use Option as alternative to null. In Java land, Guava's Optional could be use to do similar thing (which I think will be part of Java8) > Archived Execution Graph consumes too much memory > ------------------------------------------------- > > Key: FLINK-1442 > URL: https://issues.apache.org/jira/browse/FLINK-1442 > Project: Flink > Issue Type: Bug > Components: JobManager > Affects Versions: 0.9 > Reporter: Stephan Ewen > Assignee: Max Michels > > The JobManager archives the execution graphs, for analysis of jobs. The > graphs may consume a lot of memory. > Especially the execution edges in all2all connection patterns are extremely > many and add up in memory consumption. > The execution edges connect all parallel tasks. So for a all2all pattern > between n and m tasks, there are n*m edges. For parallelism of multiple 100 > tasks, this can easily reach 100k objects and more, each with a set of > metadata. > I propose the following to solve that: > 1. Clear all execution edges from the graph (majority of the memory > consumers) when it is given to the archiver. > 2. Have the map/list of the archived graphs behind a soft reference, to it > will be removed under memory pressure before the JVM crashes. That may remove > graphs from the history early, but is much preferable to the JVM crashing, in > which case the graph is lost as well... > 3. Long term: The graph should be archived somewhere else. Somthing like the > History server used by Hadoop and Hive would be a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)