[ https://issues.apache.org/jira/browse/SPARK-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ankur Dave resolved SPARK-2025. ------------------------------- Resolution: Fixed Fix Version/s: 1.1.0 1.0.1 > EdgeRDD persists after pregel iteration > --------------------------------------- > > Key: SPARK-2025 > URL: https://issues.apache.org/jira/browse/SPARK-2025 > Project: Spark > Issue Type: Bug > Components: GraphX > Affects Versions: 1.0.0, 1.0.1 > Environment: RHEL6 on local and on spark cluster > Reporter: Tim Weninger > Assignee: Ankur Dave > Labels: Pregel > Fix For: 1.0.1, 1.1.0 > > > Symptoms: During execution of a pregel script/function a copy of an > intermediate EdgeRDD object persists after each iteration as shown by the > Spark WebUI - storage. > This is like a memory leak that affects in the Pregel function. > For example, after the first iteration I will have an EdgeRDD in addition to > the EdgeRDD and VertexRDD that are kept for the next iteration. After 15 > iterations I will have 15 EdgeRDDs in addition to the current/correct state > represented by a single set of 1 EdgeRDD and 1 VertexRDD. > At the end of a Pregel loop the old EdgeRDD and VertexRDD are unpersisted, > but there seems to be another EdgeRDD that is created somewhere that does not > get unpersisted. > i _think_ this is from the replicateVertex function, but I cannot be sure. > Update - Dave Ankur says, in comments on SPARK-2011 - > {quote} > ... is a bug introduced by https://github.com/apache/spark/pull/497. > It occurs because unpersistVertices used to unpersist both the vertices and > the replicated vertices, but after unifying replicated vertices with edges, > there was no way to unpersist only one of them. I think the solution is just > to unpersist both the vertices and the edges in Pregel.{quote} -- This message was sent by Atlassian JIRA (v6.2#6252)