I see the following code toward the end of the method: // Unpersist the RDDs hidden by newly-materialized RDDs oldMessages.unpersist(blocking = false) prevG.unpersistVertices(blocking = false) prevG.edges.unpersist(blocking = false)
Wouldn't the above achieve same effect ? On Sun, Apr 10, 2016 at 9:08 AM, zhang juntao <juntao.zhang...@gmail.com> wrote: > hi experts, > > I’m reporting a problem about spark graphx, I use zeppelin submit spark > jobs, > note that scala environment shares the same SparkContext, SQLContext > instance, > and I call Connected components algorithm to do some Business, > found that every time when the job finished, some graph storage RDDs > weren’t bean released, > after several times there would be a lot of storage RDDs existing even > through all the jobs have finished . > > > So I check the code of connectedComponents and find that may be a problem > in *Pregel.scala* . > when param graph has been cached, there isn’t any way to unpersist, > so I add red font code to solve the problem > > > > > > > > > > > *def apply[VD: ClassTag, ED: ClassTag, A: ClassTag] (graph: Graph[VD, ED], > initialMsg: A, maxIterations: Int = Int.MaxValue, activeDirection: > EdgeDirection = EdgeDirection.Either) (vprog: (VertexId, VD, A) => VD, > sendMsg: EdgeTriplet[VD, ED] => Iterator[(VertexId, A)], mergeMsg: (A, A) > => A) : Graph[VD, ED] ={* > > > * ......* > > > > * var g = graph.mapVertices((vid, vdata) => vprog(vid, vdata, > initialMsg)).cache() graph.unpersistVertices(blocking = false) > graph.edges.unpersist(blocking = false)* > > * ......* > > > *} // end of apply* > > > I'm not sure if this is a bug, and thank you for your time, juntao > > >