Re: Caching in graphX

ankurdave Tue, 13 May 2014 05:00:11 -0700

Unfortunately it's very difficult to get uncaching right with GraphX due to
the complicated internal dependency structure that it creates. It's
necessary to know exactly what operations you're doing on the graph in order
to unpersist correctly (i.e., in a way that avoids recomputation).


I have a pull request (https://github.com/apache/spark/pull/497) that may
make this a bit easier, but your best option is to use the Pregel API for
iterative algorithms if possible.

If that's not possible, leaving things cached has actually not been very
costly in my experience, at least as long as VD and ED are primitive types
to reduce the load on the garbage collector.

Ankur



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Caching-in-graphX-tp5482p5514.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Caching in graphX

Reply via email to