Some questions in using Graphx

2014-04-22 Thread wu zeming
Hi all, I am using Graphx in spark-0.9.0-incubating. The number of vertices can be 100 million and the number of edges can be 1 billion in our graph. As a result, I must carefully use my limit memory. So I have some questions to the Graphx module. Why do some transformations like partitionBy,

Some questions in using Graphx

2014-04-22 Thread wu zeming
Hi all, I am using Graphx in spark-0.9.0-incubating. The number of vertices can be 100 million and the number of edges can be 1 billion in our graph. As a result, I must carefully use my limit memory. So I have some questions to the Graphx module. Why do some transformations like partitionBy,

Re: Some questions in using Graphx

2014-04-22 Thread Ankur Dave
These are excellent questions. Answers below: On Tue, Apr 22, 2014 at 8:20 AM, wu zeming zemin...@gmail.com wrote: 1. Why do some transformations like partitionBy, mapVertices cache the new graph and some like outerJoinVertices not? In general, we cache RDDs that are used more than once to

Re: Some questions in using Graphx

2014-04-22 Thread Wu Zeming
I change the StorageLevel. - Wu Zeming -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Some-questions-in-using-Graphx-tp4604p4634.html Sent from the Apache Spark User List mailing list archive at Nabble.com.