GraphX performance and settings

Maja Kabiljo Wed, 15 Jun 2016 18:01:52 -0700

Hi,

We are running some experiments with GraphX in order to compare it with other 
systems. There are multiple settings which significantly affect performance, 
and we experimented a lot in order to tune them well. I'll share here what are 
the best we found so far and which results we got with them, and would really 
appreciate if anyone who used GraphX before has any advice on what else can 
make it even better, or confirm that these results are as good as it gets.


Algorithms we used are pagerank and connected components. We used Twitter and 
UK graphs from the GraphX paper 
(https://amplab.cs.berkeley.edu/wp-content/uploads/2014/09/graphx.pdf), and 
also generated graphs with properties similar to Facebook social graph with 
various number of edges. Apart from performance we tried to see what is the 
minimum amount of resources it requires in order to handle graph of some size.

We ran experiments using Spark 1.6.1, on machines which have 20 cores with 
2-way SMT, always fixing number of executors (min=max=initial), giving 40GB or 
80GB per executor, and making sure we run only a single executor per machine. 
Additionally we used:

  *   spark.shuffle.manager=hash, spark.shuffle.service.enabled=false
  *   Parallel GC
  *   PartitionStrategy.EdgePartition2D
  *   8*numberOfExecutors partitions

Here are some data points which we got:

  *   Running on Facebook-like graph with 2 billion edges, using 4 executors 
with 80GB each it took 451 seconds to do 20 iterations of pagerank and 236 
seconds to find connected components. It failed when we tried to use 2 
executors, or 4 executors with 40GB each.
  *   For graph with 10 billion edges we needed 16 executors with 80GB each (it 
failed with 8), 1041 seconds for 20 iterations of pagerank and 716 seconds for 
connected components.
  *   Twitter-2010 graph (1.5 billion edges), 8 executors, 40GB each, pagerank 
473s, connected components 264s. With 4 executors 80GB each it worked but was 
struggling (pr 2475s, cc 4499s), with 8 executors 80GB pr 362s, cc 255s.

One more thing, we were not able to reproduce what's mentioned in the paper 
about fault tolerance (section 5.2). If we kill an executor during first few 
iterations it recovers successfully, but if killed in later iterations 
reconstruction of each iteration starts taking exponentially longer and doesn't 
finish after letting it run for a few hours. Are there some additional 
parameters which we need to set in order for this to work?

Any feedback would be highly appreciated!

Thank you,
Maja

GraphX performance and settings

Reply via email to