Hi All, 

I started exploring Spark from past 2 months. I'm looking for some concrete
features from both Spark and GraphX so that I'll take some decisions what to
use, based upon who get highest performance. 

According to documentation GraphX runs 10x faster than normal Spark. So I
run Page Rank algorithm in both the applications: 
For Spark I used:
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/SparkPageRank.scala
For GraphX I used :
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/graphx/LiveJournalPageRank.scala
  

Input data : http://snap.stanford.edu/data/soc-LiveJournal1.html (1 Gb in
size)
No of Iterations : 2 

*Time Taken : *

Local Mode (Machine : 8 Core; 16 GB memory; 2.80 Ghz Intel i7; Executor
Memory: 4Gb, No. of Partition: 50; No. of Iterations: 2);   ==>  

*Spark Page Rank took -> 21.29 mins 
GraphX Page Rank took -> 42.01 mins *   
 
Cluster Mode (ubantu 12.4; spark 1.1/hadoop 2.4 cluster ; 3 workers , 1
driver , 8 cores, 30 gb memory) (Executor memory 4gb; No. of edge partitions
: 50, random vertex cut ; no. of iteration : 2) =>

*Spark Page Rank took -> 10.54 mins 
GraphX Page Rank took -> 7.54 mins * 


Could you please help me to determine, when to use Spark and GraphX ? If
GraphX took same amount of time than Spark then its better to use Spark
because spark has variey of operators to deal with any type of RDD. 

Any suggestions or feedback or pointers will highly appreciate

Thanks,    


 



-----
--Harihar
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-or-GraphX-runs-fast-a-performance-comparison-on-Page-Rank-tp19710.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to