Elias Bassani  created SPARK-19124:
--------------------------------------

             Summary: GraphX PageRank execution time
                 Key: SPARK-19124
                 URL: https://issues.apache.org/jira/browse/SPARK-19124
             Project: Spark
          Issue Type: Question
          Components: GraphX, Spark Shell
    Affects Versions: 2.0.2
            Reporter: Elias Bassani 


Hello, I don't know if I'm writing in the right place but if anyone can help me 
that would be great.

I've to run PageRank on a really big graph, 400 million edges, 12 million 
vertices (Wikipedia's graph) but It raises an execution time problem: after 10+ 
iteration of the algorithm the execution time raises abnormally from 10 mins 
per iteration to dozens of hours: https://d.pr/svBR.

My code is really simple and it's taken directly from GraphX documentation.

The machine used has two CPU Intel Xeon E5-2697 v3, 64GB of RAM and 500GB hard 
disk and it runs Windows Server 2012 R2 Standard.

I allocated 8 cores and 50 GB of RAM to Spark invoking the Spark-Shell from the 
command line.

What could the problem be? 

Thanks for any help!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to