Elias Bassani created SPARK-19124: -------------------------------------- Summary: GraphX PageRank execution time Key: SPARK-19124 URL: https://issues.apache.org/jira/browse/SPARK-19124 Project: Spark Issue Type: Question Components: GraphX, Spark Shell Affects Versions: 2.0.2 Reporter: Elias Bassani
Hello, I don't know if I'm writing in the right place but if anyone can help me that would be great. I've to run PageRank on a really big graph, 400 million edges, 12 million vertices (Wikipedia's graph) but It raises an execution time problem: after 10+ iteration of the algorithm the execution time raises abnormally from 10 mins per iteration to dozens of hours: https://d.pr/svBR. My code is really simple and it's taken directly from GraphX documentation. The machine used has two CPU Intel Xeon E5-2697 v3, 64GB of RAM and 500GB hard disk and it runs Windows Server 2012 R2 Standard. I allocated 8 cores and 50 GB of RAM to Spark invoking the Spark-Shell from the command line. What could the problem be? Thanks for any help! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org