I'm working with graphx to calculate the pageranks of an extreme large social
network with billion verteces.
As iteration number increases, the speed of each iteration becomes slower and
unacceptable. Is there any reason of it?
How can I accelerate the ineration process?
This might be because partitions are getting dropped from memory and
needing to be recomputed. How much memory is in the cluster, and how large
are the partitions? This information should be in the Executors and Storage
pages in the web UI.
Ankur http://www.ankurdave.com/
On Tue, Mar 24, 2015 at