I'm having problems running the twitter graph on a cluster with 4 nodes, each
having over 100GB of RAM and 32 virtual cores per node.
I do have a pre-installed spark version (built against hadoop 2.3, because
it didn't compile on my system), but I'm loading my graph file from disk
without hdfs.
Hi,
I wonder if the pagerank implementation is correct. More specifically, I
look at the following function from PageRank.scala
https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/lib/PageRank.scala
, which is given to Pregel:
def vertexProgram(id: