WikipediaPageRank Data Set

2014-03-27 Thread Niko Stahl
Hello, I would like to run the WikipediaPageRankhttps://github.com/amplab/graphx/blob/f8544981a6d05687fa950639cb1eb3c31e9b6bf5/examples/src/main/scala/org/apache/spark/examples/bagel/WikipediaPageRank.scalaexample, but the Wikipedia dump XML files are no longer available on Freebase. Does anyone

ClassCastException when using saveAsTextFile

2014-03-25 Thread Niko Stahl
Hi, I'm trying to save an RDD to HDFS with the saveAsTextFile method on my ec2 cluster and am encountering the following exception (the app is called GraphTest): Exception failure: java.lang.ClassCastException: cannot assign instance of GraphTest$$anonfun$3 to field

Comparing GraphX and GraphLab

2014-03-24 Thread Niko Stahl
Hello, I'm interested in extending the comparison between GraphX and GraphLab presented in Xin et. al (2013). The evaluation presented there is rather limited as it only compares the frameworks for one algorithm (PageRank) on a cluster with a fixed number of nodes. Are there any graph algorithms