want to know the default allocation of computing resources, as
run-example may not allow me to allocate them by myself.
Regards~
Qi Song
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Comparing-GraphX-and-GraphLab-tp3112p4265.html
Sent from the Apache Spark
Hi Ankur, hi Deb,
Thanks for the information and for the reference to the recent paper. I
understand that GraphLab is highly optimized for graph algorithms and
consistently outperforms GraphX for graph related tasks. I'd like to
further evaluate the cost of moving data between Spark and some other
Hi Ankur,
Given enough memory and proper caching, I don't understand why is this the
case?
GraphX may actually be slower when Spark is configured to launch many tasks
per machine, because shuffle communication between Spark tasks on the same
machine still occurs by reading and writing from disk,
Hi Niko,
The GraphX team recently wrote a longer paper with more benchmarks and
optimizations: http://arxiv.org/abs/1402.2394
Regarding the performance of GraphX vs. GraphLab, I believe GraphX
currently outperforms GraphLab only in end-to-end benchmarks of pipelines
involving both graph-parallel
Niko,
Comparing some other components will be very useful as wellsvd++ from
graphx vs the same algorithm in graphlabalso mllib.recommendation.als
implicit/explicit compared to the collaborative filtering toolkit in
graphlab...
To stress test what's the biggest sparse dataset that you have
Hello,
I'm interested in extending the comparison between GraphX and GraphLab
presented in Xin et. al (2013). The evaluation presented there is rather
limited as it only compares the frameworks for one algorithm (PageRank) on
a cluster with a fixed number of nodes. Are there any graph algorithms
w