Re: Long running time for GraphX pagerank in dataset com-Friendster

2014-04-21 Thread Qi Song
Thanks Ankurdave~ The reason is actually the out of memory. Bests~ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Long-running-time-for-GraphX-pagerank-in-dataset-com-Friendster-tp4511p4533.html Sent from the Apache Spark User List mailing list archive at

Re: Are there any plans to develop Graphx Streaming?

2014-04-20 Thread Qi Song
Hi~Ankurdave~ Now I get another question, I realized that GraphX provides four different graph partition methods: RandonVertexCut, CanonicalRandomVertexCut, EdgePartition1D and EdgePartition2D. I've test the running time of these four method using pagerank in several different datasets and found th

Long running time for GraphX pagerank in dataset com-Friendster

2014-04-20 Thread Qi Song
Hello~ I was running some pagerank tests of GraphX in my 8 nodes cluster. I allocated each worker 32G memory and 8 CPU cores. The LiveJournal dataset used 370s, which in my mind is reasonable. But when I tried the com-Friendster data ( http://snap.stanford.edu/data/com-Friendster.html ) with 656083

Re: Comparing GraphX and GraphLab

2014-04-15 Thread Qi Song
want to know the default allocation of computing resources, as run-example may not allow me to allocate them by myself. Regards~ Qi Song -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Comparing-GraphX-and-GraphLab-tp3112p4265.html Sent from the Apache Spark

Are there any plans to develop Graphx Streaming?

2014-03-14 Thread Qi Song
know if there exists a plan to develop Graphx Streaming? If not, are there any difficulties in developing such a system, or maybe the requirement is insufficiency? Best regards~ Qi Song -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Are-there-any-pla