What is GraphX:
- It can be viewed as a kind of Distributed, Parallel, Graph Database - It can be viewed as Graph Data Structure (Data Structures 101 from your CS course) - It features some off the shelve algos for Graph Processing and Navigation (Algos and Data Structures 101) and the implementation of these takes advantage of the distributed parallel nature of GrapphX Any of the MLib algos can be applied to ANY data structure from time series to graph to matrix/tabular etc – it is up to your needs and imagination As an example – Clustering – you can apply it to Graph Data Structure BUT you may also leverage the Graph inherent connection/clustering properties and Graph algos taking advantage of that Instead of e.g. the run of the mill K-Means which is ok for te.g. time series, matrix etc data structures From: Timothée Rebours [mailto:t.rebo...@gmail.com] Sent: Thursday, June 18, 2015 10:44 AM To: Akhil Das Cc: user@spark.apache.org Subject: Re: Machine Learning on GraphX Thanks for the quick answer. I've already followed this tutorial but it doesn't use GraphX at all. My goal would be to work directly on the graph, and not extracting edges and vertices from the graph as standard RDDs and then work on that with the standard MLlib's ALS, which has no interest. That's why I tried with the other implementation, but it's not optimized at all. I might have gone in the wrong direction with the ALS, but I'd like to see what's possible to do with MLlib on GraphX. Any idea ? 2015-06-18 11:19 GMT+02:00 Akhil Das <ak...@sigmoidanalytics.com>: This might give you a good start http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html its a bit old though. Thanks Best Regards On Thu, Jun 18, 2015 at 2:33 PM, texol <t.rebo...@gmail.com> wrote: Hi, I'm new to GraphX and I'd like to use Machine Learning algorithms on top of it. I wanted to write a simple program implementing MLlib's ALS on a bipartite graph (a simple movie recommendation), but didn't succeed. I found an implementation on Spark 1.1.x (https://github.com/ankurdave/spark/blob/GraphXALS/graphx/src/main/scala/org/apache/spark/graphx/lib/ALS.scala) of ALS on GraphX, but it is painfully slow compared to the standard implementation, and uses the deprecated (in the current version) PregelVertex class. Do we expect a new implementation ? Is there a smarter solution to do so ? Thanks, Regards, Timothée Rebours. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Machine-Learning-on-GraphX-tp23388.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Timothée Rebours 13, rue Georges Bizet 78380 BOUGIVAL