Clustering of Words

2015-11-09 Thread pradhandeep
Hi,
I am trying to cluster words of some articles. I used TFIDF and Word2Vec in
Spark to get the vector for each word and I used KMeans to cluster the
words. Now, is there any way to get back the words from the vectors? I want
to know what words are there in each cluster.
I am aware that TFIDF does not have an inverse. Does anyone know how to get
back the words from the clusters?
 
Thank You
Regards,
Deep



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Clustering-of-Words-tp25328.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Joins in Spark

2014-12-22 Thread pradhandeep
Hi,
I have two RDDs, veritces which is an RDD and edges, which is a pair RDD. I
have to do a three-way join of these two. Joins work only when both the RDDs
are pair RDDs, so how can we perform a three-way join of these RDDs?

Thank You



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Joins-in-Spark-tp20819.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How to get list of edges between two Vertex ?

2014-12-22 Thread pradhandeep
Do you need the multiple edges or can you get the work done by having single
edge between two vertices?
In my view point, you can group the edges using groupEdges which will group
the same edges together. It may work because the message passed between the
vertices through same edges (replicated) will not be different.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-list-of-edges-between-two-Vertex-tp19309p20809.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Is Spark? or GraphX runs fast? a performance comparison on Page Rank

2014-12-22 Thread pradhandeep
Did you try running PageRank.scala instead of LiveJournalPageRank.scala?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-or-GraphX-runs-fast-a-performance-comparison-on-Page-Rank-tp19710p20808.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org