Re: Spectral clustering

Shannon Quinn Thu, 07 May 2015 15:51:56 -0700

Hi Sugam,

To clarify, the "RBF" I mentioned for computing affinities is the radialbasis function, linked in mahout's spectral clustering documentation:http://en.wikipedia.org/wiki/RBF_kernel

The basic layout is to compare documents pairwise, use RBF to computetheir similiarity, and set the entries in the affinity matrixcorresponding to the two documents to the output of the RBF.


On 5/7/15 4:59 PM, Shannon Quinn wrote:

Hi Sugam,
This is in response to your original thread:http://mail-archives.apache.org/mod_mbox/mahout-user/201505.mbox/%3C1412053714.1387791.1431020729309.JavaMail.yahoo%40mail.yahoo.com%3E
The first thing you need to do is build the graph affinity matrixyourself. That's the input to the map-reduce spectral clusteringalgorithm, and what is described in the documentation (the "i, j,value" part). Basically you'll consider each document as a single nodein a graph, and weight the connections between nodes. "i" and "j" arethe pair of nodes you're considering, and "value" is the similarity /affinity, usually between 0 (completely dissimilar) and 1 (identical).Typically you use RBF to compute affinities.
Once you have the data in this format, then you can feed it to thespectral clustering algorithm. Having the Mahout package compute theaffinities is at the top of my to-do list for the next version (thoughthere are still some questions that have to be addressed), so intheory you could just submit the documents as you would to any otheralgorithm in Mahout, but for now you have to compute the affinitiesyourself.
Let me know if anything still isn't clear.

Shannon

Re: Spectral clustering

Reply via email to