Preference to vectors for clustering

2013-09-17 Thread Martin, Nick
Hi all, I'm looking for the best way to get user clusters from my recommendation output. Idea being I have my recommended items for users (user, item, score) based on their preferences but I want to see how the users were clustered together (and their similarity) so I can run some other analyti

Re: Clustering algorithms

2013-09-17 Thread Ted Dunning
Right now the best in terms of speed without losing quality in Mahout is the streaming k-means implementation. One exciting possibility is that you probably can combine a streaming k-means pre-pass with a regularized k-means algorithm in order to get results more like Lingo. You could also follow

Re: Clustering algorithms

2013-09-17 Thread Mike Hugo
Thanks Ted! On Tue, Sep 17, 2013 at 2:59 PM, Ted Dunning wrote: > Right now the best in terms of speed without losing quality in Mahout is > the streaming k-means implementation. > > One exciting possibility is that you probably can combine a streaming > k-means pre-pass with a regularized k-me

Clustering algorithms

2013-09-17 Thread Mike Hugo
Hello, I'm new to mahout but have been working with Solr, Carrot2 and clustering documents with the Lingo algorithm. This has worked well for us for clustering small sets of search results, but we are now branching out into wanting to cluster larger sets of documents (millions of documents to 10s

user@mahout.apache.org

2013-09-17 Thread Darius Miliauskas
I guess there is some problems with the paths in Cygwin since I get that output: DARIUS@DARIUS-PC ~ cd .. DARIUS@DARIUS-PC ~ cd DARIUS@DARIUS-PC ~ $ cd /usr/local/mahout/examples/bin DARIUS@DARIUS-PC /usr/local/mahout/examples/bin $ ./build-reuters.sh Please call cluster-reuters.sh directly nex

Re: Using SparseVectorsFromSequenceFiles () in Java

2013-09-17 Thread Darius Miliauskas
That's like a charm, Gokhan, your suggestion was on point again. However... Despite the fact that the build is successful, the file is still empty, and I got the exception as always on Windows: java.io.IOException: Failed to set permissions of path: \tmp\hadoop-DARIUS\mapred\staging\DARIUS3311507