Using EnglishAnalyzer in KMeans

2014-02-05 Thread Sznajder ForMailingList
Hi I am using the Mahout 0.5 and I would like to use the EnglishAnalyzer for running Kmeans. However, when running the following command, I get an exception: bin/mahout seq2sparse -i logs-seqFiles/ -o log-vectors-monogram-englishanalyzer -ow -s 1 -a org.apache.lucene.analysis.en.EnglishAnalyzer

Mapping from docId to clusters in the clusterdump

2014-02-02 Thread Sznajder ForMailingList
Hi, I have a directory containing thousands of text files. I ran the KMeans cluster algorithm following the tutorial in the Mahout In Action book. However, I need to know which text file was mapped to which cluster. I did not find the easy way to do that. I ran the clusterdump algorithm , but I

Re: Mapping from docId to clusters in the clusterdump

2014-02-02 Thread Sznajder ForMailingList
suneel_mar...@yahoo.com wrote: This is an issue that was very recently fixed (infact fixed last week). Please work off of present trunk, u should see the name of the text files that r part of clusters. On Sunday, February 2, 2014 5:09 AM, Sznajder ForMailingList bs4mailingl

Meaning of seqdump output on a cluster file

2014-02-02 Thread Sznajder ForMailingList
Hi, I am using Mahout0.5 (the version corresponding to the mahout in action book) I ran a K-means clustering and ran then seqdump on the clusters file. here is an output sample Input Path: log-kmeans-clusters-monogram-sim_0_1/clusters-9/part-r-0 Key class: class org.apache.hadoop.io.Text

Running Mahout Example

2014-01-22 Thread Sznajder ForMailingList
Hi, I wished to run the mahout example for Kmeans algorithm. I suppose that it is: org.apache.mahout.clustering.syntheticcontrol.kmeans.Job (1) Is it right? It looks for a /testdata/ directory. I did not find it (2) Where is it, please? I thought to use the reuters data set described in

Re: Running Mahout Example

2014-01-22 Thread Sznajder ForMailingList
/shared_home/hadoop/IHC-0.20.2/bin/hadoop and HADOOP_CONF_DIR=/mnt/hdgpfs/shared_home/hadoop/IHC-0.20.2/conf Benjamin On Wed, Jan 22, 2014 at 4:59 PM, Suneel Marthi suneel_mar...@yahoo.comwrote: Try examples /bin/cluster-reuters.sh Sent from my iPhone On Jan 22, 2014, at 9:56 AM, Sznajder

Re: Running Mahout Example

2014-01-22 Thread Sznajder ForMailingList
hi I extracted the trunk/ code.. Benjamin On Wed, Jan 22, 2014 at 5:50 PM, Suneel Marthi suneel_mar...@yahoo.comwrote: What's ur Mahout version? On Wednesday, January 22, 2014 10:27 AM, Sznajder ForMailingList bs4mailingl...@gmail.com wrote: Strangely, I get the following

Installation steps

2014-01-19 Thread Sznajder ForMailingList
Hi could you please show me a pointer to instructions how to install Mahout on Linux? Thanks a lot Benjamin

Re: Installation steps

2014-01-19 Thread Sznajder ForMailingList
-for-beginners/ Thanks, Chameera On Sun, Jan 19, 2014 at 8:00 PM, Sznajder ForMailingList bs4mailingl...@gmail.com wrote: Hi could you please show me a pointer to instructions how to install Mahout on Linux? Thanks a lot Benjamin -- Thanks, Chameera