Re: mahout failing with -c as required option

2015-03-09 Thread Raghuveer
I see the error below: Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /home/raghuveer/trunk/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar 15/03/10 11:50:20 INFO common.AbstractJob: Command line arguments: {--clustering=null, --clusters=[hdfs://maste

Re: mahout failing with -c as required option

2015-03-09 Thread Raghuveer
I see the error below: On Tuesday, March 10, 2015 11:45 AM, Suneel Marthi wrote: Try ./mahout kmeans -i http://master:50070/explorer.html#/user/netlog/upload/output4/tfidf-vectors/part-r-0 -o /usr/netlog/upload/output4/tfidf-vectors-kmeans-clusters -c -dm org.apache.mahout.c

Re: mahout failing with -c as required option

2015-03-09 Thread Suneel Marthi
Try ./mahout kmeans -i http://master:50070/explorer.html#/user/netlog/upload/output4/tfidf-vectors/part-r-0 -o /usr/netlog/upload/output4/tfidf-vectors-kmeans-clusters -c -dm org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure -x 5 -ow -cl -k 25 I don't have a machine before me

Re: mahout failing with -c as required option

2015-03-09 Thread Raghuveer
ok so if -c is required then how can i give it or atleast is there a way to remove -k itself? ./mahout kmeans -i http://master:50070/explorer.html#/user/netlog/upload/output4/tfidf-vectors/part-r-0 -o /usr/netlog/upload/output4/tfidf-vectors-kmeans-clusters -dm org.apache.mahout.common.dist

Re: mahout failing with -c as required option

2015-03-09 Thread Suneel Marthi
Oops! I meant to say that -c is required for the random centroid initialization if -k is specified. It initializes k random centroids in the folder specified by -c. so yes -c is required. On Tue, Mar 10, 2015 at 1:42 AM, Raghuveer wrote: > No i have removed the -c option now so i get the mention

Re: mahout failing with -c as required option

2015-03-09 Thread Raghuveer
No i have removed the -c option now so i get the mentioned exception that -c is mandatory. On Tuesday, March 10, 2015 11:06 AM, Suneel Marthi wrote: R u still specifying the -c option, its only needed if u have initial centroids to launch the KMEans from otherwise KMeans picks rand

Re: mahout failing with -c as required option

2015-03-09 Thread Suneel Marthi
R u still specifying the -c option, its only needed if u have initial centroids to launch the KMEans from otherwise KMeans picks random centroids. Also CosineDistanceMeasure doesn't make sense with kMeans which is in Euclidean space -try using SquaredEuclidean or Euclidean distances. On Tue, Mar

mahout failing with -c as required option

2015-03-09 Thread Raghuveer
Hi All, I am trying to run the command: ./mahout kmeans -i hdfs://master:54310/user/netlog/upload/output4/tfidf-vectors/part-r-0 -o  hdfs://master:54310//user/netlog/upload/output4/tfidf-vectors-kmeans-clusters-raghuveer -c  hdfs://master:54310/user/netlog/upload/mahoutoutput -dm org.apache