R u still specifying the -c option, its only needed if u have initial
centroids to launch the KMEans from otherwise KMeans picks random centroids.

Also CosineDistanceMeasure doesn't make sense with kMeans which is in
Euclidean space -try using SquaredEuclidean or Euclidean distances.

On Tue, Mar 10, 2015 at 1:27 AM, Raghuveer <alwaysra...@yahoo.com.invalid>
wrote:

> Hi All,
> I am trying to run the command:
> ./mahout kmeans -i
> hdfs://master:54310/user/netlog/upload/output4/tfidf-vectors/part-r-00000
> -o
> hdfs://master:54310//user/netlog/upload/output4/tfidf-vectors-kmeans-clusters-raghuveer
> -c  hdfs://master:54310/user/netlog/upload/mahoutoutput -dm
> org.apache.mahout.common.distance.CosineDistanceMeasure -x 5 -ow -cl -k 25
> -xm mapreduce
> Since i dont have any clusters yet to give it as an input i can remove it
> is what forums suggested. But now i get the error
>
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB:
> /home/raghuveer/trunk/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar
> 15/03/10 10:52:53 ERROR common.AbstractJob: Missing required option
> --clusters
> Missing required option
> --clusters
>
> Usage:
>  [--input <input> --output <output> --distanceMeasure
> <distanceMeasure>
> --clusters <clusters> --numClusters <k> --randomSeed
> <randomSeed1>
> [<randomSeed2> ...] --convergenceDelta <convergenceDelta> --maxIter
> <maxIter>
> --overwrite --clustering --method <method>
> --outlierThreshold
> <outlierThreshold> --help --tempDir <tempDir> --startPhase
> <startPhase>
> --endPhase
> <endPhase>]
> --clusters (-c) clusters    The input centroids, as Vectors.  Must be
> a
>                             SequenceFile of Writable, Cluster/Canopy.  If
> k is
>                             also specified, then a random set of vectors
> will
>                             be selected and written out to this path
> first
> 15/03/10 10:52:53 INFO driver.MahoutDriver: Program took 370 ms (Minutes:
> 0.006166666666666667)
> Kindly help me out.
> Thanks
>
>
>

Reply via email to