K-means is attempting to load your initial clusters and is not finding any. 
Have you checked your -c path? You can also add -xm sequential so you can run 
the sequential algorithm. This allows you to use a debugger to verify your 
paths.

-----Original Message-----
From: Ahmad Ammari [mailto:ammari...@gmail.com] 
Sent: Wednesday, November 16, 2011 7:19 AM
To: user@mahout.apache.org
Subject: NewsKMeansClustering does not find any clusters!

Hello,

I am practicing the mahout examples in the clustering part of the book
"Mahout in action", particularly chapter 9. In Section 9.1.4, I am trying
to run the class NewsKMeansClustering, which I got its source code from the
companion source code files. What I understood is that the input directory
"inputDir" should contain the input documents in SequenceFile format.
Therefore, I tried to make the "reuters-seqfiles" directory that we
generated using the seqdirectory program that runs in the mahout launcher
in chapter 8 (page 139). I then ran the NewsKMeansClustering, which started
to run fine, until I get a java.lang.IllegalStateException exception,
saying that No clusters found, as follows:

java.lang.IllegalStateException: No clusters found. Check your -c path.
at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:60)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
16-Nov-2011 00:49:14 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO: map 0% reduce 0%
16-Nov-2011 00:49:14 org.apache.hadoop.mapred.JobClient monitorAndPrintJob
INFO: Job complete: job_local_0010
16-Nov-2011 00:49:14 org.apache.hadoop.mapred.Counters log
INFO: Counters: 0
Exception in thread "main" java.lang.InterruptedException: K-Means
Iteration failed processing reutersClusters/canopy-centroids/clusters-0
at
org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:363)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.buildClustersMR(KMeansDriver.java:310)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:237)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:152)
at clusterer.NewsKMeansClustering.main(NewsKMeansClustering.java:81)
------------------------------------------------------------------------
BUILD FAILURE
------------------------------------------------------------------------
Total time: 15.391s
Finished at: Wed Nov 16 00:49:14 GMT 2011
Final Memory: 10M/150M
------------------------------------------------------------------------
Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec
(default-cli) on project mahout-examples: Command execution failed. Process
exited with an error: 1(Exit value: 1) -> [Help 1]

To see the full stack trace of the errors, re-run Maven with the -e switch.
Re-run Maven using the -X switch to enable full debug logging.

For more information about the errors and possible solutions, please read
the following articles:
[Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

What does it mean that no cluster found?!

Is the input directory wrong? If so, what input should I give the class?

I tried to change the canopy thresholds (250, 120) to some other numbers,
tried also changing the EuclideanDistanceMeasure for the canopy clustering
to CosineDistanceMeasure, with no use.

Many thanks in advance,
Ahmad

Reply via email to