Hi Grant,

I am running the NewsKMeansClustering Class from NetBeans (Run -> Run
File). I did not change anything in the class code except the name of the
input directory, so the class can see the dataset that I want to cluster.
So, I changed the statement:

String inputDir = "inputDir";

to:

String inputDir = "reuters-seqfiles";

The directory (reuters-seqfiles) contains the dataset in SequenceFile
format. This directory and its data are achieved by running the
seqdirectory program using the mahout launcher (bin/mahout seqdirectory).

Do you want me to post for you the code of the NewsKMeansClustering Class
from the book, or you already have it ?

Thanks,
Ahmad

On Thu, Nov 17, 2011 at 4:57 PM, Grant Ingersoll <gsing...@apache.org>wrote:

> What command did you run?
>
> On Nov 16, 2011, at 4:47 AM, Ahmad Ammari wrote:
>
> > Hello,
> >
> > I am practicing the mahout examples in the clustering part of the book
> > "Mahout in action", particularly chapter 9. In Section 9.1.4, I am trying
> > to run the class NewsKMeansClustering, which I got its source code from
> the
> > companion source code files. What I understood is that the input
> directory
> > "inputDir" should contain the input documents in SequenceFile format.
> > Therefore, I tried to make the "reuters-seqfiles" directory that we
> > generated using the seqdirectory program that runs in the mahout launcher
> > in chapter 8 (page 139). I then ran the NewsKMeansClustering, which
> started
> > to run fine, until I get a java.lang.IllegalStateException exception,
> > saying that No clusters found, as follows:
> >
> > java.lang.IllegalStateException: No clusters found. Check your -c path.
> > at
> >
> org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:60)
> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> > 16-Nov-2011 00:49:14 org.apache.hadoop.mapred.JobClient
> monitorAndPrintJob
> > INFO: map 0% reduce 0%
> > 16-Nov-2011 00:49:14 org.apache.hadoop.mapred.JobClient
> monitorAndPrintJob
> > INFO: Job complete: job_local_0010
> > 16-Nov-2011 00:49:14 org.apache.hadoop.mapred.Counters log
> > INFO: Counters: 0
> > Exception in thread "main" java.lang.InterruptedException: K-Means
> > Iteration failed processing reutersClusters/canopy-centroids/clusters-0
> > at
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:363)
> > at
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.buildClustersMR(KMeansDriver.java:310)
> > at
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:237)
> > at
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:152)
> > at clusterer.NewsKMeansClustering.main(NewsKMeansClustering.java:81)
> > ------------------------------------------------------------------------
> > BUILD FAILURE
> > ------------------------------------------------------------------------
> > Total time: 15.391s
> > Finished at: Wed Nov 16 00:49:14 GMT 2011
> > Final Memory: 10M/150M
> > ------------------------------------------------------------------------
> > Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec
> > (default-cli) on project mahout-examples: Command execution failed.
> Process
> > exited with an error: 1(Exit value: 1) -> [Help 1]
> >
> > To see the full stack trace of the errors, re-run Maven with the -e
> switch.
> > Re-run Maven using the -X switch to enable full debug logging.
> >
> > For more information about the errors and possible solutions, please read
> > the following articles:
> > [Help 1]
> > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> >
> > What does it mean that no cluster found?! Is the input directory wrong?
> If
> > so, what input should I give the class? I tried to change the canopy
> > thresholds (250, 120) to some other numbers, tried also changing the
> > EuclideanDistanceMeasure for the canopy clustering to
> > CosineDistanceMeasure, with no use.
> >
> > Many thanks in advance,
> > Ahmad
>
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
>
>

Reply via email to