Thanks Pat and David, I tried what you told me to do, but unfortunately is not working... I get the following error when running the command:
./mahout clusterdump -i /user/Data-output/clusters-1-final -o analyze.txt --evaluate true "ERROR common.AbstractJob: Unexpected true while processing Job-Specific Options: Unexpected true while processing Job-Specific Options." According to the clusterdump help, it is not suppose to have any value in the parameter --evaluate (-e), but if I do not put anything I get the Java Null Pointer Exception. These are 2 of the 23 clusters that are generated of my analyze.txt file, maybe it can help to explain if there is something unexpected: CL-0{n=113525 c=[10.821, 48.382, 66.019, 0.004, 0.000, 0.001, 0.000, 0.001, 0.001, 0.000, 0.000, 0.000, 0.000, 4.921, 8.565, 0.068, 0.068, 0.207, 0.205, 0.951, 0.052, 0.139, 209.864, 175.184, 0.731, 0.079, 0.119, 0.025, 0.069, 0.067, 0.191, 0.196] r=[91.194, 45.425, 78.914, 0.110, 0.008, 0.035, 0.028, 0.037, 0.038, 0.013, 0.008, 0.016, 0.011, 10.173, 23.152, 0.252, 0.252, 0.405, 0.403, 0.164, 0.195, 0.292, 80.182, 102.034, 0.395, 0.223, 0.290, 0.072, 0.251, 0.250, 0.381, 0.388]} VL-1{n=17 c=[1.133, 0.669, 1.874, 1.460, 1.688, 1.818, 1.939, 1.255, 1.484, 1.697, 0.554, 1.042, 1.774, 0.818, 1.901, 1.522, 1.518, 1.098, 1.637, 1.611, 1.615, 1.212, 1.088, 1.133, 1.483, 0.761, 0.757, 0.953, 1.559, 1.696, 0.548, 0.975] r=[0.000, 0.000, 0.000, 0.000, NaN, NaN, NaN, NaN, 0.000, 0.000, NaN, 0.000, 0.000, 0.000, NaN, 0.000, NaN, 0.000, NaN, 0.000, 0.000, 0.000, 0.000, 0.000, NaN, 0.000, NaN, 0.000, 0.000]} Thanks! > Subject: Re: Mahout K-Means - Quality of the clusters > From: pat.fer...@gmail.com > Date: Mon, 19 May 2014 14:50:47 -0700 > To: user@mahout.apache.org; david.i.n...@gmail.com > > Yep, the clue is "--evaluate=null” in the console. try "-e true". I think I > ran into that a long time ago, it should really be fixed. > > Try looking here for more explanation of cluster dump: > https://mahout.apache.org/users/clustering/cluster-dumper.html > > The docs are being greatly improved, so there's a chance you’ll find answers > there. > > On May 19, 2014, at 2:34 PM, David Noel <david.i.n...@gmail.com> wrote: > > It works for me with just -e. Maybe try that or --evaluate true? > > On 5/19/14, hiroshi leon <hiroshi_8...@hotmail.com> wrote: > > Thanks Pat, > > > > But how exactly can I run clusterdump using the -evaluate (-e) parameter? > > When i try to run it for example: > > > > ./mahout clusterdump -i /user/Data-output/clusters-1-final -o analyze.txt > > --evaluate > > > > I get a Java null pointer Exception > > > > 14/05/19 15:02:03 INFO common.AbstractJob: Command line arguments: > > {--dictionaryType=[text], > > --distanceMeasure=[org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure], > > --endPhase=[2147483647], --evaluate=null, > > --input=[/user/Data-output/clusters-1-final], --output=[analyze.txt], > > --outputFormat=[TEXT], --startPhase=[0], --tempDir=[temp]} > > Exception in thread "main" java.lang.NullPointerException > > > > Do I have to put a parameter to evaluate? As input for clusterdump I am > > using the output with the clusters after running mahout K-Means. > > > >> Subject: Re: Mahout K-Means - Quality of the clusters > >> From: pat.fer...@gmail.com > >> Date: Sat, 17 May 2014 09:43:59 -0700 > >> To: user@mahout.apache.org > >> > >> mahout clusterdump —evaluate … > >> > >> provides some stats > >> > >> On May 15, 2014, at 10:23 PM, hiroshi leon <hiroshi_8...@hotmail.com> > >> wrote: > >> > >> Hello everybody, > >> > >> Do you know how can I get the MSE of the clusters in mahout K-Means? > >> I would like to check the quality of the clusters. Thanks! > >> > >> > >> > > >