I copied down to local and found that the classDumper needs both HDFS and local disk file. So I made two folders exactly the same name one in HDFS and one in local. And I get the following:
hadoop jar mahout-utils-0.3-SNAPSHOT.jar org.apache.mahout.utils.clustering.ClusterDumper -s /data/clusters-6 -o /data/output Input Path: /data/clusters-6/part-00000 Exception in thread "main" java.lang.NullPointerException at org.apache.mahout.utils.vectors.VectorHelper.vectorToString(VectorHelper.java:60) at org.apache.mahout.utils.clustering.ClusterDumper.printClusters(ClusterDumper.java:124) at org.apache.mahout.utils.clustering.ClusterDumper.main(ClusterDumper.java:253) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Any thoughts? Drew Farris wrote: > > I suspect you are seeing this error because ClusterDumper doesn't work > from hadoop/hdfs, you will need to copy the directories down to your > local disk and run from there using java. > > On Thu, Jan 7, 2010 at 4:30 PM, diveman <shilian...@gmail.com> wrote: >> >> Thanks! >> >> and when I try to run the dumper it gives me the following: >> hadoop jar mahout-utils-0.3-SNAPSHOT.jar >> org.apache.mahout.utils.clustering.ClusterDumper -s output/clusters-6/ -o >> /data/output >> Exception in thread "main" java.lang.NullPointerException >> at >> org.apache.mahout.utils.clustering.ClusterDumper.printClusters(ClusterDumper.java:112) >> at >> org.apache.mahout.utils.clustering.ClusterDumper.main(ClusterDumper.java:253) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> >> >> >> Drew Farris wrote: >>> >>> Each iteration of k-means clustering will produce a cluster-X file. In >>> this case, there were 7 iterations prior to the clusters converging. >>> The final cluster data can be found in clusters-6. >>> >>> There is a utility in mahout-util, >>> o.a.m.utils.clustering.ClusterDumper that can be used to dump the data >>> from clusters-6 and points into a json-like format. You could use that >>> code as a starting point for discovering how to get at the data you're >>> interested in. >>> >>> On Thu, Jan 7, 2010 at 3:23 PM, diveman <shilian...@gmail.com> wrote: >>>> >>>> I'm new to Mahout. Installed 0.3 in a 4-node cluster and run mahout >>>> kmean >>>> example with syntheticcontrol data. I got outputs like the following: >>>> >>>> output/canopies >>>> output/clusters-0 >>>> output/clusters-1 >>>> output/clusters-2 >>>> output/clusters-3 >>>> output/clusters-4 >>>> output/clusters-5 >>>> output/clusters-6 >>>> output/data >>>> output/points >>>> >>>> by which I understand in the points folder, each point is labeled with >>>> a >>>> cluster id. I'm wondering where I can find the cluster center, radius >>>> info, >>>> etc. And what's in clusters-0~6? BTW, the sample data has 6 groups and >>>> the >>>> result has 7 clusters, any clue? >>>> >>>> Thanks! >>>> -- >>>> View this message in context: >>>> http://old.nabble.com/Kmeans-clustering-tp27066415p27066415.html >>>> Sent from the Mahout User List mailing list archive at Nabble.com. >>>> >>>> >>> >>> >> >> -- >> View this message in context: >> http://old.nabble.com/Kmeans-clustering-tp27066415p27067350.html >> Sent from the Mahout User List mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://old.nabble.com/Kmeans-clustering-tp27066415p27115555.html Sent from the Mahout User List mailing list archive at Nabble.com.