I suspect you are seeing this error because ClusterDumper doesn't work
from hadoop/hdfs, you will need to copy the directories down to your
local disk and run from there using java.

On Thu, Jan 7, 2010 at 4:30 PM, diveman <[email protected]> wrote:
>
> Thanks!
>
> and when I try to run the dumper it gives me the following:
> hadoop jar mahout-utils-0.3-SNAPSHOT.jar
> org.apache.mahout.utils.clustering.ClusterDumper -s output/clusters-6/ -o
> /data/output
> Exception in thread "main" java.lang.NullPointerException
>        at
> org.apache.mahout.utils.clustering.ClusterDumper.printClusters(ClusterDumper.java:112)
>        at
> org.apache.mahout.utils.clustering.ClusterDumper.main(ClusterDumper.java:253)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> Drew Farris wrote:
>>
>> Each iteration of k-means clustering will produce a cluster-X file. In
>> this case, there were 7 iterations prior to the clusters converging.
>> The final cluster data can be found in clusters-6.
>>
>> There is a utility in mahout-util,
>> o.a.m.utils.clustering.ClusterDumper that can be used to dump the data
>> from clusters-6 and points into a json-like format. You could use that
>> code as a starting point for discovering how to get at the data you're
>> interested in.
>>
>> On Thu, Jan 7, 2010 at 3:23 PM, diveman <[email protected]> wrote:
>>>
>>> I'm new to Mahout. Installed 0.3 in a 4-node cluster and run mahout kmean
>>> example with syntheticcontrol data. I got outputs like the following:
>>>
>>> output/canopies
>>> output/clusters-0
>>> output/clusters-1
>>> output/clusters-2
>>> output/clusters-3
>>> output/clusters-4
>>> output/clusters-5
>>> output/clusters-6
>>> output/data
>>> output/points
>>>
>>> by which I understand in the points folder, each point is labeled with a
>>> cluster id. I'm wondering where I can find the cluster center, radius
>>> info,
>>> etc. And what's in clusters-0~6? BTW, the sample data has 6 groups and
>>> the
>>> result has 7 clusters, any clue?
>>>
>>> Thanks!
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Kmeans-clustering-tp27066415p27066415.html
>>> Sent from the Mahout User List mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/Kmeans-clustering-tp27066415p27067350.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>
>

Reply via email to