Re: Interpretation of cluster output

2014-06-20 Thread Suneel Marthi
There was an issue with empty cluster file being created for Canopy which has since been fixed in present trunk. So u may want to work off of present trunk. Also Canopy's been marked for deprecation in future release so whatever u r trying to do, you may want to look at the alternatives. On Fri,

Re: Interpretation of cluster output

2014-06-20 Thread Kamesh
Hi Andrew, I am invoking Canopy Driver class to perform clustering. I am able to see the results when output format is either TEXT or CSV. However, when I am using JSON, I am getting the exception as I mentioned above. On Wed, Jun 18, 2014 at 10:32 PM, Andrew Musselman < andrew.mussel...@gmail.c

Re: Interpretation of cluster output

2014-06-18 Thread Andrew Musselman
Kamesh, can you please describe the schema of your input data, along with your command to perform the clustering? On Mon, Jun 16, 2014 at 12:44 AM, Kamesh wrote: > Thanks for the response Andrew. I am using Mahout 0.9 version. However, I > tried with trunk version but still I am getting output

Re: Interpretation of cluster output

2014-06-18 Thread Andrew Musselman
Interesting; could be a bug, I'll take a look. On Tue, Jun 17, 2014 at 10:38 AM, Han Fan wrote: > Is this command line what you need? (Replace /user/root/testdataout with > your output directory) > $ mahout seqdumper -i /user/root/testdataout/data/part-m-0 > Key: 9: Value: {0:1.0,2:-0.956,1

Re: Interpretation of cluster output

2014-06-17 Thread Han Fan
Is this command line what you need? (Replace /user/root/testdataout with your output directory) $ mahout seqdumper -i /user/root/testdataout/data/part-m-0 Key: 9: Value: {0:1.0,2:-0.956,1:-0.213,5:0.091,3:-0.003,7:-0.024,6:0.017,8:1.0,4:0.056} Key: 9: Value: {0:1.0,2:2.129,1:3.147,5:-0.063,

Re: Interpretation of cluster output

2014-06-16 Thread Kamesh
Thanks for the response Andrew. I am using Mahout 0.9 version. However, I tried with trunk version but still I am getting output in the following format C-55{n=1 c=[15993058.000] r=[]} C-56{n=2 c=[15993061.167] r=[]} C-57{n=1 c=[15993062.000] r=[]} C-97{n=1 c=[15993103.000] r=[]} C-98{n=2 c=[1599

Re: Interpretation of cluster output

2014-06-13 Thread Andrew Musselman
That's going to be easier if you can work off of trunk, since the output of clustering has been cleaned up to write a better format, per https://issues.apache.org/jira/browse/MAHOUT-1505 E.g., { "top_terms": [ {"all":3.0149030685424805}, {"english":3.0149030685424805}, {"best":3.014

Interpretation of cluster output

2014-06-13 Thread Kamesh
Hi All, Please help me in getting the data points inside each cluster. The output of the clustering algorithm is center of the cluster and radius of the cluster. How do we derive actual data points inside each cluster from this output. -- Kamesh.