There was an issue with empty cluster file being created for Canopy which
has since been fixed in present trunk. So u may want to work off of present
trunk.
Also Canopy's been marked for deprecation in future release so whatever u r
trying to do, you may want to look at the alternatives.
On Fri,
Hi Andrew,
I am invoking Canopy Driver class to perform clustering. I am able to see
the results when output format is either TEXT or CSV. However, when I am
using JSON, I am getting the exception as I mentioned above.
On Wed, Jun 18, 2014 at 10:32 PM, Andrew Musselman <
andrew.mussel...@gmail.c
Kamesh, can you please describe the schema of your input data, along with
your command to perform the clustering?
On Mon, Jun 16, 2014 at 12:44 AM, Kamesh wrote:
> Thanks for the response Andrew. I am using Mahout 0.9 version. However, I
> tried with trunk version but still I am getting output
Interesting; could be a bug, I'll take a look.
On Tue, Jun 17, 2014 at 10:38 AM, Han Fan wrote:
> Is this command line what you need? (Replace /user/root/testdataout with
> your output directory)
> $ mahout seqdumper -i /user/root/testdataout/data/part-m-0
> Key: 9: Value: {0:1.0,2:-0.956,1
Is this command line what you need? (Replace /user/root/testdataout with
your output directory)
$ mahout seqdumper -i /user/root/testdataout/data/part-m-0
Key: 9: Value:
{0:1.0,2:-0.956,1:-0.213,5:0.091,3:-0.003,7:-0.024,6:0.017,8:1.0,4:0.056}
Key: 9: Value:
{0:1.0,2:2.129,1:3.147,5:-0.063,
Thanks for the response Andrew. I am using Mahout 0.9 version. However, I
tried with trunk version but still I am getting output in the following
format
C-55{n=1 c=[15993058.000] r=[]}
C-56{n=2 c=[15993061.167] r=[]}
C-57{n=1 c=[15993062.000] r=[]}
C-97{n=1 c=[15993103.000] r=[]}
C-98{n=2 c=[1599
That's going to be easier if you can work off of trunk, since the output of
clustering has been cleaned up to write a better format, per
https://issues.apache.org/jira/browse/MAHOUT-1505
E.g.,
{
"top_terms": [
{"all":3.0149030685424805},
{"english":3.0149030685424805},
{"best":3.014
Hi All,
Please help me in getting the data points inside each cluster.
The output of the clustering algorithm is center of the cluster and radius
of the cluster. How do we derive actual data points inside each cluster
from this output.
--
Kamesh.