You need to also add the -p argument to clusterdump, specifying your 
clusteredPoints directory. 

-----Original Message-----
From: Yosep Kim [mailto:[email protected]] 
Sent: Thursday, August 11, 2011 4:11 PM
To: [email protected]
Subject: Re: How to convert

Hello, Jeff:

I did run the commands again with parameters you wanted me to add.  However,
when I ran the following clusterdump command, I still had the same output:

   mahout clusterdump -s /user/hadoop/articles-kmeans/clusters-1 -d
/user/hadoop/articles-seqdir-sparse-kmeans/dictionary.file-0 -dt
sequencefile -b 100 -n 20

Am I missing some arguments?

Thanks again for your help, Jeff.

On Thu, Aug 11, 2011 at 6:49 PM, Yosep Kim <[email protected]> wrote:

> What a fast response!!!  Thanks for the quick answer. I will let you know
> how it goes!  Thanks!
>
>
> On Thu, Aug 11, 2011 at 6:47 PM, Jeff Eastman <[email protected]> wrote:
>
>> You'll want to add the -nv option to seq2sparse to get NamedVectors out
>> and add the -cl argument to k-means to get the clustered documents. Then the
>> clusterdump should give you what you are seeking.
>>
>> -----Original Message-----
>> From: Yosep Kim [mailto:[email protected]]
>> Sent: Thursday, August 11, 2011 3:43 PM
>> To: [email protected]
>> Subject: How to convert
>>
>> Hello, Everyone!
>>
>> This is Yosep Kim, and I just started playing with Mahout.
>>  I successfully installed it on my box and got a example data clustered
>> using a K-Means clustering algorithm.  My input data was all text
>> documents
>> (i.e. new articles).  I ran a clusterdump command, I get some cool
>> information.  However, I was not able to find a way to translate this back
>> to the original document.  It looks like the algorithm created clusters
>> based on all the words inside of documents.  Did I understand this
>> correctly?  How can I create clusters based on documents so I can see that
>> "document1.txt and document2.txt are in Cluster 1"?  I'd appreciate your
>> help!!  Thanks.
>>
>>
>> :CL-16397{n=1032 c=[0:0.125, 0.5:0.019, 0.8m:0.014, 00:0.096, 0000:0.008,
>> 001:0.015, 00139:0.014, 001
>>        Top Terms:
>>                c                                       =>
>> 2.458502088406289
>>                software                                =>
>> 2.375095306671867
>>                java                                    =>
>>  2.2093305677868598
>>                project                                 =>
>> 1.989917316871096
>>                application                             =>
>> 1.957329582567363
>>                using                                   =>
>> 1.916300386652466
>>                web                                     =>
>>  1.9046723985856817
>>                development                             =>
>>  1.8707247066867443
>>
>> By the way, Mahout is way cool, and I can't wait to be part of this
>> "movement".
>>
>> Yosep
>>
>
>

Reply via email to