Re: How to convert

Yosep Kim Thu, 11 Aug 2011 15:49:53 -0700

What a fast response!!!  Thanks for the quick answer. I will let you know
how it goes!  Thanks!


On Thu, Aug 11, 2011 at 6:47 PM, Jeff Eastman <[email protected]> wrote:

> You'll want to add the -nv option to seq2sparse to get NamedVectors out and
> add the -cl argument to k-means to get the clustered documents. Then the
> clusterdump should give you what you are seeking.
>
> -----Original Message-----
> From: Yosep Kim [mailto:[email protected]]
> Sent: Thursday, August 11, 2011 3:43 PM
> To: [email protected]
> Subject: How to convert
>
> Hello, Everyone!
>
> This is Yosep Kim, and I just started playing with Mahout.
>  I successfully installed it on my box and got a example data clustered
> using a K-Means clustering algorithm.  My input data was all text documents
> (i.e. new articles).  I ran a clusterdump command, I get some cool
> information.  However, I was not able to find a way to translate this back
> to the original document.  It looks like the algorithm created clusters
> based on all the words inside of documents.  Did I understand this
> correctly?  How can I create clusters based on documents so I can see that
> "document1.txt and document2.txt are in Cluster 1"?  I'd appreciate your
> help!!  Thanks.
>
>
> :CL-16397{n=1032 c=[0:0.125, 0.5:0.019, 0.8m:0.014, 00:0.096, 0000:0.008,
> 001:0.015, 00139:0.014, 001
>        Top Terms:
>                c                                       =>
> 2.458502088406289
>                software                                =>
> 2.375095306671867
>                java                                    =>
>  2.2093305677868598
>                project                                 =>
> 1.989917316871096
>                application                             =>
> 1.957329582567363
>                using                                   =>
> 1.916300386652466
>                web                                     =>
>  1.9046723985856817
>                development                             =>
>  1.8707247066867443
>
> By the way, Mahout is way cool, and I can't wait to be part of this
> "movement".
>
> Yosep
>

Re: How to convert

Reply via email to