You can use the ClusterDumper class (I just posted an example at
http://lucene.grantingersoll.com/2010/02/16/trijug-intro-to-mahout-slides-and-demo-examples/)
or you can use the SequenceFile Dumper.
If you generated from a Lucene index, you can also use a class in utilities
(ClusterLabels?) to print out labels based on LLR calculations.
HTH,
Grant
On Feb 18, 2010, at 2:32 AM, Cui tony wrote:
> Hi,
> I'm a beginner on mahout.I have figure out how to run k-means of mahout.
> But after that, I have no idea how to get the clustered result.
>
> My input data is the standard example data : synthetic_control.data
>
> After running, I got a points folder which someone says that it contains the
> result. The points folders has mainly two files : part-00000 part-00001
>
>
> file part-00000 like this:
> eq^f^yorg.apache.hadoop.io.text^yorg.apache.hadoop.io.te...@^@^...@^@^...@^@$*hnph?[34m~yyvb}?~\~up_z~n...@^@^C:^...@^@^C8~N^C5{"class":"org.apache.mahout.matrix.Sparsor","vector":"{\"values\":{\"indices\":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59],\"values\":[28.7812,34.4632,31.3381,31.2834,28.9207,33.7596,25.3969,27.7849,35.2479,27.1159,32.8717,29.2171,36.0253,32.337,34.5249,32.8717,34.1173,26.5235,27.6623,26.3693,25.7744,29.27,30.7326,29.5054,33.0292,25.04,28.9167,24.3437,26.1203,34.9424,25.0293,26.6311,35.6541,28.4353,29.1495,28.1584,26.1927,33.3182,30.9772,27.0443,35.5344,26.2353,28.9964,32.0036,31.0558,34.2553,28.0721,28.9402,35.4973,29.747,31.4333,24.5556,33.7431,25.0466,34.9318,34.9879,32.4721,33.3759,25.4652,25.8717],\"numMappings\":60},\"cardinality\":60,\"lengthSquared\":-1.0,\"name\":\"\"}"}^...@^@^C;^...@^@^C9~N^C6{"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59],\"values\":[24.8923,25.741,27.5532,32.8217,27.8789,
>
> I'm so confused about this result: who can I got the data with the clustered
> label?
>
> thanks~~