Just succeed to make work my app. Should to use 
ClusterDumperWriter.gettopfeatures(ar1,arg2,arg3) and that gave me the top 
words on human readable format :D



-----Message d'origine-----
De : Paritosh Ranjan [mailto:pran...@xebia.com] 
Envoyé : mardi 7 août 2012 10:32
À : user@mahout.apache.org
Objet : Re: ClusterDumper eclipse human readable output kmeans

I don't know why ClusterDumper is not working, but I can give an alternate 
solution.

Use ClusterOutputPostProcessor  (clusterpp), on the clusters-*-final directory. 
https://cwiki.apache.org/MAHOUT/top-down-clustering.html
It will arrange the vectors in respective directories. However, it will still 
be in the form of sequence files.

Its very simple to read a sequence file and write in a human readable format.

Classes in org.apache.mahout.common.iterator.sequencefile package can help to 
read the sequence files easily.

On 07-08-2012 12:50, Videnova, Svetlana wrote:
> I already generated points directory when i run cluster (kmeans in my case).
> But for the moment I can't generate clustedump because of error on this line:
> ClusterDumper.readPoints(new Path("output/kmeans/clusters-0"), 2, 
> conf); Second parameter is double but he wants int but does not accept int 
> .... well pretty confused ...
>
>
>
> -----Message d'origine-----
> De : kiran kumar [mailto:kirankumarsm...@gmail.com]
> Envoyé : lundi 6 août 2012 18:01
> À : user@mahout.apache.org
> Objet : Re: ClusterDumper eclipse human readable output kmeans
>
> Hello,
> Clusterdump actually shows you the top terms and vectors of centroid and each 
> document. But to identify what vectors are for your document, You need to 
> generate points directory when running clustering algorithm and use the 
> points directory generated in the above step when generating cluster dump.
>
> Thanks,
> Kiran Bushireddy.
>
> On Mon, Aug 6, 2012 at 10:33 AM, Videnova, Svetlana < 
> svetlana.viden...@logica.com> wrote:
>
>> Hi,
>>
>> My goal is to transform the vectors created by lucene.vector (thanks 
>> to kmeans clustering) to a human readable format. For that I am using 
>> ClusterDumper function on eclipse. But that code does not generate 
>> none files. What am I missing? What is the best approach to transform 
>> output of kmeans to a human readable (no unix command please I am on 
>> windows using eclipse and cygwin).
>> This is the code:
>>
>>
>> Code :
>>
>> Map<Integer, List<WeightedVectorWritable>> result = 
>> ClusterDumper.readPoints(new Path("output/kmeans/clusters-0"), 2, 
>> conf);
>>
>>              System.out.println(result.get(0).toString());
>>              for(int j = 0; j < result.size(); j++){
>>                    List<WeightedVectorWritable> list = result.get(j);
>>                    for(WeightedVectorWritable vector : list){
>>
>> System.out.println(vector.getVector().asFormatString());
>>                    }
>>
>>              }
>>
>>
>> Error :
>>
>> Exception in thread "main" java.lang.ClassCastException:
>> org.apache.mahout.clustering.iterator.ClusterWritable cannot be cast 
>> to org.apache.mahout.clustering.classify.WeightedVectorWritable
>>        at main.LuceneDemo.main(LuceneDemo.java:260)
>>
>>
>>
>> Thank you
>>
>>
>> Think green - keep it on the screen.
>>
>> This e-mail and any attachment is for authorised use by the intended
>> recipient(s) only. It may contain proprietary material, confidential 
>> information and/or be subject to legal privilege. It should not be 
>> copied, disclosed to, retained or used by, any other party. If you 
>> are not an intended recipient then please promptly delete this e-mail 
>> and any attachment and all copies and inform the sender. Thank you.
>>
>>
>
> --
> Thanks & Regards,
> Kiran Kumar
>
> Think green - keep it on the screen.
>
> This e-mail and any attachment is for authorised use by the intended 
> recipient(s) only. It may contain proprietary material, confidential 
> information and/or be subject to legal privilege. It should not be copied, 
> disclosed to, retained or used by, any other party. If you are not an 
> intended recipient then please promptly delete this e-mail and any attachment 
> and all copies and inform the sender. Thank you.
>
>




Think green - keep it on the screen.

This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. It may contain proprietary material, confidential 
information and/or be subject to legal privilege. It should not be copied, 
disclosed to, retained or used by, any other party. If you are not an intended 
recipient then please promptly delete this e-mail and any attachment and all 
copies and inform the sender. Thank you.


Reply via email to