LDA clustering documentation (mahout-07-snapshot)

antonio d'agata Thu, 12 Apr 2012 06:21:35 -0700

Dear users,

I'm trying to use lda clustering algorithm by command line (using
mahout-07-snapshot) and I was able to get the topics (as text file
containing the top words) but I need also to get the documents id
associated to the calculated topics.


I tried this commands:
mahout vectordump -i DB-LDA-clusters/docTopics/part-m-00000 -o
output/cluster_lda_topics.txt
mahout vectordump -i DB-LDA-clusters/docTopics/part-m-00000 -o
output/cluster_lda_topics.txt -dt text(or sequencefile)
but without success.

Is there a way to do such work?

Thanks

Antonio Michelangelo D'Agata

LDA clustering documentation (mahout-07-snapshot)

Reply via email to