Dear users,

I'm trying to use lda clustering algorithm by command line (using
mahout-07-snapshot) and I was able to get the topics (as text file
containing the top words) but I need also to get the documents id
associated to the calculated topics.

I tried this commands:
mahout vectordump -i DB-LDA-clusters/docTopics/part-m-00000 -o
output/cluster_lda_topics.txt
mahout vectordump -i DB-LDA-clusters/docTopics/part-m-00000 -o
output/cluster_lda_topics.txt -dt text(or sequencefile)
but without success.

Is there a way to do such work?

Thanks

Antonio Michelangelo D'Agata

Reply via email to