Hi, I am new mahout user and using Mahout 0.4 with eclipse.
I need to generate document similarity matrix from the vector file which I have already created using SparseVectorsFromSequenceFiles Now I need to generate the document similarity matrix. Which gave me Directory structure -> df-count -> tfidf-vectors -> tf-vectors -> tokenized-documents -> wordcount -> .dictionary.file-0.crc -> .frequency.file-0.crc -> dictionary.file-0 -> frequency.file-0 I am confused now which one to use Which utility of mahout computes document document similairity matrix. Can any one help me. Regards, Divya
