Hi,

how many documents do you have and what kind of similarity do you wanna use?

--sebastian

On 26.10.2010 08:10, Divya wrote:
Hi,

I am new mahout user and using Mahout 0.4 with eclipse.

I need to generate document similarity matrix from the vector file which I
have already created using SparseVectorsFromSequenceFiles

Now I need to generate the document similarity matrix.

Which gave me

Directory structure

->  df-count

->  tfidf-vectors

->  tf-vectors

->  tokenized-documents

->  wordcount

->  .dictionary.file-0.crc

->  .frequency.file-0.crc

->  dictionary.file-0

->  frequency.file-0



I am confused now which one to use

Which utility of mahout  computes document  document similairity matrix.



Can any one help me.





Regards,

Divya



Reply via email to