Hi,

I am new mahout user and using Mahout 0.4 with eclipse.

I need to generate document similarity matrix from the vector file which I
have already created using SparseVectorsFromSequenceFiles

Now I need to generate the document similarity matrix.

Which gave me 

Directory structure 

-> df-count

-> tfidf-vectors

-> tf-vectors

-> tokenized-documents

-> wordcount

-> .dictionary.file-0.crc

-> .frequency.file-0.crc

-> dictionary.file-0

-> frequency.file-0

 

I am confused now which one to use 

Which utility of mahout  computes document  document similairity matrix.

 

Can any one help me.

 

 

Regards,

Divya  

Reply via email to