Right now I have only few documents..
Just wanna know what kind of similarity it generates.
As I have no idea on what basis it generates similarity..

-----Original Message-----
From: Sebastian Schelter [mailto:[email protected]] 
Sent: Tuesday, October 26, 2010 2:37 PM
To: [email protected]
Subject: Re: generate document-document similarity matrix

Hi,

how many documents do you have and what kind of similarity do you wanna use?

--sebastian

On 26.10.2010 08:10, Divya wrote:
> Hi,
>
> I am new mahout user and using Mahout 0.4 with eclipse.
>
> I need to generate document similarity matrix from the vector file which I
> have already created using SparseVectorsFromSequenceFiles
>
> Now I need to generate the document similarity matrix.
>
> Which gave me
>
> Directory structure
>
> ->  df-count
>
> ->  tfidf-vectors
>
> ->  tf-vectors
>
> ->  tokenized-documents
>
> ->  wordcount
>
> ->  .dictionary.file-0.crc
>
> ->  .frequency.file-0.crc
>
> ->  dictionary.file-0
>
> ->  frequency.file-0
>
>
>
> I am confused now which one to use
>
> Which utility of mahout  computes document  document similairity matrix.
>
>
>
> Can any one help me.
>
>
>
>
>
> Regards,
>
> Divya
>
>
>    


Reply via email to