Re: Similarity between sparse vectors

2011-07-15 Thread Sean Owen
This is simply Euclidean distance squared. Take the square root if you need the simple Euclidean distance. On Fri, Jul 15, 2011 at 12:36 PM, marco turchi wrote: > Dear All, > I'm a newcomer in Mahout and I'm try to compute the cosine similarity > between two sparse vectors. > I have loaded them u

Re: Similarity between sparse vectors

2011-07-15 Thread marco turchi
Hi thanks a lot I have also another problem ( :-) ). As I wrote in the previous email, I'm using the RandomAccessSparseVector representation to store sparse vectors. I need to sum some of them together, so I use the method plus but it seems that it requires the same vector cardinality. I set the i

Re: Similarity between sparse vectors

2011-07-15 Thread Sean Owen
Cardinality should be set to whatever the logical dimension of the vector is -- it shouldn't be arbitrary. It's not like an "initial size" of a list. If your'e dealing with vectors that have a potentially unbounded maximum dimension, use Integer.MAX_VALUE. As the name suggests, the implementation

Re: Similarity between sparse vectors

2011-07-15 Thread marco turchi
Dear Sean, thanks a lot for the advices, everything is working perfectly! Cheers Marco On Fri, Jul 15, 2011 at 2:15 PM, Sean Owen wrote: > Cardinality should be set to whatever the logical dimension of the > vector is -- it shouldn't be arbitrary. It's not like an "initial > size" of a list. If