Hello,

I currently working on a small database, I understand that, when I need the
similarity between users, it's basically the compute between all pairs of
users.

It's that ? or it's better ?
If it's that, how can I expect a quick compute for 1 million rows ? 

I don't see what is the difference between asking for the neighborhood, to
compute the similarity for all pairs of users.

Because I thought, something could be interesting :
Make some clusters of users, and only compute the similarity between users
in my cluster.

Thanks
-- 
View this message in context: 
http://www.nabble.com/Compute-similarities-for-an-hudge-quantity-of-data-tp24364711p24364711.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to