Hello, I currently working on a small database, I understand that, when I need the similarity between users, it's basically the compute between all pairs of users.
It's that ? or it's better ? If it's that, how can I expect a quick compute for 1 million rows ? I don't see what is the difference between asking for the neighborhood, to compute the similarity for all pairs of users. Because I thought, something could be interesting : Make some clusters of users, and only compute the similarity between users in my cluster. Thanks -- View this message in context: http://www.nabble.com/Compute-similarities-for-an-hudge-quantity-of-data-tp24364711p24364711.html Sent from the Mahout User List mailing list archive at Nabble.com.
