Hi there, My name is Sebastian, I'm a student and currently writing my diploma thesis about the comparison of several recommendation algorithms for a large german ecommerce site.
The algorithms I evaluate include item-based collaborative filtering, what makes me a taste and mahout user. One major task in my recent work regarding item-based collaborative filtering was the precomputation of the item similarities with Map/Reduce. I decided to use a slightly modified version of the algorithm in [1] to compute the pairwise cosine similarity between all the item-vectors. I'd be happy to donate this code to mahout, if you find it useful for your project. So just tell me and I'd provide a patch! Regards, Sebastian [1] Elsayed et al: Pairwise Document Similarity in Large Collections with MapReduce, http://www.umiacs.umd.edu/~jimmylin/publications/Elsayed_etal_ACL2008_short.pdf