Binary meaning you just have a "yes, the user likes or has seen or bought the book" versus no relation at all? Yeah that should be a special case of what CF normally works on, where you have some degree of preference versus yes/no. In that sense yes it is supported and in theory should be a lot faster -- in practice it's only easy to gain a little speedup since to really re-orient the algorithms to take advantage of this case would take a lot of change.
Thinking it through... I am not sure slope one would work in this case. It operates on relative differences in ratings across items, and if all your ratings are "1.0" if they exist at all, then it falls apart. So perhaps the other algorithms are a better place to start after all. The binary case does allow you to use fast similarity metrics like the Tanimoto measure, and if you have a fast similarity metric you generally have a fast algorithm since most algorithms rely heavily on computing similarity metrics. Do you have relatively lots of users, or lots of items? If you have relatively few items, and item-based recommender is ideal -- and vice versa with user-based recommenders. How is that sounding? what form is your data in? I could send over rough draft code to try out. On Mon, Jan 19, 2009 at 12:37 PM, Goel, Ankur <[email protected]> wrote: > Yep! I actually want to recommend items of interest, where item depends > on the context say for an online bookshop it is books. Few question > regarding slope one. > 1. Can I be applied to a binary data setting like mine? > 2. Do we have an implementation for it in Mahout? > 3. Will it scale well?
