Re: RE: RE: [jira] Commented: (MAHOUT-19) Hierarchial clusterer

Sean Owen Mon, 19 Jan 2009 05:32:14 -0800

Binary meaning you just have a "yes, the user likes or has seen or
bought the book" versus no relation at all? Yeah that should be a
special case of what CF normally works on, where you have some degree
of preference versus yes/no. In that sense yes it is supported and in
theory should be a lot faster -- in practice it's only easy to gain a
little speedup since to really re-orient the algorithms to take
advantage of this case would take a lot of change.

Thinking it through... I am not sure slope one would work in this
case. It operates on relative differences in ratings across items, and
if all your ratings are "1.0" if they exist at all, then it falls
apart.

So perhaps the other algorithms are a better place to start after all.
The binary case does allow you to use fast similarity metrics like the
Tanimoto measure, and if you have a fast similarity metric you
generally have a fast algorithm since most algorithms rely heavily on
computing similarity metrics.

Do you have relatively lots of users, or lots of items? If you have
relatively few items, and item-based recommender is ideal -- and vice
versa with user-based recommenders.

How is that sounding? what form is your data in? I could send over
rough draft code to try out.

On Mon, Jan 19, 2009 at 12:37 PM, Goel, Ankur <[email protected]> wrote:
> Yep! I actually want to recommend items of interest, where item depends
> on the context say for an online bookshop it is books. Few question
> regarding slope one.
> 1. Can I be applied to a binary data setting like mine?
> 2. Do we have an implementation for it in Mahout?
> 3. Will it scale well?

Re: RE: RE: [jira] Commented: (MAHOUT-19) Hierarchial clusterer

Reply via email to