Re: Mahout performance issues

Sean Owen Thu, 01 Dec 2011 14:46:52 -0800

These users should cause problems though. They don't add anything to a set
of candidates. Taking them away means you can't recommend anything to them.
I doubt this is quite the issue.

(That item with 400K interactions might be just fine to remove!)

You are certainly bottleneck on item-item similarity, from your graph --
intersectionSize() is the heart of the loglikelihood computation.

I still do not understand why your proposed change does not solve the
problem! You can turn down the candidate set size as low as you want. At a
"reasonable" size quality will still be OK. I'm missing something here.

On Thu, Dec 1, 2011 at 10:35 PM, Daniel Zohar <[email protected]> wrote:

> Sebastian, as I wrote before, it's the other way around. ~8.5M users had
> only chosen a single item. The item with the most interactions is about
> 400k.
> This is why I'm looking now into improving GenericBooleanPrefDataModel to
> not take into account users which made one interaction under the
> 'preferenceForItems' Map. What do you think about this approach?
>
>

Re: Mahout performance issues

Reply via email to