These users should cause problems though. They don't add anything to a set of candidates. Taking them away means you can't recommend anything to them. I doubt this is quite the issue.
(That item with 400K interactions might be just fine to remove!) You are certainly bottleneck on item-item similarity, from your graph -- intersectionSize() is the heart of the loglikelihood computation. I still do not understand why your proposed change does not solve the problem! You can turn down the candidate set size as low as you want. At a "reasonable" size quality will still be OK. I'm missing something here. On Thu, Dec 1, 2011 at 10:35 PM, Daniel Zohar <[email protected]> wrote: > Sebastian, as I wrote before, it's the other way around. ~8.5M users had > only chosen a single item. The item with the most interactions is about > 400k. > This is why I'm looking now into improving GenericBooleanPrefDataModel to > not take into account users which made one interaction under the > 'preferenceForItems' Map. What do you think about this approach? > >
