Re: Mahout performance issues

Ted Dunning Fri, 02 Dec 2011 09:57:34 -0800

On Fri, Dec 2, 2011 at 3:10 AM, Sean Owen <[email protected]> wrote:

> > I just ran the fix I proposed earlier and I got great results! The query
> > time was reduced to about a third for the 'heavy users'. Before it was
> 1-5
> > secs and now it's 0.5-1.5. The best part is that the accuracy level
> should
> > remain exactly the same. I also believe it should reduce memory
> > consumption, as the GenericBooleanPrefDataModel.preferenceForItems gets
> > significantly smaller (in my case at least).
> >
> > The fix is merely adding two lines of code to one of
> > the GenericBooleanPrefDataModel constructors. See
> > http://pastebin.com/K5PB68Et, the lines I added are #11, #22.
> >
>
> I don't think this works though, because you've deleted the one data point
> you have for those users. They can't get recommendations now.
>
> I can't figure out how that speeds up recommendations though, what am I
> missing? these users aren't providing any more item-item interactions to
> consider.
>



Actually, if these users single item is a fantastically popular item, then
all of those users will be roped into the computation (with no effect).

Sean's argument would be correct if the users were each interacting with
some item that is way out in the low frequency tail.  By Murphy, this won't
be the case.

Better to dump the uninformative items using a kill list.

Re: Mahout performance issues

Reply via email to