Re: Mahout performance issues

Ted Dunning Mon, 05 Dec 2011 00:28:14 -0800

The downsampling should have a target size after sampling so that users
with that many or fewer ratings are not down-sampled at all.

This is easy to do using reservoir sampling or anything similar.  You can
also just keep the first or most recent ratings.  Or you can use a sampler
biased toward either of those extremes.

On Mon, Dec 5, 2011 at 12:20 AM, Daniel Zohar <[email protected]> wrote:

> Another issue to take into account, is to try and not down-sample too much
> so users with 1-2 preferences still get decent results.
>

Re: Mahout performance issues

Reply via email to