I have heard various arguments that favor retaining the most recent interactions or favor a fair sample or favor taking the earliest interactions. These can even be combined with biased samples. I haven't seen much difference between these approaches. I think at the lack of difference is largely due to the fact that the sampling falls most heavily on items that we care very little about in recommendations since they are the most popular items that are obviously getting plenty of traffic anyway.
Sent from my iPad On Aug 13, 2011, at 2:31 AM, Sebastian Schelter <[email protected]> wrote: > One thing I'm currently looking into is how to sample the input. Ted has > stated that you usually only need to look at a few hundred or thousand > ratings per item as you don't learn anything new from the rest. Would it be > sufficient to randomly sample the ratings of an item then? That's what I'm > currently doing but I wonder whether there are more clever ways to do this.
