I understand the idea, but this boils down to the current implementation, plus going back and throwing out some additional training data that is lower rated -- it's neither in test or training. Anything's possible, but I do not imagine this is a helpful practice in general.
On Sat, Feb 16, 2013 at 10:29 PM, Tevfik Aytekin <tevfik.ayte...@gmail.com>wrote: > I'm suggesting the second one. In that way the test user's ratings in > the training set will compose of both low and high rated items, that > prevents the problem pointed out by Ahmet. > > On Sat, Feb 16, 2013 at 11:19 PM, Sean Owen <sro...@gmail.com> wrote: > > If you're suggesting that you hold out only high-rated items, and then > > sample them, then that's what is done already in the code, except without > > the sampling. The sampling doesn't buy anything that I can see. > > > > If you're suggesting holding out a random subset and then throwing away > the > > held-out items with low rating, then it's also the same idea, except > you're > > randomly throwing away some lower-rated data from both test and train. I > > don't see what that helps either. > > > > > > On Sat, Feb 16, 2013 at 9:41 PM, Tevfik Aytekin < > tevfik.ayte...@gmail.com>wrote: > > > >> What I mean is you can choose ratings randomly and try to recommend > >> the ones above the threshold > >> > >> >