Time splits are fine but may contain anomalies that bias the data. If you are going to compare two recommenders based on time splits, make sure the data is exactly the same for each recommender. One time split we did to create a 90-10 training to test set had a split date of 12/24! Some form of random hold-out will be less prone to time based systematic variation like seasonality, holidays, day of week, and the like. Stay with the same data when comparing and at least the tests will vary together.
We still use time based splits, partly for the reasons Ted mentions but knowing the limitations is always good. On Feb 16, 2013, at 3:12 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: There are a variety of common time based effects which make time splits best in many practical cases. Having the training data all be from the past emulates this better than random splits. For one thing, you can have the same user under different names in training and test. For another thing, in real life you get data from the past of the user under consideration. As a third consideration, topical events can influence all users in common. These all mean that random training splits can have very large error in estimated performance. Sent from my iPhone On Feb 16, 2013, at 1:41 PM, Tevfik Aytekin <tevfik.ayte...@gmail.com> wrote: > What I mean is you can choose ratings randomly and try to recommend > the ones above the threshold > > On Sat, Feb 16, 2013 at 10:32 PM, Sean Owen <sro...@gmail.com> wrote: >> Sure, if you were predicting ratings for one movie given a set of ratings >> for that movie and the ratings for many other movies. That isn't what the >> recommender problem is. Here, the problem is to list N movies most likely >> to be top-rated. The precision-recall test is, in turn, a test of top N >> results, not a test over prediction accuracy. We aren't talking about RMSE >> here or even any particular means of generating top N recommendations. You >> don't even have to predict ratings to make a top N list. >> >> >> On Sat, Feb 16, 2013 at 9:28 PM, Tevfik Aytekin >> <tevfik.ayte...@gmail.com>wrote: >> >>> No, rating prediction is clearly a supervised ML problem >>> >>> On Sat, Feb 16, 2013 at 10:15 PM, Sean Owen <sro...@gmail.com> wrote: >>>> This is a good answer for evaluation of supervised ML, but, this is >>>> unsupervised. Choosing randomly is choosing the 'right answers' randomly, >>>> and that's plainly problematic. >>>> >>>> >>>> On Sat, Feb 16, 2013 at 8:53 PM, Tevfik Aytekin < >>> tevfik.ayte...@gmail.com>wrote: >>>> >>>>> I think, it is better to choose ratings of the test user in a random >>>>> fashion. >>>>> >>>>> On Sat, Feb 16, 2013 at 9:37 PM, Sean Owen <sro...@gmail.com> wrote: >>>>>> Yes. But: the test sample is small. Using 40% of your data to test is >>>>>> probably quite too much. >>>>>> >>>>>> My point is that it may be the least-bad thing to do. What test are >>> you >>>>>> proposing instead, and why is it coherent with what you're testing? >>>>>> >>>>> >>>