[ https://issues.apache.org/jira/browse/MAHOUT-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169279#comment-13169279 ]
Anatoliy Kats commented on MAHOUT-906: -------------------------------------- The IR tests make recommendations for one user at a time, true, but they build a model based on all other users to make a recommendation for the one. So, as we try to recover each preference P, we build a model based on all users, and *all preferences expressed earlier than time(P)*. You're right, sorting is not necessary, because usually it's assumed that preferences stay constant during some time period, say, a day. Is there an existing TopN class you are referring to, or should I write my own? I am thinking I need to write a brand new evaluator and make the existing GenericRecommenderIRStatsEvaluator its subclass, rather than the other way around. The reason is that the outer loop of a temporal evaluator class is over the time range of preferences, and only then over the users like GenericRecommenderIRStatsEvaluator. It's natural to see the generic evaluator as a special case of the temporal one, with one pass over the outer loop. What do you think? So, I'd write a loop like this: for i in 1...N: let training data be bottom (i/N * 100)% by time. let testing data be between (i/N, (i+1)/N)*100% (Alternatively, split by a time period, s.t. let days 1...i be training, and i+1 be testing) Generate the same number of preferences for each user as in the testing data Compute IR statistics on the intersection of actual and predicted preferences. How does that sound? > Allow collaborative filtering evaluators to use custom logic in splitting > data set > ---------------------------------------------------------------------------------- > > Key: MAHOUT-906 > URL: https://issues.apache.org/jira/browse/MAHOUT-906 > Project: Mahout > Issue Type: Improvement > Components: Collaborative Filtering > Affects Versions: 0.5 > Reporter: Anatoliy Kats > Priority: Minor > Labels: features > Original Estimate: 48h > Remaining Estimate: 48h > > I want to start a discussion about factoring out the logic used in splitting > the data set into training and testing. Here is how things stand: There are > two independent evaluator based classes: > AbstractDifferenceRecommenderEvaluator, splits all the preferences randomly > into a training and testing set. GenericRecommenderIRStatsEvaluator takes > one user at a time, removes their top AT preferences, and counts how many of > them the system recommends back. > I have two use cases that both deal with temporal dynamics. In one case, > there may be expired items that can be used for building a training model, > but not a test model. In the other, I may want to simulate the behavior of a > real system by building a preference matrix on days 1-k, and testing on the > ratings the user generated on the day k+1. In this case, it's not items, but > preferences(user, item, rating triplets) which may belong only to the > training set. Before we discuss appropriate design, are there any other use > cases we need to keep in mind? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira