Nice point about before-time/after-time training & prediction sets! On Fri, Dec 16, 2011 at 12:52 AM, Anatoliy Kats (Commented) (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170836#comment-13170836 > ] > > Anatoliy Kats commented on MAHOUT-906: > -------------------------------------- > > Not yet, just a refactoring so far. Still working on it. > >> Allow collaborative filtering evaluators to use custom logic in splitting >> data set >> ---------------------------------------------------------------------------------- >> >> Key: MAHOUT-906 >> URL: https://issues.apache.org/jira/browse/MAHOUT-906 >> Project: Mahout >> Issue Type: Improvement >> Components: Collaborative Filtering >> Affects Versions: 0.5 >> Reporter: Anatoliy Kats >> Priority: Minor >> Labels: features >> Attachments: MAHOUT-906.patch, MAHOUT-906.patch, MAHOUT-906.patch, >> MAHOUT-906.patch >> >> Original Estimate: 48h >> Remaining Estimate: 48h >> >> I want to start a discussion about factoring out the logic used in splitting >> the data set into training and testing. Here is how things stand: There >> are two independent evaluator based classes: >> AbstractDifferenceRecommenderEvaluator, splits all the preferences randomly >> into a training and testing set. GenericRecommenderIRStatsEvaluator takes >> one user at a time, removes their top AT preferences, and counts how many of >> them the system recommends back. >> I have two use cases that both deal with temporal dynamics. In one case, >> there may be expired items that can be used for building a training model, >> but not a test model. In the other, I may want to simulate the behavior of >> a real system by building a preference matrix on days 1-k, and testing on >> the ratings the user generated on the day k+1. In this case, it's not >> items, but preferences(user, item, rating triplets) which may belong only to >> the training set. Before we discuss appropriate design, are there any other >> use cases we need to keep in mind? > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > >
-- Lance Norskog goks...@gmail.com