[ https://issues.apache.org/jira/browse/MAHOUT-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836733#action_12836733 ]
Sean Owen commented on MAHOUT-305: ---------------------------------- Say I've made the following ratings: 5 stars: Harry Potter 5 stars: Harry Potter 2 1 star: Maid in Manhattan Say I remove Maid in Manhattan as test data. I run recommendations and it recommends to me Harry Potter 3 (which presumably I would rate highly). The implementation would be penalized for not returning Maid in Manhattan, when that's surely not what it should have returned. Even if you take out only the most highly-rated movies as test data (this is what the existing CF precsion/recall evaluator does), this phenomenon can still occur: the recommender could return a movie that's better than anything you've yet seen but that would be considered 'bad' by this evaluation style. It's still not a fair test, but it's less un-fair. Yes you could take the 20% most-highly-rated movies from each user as test data if you like, not just 5-star. Say I ask for 10 recommendations. Precision @ 10 is the proportion of those 10 that were in the users' history (top ratings). Recall @ 10 is the proportion of all top-rated items that appeared in those 10. I think this is a little different than what you're saying? > Combine both cooccurrence-based CF M/R jobs > ------------------------------------------- > > Key: MAHOUT-305 > URL: https://issues.apache.org/jira/browse/MAHOUT-305 > Project: Mahout > Issue Type: Improvement > Components: Collaborative Filtering > Affects Versions: 0.2 > Reporter: Sean Owen > Assignee: Ankur > Priority: Minor > > We have two different but essentially identical MapReduce jobs to make > recommendations based on item co-occurrence: > org.apache.mahout.cf.taste.hadoop.{item,cooccurrence}. They ought to be > merged. Not sure exactly how to approach that but noting this in JIRA, per > Ankur. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.