[ 
https://issues.apache.org/jira/browse/MAHOUT-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836733#action_12836733
 ] 

Sean Owen commented on MAHOUT-305:
----------------------------------

Say I've made the following ratings:

5 stars: Harry Potter
5 stars: Harry Potter 2
1 star: Maid in Manhattan

Say I remove Maid in Manhattan as test data. I run recommendations and it 
recommends to me Harry Potter 3 (which presumably I would rate highly). The 
implementation would be penalized for not returning Maid in Manhattan, when 
that's surely not what it should have returned.

Even if you take out only the most highly-rated movies as test data (this is 
what the existing CF precsion/recall evaluator does), this phenomenon can still 
occur: the recommender could return a movie that's better than anything you've 
yet seen but that would be considered 'bad' by this evaluation style. It's 
still not a fair test, but it's less un-fair.

Yes you could take the 20% most-highly-rated movies from each user as test data 
if you like, not just 5-star.

Say I ask for 10 recommendations. Precision @ 10 is the proportion of those 10 
that were in the users' history (top ratings). Recall @ 10 is the proportion of 
all top-rated items that appeared in those 10. I think this is a little 
different than what you're saying?

> Combine both cooccurrence-based CF M/R jobs
> -------------------------------------------
>
>                 Key: MAHOUT-305
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-305
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.2
>            Reporter: Sean Owen
>            Assignee: Ankur
>            Priority: Minor
>
> We have two different but essentially identical MapReduce jobs to make 
> recommendations based on item co-occurrence: 
> org.apache.mahout.cf.taste.hadoop.{item,cooccurrence}. They ought to be 
> merged. Not sure exactly how to approach that but noting this in JIRA, per 
> Ankur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to