[ 
https://issues.apache.org/jira/browse/MAHOUT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781858#action_12781858
 ] 

Sean Owen commented on MAHOUT-103:
----------------------------------

Yes, this is basically item-based recommendation. With some superficial 
changes, it would exactly fit that model. Co-occurrence here is like a 
similarity metric, which is ultimately used as a weighting. Canonically this 
value would be in [-1,1], and you can easily map [1,...) into that range of 
course.

Next you're sort of estimating preferences when you add up co-occurrence 
values. Canonically, you'd be doing a weighted average over M1 - M3. This is 
the same thing -- you're just not dividing by 3.

The result is conceptually the same, though different approaches would yield 
slightly different results. I'm not necessarily suggesting you change the 
algorithm. At the same time I am also about to implement this very same thing 
-- the more 'canoncial' form, to go hand-in-hand with the existing 
GenericItemBasedRecommender. I'd rather avoid duplication, and would like to 
make the Hadoop-based implementation as analogous to the existing code as 
possible. All I'd say is, go ahead, and maybe we look at generalizing it or 
shifting these concepts towards the canonical setup later.

Look at GenericIRStatsEvaluator and subclass for precision-recall approaches.

> Co-occurence based nearest neighbourhood
> ----------------------------------------
>
>                 Key: MAHOUT-103
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-103
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: Ankur
>            Assignee: Ankur
>         Attachments: jira-103.patch, mahout-103.patch.v1
>
>
> Nearest neighborhood type queries for users/items can be answered efficiently 
> and effectively by analyzing the co-occurrence model of a user/item w.r.t 
> another. This patch aims at providing an implementation for answering such 
> queries based upon simple co-occurrence counts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to