[ https://issues.apache.org/jira/browse/MAHOUT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781858#action_12781858 ]
Sean Owen commented on MAHOUT-103: ---------------------------------- Yes, this is basically item-based recommendation. With some superficial changes, it would exactly fit that model. Co-occurrence here is like a similarity metric, which is ultimately used as a weighting. Canonically this value would be in [-1,1], and you can easily map [1,...) into that range of course. Next you're sort of estimating preferences when you add up co-occurrence values. Canonically, you'd be doing a weighted average over M1 - M3. This is the same thing -- you're just not dividing by 3. The result is conceptually the same, though different approaches would yield slightly different results. I'm not necessarily suggesting you change the algorithm. At the same time I am also about to implement this very same thing -- the more 'canoncial' form, to go hand-in-hand with the existing GenericItemBasedRecommender. I'd rather avoid duplication, and would like to make the Hadoop-based implementation as analogous to the existing code as possible. All I'd say is, go ahead, and maybe we look at generalizing it or shifting these concepts towards the canonical setup later. Look at GenericIRStatsEvaluator and subclass for precision-recall approaches. > Co-occurence based nearest neighbourhood > ---------------------------------------- > > Key: MAHOUT-103 > URL: https://issues.apache.org/jira/browse/MAHOUT-103 > Project: Mahout > Issue Type: New Feature > Components: Collaborative Filtering > Reporter: Ankur > Assignee: Ankur > Attachments: jira-103.patch, mahout-103.patch.v1 > > > Nearest neighborhood type queries for users/items can be answered efficiently > and effectively by analyzing the co-occurrence model of a user/item w.r.t > another. This patch aims at providing an implementation for answering such > queries based upon simple co-occurrence counts. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.