[ 
https://issues.apache.org/jira/browse/MAHOUT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur updated MAHOUT-103:
-------------------------

    Attachment: mahout-103.patch.v1

Ok,  so here's the revised version of the algorithm that this jira proposes to 
implement.  I have tried to make the code as clean and readable as possible. 
Next I plan to write some test code for preparing and running on Netflix prize 
dataset. As a part of data preparation the 'dates' and 'ratings' will be 
dropped and algo will run on (user-id, item-id) pairs. 

Not sure how we can include age related decay/boost when counting 
co-occurrence. May be others can pitch in once we have the basic stuff working 
fine.

> Co-occurence based nearest neighbourhood
> ----------------------------------------
>
>                 Key: MAHOUT-103
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-103
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: Ankur
>            Assignee: Ankur
>         Attachments: jira-103.patch, mahout-103.patch.v1
>
>
> Nearest neighborhood type queries for users/items can be answered efficiently 
> and effectively by analyzing the co-occurrence model of a user/item w.r.t 
> another. This patch aims at providing an implementation for answering such 
> queries based upon simple co-occurrence counts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to