[ 
https://issues.apache.org/jira/browse/MAHOUT-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved MAHOUT-393.
------------------------------

         Assignee: Sean Owen
    Fix Version/s: 0.4
       Resolution: Fixed

Done, I committed with only two substantive tweaks:

- I had switched over to VLongWritable from LongWritable. Most IDs used don't 
really need nearly 8 bytes, so variable-length coding saves a lot.
- CountUsersKeyWritable didn't define equals() and hashCode() non-trivially, 
and was inconsistent with compareTo(). Do I miss something about this?

> Distributed item similarity functions
> -------------------------------------
>
>                 Key: MAHOUT-393
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-393
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>            Reporter: Sebastian Schelter
>            Assignee: Sean Owen
>             Fix For: 0.4
>
>         Attachments: MAHOUT-393.patch
>
>
> To complete the work started in MAHOUT-389, I've created a distributed 
> version of any item similarity function that is currently already available 
> in a non-distributed manner. An additional M/R job was necessary to compute 
> the number of all users which is needed by some similarity functions (like 
> LogLikelihoodSimilarity for example).
> There is still some optimization potential in the code as not every 
> similarity function needs all information that is currently extracted (like 
> the number of users e.g.), but the optimization would have made the code much 
> less readable so I did not do any work on that.
> I hope you consider this a useful addition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to