[
https://issues.apache.org/jira/browse/MAHOUT-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879699#action_12879699
]
Hudson commented on MAHOUT-407:
-------------------------------
Integrated in Mahout-Quality #85 (See
[http://hudson.zones.apache.org/hudson/job/Mahout-Quality/85/])
MAHOUT-407 committed per Sebastian
> Limit the number of similar items per item in the ItemSimilarityJob
> -------------------------------------------------------------------
>
> Key: MAHOUT-407
> URL: https://issues.apache.org/jira/browse/MAHOUT-407
> Project: Mahout
> Issue Type: New Feature
> Components: Collaborative Filtering
> Reporter: Sebastian Schelter
> Fix For: 0.4
>
> Attachments: MAHOUT-407-2.patch, MAHOUT-407.patch
>
>
> In order to keep the item-similarity-matrix sparse, it would be a useful
> improvement to add an option like "maxSimilaritiesPerItem" to
> o.a.m.cf.taste.hadoop.similarity.item.ItemSimilarityJob, which would make it
> try to cap the number of similar items per item.
> However as we store each similarity pair only once it could happen that there
> are more than "maxSimilaritiesPerItem" similar items for a single item as we
> can't drop some of the pairs because the other item in the pair might have
> too little similarities otherwise.
> A default value of 100 co-occurrences (similarities) will be used because
> this is already the default in the distributed recommender.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.