[ 
https://issues.apache.org/jira/browse/MAHOUT-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881207#action_12881207
 ] 

Sean Owen commented on MAHOUT-423:
----------------------------------

Sure sounds good, feel free to post a patch and I'll look at it.

> Optimize getNumUsersWithPreferenceFor(long... itemIDs)
> ------------------------------------------------------
>
>                 Key: MAHOUT-423
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-423
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jonathan Young
>
> I ran a simple collaborative filtering application using a 
> GenericBooleanPrefDataModel built from (a subset of) the Netflix data, 
> Tanimoto similarity, and the GenericItemBasedRecommender, and then called 
> recommender.mostSimilarItems() (a lot).  
> Profiling indicated that the majority of the time was spent in 
> GenericBooleanPrefDataModel.getNumUsersWithPreferenceFor(long... itemIDs).  
> The version in GenericDataModel is optimized for the cases of one and two 
> itemIDs, but the version in GenericBooleanPrefDataModel always computes the 
> intersection set.
> I can create a patch which optimizes the two cases of itemIDs.length == 1 and 
> itemIDs.length == 2 (similar to the version in GenericDataModel), but perhaps 
> the code should be refactored if these are really the most common cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to