[ 
https://issues.apache.org/jira/browse/MAHOUT-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162899#comment-13162899
 ] 

Sean Owen commented on MAHOUT-910:
----------------------------------

Daniel says:

Hi Sean,
I have been playing around with your patch. It looks good.
>From the little testing I did, I can also say that the recommendations seem
to be more accurate than in my initial proposal (#4).

I just have one suggestion though. I think the current parameters (int
defaultMaxPrefsPerItemConsidered, int userItemCountMultiplier) are not so
clear and don't give enough control over the sampling.
I would introduce two other parameters (it won't be backwards-compatible
though) -
- maxSourcePrefsConsidered: which will be used
in conjunction with SamplingLongPrimitiveIterator to do #1.
- maxFinalPrefs : which will set the value for 'int max' in your patch
(i.e. get rid of max = (int) Math.max(defaultMaxPrefsPerItemConsidered,
userItemCountMultiplier * Math.log(Math.max(dataModel.getNumUsers(),
dataModel.getNumItems()))); )

In the future it would be possible to add a strategy that will affect the
way maxSourcePrefsConsidered is sampled. For example, most recent items or
least recent items or random sampling (like we have now). Even though that
might not be the place to do so.. (since it's not in the context of the
user)

What do you think?
                
> Improve sampling in SamplingCandidateItemStrategy, optimize intersection 
> computations
> -------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-910
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-910
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.6
>
>         Attachments: MAHOUT-910.patch, MAHOUT-910.patch
>
>
> Per the lengthy discussion on the mailing list about optimizing 
> SamplingCandidateItemStrategy and related code, I'm opening this placeholder 
> issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to