Combining the latest commits with my
optimized-SamplingCandidateItemsStrategy (http://pastebin.com/6n9C8Pw1)
I achieved satisfying results. All the queries were under one second.

Sebastian, I took a look at your patch and I think it's more practical than
the current SamplingCandidateItemsStrategy, however it still doesn't put a
strict cap on the number of possible item IDs like my implementation does.
Perhaps there is room for both implementations?



On Sun, Dec 4, 2011 at 11:13 AM, Sebastian Schelter <s...@apache.org> wrote:

> I created a jira to supply a non-distributed counterpart of the
> sampling that is done in the distributed item similarity computation:
>
> https://issues.apache.org/jira/browse/MAHOUT-914
>
>
> 2011/12/2 Sean Owen <sro...@gmail.com>:
> > For your purposes, it's LogLikelihoodSimilarity. I made similar changes
> in
> > other files. Ideally, just svn update to get all recent changes.
> >
> > On Fri, Dec 2, 2011 at 6:43 PM, Daniel Zohar <disso...@gmail.com> wrote:
> >
> >> Sean, can you tell me which files have you committed the changes to?
> Thanks
>

Reply via email to