CachingUserSimilarity and CachingItemSimilarity have wrong (far to small)
default maxSizes
------------------------------------------------------------------------------------------
Key: MAHOUT-905
URL: https://issues.apache.org/jira/browse/MAHOUT-905
Project: Mahout
Issue Type: Bug
Components: Collaborative Filtering
Affects Versions: 0.5
Environment: Mac OS X 10.6.8
java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-10M3527)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
This does not matter should be reproducible on every system.
Reporter: Manuel Blechschmidt
Assignee: Sean Owen
Attachments: CachingSimilariyAdjustedDefaultSize.patch
I am currently tuning my recommender discussed here:
http://thread.gmane.org/gmane.comp.apache.mahout.user/10433.
As a first step I wrapped my LogLikelihoodSimilarity with an
CachingUserSimilarity. I used Java Visual VM to profile the calls. I recognized
that I didn't get any performance benefits. So I had a look into the code.
Actually line 47 this(similarity, dataModel.getNumItems()); in
CachingUserSimilarity.java is wrong. If we want to cache all item similarities
we need a cache with (dataModel.getNumItems()*(dataModel.getNumItems()-1))/2
possible entries.
I am now doing this in the constructor. I attached a patch to adjust this in
the trunk build.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira