Hello,

I'm using TanimotoCoefficientSimilarity.  With or without Rescorer, virtually 
all time gets spent in TanimotoCoefficientSimilarity.itemCorrelation (see 
below).
I have not profiled things yet, but looking at
TanimotoCoefficientSimilarity.itemCorrelation I don't see much room for
performance improvement.

So how can this puppy scale?  From what I can tell so far, the only way
to scale is to really pre-compute recommendations for all users ahead
of time and simply store them somewhere (e.g. DB, FS, memcached) for a quick
user->recommendations lookup.  It looks like real-time computation
is out of question.  Since CF/Taste sort of requires access to all
users' data in order to compute recommendations, I don't yet see how
data could be broken into smaller chunks and processed
in distributed MapReduce-style... or does anyone see how this could be done? [1]

I looked at Ian's emails again and see that he, too, says there is no real-time 
aspect in their system, plus it looks like they do aggregation and store 
aggregation summaries for quick lookup in a DB, but don't really use Taste for 
recommending items to individual users.

[1]
But this really brings me back a thread from the end of August thread, whose 
key messages are:

http://markmail.org/message/jo66sxyyn2pklsgv
http://markmail.org/message/cfntfbhshn5qz36n
http://markmail.org/message/27ijhgs4ghpr6cjv
http://markmail.org/message/eu3npmt7ggzc2jaq

It sounds like the next step to try are TreeClusteringRecommender and 
TreeClusteringRecommender2...

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

"qtp0-0" prio=10 tid=0x08af3000 nid=0x5b94 runnable [0x6c0c6000..0x6c0c6fc0]
   java.lang.Thread.State: RUNNABLE
    at
org.apache.mahout.cf.taste.impl.similarity.TanimotoCoefficientSimilarity.itemCorrelation(TanimotoCoefficientSimilarity.java:161)
    at
org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender.doEstimatePreference(GenericItemBasedRecommender.java:206)
    at 
org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender.access$400(GenericItemBasedRecommender.java:59)
    at
org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender$Estimator.estimate(GenericItemBasedRecommender.java:265)
    at
org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender$Estimator.estimate(GenericItemBasedRecommender.java:256)
    at 
org.apache.mahout.cf.taste.impl.recommender.TopItems.getTopItems(TopItems.java:54)
    at 
org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender.recommend(GenericItemBasedRecommender.java:101)
    at 
org.apache.mahout.cf.taste.impl.recommender.AbstractRecommender.recommend(AbstractRecommender.java:52)
    at 
org.apache.mahout.cf.taste.impl.recommender.CachingRecommender$RecommendationRetriever.get(CachingRecommender.java:170)
    at 
org.apache.mahout.cf.taste.impl.recommender.CachingRecommender$RecommendationRetriever.get(CachingRecommender.java:158)
    at 
org.apache.mahout.cf.taste.impl.common.Cache.getAndCacheValue(Cache.java:102)
    at org.apache.mahout.cf.taste.impl.common.Cache.get(Cache.java:76)
    at 
org.apache.mahout.cf.taste.impl.recommender.CachingRecommender.recommend(CachingRecommender.java:93)

Reply via email to