Thanks for the reply Sean, Another doubt: Does the ReloadFromJDBCDataModel fit my case? Is it a all-in-memory strategy?
Abmar On Tue, Jul 12, 2011 at 1:22 PM, Sean Owen <sro...@gmail.com> wrote: > Instead of pre-processing, you can put a CachingItemSimilarity on top of > your ItemSimilarity. At least it will remember what it has already > computed, > and you don't have to pre-compute everything, most of which is wasted. > > You can also look at different CandidateItemStrategy classes. You can use > it > to have it consider fewer item-item pairs. > > But for MapReduce, you want to look at > org.apache.mahout.cf.taste.hadoop.item. There's a job there that will > compute all-pairs item-item similarity. > > Sean > > On Tue, Jul 12, 2011 at 4:32 PM, Abmar Barros <abma...@gmail.com> wrote: > > > Hi all, > > > > I am new to Mahout and I am putting up a Recommender for buddycloud ( > > http://buddycloud.com/) as a part of my GSoC project ( > > https://github.com/buddycloud/channel-directory). > > In the testing snapshot, I got ~100k users, ~20k items and ~230k boolean > > taste preferences. > > At first I tried an UserBasedRecommender, with an all-in-memory DataModel > > (read from dump file, created a GenericDataModel). The recommendations > > performed great, almost real time. However, I thought this strategy > > wouldn't > > scale, once the number of users and items tend to increase, and then the > > service could run out-of-memory. > > > > Then I tried a PostgreSQLBooleanPrefJDBCDataModel, and, as expected, the > > performance dropped drastically. After reading the blog post at > > > > > http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/ > > , > > I decided to try an ItemBasedRecommender, using a preprocessed > > ItemSimilarity table. I am trying to not use MapReduce at first, thus I > > tried to compute the LogLikehood similarity from every pair of item. This > > took too long, and then I gave up. > > > > Finally, my questions are: Am I doing things right? What is the best way > to > > compute item similarity offline without MapReduce? > > > > Thanks in advance! > > Abmar > > > > -- > > Abmar Barros > > MSc candidate on Computer Science at Federal University of Campina Grande > - > > www.ufcg.edu.br > > OurGrid Team Member - www.ourgrid.org > > Paraíba - Brazil > > > -- Abmar Barros MSc candidate on Computer Science at Federal University of Campina Grande - www.ufcg.edu.br OurGrid Team Member - www.ourgrid.org Paraíba - Brazil