Quick one for the original poster: You could also use Solr/Lucene's MoreLikeThis, for example.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Sean Owen <[email protected]> > To: [email protected] > Sent: Friday, April 3, 2009 12:54:40 AM > Subject: Re: Using Taste to recommend documents > > You could do that. But then, the system would be recommending words to > documents! Not quite what you want. I assume you still want to > recommend documents to (real) users. > > I would use other techniques to determine document similarity. Others > on this list can suggest ideas, but, simple metrics based on word > frequency should do well. Then, use that logic to create an > implementation of ItemSimilarity. Then build a DataModel, perhaps a > FileDataModel, maybe from a file containing user IDs, document IDs, > and preference values. Then try a GenericItemBasedRecommender based on > these components. We can discuss these more in detail later. > > Assuming you go this way, a couple thousand documents (and a couple > thousand users?) should be no problem to process in memory. It should > be fast. I would, perhaps, make sure that your ItemSimilarity caches > results, or perhaps is based on pre-computed values, since that would > be slow to re-compute those over and over a runtime. > > Sean > > On Apr 3, 2009 7:14 AM, "Vinicius Carvalho" wrote: > > Hi there! I would like to build a document recommendation system, and one of > the approaches I wish to experiment is use taste for that task. One idea I > had was to model users as documents, words as items and word frequencies on > documents as preferences. > > Am I going on the right direction here? > > Also, I'm a bit afraid about memory consumption here. So far we only have 6k > documents (which may have a few hundred words per doc). But would taste > scale to lets say 100k documents with few hundreds of words? > > Best regards > > -- > The intuitive mind is a sacred gift and the > rational mind is a faithful servant. We have > created a society that honors the servant and > has forgotten the gift.
