On Sun, May 19, 2013 at 8:34 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
> Won't argue with how fast Solr is, It's another fast and scalable lookup > engine and another option. Especially if you don't need to lookup anything > else by user, in which case you are back to a db... > But remember, it is also doing more than lookup. It is computing scores on items and retaining the highest scoring items. > Using a cooccurrence matrix means you are doing item similairty since > there is no user data in the matrix. Or are you talking about using the > user history as the query? in which case you have to remember somewhere all > users' history and look it up for the query, no? > Yes. You do. And that is the key to making this orders of magnitude faster. But that is generally fairly trivial to do. One option is to keep it in a cookie. Another is to use browser persistent storage. Another is to use a memory based user profile database. Yet another is to use M7 tables on MapR or HBase on other Hadoop distributions. > On May 19, 2013, at 8:09 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > > On Sun, May 19, 2013 at 8:04 PM, Pat Ferrel <p...@occamsmachete.com> wrote: > > > Two basic solutions to this are: factorize (reduces 100s of thousands of > > items to hundreds of 'features') and continue to calculate recs at > runtime, > > which you have to do with Myrrix since mahout does not have an in-memory > > ALS impl, or move to the mahout hadoop recommenders and pre-calculate > recs. > > > > Or sparsify the cooccurrence matrix and run recommendations out of a search > engine. > > This will scale to thousands or tens of thousands of recommendations per > second against 10's of millions of items. The number of users doesn't > matter. > >