Do you have relatively few users? a user-user-similarity-based algorithm would be a lot faster then.
I'm guessing that the number of items is unusually large relative to the number of actual user-item interactions you might otherwise expect -- that it's very sparse? Matrix-factorization techniques will probably do well here, since they'll squeeze out a lot of the problems of accuracy and scale that come with very sparse data. Yes a precision test has the problem you described, even though that's a general problem and not specific to this situation. It's just very hard to define a "relevant" vs "non-relevant" item. Most items will be considered non-relevant by default even though that's not true. On Mon, Nov 21, 2011 at 4:26 PM, James Li <[email protected]> wrote: > Hi, > > I was wondering if anybody has dealt with the issue where your recommender > system has to deal with a really large number of items which can be > recommended, say 10 millions. It would be impractical for the recommender > to predict a rating on every single items before ranking them. Can anybody > point me to any papers or links for a solution? > > This issue also causes some problem for performance tests if we adopt the > rank-based measure such as Precision@5. If I want to use this measure > Precision@#n to test a recommender system where there are a large number > of > items to recommend, the likelihood of an item consumed by a user getting > into the top #n list should be really low. Any suggestions as to how to > handle this case? > > Thanks, > > James >
