Actually, collaborative filtering as originally stated and often reiterated in the form of predicting ratings has very limited utility, but is an example of a more important super-class of algorithms that I call reflected intelligence where you use user behavior in a more or less simple way to cause a system to behave in an apparently intelligently.
The problem with ratings predictors is that you have posed some very difficult problems for yourself from the outset. These include getting people to rate things at all (very doable on netflix, almost impossible in my experienece on Veoh, Musicmatch and all the fraud systems I have built) and you have to translate a ratings prediction into useful action (very, very hard to do well in almost all cases). Predicting next URL is an important RI problem and using an item-set predictor or an indicator-set predictor or a latent-variable predictor is likely to work reasonably well. The asymmetry of the prediction is not particularly a problem since it captures important structural cues (web links are unidirectional). It is important, however, to subtract away the link structure of the web pages before evaluating the system since suggesting that the user simply follow links that already exist is less than interesting. As such, a raw markov chain isn't likely to work well. On Sat, Jan 17, 2009 at 5:24 AM, Sean Owen <[email protected]> wrote: > But then again is this a CF problem? Sounds like markov chains... given the > last 1 or 2 or 3 URLs visited, which URL has been next, most often? I think > that's relatively easy and fast, does that work? > -- Ted Dunning, CTO DeepDyve 4600 Bohannon Drive, Suite 220 Menlo Park, CA 94025 www.deepdyve.com 650-324-0110, ext. 738 858-414-0013 (m)
