On Thu, Aug 28, 2008 at 9:20 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > 1. Findory's personalization used a type of hybrid collaborative filtering > algorithm that recommended articles based on a combination of > similarity of content and articles that tended to interested other > Findory users with similar tastes.
Interesting -- yeah, that would be a hybrid of user-based and item-based approaches. Usually, in a user-based approach, you find similar users, and then guess a rating for a new item by averaging the rating for that item of similar users -- weighted by the user similarity of course. Here, I imagine that in Findory you don't have a rating per se for articles, just a boolean yes/no. So you substitute a similarity metric between those items the user has read and a given new item. Yeah... that does add up to an interesting new approach, likely. I'd have to digest that a bit more to think about how to implement it right. > The way Findory does this is > that it pre-computes as much of the expensive personalization as it > can. Much of the task of matching interests to content is moved to an > offline batch process. The online task of personalization, the part > while the user is waiting, is reduced to a few thousand data lookups. Ah-ha, yeah, computing offline is not surprising. Good news, because that is the only option for the sorts of parallelization we are considering via Hadoop. There is a notion of "Rescorer" in the code which allows for injecting arbitrary logic to re-rank recommendations. That maps to the "online personalization" part, and indeed I think that is useful to allow for some cheap, real-time logic to affect rankings, on top of recommendations computed offline.
