Great description, Sean. I would also throw in: have a look at http://lucene.apache.org/mahout/taste.html and try out the Group Lens demo.

-Grant

On Mar 31, 2009, at 2:16 AM, Sean Owen wrote:

Sounds like a classic application of CF indeed, so, yes it is a fit.

The first step is to decide what your 'users' and 'items' are. Here it is clearly users, and articles. Then decide whether you have any notion of 'likes', 'loves', 'dislikes' between users and articles, or whether you
simply have an association or no association. Sounds like the latter -
unless users are rating articles?

I only have a sec to describe roughly how to begin with Mahout since I am
away from a proper workstation but we can follow on later.

First make a file with 'userID,itemID' on each line. They can be whatever you like. Make a FileDataModel with this file. (Unless I am crazy and never submitted the change, should work with this input format - normally there is
a third element per line, the preference value.

Then make a TanimotoCoefficientSimilarity with this model. Then make a
NearestNUserNeighborhood with these objects. Then make a
GenericUserBasedRecommender with these. Try calling recommend() and see what
happens!

This basic setup can be further tweaked, customized and optimized for your
domain but that is the basic approach.

On Mar 31, 2009 4:07 AM, "Joshua Bronson" <[email protected]> wrote:

I'm working on an experimental web-based feed reader[1], and in our next
release we would like to feature collaborative filtering-based article
recommendation. For starters, articles will be recommended to you based on how similar they are to other articles that either you or people you're following have starred. I am just getting started reading up on mahout and the problem space in general[2], and thought I would inquire here about
whether it would be a good choice for us.
Thanks!
Josh

P.S. Do you guys hang out in an IRC channel by any chance?


[1] http://melkjug.org, http://melkjug.openplans.org/about
[2] http://oreilly.com/catalog/9780596529321/

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to