Great description, Sean. I would also throw in: have a look at http://lucene.apache.org/mahout/taste.html
and try out the Group Lens demo.
-Grant
On Mar 31, 2009, at 2:16 AM, Sean Owen wrote:
Sounds like a classic application of CF indeed, so, yes it is a fit.
The first step is to decide what your 'users' and 'items' are. Here
it is
clearly users, and articles. Then decide whether you have any notion
of
'likes', 'loves', 'dislikes' between users and articles, or whether
you
simply have an association or no association. Sounds like the latter -
unless users are rating articles?
I only have a sec to describe roughly how to begin with Mahout since
I am
away from a proper workstation but we can follow on later.
First make a file with 'userID,itemID' on each line. They can be
whatever
you like. Make a FileDataModel with this file. (Unless I am crazy
and never
submitted the change, should work with this input format - normally
there is
a third element per line, the preference value.
Then make a TanimotoCoefficientSimilarity with this model. Then make a
NearestNUserNeighborhood with these objects. Then make a
GenericUserBasedRecommender with these. Try calling recommend() and
see what
happens!
This basic setup can be further tweaked, customized and optimized
for your
domain but that is the basic approach.
On Mar 31, 2009 4:07 AM, "Joshua Bronson" <[email protected]> wrote:
I'm working on an experimental web-based feed reader[1], and in our
next
release we would like to feature collaborative filtering-based article
recommendation. For starters, articles will be recommended to you
based on
how similar they are to other articles that either you or people
you're
following have starred. I am just getting started reading up on
mahout and
the problem space in general[2], and thought I would inquire here
about
whether it would be a good choice for us.
Thanks!
Josh
P.S. Do you guys hang out in an IRC channel by any chance?
[1] http://melkjug.org, http://melkjug.openplans.org/about
[2] http://oreilly.com/catalog/9780596529321/
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search