@Pat. You described my situation very well. The only additional thing is that I am also interested in creating some sort of a profile from the user with all the information s/he has provided by interacting with the articles and not only recommending similar items (news) based on a specific input. Thus, that is why I thought using the output of RowSimilarityJob as the ItemSimilarity of a ItemBasedRecommender would behave as I want since I use Mahout dataModel to create that profile.
On Wed, Mar 5, 2014 at 3:40 PM, Pat Ferrel <p...@occamsmachete.com> wrote: > I am ignoring the rest of the thread because I suspect it may have gotten > off track. > > Your data is new articles, right? You would like to recommend from known > articles to any user based on an article they rate or even view. You have > no collaborative filtering data because the lifetime of a news article is > short and so there is not enough usage data to create a CF type > recommender. Is this a correct problem statement? If so I don't believe you > should be using a CF recommender from Mahout's collection. > > However you can use the Mahout text analysis pipeline to find all articles > that are similar to each other. In this case when a user views any article > in the training data you can show the most similar items precalculated with > RowSimilarityJob and the rest of the text prep jobs. The pipeline is > outlined here: > https://cwiki.apache.org/confluence/display/MAHOUT/Quick+tour+of+text+analysis+using+the+Mahout+command+line > > But this will only work for news articles already in the training data. > Another approach it to not use Mahout at all. Simply index all docs as they > come in with Solr. Then when a user rates or even views an article, even if > it has not been indexed yet, you can use the viewed article as the query on > the indexed articles and Solr will return articles ranked by similarity. > This is a content based recommender based solely on Solr. > > Does this describe your situation? > > > On Mar 4, 2014, at 1:16 AM, Juan José Ramos <jjar...@gmail.com> wrote: > > First thing is thatI know this requirement would not make sense in a CF > Recommender. In my case, I am trying to use Mahout to create something > closer to a Content-Based Recommender. > > In particular, I am pre-computing a similarity matrix between all the > documents (items) of my catalogue and using that matrix as the > ItemSimilarity for my Item-Based Recommender. > > So, when a user rates a document, how could I make the recommender outputs > similar documents to that ones the user has already rated even if no other > user in the system has rated them yet? Is that even possible in the first > place? > > Thanks a lot. > >