As I said below RSJ is actually all that is needed. But with the entire 
recommender also integrated we can compare the two in the demo framework. For 
instance one of the lines of recs on a video detail page (the top one) is the 
actual RSJ output. When I get time, the recommend page will have a line of 
precalculated recs from the Mahout item recommender since those are already 
being generated. It will be interesting to see them side by side, could even 
form an A/B test around that if there were any traffic.

One thing I’ve noticed is that Solr recs are so much more flexible, especially 
when blended with metadata I can’t imagine wanting to go back to the old way. 
Even if the mahout precalculated recs were marginally better, the Solr method 
allows you to fill pages with recs biased in different ways. It’s almost like 
turning the catalog browser into one customized by the user’s preferences.

BTW dithering and anti-repeat/anti-flood are implemented on the recommend page. 
Dithering is done with varying lambdas, very high values are used on lists that 
change seldom, like “Recently Popular”.


On Apr 6, 2014, at 4:28 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

This can actually be simplified a bit by using ItemSimilarityJob to call
RowSimilarityJob.

Nice work overall.


On Sun, Apr 6, 2014 at 10:21 PM, Andrew Musselman <
andrew.mussel...@gmail.com> wrote:

> Pat, do you still want help putting this into a new mahout/examples, or
> work out how to do the distribution via "github pointer"?  There's an open
> bug for that.
> 
>> On Apr 6, 2014, at 1:13 PM, Sebastian Schelter <s...@apache.org> wrote:
>> 
>> The top 3 recommendations "based on videos you liked" are very good!
>> 
>> Nice job.
>> 
>> 
>>> On 04/06/2014 07:26 PM, Pat Ferrel wrote:
>>> After having integrated several versions of the Mahout and Myrrix
> recommenders at fairly large scale. I was interested in solving three
> problems that these did not directly provide for:
>>> 1) realtime queries for recs using data not yet incorporated into the
> training set. Myrrix allows this but Mahout using the hadoop mr version
> does not.
>>> 2) cross-recommendations from two or more action types (say purchase
> and detail-view)
>>> 3) blending metadata and user preference data to return recs (for
> example category & user preferences => recs)
>>> 
>>> Using Solr + Mahout provided an amazingly flexible and performant way
> to do this. Ted wrote about his experience with this basic approach in his
> recent book. Take user preferences, run them through RowSimilarityJob and
> you get an item by item similarity Matrix. This is the core of an
> item-based cooccurrence recommender. If you take the similarity matrix, and
> convert it into a list of tokens per row, you have something Solr can
> index. If you then use a user’s history as a query on the indexed data you
> get an ordered list of recommendations.
>>> 
>>> When I set out to do #1 and #3 the need for CF data AND metadata was
> the first problem. So I mined the web for video reviews and video metadata.
> Then logging any users who visit the site will lead to data for #2 and #1.
>>> 
>>> The demo site is https://guide.finderbots.com and instructions are at
> the end of this for anyone who would like to test it out. As a crude user
> test there is a procedure we ask you to follow to help gather quality of
> recommendations data. It’s running out of my closet over Comcast so if it’s
> down I may have tripped over a cord, sorry try again later.
>>> 
>>> There are a bunch of different methods for making recs illustrated on
> the site. One method that illustrates blending metadata uses preference
> data from you, and metadata to bias and filter recs. Imagine that you have
> trained the system with your preferences by making some video picks. Now
> imagine you’d like to get recommendations for Comedies from Neflix based on
> your previous video preferences. This is done with a single Solr query on
> indexed video fields that hold genre, similar videos (from the similarity
> matrix), and sources. The query finds similar videos to the ones you have
> liked, with the genre “Comedy” boosted by some amount, but only those that
> have at least one source = “Netflix”.
>>> 
>>> I’ll be doing some blog posts covering the specifics of how each rec
> type is done, the site and DB architecture, and Solr setup.
>>> 
>>> The project uses the Solr recommender prep code here:
> https://github.com/pferrel/solr-recommender
>>> 
>>> BTW I plan to publish obfuscated usage data in the github repo.
>>> 
>>> begin form letter =======================================
>>> 
>>> Please use a very newly updated browser (latest Firefox, Chrome,
> Safari, and nothing older than IE10) the site doesn’t yet check browser
> compatibility but relies on HTML5 and CSS3 rather heavily.
>>> 
>>> 1) go to https://guide.finderbots.com/users/sign_up to create an
> account
>>> 2) go to https://guide.finderbots.com/trainers to ’train' the
> recommender hit thumbs up on videos you like. There are 20 pages of
> training videos, you can leave at any time but if you can go through them
> all it would be appreciated.
>>> 3) go to https://guide.finderbots.com/guides/recommend to immediately
> get personalized recs from your training data. If you completed the trainer
> check the top line of recs, count how many are videos you liked or would
> like to see. Scroll right or left to see a total of 24 in four batches of
> 6. If you could report to me the total you thought were good recs it would
> be greatly appreciated.
>>> 4) browse videos by various criteria here:
> https://guide.finderbots.com/guides These are not recommendations, they
> are simply a catalog.
>>> 5) control how you browse videos by clicking the gears icon. You can
> set all videos to be from one or more sources here. If you choose Netflix
> alone (don’t forget to uncheck ‘all’) then recs and browsed videos will all
> be available on Netflix.
>> 
> 

Reply via email to