Cross Recommender Proposal

2014-01-23 Thread Pat Ferrel
Part of the solr-recommender project is a cross-recommender based on Mahout. It uses the mapreduce version of the RecommenderJob as a template and implements an XRecommenderJob. Unfortunately the key part of the algorithm—the part handled by RowSimilarityJob—is done with a simple matrix

Re: More Cross-recommender thoughts

2013-05-17 Thread Ted Dunning
Anonymizing the id's is a good start, especially if you have a relatively small subset of the entire social graph and if the graph is publicly visible in any case. If you have a complete crawl of the graph, then many id's will recoverable by reference back to the public version of the graph. Sinc

More Cross-recommender thoughts

2013-05-17 Thread Pat Ferrel
dicts the held out follows with different precision. Since all three actions (an a few others I can think of) have user IDs of the primary user in common, they exist in the same user space and I can create a cross recommender ensemble (see previous cross recommender emails) so R_f + aR_fb +

Re: cross recommender

2013-04-16 Thread Ted Dunning
h for "ACM hackathon" and you should see > > it. Feel free to ping me off list with specific questions on that. > > > > > > On Tue, Apr 16, 2013 at 10:29 AM, Ted Dunning > > wrote: > > > > > Primary action can be emitting a search term. S

Re: cross recommender

2013-04-16 Thread Nick Kolegraff
ith specific questions on that. > > > On Tue, Apr 16, 2013 at 10:29 AM, Ted Dunning > wrote: > > > Primary action can be emitting a search term. Secondary can be click to > > view. > > > > > > On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel > wrote: > > >

Re: cross recommender

2013-04-16 Thread Pat Ferrel
16, 2013 at 10:29 AM, Ted Dunning wrote: > Primary action can be emitting a search term. Secondary can be click to > view. > > > On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel wrote: > >> For the cross-recommender we need some replacement for a primary >>

Re: cross recommender

2013-04-16 Thread Pat Ferrel
k to > view. > > > On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel wrote: > >> For the cross-recommender we need some replacement for a primary >> action--purchases and a secondary action--views, clicks, impressions, >> something. >> >> To use this da

Re: cross recommender

2013-04-16 Thread Nick Kolegraff
6, 2013 at 4:53 PM, Pat Ferrel wrote: > > > For the cross-recommender we need some replacement for a primary > > action--purchases and a secondary action--views, clicks, impressions, > > something. > > > > To use this data we would treat clicks like a purchase--the pr

Re: cross recommender

2013-04-16 Thread Ted Dunning
Primary action can be emitting a search term. Secondary can be click to view. On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel wrote: > For the cross-recommender we need some replacement for a primary > action--purchases and a secondary action--views, clicks, impressions, > something. &

Re: cross recommender

2013-04-16 Thread Pat Ferrel
For the cross-recommender we need some replacement for a primary action--purchases and a secondary action--views, clicks, impressions, something. To use this data we would treat clicks like a purchase--the primary action we want to recommend. Then the search-result-item-impressions is like a

Re: cross recommender

2013-04-15 Thread Nick Kolegraff
kathon-big/data On Mon, Apr 15, 2013 at 2:03 PM, Pat Ferrel wrote: > MAJOR may be too tame a word. > > Furthermore there are several enhancements the community could make to > support retail data and retail recommenders. For one thing without public > data a *public* cross-recommen

Re: cross recommender

2013-04-15 Thread Pat Ferrel
MAJOR may be too tame a word. Furthermore there are several enhancements the community could make to support retail data and retail recommenders. For one thing without public data a *public* cross-recommender will probably not get built. The cross-recommender needs to separate actions types

Re: cross recommender

2013-04-15 Thread Koobas
Definitely of MAJOR interest. I am sure it would also draw all kinds of desired attention to your business. Movie Lens is way too small to be meaningful any more. Wikipedia articles and Stackoverflow tags are not retail data! By all means, post some real retail data, if you can. Meaningful sizes wo

Re: cross recommender

2013-04-15 Thread Robin Morris
I asked management here a while ago whether there would be a problem with releasing an anonymized set of data from one of our retail customers, and didn't get too much push-back. If this is something that would be of major interest, I can ask again and see whether there's something we can put out

Re: cross recommender

2013-04-12 Thread Ted Dunning
: > That looks like the best shortcut. It is one of the few places where the > rows of one and the columns of the other are seen together. Now I know why > you transpose the first input :-) > > But, I have begun to wonder whether it is the right thing to do for a > cross recommend

Re: cross recommender

2013-04-12 Thread Pat Ferrel
That looks like the best shortcut. It is one of the few places where the rows of one and the columns of the other are seen together. Now I know why you transpose the first input :-) But, I have begun to wonder whether it is the right thing to do for a cross recommender because you are

Re: cross recommender

2013-04-11 Thread Sebastian Schelter
> Do I have to create a SimilarityJob( matrixB, matrixA, similarityType ) to get this or have I missed something already in Mahout? It could be worth to investigate whether MatrixMultiplicationJob could be extended to compute similarities instead of dot products. Best, Sebastian

Re: cross recommender

2013-04-11 Thread Pat Ferrel
Getting this running with co-occurrence rather than using a similarity calc on user rows finally forced me to understand what is going on in the base recommender. And the answer implies further work. [B'B] is usually not calculated in the usual item based recommender. The matrix that comes out

Re: cross recommender

2013-04-10 Thread Pat Ferrel
I have retail data but can't publish results from it. If I could get a public sample I'd share how the technique worked out. Not sure how to simulate this data. It has the important characteristic that every purchase is also a view but not the other way around and Ted's technique is a way to sc

Re: cross recommender

2013-04-10 Thread Koobas
Retail data may be hard to impossible, but one can improvise. It seems to be fairly common to use Wikipedia articles (Myrrix, GraphLab). Another idea is to use StackOverflow tags (Myrrix examples). Although they are only good for emulating implicit feedback. On Wed, Apr 10, 2013 at 6:48 PM, Ted D

Re: cross recommender

2013-04-10 Thread Ted Dunning
On Wed, Apr 10, 2013 at 10:38 AM, Pat Ferrel wrote: > Does anyone know of a public data set that provides things like views and > purchases? > I don't.

Re: cross recommender

2013-04-10 Thread Pat Ferrel
BTW I have this working on trivial data and am in the process of measuring it's results on some real world data. It does a lot with DistributedRowMatix and so I'll be interested to see how it performs with a larger data set. Does anyone know of a public data set that provides things like views

Re: cross recommender

2013-04-08 Thread Ted Dunning
On Sat, Apr 6, 2013 at 3:26 PM, Pat Ferrel wrote: > I guess I don't understand this issue. > > In my case both the item ids and user ids of the separate DistributedRow > Matrix will match and I know the size for the entire space from a previous > step where I create id maps. I suppose you are say

Re: cross recommender

2013-04-06 Thread Pat Ferrel
I need to do the equivalent of the xrecommender.mostSimilarItems(long[] itemIDs, int howMany) To over simplify this, in the standard Item-Based Recommender this is equivalent to looking at the item similarities from the preference matrix (similarity of item pruchases by user). In the xrecommen

Re: cross recommender

2013-04-06 Thread Pat Ferrel
I guess I don't understand this issue. In my case both the item ids and user ids of the separate DistributedRow Matrix will match and I know the size for the entire space from a previous step where I create id maps. I suppose you are saying the the m/r code would be super simple if a row of B'

Re: cross recommender

2013-04-06 Thread Sebastian Schelter
Completely concur with that. MatrixMultiplicationJob is already using a mapside merge-join AFAIK. On 05.04.2013 15:04, Ted Dunning wrote: > This may not quite be true because the RSJ is able to take some liberties. > > The origin of these is that A'A can be viewed as a self join. Thus as rows

Re: cross recommender

2013-04-06 Thread Ted Dunning
This may not quite be true because the RSJ is able to take some liberties. The origin of these is that A'A can be viewed as a self join. Thus as rows of A are read, the cooccurrences can be emitted as they are read. For B'A, we have to somehow get corresponding rows of A and B at the same time

Re: cross recommender

2013-04-06 Thread Ted Dunning
On Apr 4, 2013, at 5:17 PM, Pat Ferrel wrote: > One issue with the method below is that the two source matrices would not > have values for all users or items (rows or columns). I do know the entire > user and item id space from a previous step so I know the # of rows including > blank ones an

Re: cross recommender

2013-04-06 Thread Ted Dunning
inline On Apr 3, 2013, at 6:15 PM, Pat Ferrel wrote: > The non-symmetry of the [B'A] and the fact that it is calculated from two > models leads me to a rather heavy handed approach at least for a first cut. > > Let me know if this seems right: > > //calculate the 'cross' co-occurrence matrix

Re: cross recommender

2013-04-04 Thread Pat Ferrel
top-k similarities for item j in the j-th row. This means it is not symmetric. I don't think you need to run RowSimilarityJob on B'A, I think you would need an equivalent of RowSimilarityJob to compute B'A. I guess you could extends the MatrixMultiplicationJob to use the similarit

Re: cross recommender

2013-04-03 Thread Pat Ferrel
27;A, I think you would need an equivalent of RowSimilarityJob to compute B'A. I guess you could extends the MatrixMultiplicationJob to use the similarity measures from RowSimilarityJob instead of standard dot products. I really like the idea of such a cross recommender. On 03.04.2013 08:33, Te

Re: cross recommender

2013-04-03 Thread Sebastian Schelter
eed to run RowSimilarityJob on B'A, I think you would need an equivalent of RowSimilarityJob to compute B'A. I guess you could extends the MatrixMultiplicationJob to use the similarity measures from RowSimilarityJob instead of standard dot products. I really like the idea of such a cross recommende

Re: cross recommender

2013-04-02 Thread Ted Dunning
ilarity measure. > > > On 02.04.2013 23:43, Pat Ferrel wrote: > > Taking an idea from Ted, I'm working on a cross recommender starting > from mahout's m/r implementation of an item-based recommender. We have > purchases and views for items by user. It is straightforward

Re: cross recommender

2013-04-02 Thread Sebastian Schelter
n use o.a.m.math.hadoop.similarity.cooccurrence.measures.CooccurrenceCountSimilarity as similarity measure. On 02.04.2013 23:43, Pat Ferrel wrote: > Taking an idea from Ted, I'm working on a cross recommender starting from > mahout's m/r implementation of an item-based recommender. We have purch

cross recommender

2013-04-02 Thread Pat Ferrel
Taking an idea from Ted, I'm working on a cross recommender starting from mahout's m/r implementation of an item-based recommender. We have purchases and views for items by user. It is straightforward to create a recommender on purchases but using views as a predictor of purchases doe

Re: [B'A] h_v cross recommender

2013-03-19 Thread Pat Ferrel
013, at 6:47 AM, Pat Ferrel wrote: To pick up an old thread… A = views items x users B = purchases items x users A cross recommender B'A h_v + B'B h_p = r_p The B'B h_p is the basic boolean mahout recommender trained on purchases and we'll use that implementation I assum

[B'A] h_v cross recommender

2013-03-19 Thread Pat Ferrel
To pick up an old thread… A = views items x users B = purchases items x users A cross recommender B'A h_v + B'B h_p = r_p The B'B h_p is the basic boolean mahout recommender trained on purchases and we'll use that implementation I assume. B'A gives cooccurrenc