Part of the solr-recommender project is a cross-recommender based on Mahout. It
uses the mapreduce version of the RecommenderJob as a template and implements
an XRecommenderJob. Unfortunately the key part of the algorithm—the part
handled by RowSimilarityJob—is done with a simple matrix
Anonymizing the id's is a good start, especially if you have a relatively
small subset of the entire social graph and if the graph is publicly
visible in any case. If you have a complete crawl of the graph, then many
id's will recoverable by reference back to the public version of the graph.
Sinc
dicts
the held out follows with different precision.
Since all three actions (an a few others I can think of) have user IDs of the
primary user in common, they exist in the same user space and I can create a
cross recommender ensemble (see previous cross recommender emails) so R_f +
aR_fb +
h for "ACM hackathon" and you should see
> > it. Feel free to ping me off list with specific questions on that.
> >
> >
> > On Tue, Apr 16, 2013 at 10:29 AM, Ted Dunning
> > wrote:
> >
> > > Primary action can be emitting a search term. S
ith specific questions on that.
>
>
> On Tue, Apr 16, 2013 at 10:29 AM, Ted Dunning
> wrote:
>
> > Primary action can be emitting a search term. Secondary can be click to
> > view.
> >
> >
> > On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel
> wrote:
> >
>
16, 2013 at 10:29 AM, Ted Dunning wrote:
> Primary action can be emitting a search term. Secondary can be click to
> view.
>
>
> On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel wrote:
>
>> For the cross-recommender we need some replacement for a primary
>>
k to
> view.
>
>
> On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel wrote:
>
>> For the cross-recommender we need some replacement for a primary
>> action--purchases and a secondary action--views, clicks, impressions,
>> something.
>>
>> To use this da
6, 2013 at 4:53 PM, Pat Ferrel wrote:
>
> > For the cross-recommender we need some replacement for a primary
> > action--purchases and a secondary action--views, clicks, impressions,
> > something.
> >
> > To use this data we would treat clicks like a purchase--the pr
Primary action can be emitting a search term. Secondary can be click to
view.
On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel wrote:
> For the cross-recommender we need some replacement for a primary
> action--purchases and a secondary action--views, clicks, impressions,
> something.
&
For the cross-recommender we need some replacement for a primary
action--purchases and a secondary action--views, clicks, impressions, something.
To use this data we would treat clicks like a purchase--the primary action we
want to recommend. Then the search-result-item-impressions is like a
kathon-big/data
On Mon, Apr 15, 2013 at 2:03 PM, Pat Ferrel wrote:
> MAJOR may be too tame a word.
>
> Furthermore there are several enhancements the community could make to
> support retail data and retail recommenders. For one thing without public
> data a *public* cross-recommen
MAJOR may be too tame a word.
Furthermore there are several enhancements the community could make to support
retail data and retail recommenders. For one thing without public data a
*public* cross-recommender will probably not get built.
The cross-recommender needs to separate actions types
Definitely of MAJOR interest.
I am sure it would also draw all kinds of desired attention to your
business.
Movie Lens is way too small to be meaningful any more.
Wikipedia articles and Stackoverflow tags are not retail data!
By all means, post some real retail data, if you can.
Meaningful sizes wo
I asked management here a while ago whether there would be a problem with
releasing an anonymized set of data from one of our retail customers, and
didn't get too much push-back. If this is something that would be of
major interest, I can ask again and see whether there's something we can
put out
:
> That looks like the best shortcut. It is one of the few places where the
> rows of one and the columns of the other are seen together. Now I know why
> you transpose the first input :-)
>
> But, I have begun to wonder whether it is the right thing to do for a
> cross recommend
That looks like the best shortcut. It is one of the few places where the rows
of one and the columns of the other are seen together. Now I know why you
transpose the first input :-)
But, I have begun to wonder whether it is the right thing to do for a cross
recommender because you are
> Do I have to create a SimilarityJob( matrixB, matrixA, similarityType
) to get this or have I missed something already in Mahout?
It could be worth to investigate whether MatrixMultiplicationJob could
be extended to compute similarities instead of dot products.
Best,
Sebastian
Getting this running with co-occurrence rather than using a similarity calc on
user rows finally forced me to understand what is going on in the base
recommender. And the answer implies further work.
[B'B] is usually not calculated in the usual item based recommender. The matrix
that comes out
I have retail data but can't publish results from it. If I could get a public
sample I'd share how the technique worked out.
Not sure how to simulate this data. It has the important characteristic that
every purchase is also a view but not the other way around and Ted's technique
is a way to sc
Retail data may be hard to impossible, but one can improvise.
It seems to be fairly common to use Wikipedia articles (Myrrix, GraphLab).
Another idea is to use StackOverflow tags (Myrrix examples).
Although they are only good for emulating implicit feedback.
On Wed, Apr 10, 2013 at 6:48 PM, Ted D
On Wed, Apr 10, 2013 at 10:38 AM, Pat Ferrel wrote:
> Does anyone know of a public data set that provides things like views and
> purchases?
>
I don't.
BTW I have this working on trivial data and am in the process of measuring it's
results on some real world data. It does a lot with DistributedRowMatix and so
I'll be interested to see how it performs with a larger data set.
Does anyone know of a public data set that provides things like views
On Sat, Apr 6, 2013 at 3:26 PM, Pat Ferrel wrote:
> I guess I don't understand this issue.
>
> In my case both the item ids and user ids of the separate DistributedRow
> Matrix will match and I know the size for the entire space from a previous
> step where I create id maps. I suppose you are say
I need to do the equivalent of the xrecommender.mostSimilarItems(long[]
itemIDs, int howMany)
To over simplify this, in the standard Item-Based Recommender this is
equivalent to looking at the item similarities from the preference matrix
(similarity of item pruchases by user). In the xrecommen
I guess I don't understand this issue.
In my case both the item ids and user ids of the separate DistributedRow Matrix
will match and I know the size for the entire space from a previous step where
I create id maps. I suppose you are saying the the m/r code would be super
simple if a row of B'
Completely concur with that. MatrixMultiplicationJob is already using a
mapside merge-join AFAIK.
On 05.04.2013 15:04, Ted Dunning wrote:
> This may not quite be true because the RSJ is able to take some liberties.
>
> The origin of these is that A'A can be viewed as a self join. Thus as rows
This may not quite be true because the RSJ is able to take some liberties.
The origin of these is that A'A can be viewed as a self join. Thus as rows of
A are read, the cooccurrences can be emitted as they are read.
For B'A, we have to somehow get corresponding rows of A and B at the same time
On Apr 4, 2013, at 5:17 PM, Pat Ferrel wrote:
> One issue with the method below is that the two source matrices would not
> have values for all users or items (rows or columns). I do know the entire
> user and item id space from a previous step so I know the # of rows including
> blank ones an
inline
On Apr 3, 2013, at 6:15 PM, Pat Ferrel wrote:
> The non-symmetry of the [B'A] and the fact that it is calculated from two
> models leads me to a rather heavy handed approach at least for a first cut.
>
> Let me know if this seems right:
>
> //calculate the 'cross' co-occurrence matrix
top-k similarities for
item j in the j-th row. This means it is not symmetric.
I don't think you need to run RowSimilarityJob on B'A, I think you would
need an equivalent of RowSimilarityJob to compute B'A. I guess you could
extends the MatrixMultiplicationJob to use the similarit
27;A, I think you would
need an equivalent of RowSimilarityJob to compute B'A. I guess you could
extends the MatrixMultiplicationJob to use the similarity measures from
RowSimilarityJob instead of standard dot products.
I really like the idea of such a cross recommender.
On 03.04.2013 08:33, Te
eed to run RowSimilarityJob on B'A, I think you would
need an equivalent of RowSimilarityJob to compute B'A. I guess you could
extends the MatrixMultiplicationJob to use the similarity measures from
RowSimilarityJob instead of standard dot products.
I really like the idea of such a cross recommende
ilarity measure.
>
>
> On 02.04.2013 23:43, Pat Ferrel wrote:
> > Taking an idea from Ted, I'm working on a cross recommender starting
> from mahout's m/r implementation of an item-based recommender. We have
> purchases and views for items by user. It is straightforward
n use
o.a.m.math.hadoop.similarity.cooccurrence.measures.CooccurrenceCountSimilarity
as similarity measure.
On 02.04.2013 23:43, Pat Ferrel wrote:
> Taking an idea from Ted, I'm working on a cross recommender starting from
> mahout's m/r implementation of an item-based recommender. We have purch
Taking an idea from Ted, I'm working on a cross recommender starting from
mahout's m/r implementation of an item-based recommender. We have purchases and
views for items by user. It is straightforward to create a recommender on
purchases but using views as a predictor of purchases doe
013, at 6:47 AM, Pat Ferrel wrote:
To pick up an old thread…
A = views items x users
B = purchases items x users
A cross recommender B'A h_v + B'B h_p = r_p
The B'B h_p is the basic boolean mahout recommender trained on purchases and
we'll use that implementation I assum
To pick up an old thread…
A = views items x users
B = purchases items x users
A cross recommender B'A h_v + B'B h_p = r_p
The B'B h_p is the basic boolean mahout recommender trained on purchases and
we'll use that implementation I assume.
B'A gives cooccurrenc
37 matches
Mail list logo