Hi,

i am currently implementing a system of the same kind, LLR sparsified
"term"-cooccurrence vectors in lucene (since not a day goes by where i see
Ted praising this).
There are not only views and purchases, but also search terms, facets and a
lot more textual information to be included in the cooccurrence matrix (as
"input").
That's why i went with the feature hashing framework in mahout. This gives
small (hd/mem) user profiles and allows for reusing the vectors for click
prediction and/or clustering. The main difference is that there's only two
fields in lucene with a lot of terms (Numbers), representing the features.
Two fields because i think predicting views (besides purchases) might in
some cases be better than predicting nothing.
I don't think it  should make a big differing in scoring because in a
vector space model used by most engines it's just, well a vector space and
i don't know if the field norm make sense after stripping values from the
term vectors with the LLR threshold.

@Ted
> It is handy to simply use the binary values of the sparsified versions of
>these and let the search engine handle the weighting of different
>components at query time.

Do you really want to omit the cooccurrence counts which would become the
term frequecies? how would the engine then weight different inputs against
each other?

And,
if anyone knows a
1. smarter way to index the cooccurrence counts in lucene than a
tokenstream that emits a word k times for a cooccurrence count of k
2. way to avoid treating the (hashed) vector column indices as terms but
reusing them? It's a bit weird hashing to an int and then having the lucene
term dictionary treating them as string, mapping to another int

I'd be thankful for input





On Sun, Feb 10, 2013 at 6:36 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Actually treating the different interactions separately can lead to very
> good recommendations.  The only issue is that the interactions are no
> longer dyadic.
>
> If you think about it, having two different kinds of interactions is like
> adjoining interaction matrices for the two different kinds of interaction.
>  Suppose that you have user x views in matrix A and you have user x
> purchases in matrix B.  The complete interaction matrix of user x (views +
> purchases) is [A | B].
>
> When you compute cooccurrence in this matrix, you get
>
>                [A | B] = [ A' ]           [ A' A  A' B ]
>       [A | B]' [A | B] = [    ] [A | B] = [            ]
>                [A | B] = [ B' ]           [ B' A  B' B ]
>
> This matrix is (view + purchase) x (view + purchase).  But we don't care
> about predicting views so we only really need a matrix that is purchase x
> (view
> + purchase).  This is just the bottom part of the matrix above, or [ B'A |
> B'B ].  When you produce purchase recommendations r_p by multiplying by a
> mixed view and purchase history vector h which has a view part h_v and a
> purchase part h_p, you get
>
>       r_p = [ B' A  B' B ] h = B'A h_v + B'B h_p
>
> That is a prediction of purchases based on past views and past purchase.
>
> Note that this general form applies for both decomposition techniques such
> as SVD, ALS and LLL as well as for sparsification techniques such as the
> LLR sparsification.  All that changes is the mechanics of how you do the
> multiplications.  Weighting of components works the same as well.
>
> What is very different here is that we have a component of cross
> recommendation.  That is the B'A in the formula above.  This is very
> different from a normal recommendation and cannot be computed with the
> simple self-join that we normally have in Mahout cooccurrence computation
> and also very different from the decompositions that we normally do.  It
> isn't hard to adapt the Mahout computations, however.
>
> When implementing a recommendation using a search engine as the base, these
> same techniques also work extremely well in my experience.  What happens is
> that for each item that you would like to recommend, you would have one
> field that has components of B'A and one field that has components of B'B.
>  It is handy to simply use the binary values of the sparsified versions of
> these and let the search engine handle the weighting of different
> components at query time.  Having these components separated into different
> fields in the search index seems to help quite a lot, which makes a fair
> bit of sense.
>
> On Sun, Feb 10, 2013 at 9:55 AM, Sean Owen <sro...@gmail.com> wrote:
> >
> > I think you'd have to hack the code to not exclude previously-seen items,
> > or at least, not of the type you wish to consider. Yes you would also
> have
> > to hack it to add rather than replace existing values. Or for test
> > purposes, just do the adding yourself before inputting the data.
> >
> > My hunch is that it will hurt non-trivially to treat different
> interaction
> > types as different items. You probably want to predict that someone who
> > viewed a product over and over is likely to buy it, but this would only
> > weakly tend to occur if the bought-item is not the same thing as the
> > viewed-item. You'd learn they go together but not as strongly as ought to
> > be obvious from the fact that they're the same. Still, interesting
> thought.
> >
> > There ought to be some 'signal' in this data, just a question of how much
> > vs noise. A purchase means much more than a page view of course; it's not
> > as subject to noise. Finding a means to use that info is probably going
> to
> > help.
> >
> >
> >
> >
> > On Sat, Feb 9, 2013 at 7:50 PM, Pat Ferrel <pat.fer...@gmail.com> wrote:
> >
> > > I'd like to experiment with using several types of implicit preference
> > > values with recommenders. I have purchases as an implicit pref of high
> > > strength. I'd like to see if add-to-cart, view-product-details,
> > > impressions-seen, etc. can increase offline precision in holdout tests.
> > > These less than obvious implicit prefs will get a much lower value than
> > > purchase and i'll experiment with different mixes. The problem is that
> some
> > > of these prefs will indicate that the user, for whom I'm getting recs,
> has
> > > expressed a preference.
> > >
> > > Using these implicit prefs seems reasonable in finding similarity of
> taste
> > > between users but presents several problems. 1) how to encode the
> prefs,
> > > each impression-seen will increase the strength of preference of a user
> for
> > > an item but the recommender framework replaces the preference value for
> > > items preferred more than once, doesn't it? 2) AFAIK the current
> > > recommender framework will return recs only for items that the user in
> > > question has expressed no preference for. If I use something like
> > > view-product-details or impressions-seen, I will be removing anything
> the
> > > user has seen from the recs, which is not what I want in this
> experiment.
> > >
> > > Has anyone tried something like this? I'm not convinced that these
> other
> > > implicit preferences will add anything to the recommender, just trying
> to
> > > find out.
>

Reply via email to