Re: Shopping cart Recommender (was Item-set Recommender)

2014-04-27 Thread Ted Dunning
Yeah... this Amazon hack is just a way to do multi-modal recommendations
without actually having a framework capable of multi-modal operation.  It
is similar to taking a rating as equivalent to a fraction of a purchase.
 Neither has great standing and neither is necessary if you segregate the
action streams and use multi-modal methods.



On Sun, Apr 27, 2014 at 4:58 PM, Pat Ferrel  wrote:

> The literature has examples of building rules from item-sets, which seems
> pretty archaic. Also Amazon did a paper (2003?) on using individual items
> from the user’s cart then getting similar items and summing the weights to
> get ordering. Also seems wrong since the actions/user intent doesn’t really
> match. Notice that the method below does not make use of the userID, it is
> specific to an item-set ID so the user intent is narrowed. In other words
> it is not measuring taste (a long lived user trait)
>
> Was wondering if anyone has used the method below. I don’t have data for
> shopping carts at the present. The last time I did, we used the Amazon
> method but it always seemed wrong. The one good thing about it is you have
> purchase data very early on but may not have enough shopping carts for some
> time and if you don’t have enough traffic you may never get timely enough
> carts to make this work. In other words the catalog may turn over too
> quickly.
>
>
> On Apr 27, 2014, at 3:37 AM, Ted Dunning  wrote:
>
> In general, any action that can be detected in a user history can be an
> item (column) in the user history matrix.  If you find that there are
> item-sets that seem to occur together, then appearance of the entire
> item-set can be a reasonable feature to be assigned a column.  Somewhat
> more plausible is that you start to offer small packages of multiple items
> in a single order and you count browsing, interacting and buying these
> packages as different actions to be recorded.
>
> The general rule of thumb is that anything is a reasonable behavior to
> analyze if it is plausible of evidence about the users state of mind vis a
> vis the potential recommended actions.  Plausible initially means that a
> kinda sorta domain expert suggests the connection.  Plausible later means
> that the feature gets picked up as an indicator for some recommendations.
> If it never gets picked up, then it clearly isn't serving as a competitive
> piece of evidence about user intent.
>
>
>
>
> On Sat, Apr 26, 2014 at 6:46 PM, Pat Ferrel  wrote:
>
> > B = all item-sets gathered from user actions, actions like
> > purchased-together/shopping cart purchases, watchlists etc.
> > i = an item-set vector for a specific user
> >
> > B:
> > itemSetID, items
> > 1, iPad:iPad-case,stylus
> > 2, iPad:battery-booster:iPad-case
> >
> > [B’B]i = r_i, right?
> >
> > [B’B] would be an item-item cooccurrence similarity matrix taken from
> > item-set actions, calculated using LLR. The items-set IDs are not needed
> > anymore.
> >
> > This would imply that we could create an item-set indicator matrix, then
> > use a user’s item-set as the query to get back an ordered list taken from
> > cooccurrences in other items sets, rather than preference cooccurrences.
> >
> > So instead of summing similar items to each separate item in a shopping
> > cart to get an ordering of items to recommend (the way some people do
> > shopping cart recs) we could use the cooccurrence recommender to get
> these
> > directly from the items-sets. If the item-set is generated in near
> realtime
> > we’d need Solr (or some search engine) for the queries.
> >
> > The intuition being that things purchased together at the same time will
> > give you better shopping cart recs than using user preferences generally.
> > The item-sets often have something in common that user history will not
> > lead you to. I suppose you’d have to have a good size chunk of items-sets
> > to make it work.
> >
> > Does this make sense?
> >
> >
> >
>
>


Shopping cart Recommender (was Item-set Recommender)

2014-04-27 Thread Pat Ferrel
The literature has examples of building rules from item-sets, which seems 
pretty archaic. Also Amazon did a paper (2003?) on using individual items from 
the user’s cart then getting similar items and summing the weights to get 
ordering. Also seems wrong since the actions/user intent doesn’t really match. 
Notice that the method below does not make use of the userID, it is specific to 
an item-set ID so the user intent is narrowed. In other words it is not 
measuring taste (a long lived user trait)

Was wondering if anyone has used the method below. I don’t have data for 
shopping carts at the present. The last time I did, we used the Amazon method 
but it always seemed wrong. The one good thing about it is you have purchase 
data very early on but may not have enough shopping carts for some time and if 
you don’t have enough traffic you may never get timely enough carts to make 
this work. In other words the catalog may turn over too quickly.


On Apr 27, 2014, at 3:37 AM, Ted Dunning  wrote:

In general, any action that can be detected in a user history can be an
item (column) in the user history matrix.  If you find that there are
item-sets that seem to occur together, then appearance of the entire
item-set can be a reasonable feature to be assigned a column.  Somewhat
more plausible is that you start to offer small packages of multiple items
in a single order and you count browsing, interacting and buying these
packages as different actions to be recorded.

The general rule of thumb is that anything is a reasonable behavior to
analyze if it is plausible of evidence about the users state of mind vis a
vis the potential recommended actions.  Plausible initially means that a
kinda sorta domain expert suggests the connection.  Plausible later means
that the feature gets picked up as an indicator for some recommendations.
If it never gets picked up, then it clearly isn't serving as a competitive
piece of evidence about user intent.




On Sat, Apr 26, 2014 at 6:46 PM, Pat Ferrel  wrote:

> B = all item-sets gathered from user actions, actions like
> purchased-together/shopping cart purchases, watchlists etc.
> i = an item-set vector for a specific user
> 
> B:
> itemSetID, items
> 1, iPad:iPad-case,stylus
> 2, iPad:battery-booster:iPad-case
> 
> [B’B]i = r_i, right?
> 
> [B’B] would be an item-item cooccurrence similarity matrix taken from
> item-set actions, calculated using LLR. The items-set IDs are not needed
> anymore.
> 
> This would imply that we could create an item-set indicator matrix, then
> use a user’s item-set as the query to get back an ordered list taken from
> cooccurrences in other items sets, rather than preference cooccurrences.
> 
> So instead of summing similar items to each separate item in a shopping
> cart to get an ordering of items to recommend (the way some people do
> shopping cart recs) we could use the cooccurrence recommender to get these
> directly from the items-sets. If the item-set is generated in near realtime
> we’d need Solr (or some search engine) for the queries.
> 
> The intuition being that things purchased together at the same time will
> give you better shopping cart recs than using user preferences generally.
> The item-sets often have something in common that user history will not
> lead you to. I suppose you’d have to have a good size chunk of items-sets
> to make it work.
> 
> Does this make sense?
> 
> 
> 



Re: Shopping cart

2013-02-15 Thread Ted Dunning
Ahh... I understand now.

To rephrase what I understand your problem to be is that you basically want
to use the cart as the user and recommend based on a history composed of
items already in the cart.

That should work just fine.

I would recommend combining that with user level recommendations by using
search abuse.  This would give you two levels of recommendations in one
step.  These levels would be cart level recommendations and (presumably
weaker) user level recommendations.

Similarly, you could cross-recommend from item meta-data to item purchases
or to purchased item meta-data.  This last would be tricky to do in a
single search engine request.

On Thu, Feb 14, 2013 at 8:41 PM, Pat Ferrel  wrote:

> Sure, we have cart/session IDs, items IDs, and user IDs when purchases are
> made or when asked for a recommendation from the cart page.
>
> We currently don't get the add-to or remove-from cart actions. We could
> get them.
>
> Are you thinking that we can use the add-to-cart user x item matrix and
> purchase user x item matrix to get purchase recs from add-to-cart actions?
> Interesting idea. This could be combined with the purchase recs from
> show-details to get even better recs given an item context.
>
> In this case I was looking for a way to do use the purchases cart x item
> matrix to get recommendations by finding the cart in the training matrix
> most similar to the cart contents at runtime. In which case we have to use
> the current cart contents as the query to find the most similar cart in the
> matrix, then return recs for that cart. At least that was what I was
> thinking.
>
> On Feb 14, 2013, at 6:09 PM, Ted Dunning  wrote:
>
> Do you see the contents of the cart?
>
> Is the cart ID opaque?  Does it persist as a surrogate for a user?
>
> On Thu, Feb 14, 2013 at 10:30 AM, Pat Ferrel  wrote:
>
> > I thought you might say that but we don't have the add-to-cart action. We
> > have to calculate cart purchases by matching cart IDs or session IDs. So
> we
> > only have cart purchases with items.
> >
> > If we had the add-to-cart and the purchase we could use your cross-action
> > method for getting recs by training only on those two actions.
> >
> > Still without the add-to-cart the method below should work, right? The
> > main problem being finding a similar cart in the training set quickly.
> Are
> > there other problems?
> >
> > On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
> >
> > I think that this is an excellent use case for cross recommendation from
> > cart contents (items) to cart purchases (items).  The cross aspect is
> that
> > the recommendation is from two different kinds of actions, not two kinds
> of
> > things.  The first action is insertion into a cart and the second is
> > purchase of an item.
> >
> > On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel 
> wrote:
> >
> >> There are several methods for recommending things given a shopping cart
> >> contents. At the risk of using the same tool for every problem I was
> >> thinking about a recommender's use here.
> >>
> >> I'd do something like train on shopping cart purchases so row = cartID,
> >> column = itemID.
> >> Given cart contents I could find the most similar cart in the training
> > set
> >> by using a similarity measure then get recs for this closest matched
> > cart.
> >>
> >> The search for similar carts may be slow if I have to check for pairwise
> >> similarity so I could cluster and find the best cluster then search it
> > for
> >> the best cart. I could create a decision tree on all trained carts and
> > walk
> >> as far as I can down the tree to find the cart with the most
> > cooccurrences.
> >> There may be other cooccurrence based methods in mahout??? With the id
> of
> >> the cart I can then get recs from the training set. I could also fold-in
> >> the new cart contents to the training set and ask for recs based on it
> >> (this seems like it would take a long time to compute). This last would
> >> also pollute the trained matrix with partial carts over time.
> >>
> >> This seems like another place where Lucene might help but are there
> other
> >> mahout methods to look at before I diving into Lucene?
> >
> >
>
>


Re: Shopping cart

2013-02-14 Thread Pat Ferrel
Sure, we have cart/session IDs, items IDs, and user IDs when purchases are made 
or when asked for a recommendation from the cart page.

We currently don't get the add-to or remove-from cart actions. We could get 
them.

Are you thinking that we can use the add-to-cart user x item matrix and 
purchase user x item matrix to get purchase recs from add-to-cart actions? 
Interesting idea. This could be combined with the purchase recs from 
show-details to get even better recs given an item context.

In this case I was looking for a way to do use the purchases cart x item matrix 
to get recommendations by finding the cart in the training matrix most similar 
to the cart contents at runtime. In which case we have to use the current cart 
contents as the query to find the most similar cart in the matrix, then return 
recs for that cart. At least that was what I was thinking.

On Feb 14, 2013, at 6:09 PM, Ted Dunning  wrote:

Do you see the contents of the cart?

Is the cart ID opaque?  Does it persist as a surrogate for a user?

On Thu, Feb 14, 2013 at 10:30 AM, Pat Ferrel  wrote:

> I thought you might say that but we don't have the add-to-cart action. We
> have to calculate cart purchases by matching cart IDs or session IDs. So we
> only have cart purchases with items.
> 
> If we had the add-to-cart and the purchase we could use your cross-action
> method for getting recs by training only on those two actions.
> 
> Still without the add-to-cart the method below should work, right? The
> main problem being finding a similar cart in the training set quickly. Are
> there other problems?
> 
> On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
> 
> I think that this is an excellent use case for cross recommendation from
> cart contents (items) to cart purchases (items).  The cross aspect is that
> the recommendation is from two different kinds of actions, not two kinds of
> things.  The first action is insertion into a cart and the second is
> purchase of an item.
> 
> On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel  wrote:
> 
>> There are several methods for recommending things given a shopping cart
>> contents. At the risk of using the same tool for every problem I was
>> thinking about a recommender's use here.
>> 
>> I'd do something like train on shopping cart purchases so row = cartID,
>> column = itemID.
>> Given cart contents I could find the most similar cart in the training
> set
>> by using a similarity measure then get recs for this closest matched
> cart.
>> 
>> The search for similar carts may be slow if I have to check for pairwise
>> similarity so I could cluster and find the best cluster then search it
> for
>> the best cart. I could create a decision tree on all trained carts and
> walk
>> as far as I can down the tree to find the cart with the most
> cooccurrences.
>> There may be other cooccurrence based methods in mahout??? With the id of
>> the cart I can then get recs from the training set. I could also fold-in
>> the new cart contents to the training set and ask for recs based on it
>> (this seems like it would take a long time to compute). This last would
>> also pollute the trained matrix with partial carts over time.
>> 
>> This seems like another place where Lucene might help but are there other
>> mahout methods to look at before I diving into Lucene?
> 
> 



Re: Shopping cart

2013-02-14 Thread Ted Dunning
Do you see the contents of the cart?

Is the cart ID opaque?  Does it persist as a surrogate for a user?

On Thu, Feb 14, 2013 at 10:30 AM, Pat Ferrel  wrote:

> I thought you might say that but we don't have the add-to-cart action. We
> have to calculate cart purchases by matching cart IDs or session IDs. So we
> only have cart purchases with items.
>
> If we had the add-to-cart and the purchase we could use your cross-action
> method for getting recs by training only on those two actions.
>
> Still without the add-to-cart the method below should work, right? The
> main problem being finding a similar cart in the training set quickly. Are
> there other problems?
>
> On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
>
> I think that this is an excellent use case for cross recommendation from
> cart contents (items) to cart purchases (items).  The cross aspect is that
> the recommendation is from two different kinds of actions, not two kinds of
> things.  The first action is insertion into a cart and the second is
> purchase of an item.
>
> On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel  wrote:
>
> > There are several methods for recommending things given a shopping cart
> > contents. At the risk of using the same tool for every problem I was
> > thinking about a recommender's use here.
> >
> > I'd do something like train on shopping cart purchases so row = cartID,
> > column = itemID.
> > Given cart contents I could find the most similar cart in the training
> set
> > by using a similarity measure then get recs for this closest matched
> cart.
> >
> > The search for similar carts may be slow if I have to check for pairwise
> > similarity so I could cluster and find the best cluster then search it
> for
> > the best cart. I could create a decision tree on all trained carts and
> walk
> > as far as I can down the tree to find the cart with the most
> cooccurrences.
> > There may be other cooccurrence based methods in mahout??? With the id of
> > the cart I can then get recs from the training set. I could also fold-in
> > the new cart contents to the training set and ask for recs based on it
> > (this seems like it would take a long time to compute). This last would
> > also pollute the trained matrix with partial carts over time.
> >
> > This seems like another place where Lucene might help but are there other
> > mahout methods to look at before I diving into Lucene?
>
>


Re: Shopping cart

2013-02-14 Thread Sean Owen
Yes your only issue there, which I think you had touched on, was that you
have to put your current cart (which hasn't been purchased) into the model
in order to get an answer out of a recommender. I think we've talked about
the recommend-to-anonymous function in the context of another system, which
is exactly what you need here.

Yes, all you have to do then is reproduce the recommender computation. But
I understand that you were hoping to avoid rewriting it. It's really just a
loop though, so not much work to reproduce.

100K items x a few items in a cart is a few hundred thousand similarities.
This isn't trivial but not going to take seconds, I think. Yes this gets
much faster if you can precompute item-item similarity. Computing NxN pairs
is going to take a long time though when N=100,000. So yes something like
clustering is the nice way to scale that. Then your clusters greatly limit
the number of candidates to consider because you can round every other
inter-cluster similarity to 0.

By this point... I imagine it's about as hard to whip up a frequent itemset
implementation! or crib one and adapt it. This is in mahout. That's
probably the right tool for the job.



On Thu, Feb 14, 2013 at 8:19 PM, Pat Ferrel  wrote:

> I'm creating a matrix of cart ids and items ids so cart x items in cart.
> The 'preference' then is cartID, itemID. This will create the correct
> matrix I think.
>
> For any cart id I would get a ranked list of recommended items that was
> calculated from other carts. This seems like what is needed in a SC
> recommender. So doing this should give a "recommend to this collection of
> items", right?
>
> The only issue is finding the best cart to get the recs. I would be doing
> a pair-wise similarity comparison for N carts to the current cart contents
> and the result would have to come back in a very short amount of time, on
> the order of the time to get recs for 3M users and 100K items.
>
> Not sure what N is yet but the # of items is the same as in the purchase
> matrix. So finding the best cart to get recs for will be N similarity
> comparisons--worst case. Each cart is likely to have only a few items in it
> and I imagine this speeds the similarity calc.
>
> I guess I'll try it as described and optimize for speed if the precision
> is good compared to the apriori algo.
>
> On Feb 14, 2013, at 10:57 AM, Sean Owen  wrote:
>
> I don't think it's necessarily slow; this is how item-based recommenders
> work. The only thing stopping you from using Mahout directly is that I
> don't think there's an easy way to say "recommend to this collection of
> items". But that's what is happening inside when you recommend for a user.
>
> You can just roll your own version of it. Yes you are computing similarity
> for k carted items  by all N items, but is N so large? hundreds of
> thousands of products? this is still likely pretty fast even if the
> similarity is over millions of carts. Some smart precomputation and caching
> goes a long way too.
>
>
> On Thu, Feb 14, 2013 at 7:10 PM, Pat Ferrel  wrote:
>
> > Yes, one time tested way to do this is the "apriori" algo which looks at
> > frequent item sets and creates rules.
> >
> > I was looking for a shortcut using a recommender, which would be super
> > easy to try. The rule builder is a little harder to implement but we can
> > also test precision on that and compare the two.
> >
> > The recommender method below should be reasonable AFAICT except for the
> > method(s) of retrieving recs, which seem likely to be slow.
> >
> > On Feb 14, 2013, at 9:45 AM, Sean Owen  wrote:
> >
> > This sounds like a job for frequent item set mining, which is kind of a
> > special case of the ideas you've mentioned here. Given N items in a cart,
> > which next item most frequently occurs in a purchased cart?
> >
> >
> > On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel 
> wrote:
> >
> >> I thought you might say that but we don't have the add-to-cart action.
> We
> >> have to calculate cart purchases by matching cart IDs or session IDs. So
> > we
> >> only have cart purchases with items.
> >>
> >> If we had the add-to-cart and the purchase we could use your
> cross-action
> >> method for getting recs by training only on those two actions.
> >>
> >> Still without the add-to-cart the method below should work, right? The
> >> main problem being finding a similar cart in the training set quickly.
> > Are
> >> there other problems?
> >>
> >> On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
> >>
> >> 

Re: Shopping cart

2013-02-14 Thread Pat Ferrel
I'm creating a matrix of cart ids and items ids so cart x items in cart. The 
'preference' then is cartID, itemID. This will create the correct matrix I 
think.

For any cart id I would get a ranked list of recommended items that was 
calculated from other carts. This seems like what is needed in a SC 
recommender. So doing this should give a "recommend to this collection of 
items", right?

The only issue is finding the best cart to get the recs. I would be doing a 
pair-wise similarity comparison for N carts to the current cart contents and 
the result would have to come back in a very short amount of time, on the order 
of the time to get recs for 3M users and 100K items.

Not sure what N is yet but the # of items is the same as in the purchase 
matrix. So finding the best cart to get recs for will be N similarity 
comparisons--worst case. Each cart is likely to have only a few items in it and 
I imagine this speeds the similarity calc.

I guess I'll try it as described and optimize for speed if the precision is 
good compared to the apriori algo.

On Feb 14, 2013, at 10:57 AM, Sean Owen  wrote:

I don't think it's necessarily slow; this is how item-based recommenders
work. The only thing stopping you from using Mahout directly is that I
don't think there's an easy way to say "recommend to this collection of
items". But that's what is happening inside when you recommend for a user.

You can just roll your own version of it. Yes you are computing similarity
for k carted items  by all N items, but is N so large? hundreds of
thousands of products? this is still likely pretty fast even if the
similarity is over millions of carts. Some smart precomputation and caching
goes a long way too.


On Thu, Feb 14, 2013 at 7:10 PM, Pat Ferrel  wrote:

> Yes, one time tested way to do this is the "apriori" algo which looks at
> frequent item sets and creates rules.
> 
> I was looking for a shortcut using a recommender, which would be super
> easy to try. The rule builder is a little harder to implement but we can
> also test precision on that and compare the two.
> 
> The recommender method below should be reasonable AFAICT except for the
> method(s) of retrieving recs, which seem likely to be slow.
> 
> On Feb 14, 2013, at 9:45 AM, Sean Owen  wrote:
> 
> This sounds like a job for frequent item set mining, which is kind of a
> special case of the ideas you've mentioned here. Given N items in a cart,
> which next item most frequently occurs in a purchased cart?
> 
> 
> On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel  wrote:
> 
>> I thought you might say that but we don't have the add-to-cart action. We
>> have to calculate cart purchases by matching cart IDs or session IDs. So
> we
>> only have cart purchases with items.
>> 
>> If we had the add-to-cart and the purchase we could use your cross-action
>> method for getting recs by training only on those two actions.
>> 
>> Still without the add-to-cart the method below should work, right? The
>> main problem being finding a similar cart in the training set quickly.
> Are
>> there other problems?
>> 
>> On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
>> 
>> I think that this is an excellent use case for cross recommendation from
>> cart contents (items) to cart purchases (items).  The cross aspect is
> that
>> the recommendation is from two different kinds of actions, not two kinds
> of
>> things.  The first action is insertion into a cart and the second is
>> purchase of an item.
>> 
>> On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel 
> wrote:
>> 
>>> There are several methods for recommending things given a shopping cart
>>> contents. At the risk of using the same tool for every problem I was
>>> thinking about a recommender's use here.
>>> 
>>> I'd do something like train on shopping cart purchases so row = cartID,
>>> column = itemID.
>>> Given cart contents I could find the most similar cart in the training
>> set
>>> by using a similarity measure then get recs for this closest matched
>> cart.
>>> 
>>> The search for similar carts may be slow if I have to check for pairwise
>>> similarity so I could cluster and find the best cluster then search it
>> for
>>> the best cart. I could create a decision tree on all trained carts and
>> walk
>>> as far as I can down the tree to find the cart with the most
>> cooccurrences.
>>> There may be other cooccurrence based methods in mahout??? With the id
> of
>>> the cart I can then get recs from the training set. I could also fold-in
>>> the new cart contents to the training set and ask for recs based on it
>>> (this seems like it would take a long time to compute). This last would
>>> also pollute the trained matrix with partial carts over time.
>>> 
>>> This seems like another place where Lucene might help but are there
> other
>>> mahout methods to look at before I diving into Lucene?
>> 
>> 
> 
> 



Re: Shopping cart

2013-02-14 Thread Sean Owen
I don't think it's necessarily slow; this is how item-based recommenders
work. The only thing stopping you from using Mahout directly is that I
don't think there's an easy way to say "recommend to this collection of
items". But that's what is happening inside when you recommend for a user.

You can just roll your own version of it. Yes you are computing similarity
for k carted items  by all N items, but is N so large? hundreds of
thousands of products? this is still likely pretty fast even if the
similarity is over millions of carts. Some smart precomputation and caching
goes a long way too.


On Thu, Feb 14, 2013 at 7:10 PM, Pat Ferrel  wrote:

> Yes, one time tested way to do this is the "apriori" algo which looks at
> frequent item sets and creates rules.
>
> I was looking for a shortcut using a recommender, which would be super
> easy to try. The rule builder is a little harder to implement but we can
> also test precision on that and compare the two.
>
> The recommender method below should be reasonable AFAICT except for the
> method(s) of retrieving recs, which seem likely to be slow.
>
> On Feb 14, 2013, at 9:45 AM, Sean Owen  wrote:
>
> This sounds like a job for frequent item set mining, which is kind of a
> special case of the ideas you've mentioned here. Given N items in a cart,
> which next item most frequently occurs in a purchased cart?
>
>
> On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel  wrote:
>
> > I thought you might say that but we don't have the add-to-cart action. We
> > have to calculate cart purchases by matching cart IDs or session IDs. So
> we
> > only have cart purchases with items.
> >
> > If we had the add-to-cart and the purchase we could use your cross-action
> > method for getting recs by training only on those two actions.
> >
> > Still without the add-to-cart the method below should work, right? The
> > main problem being finding a similar cart in the training set quickly.
> Are
> > there other problems?
> >
> > On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
> >
> > I think that this is an excellent use case for cross recommendation from
> > cart contents (items) to cart purchases (items).  The cross aspect is
> that
> > the recommendation is from two different kinds of actions, not two kinds
> of
> > things.  The first action is insertion into a cart and the second is
> > purchase of an item.
> >
> > On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel 
> wrote:
> >
> >> There are several methods for recommending things given a shopping cart
> >> contents. At the risk of using the same tool for every problem I was
> >> thinking about a recommender's use here.
> >>
> >> I'd do something like train on shopping cart purchases so row = cartID,
> >> column = itemID.
> >> Given cart contents I could find the most similar cart in the training
> > set
> >> by using a similarity measure then get recs for this closest matched
> > cart.
> >>
> >> The search for similar carts may be slow if I have to check for pairwise
> >> similarity so I could cluster and find the best cluster then search it
> > for
> >> the best cart. I could create a decision tree on all trained carts and
> > walk
> >> as far as I can down the tree to find the cart with the most
> > cooccurrences.
> >> There may be other cooccurrence based methods in mahout??? With the id
> of
> >> the cart I can then get recs from the training set. I could also fold-in
> >> the new cart contents to the training set and ask for recs based on it
> >> (this seems like it would take a long time to compute). This last would
> >> also pollute the trained matrix with partial carts over time.
> >>
> >> This seems like another place where Lucene might help but are there
> other
> >> mahout methods to look at before I diving into Lucene?
> >
> >
>
>


Re: Shopping cart

2013-02-14 Thread Pat Ferrel
Yes, one time tested way to do this is the "apriori" algo which looks at 
frequent item sets and creates rules. 

I was looking for a shortcut using a recommender, which would be super easy to 
try. The rule builder is a little harder to implement but we can also test 
precision on that and compare the two.

The recommender method below should be reasonable AFAICT except for the 
method(s) of retrieving recs, which seem likely to be slow.

On Feb 14, 2013, at 9:45 AM, Sean Owen  wrote:

This sounds like a job for frequent item set mining, which is kind of a
special case of the ideas you've mentioned here. Given N items in a cart,
which next item most frequently occurs in a purchased cart?


On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel  wrote:

> I thought you might say that but we don't have the add-to-cart action. We
> have to calculate cart purchases by matching cart IDs or session IDs. So we
> only have cart purchases with items.
> 
> If we had the add-to-cart and the purchase we could use your cross-action
> method for getting recs by training only on those two actions.
> 
> Still without the add-to-cart the method below should work, right? The
> main problem being finding a similar cart in the training set quickly. Are
> there other problems?
> 
> On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
> 
> I think that this is an excellent use case for cross recommendation from
> cart contents (items) to cart purchases (items).  The cross aspect is that
> the recommendation is from two different kinds of actions, not two kinds of
> things.  The first action is insertion into a cart and the second is
> purchase of an item.
> 
> On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel  wrote:
> 
>> There are several methods for recommending things given a shopping cart
>> contents. At the risk of using the same tool for every problem I was
>> thinking about a recommender's use here.
>> 
>> I'd do something like train on shopping cart purchases so row = cartID,
>> column = itemID.
>> Given cart contents I could find the most similar cart in the training
> set
>> by using a similarity measure then get recs for this closest matched
> cart.
>> 
>> The search for similar carts may be slow if I have to check for pairwise
>> similarity so I could cluster and find the best cluster then search it
> for
>> the best cart. I could create a decision tree on all trained carts and
> walk
>> as far as I can down the tree to find the cart with the most
> cooccurrences.
>> There may be other cooccurrence based methods in mahout??? With the id of
>> the cart I can then get recs from the training set. I could also fold-in
>> the new cart contents to the training set and ask for recs based on it
>> (this seems like it would take a long time to compute). This last would
>> also pollute the trained matrix with partial carts over time.
>> 
>> This seems like another place where Lucene might help but are there other
>> mahout methods to look at before I diving into Lucene?
> 
> 



Re: Shopping cart

2013-02-14 Thread Sean Owen
This sounds like a job for frequent item set mining, which is kind of a
special case of the ideas you've mentioned here. Given N items in a cart,
which next item most frequently occurs in a purchased cart?


On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel  wrote:

> I thought you might say that but we don't have the add-to-cart action. We
> have to calculate cart purchases by matching cart IDs or session IDs. So we
> only have cart purchases with items.
>
> If we had the add-to-cart and the purchase we could use your cross-action
> method for getting recs by training only on those two actions.
>
> Still without the add-to-cart the method below should work, right? The
> main problem being finding a similar cart in the training set quickly. Are
> there other problems?
>
> On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:
>
> I think that this is an excellent use case for cross recommendation from
> cart contents (items) to cart purchases (items).  The cross aspect is that
> the recommendation is from two different kinds of actions, not two kinds of
> things.  The first action is insertion into a cart and the second is
> purchase of an item.
>
> On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel  wrote:
>
> > There are several methods for recommending things given a shopping cart
> > contents. At the risk of using the same tool for every problem I was
> > thinking about a recommender's use here.
> >
> > I'd do something like train on shopping cart purchases so row = cartID,
> > column = itemID.
> > Given cart contents I could find the most similar cart in the training
> set
> > by using a similarity measure then get recs for this closest matched
> cart.
> >
> > The search for similar carts may be slow if I have to check for pairwise
> > similarity so I could cluster and find the best cluster then search it
> for
> > the best cart. I could create a decision tree on all trained carts and
> walk
> > as far as I can down the tree to find the cart with the most
> cooccurrences.
> > There may be other cooccurrence based methods in mahout??? With the id of
> > the cart I can then get recs from the training set. I could also fold-in
> > the new cart contents to the training set and ask for recs based on it
> > (this seems like it would take a long time to compute). This last would
> > also pollute the trained matrix with partial carts over time.
> >
> > This seems like another place where Lucene might help but are there other
> > mahout methods to look at before I diving into Lucene?
>
>


Re: Shopping cart

2013-02-14 Thread Pat Ferrel
I thought you might say that but we don't have the add-to-cart action. We have 
to calculate cart purchases by matching cart IDs or session IDs. So we only 
have cart purchases with items.

If we had the add-to-cart and the purchase we could use your cross-action 
method for getting recs by training only on those two actions.

Still without the add-to-cart the method below should work, right? The main 
problem being finding a similar cart in the training set quickly. Are there 
other problems?

On Feb 14, 2013, at 9:19 AM, Ted Dunning  wrote:

I think that this is an excellent use case for cross recommendation from
cart contents (items) to cart purchases (items).  The cross aspect is that
the recommendation is from two different kinds of actions, not two kinds of
things.  The first action is insertion into a cart and the second is
purchase of an item.

On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel  wrote:

> There are several methods for recommending things given a shopping cart
> contents. At the risk of using the same tool for every problem I was
> thinking about a recommender's use here.
> 
> I'd do something like train on shopping cart purchases so row = cartID,
> column = itemID.
> Given cart contents I could find the most similar cart in the training set
> by using a similarity measure then get recs for this closest matched cart.
> 
> The search for similar carts may be slow if I have to check for pairwise
> similarity so I could cluster and find the best cluster then search it for
> the best cart. I could create a decision tree on all trained carts and walk
> as far as I can down the tree to find the cart with the most cooccurrences.
> There may be other cooccurrence based methods in mahout??? With the id of
> the cart I can then get recs from the training set. I could also fold-in
> the new cart contents to the training set and ask for recs based on it
> (this seems like it would take a long time to compute). This last would
> also pollute the trained matrix with partial carts over time.
> 
> This seems like another place where Lucene might help but are there other
> mahout methods to look at before I diving into Lucene?



Re: Shopping cart

2013-02-14 Thread Ted Dunning
I think that this is an excellent use case for cross recommendation from
cart contents (items) to cart purchases (items).  The cross aspect is that
the recommendation is from two different kinds of actions, not two kinds of
things.  The first action is insertion into a cart and the second is
purchase of an item.

On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel  wrote:

> There are several methods for recommending things given a shopping cart
> contents. At the risk of using the same tool for every problem I was
> thinking about a recommender's use here.
>
> I'd do something like train on shopping cart purchases so row = cartID,
> column = itemID.
> Given cart contents I could find the most similar cart in the training set
> by using a similarity measure then get recs for this closest matched cart.
>
> The search for similar carts may be slow if I have to check for pairwise
> similarity so I could cluster and find the best cluster then search it for
> the best cart. I could create a decision tree on all trained carts and walk
> as far as I can down the tree to find the cart with the most cooccurrences.
> There may be other cooccurrence based methods in mahout??? With the id of
> the cart I can then get recs from the training set. I could also fold-in
> the new cart contents to the training set and ask for recs based on it
> (this seems like it would take a long time to compute). This last would
> also pollute the trained matrix with partial carts over time.
>
> This seems like another place where Lucene might help but are there other
> mahout methods to look at before I diving into Lucene?


Shopping cart

2013-02-14 Thread Pat Ferrel
There are several methods for recommending things given a shopping cart 
contents. At the risk of using the same tool for every problem I was thinking 
about a recommender's use here.

I'd do something like train on shopping cart purchases so row = cartID, column 
= itemID.
Given cart contents I could find the most similar cart in the training set by 
using a similarity measure then get recs for this closest matched cart.

The search for similar carts may be slow if I have to check for pairwise 
similarity so I could cluster and find the best cluster then search it for the 
best cart. I could create a decision tree on all trained carts and walk as far 
as I can down the tree to find the cart with the most cooccurrences. There may 
be other cooccurrence based methods in mahout??? With the id of the cart I can 
then get recs from the training set. I could also fold-in the new cart contents 
to the training set and ask for recs based on it (this seems like it would take 
a long time to compute). This last would also pollute the trained matrix with 
partial carts over time.

This seems like another place where Lucene might help but are there other 
mahout methods to look at before I diving into Lucene?