I'm creating a matrix of cart ids and items ids so cart x items in cart. The 
'preference' then is cartID, itemID. This will create the correct matrix I 
think.

For any cart id I would get a ranked list of recommended items that was 
calculated from other carts. This seems like what is needed in a SC 
recommender. So doing this should give a "recommend to this collection of 
items", right?

The only issue is finding the best cart to get the recs. I would be doing a 
pair-wise similarity comparison for N carts to the current cart contents and 
the result would have to come back in a very short amount of time, on the order 
of the time to get recs for 3M users and 100K items.

Not sure what N is yet but the # of items is the same as in the purchase 
matrix. So finding the best cart to get recs for will be N similarity 
comparisons--worst case. Each cart is likely to have only a few items in it and 
I imagine this speeds the similarity calc.

I guess I'll try it as described and optimize for speed if the precision is 
good compared to the apriori algo.

On Feb 14, 2013, at 10:57 AM, Sean Owen <sro...@gmail.com> wrote:

I don't think it's necessarily slow; this is how item-based recommenders
work. The only thing stopping you from using Mahout directly is that I
don't think there's an easy way to say "recommend to this collection of
items". But that's what is happening inside when you recommend for a user.

You can just roll your own version of it. Yes you are computing similarity
for k carted items  by all N items, but is N so large? hundreds of
thousands of products? this is still likely pretty fast even if the
similarity is over millions of carts. Some smart precomputation and caching
goes a long way too.


On Thu, Feb 14, 2013 at 7:10 PM, Pat Ferrel <pat.fer...@gmail.com> wrote:

> Yes, one time tested way to do this is the "apriori" algo which looks at
> frequent item sets and creates rules.
> 
> I was looking for a shortcut using a recommender, which would be super
> easy to try. The rule builder is a little harder to implement but we can
> also test precision on that and compare the two.
> 
> The recommender method below should be reasonable AFAICT except for the
> method(s) of retrieving recs, which seem likely to be slow.
> 
> On Feb 14, 2013, at 9:45 AM, Sean Owen <sro...@gmail.com> wrote:
> 
> This sounds like a job for frequent item set mining, which is kind of a
> special case of the ideas you've mentioned here. Given N items in a cart,
> which next item most frequently occurs in a purchased cart?
> 
> 
> On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel <pat.fer...@gmail.com> wrote:
> 
>> I thought you might say that but we don't have the add-to-cart action. We
>> have to calculate cart purchases by matching cart IDs or session IDs. So
> we
>> only have cart purchases with items.
>> 
>> If we had the add-to-cart and the purchase we could use your cross-action
>> method for getting recs by training only on those two actions.
>> 
>> Still without the add-to-cart the method below should work, right? The
>> main problem being finding a similar cart in the training set quickly.
> Are
>> there other problems?
>> 
>> On Feb 14, 2013, at 9:19 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>> 
>> I think that this is an excellent use case for cross recommendation from
>> cart contents (items) to cart purchases (items).  The cross aspect is
> that
>> the recommendation is from two different kinds of actions, not two kinds
> of
>> things.  The first action is insertion into a cart and the second is
>> purchase of an item.
>> 
>> On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel <pat.fer...@gmail.com>
> wrote:
>> 
>>> There are several methods for recommending things given a shopping cart
>>> contents. At the risk of using the same tool for every problem I was
>>> thinking about a recommender's use here.
>>> 
>>> I'd do something like train on shopping cart purchases so row = cartID,
>>> column = itemID.
>>> Given cart contents I could find the most similar cart in the training
>> set
>>> by using a similarity measure then get recs for this closest matched
>> cart.
>>> 
>>> The search for similar carts may be slow if I have to check for pairwise
>>> similarity so I could cluster and find the best cluster then search it
>> for
>>> the best cart. I could create a decision tree on all trained carts and
>> walk
>>> as far as I can down the tree to find the cart with the most
>> cooccurrences.
>>> There may be other cooccurrence based methods in mahout??? With the id
> of
>>> the cart I can then get recs from the training set. I could also fold-in
>>> the new cart contents to the training set and ask for recs based on it
>>> (this seems like it would take a long time to compute). This last would
>>> also pollute the trained matrix with partial carts over time.
>>> 
>>> This seems like another place where Lucene might help but are there
> other
>>> mahout methods to look at before I diving into Lucene?
>> 
>> 
> 
> 

Reply via email to