Is Mahout the right tool to recommend cross sales?

2013-04-11 Thread Billy
I am very new to Mahout and currently just ready up to chapter 5 of 'MIA'
but after reading about the various User centric and Item centric
recommenders they all seem to still need a userId so still unsure if Mahout
can help with a fairly common recommendation.

My requirement is to produce 'n' item recommendations based on a chosen
item.

E.g. if I've added item #1 to my order then based on all the
other items; in all the other orders for this site, what are the
likely items that I may also want add to my order based; on the item to
item relationship in the history of orders of this site?

Most probably using the most popular relationship between the item I have
chosen and all the items in all the other orders.

My data is not 'user' specific; and I don't think it should be, but more
like order specific as its the pattern of items in each order that should
determine the recommendation.

I have no preference values so merely boolean preferences will be used.

If Mahout can perform these calculations then how must I present the data?

Will I need to shape the data in some way to feed into Mahout (currently
versed in using Hadoop via Aws Emr using Java)

Thanks for the advice in advance,

Billy


Re: Is Mahout the right tool to recommend cross sales?

2013-04-11 Thread Billy
As in the example data 'intro.csv' in the MIA it has users 1-5 so if I ask
for recommendations for user 1 then this works but if I ask for
recommendations for user 6 (a new user yet to be added to the data model)
then I get no recommendations ... so if I substitute users for orders then
again I will get no recommendations ... which I sort of understand so do I
need to inject my 'new' active order; along with its attached item/s into
the data model first and then ask for the recommendations for the order by
offering up the new orderId? or is there a way of merely offering up an
'item' and then getting recommendations based merely on the item using the
data already stored and the relationships with my item?

My assumptions:
#1
I am assuming the data model is a static island of data that has been
processed (flattened) overnight (most probably by an Hadoop process) due to
the size of this data ... rather than a living document that is updated as
soon as new data is available.
#2
I'm also assuming that instead of reading in the data model and
providing recommendations 'on the fly' I will have to run thru every item
in my catalogue and find out the top 5 recommended items that are ordered
with each item (most probably via a Hadoop process) and then store this
output in dynamoDb or luncene for quick access.

Sorry for all the questions but it such an interesting subject.


On 11 April 2013 22:04, Ted Dunning ted.dunn...@gmail.com wrote:

 Actually, making this user based is a really good thing because you get
 recommendations from one session to the next.  These may be much more
 valuable for cross-sell than things in the same order.


 On Thu, Apr 11, 2013 at 12:50 PM, Sean Owen sro...@gmail.com wrote:

 You can try treating your orders as the 'users'. Then just compute
 item-item similarities per usual.

 On Thu, Apr 11, 2013 at 7:59 PM, Billy b...@ntlworld.com wrote:
  Thanks for replying,
 
 
  I don't have users, well I do :-) but in this case it should not
 influence
  the recommendations
 
  ,
  these need to be based on the relationship between
  
  items ordered with other items
  in the 'same order'
  .
 
  E.g. If item #1 has been order with item #4
 
  [
  22
  ]
  times and item #1 has been order with item #9
  [
  57
  ]
  times then
  if I added item #1 to my order
  these would both be recommended
  but item #9 would be recommended above item #4 purely based on the fact
 that
  the relationship between item #1 and item #9 is greater than the
  relationship with item #4.
 
  What I don't want is; if a user ordered items #A, #B, #C separately
  'at some point in their order history' then recommen
  d #A and #C to other users who order #B ... I still don't want this if
 the
  items are similar and/or the users similar.
 
  Cheers
 
  Billy
 
 
 
  On 11 Apr 2013 18:28, Sean Owen sro...@gmail.com wrote:
 
  This sounds like just a most-similar-items problem. That's good news
  because that's simpler. The only question is how you want to compute
  item-item similarities. That could be based on user-item interactions.
  If you're on Hadoop, try the RowSimilarityJob (where you will need
  rows to be items, columns the users).
 
  On Thu, Apr 11, 2013 at 6:11 PM, Billy b...@ntlworld.com wrote:
   I am very new to Mahout and currently just ready up to chapter 5 of
   'MIA'
   but after reading about the various User centric and Item centric
   recommenders they all seem to still need a userId so still unsure if
   Mahout
   can help with a fairly common recommendation.
  
   My requirement is to produce 'n' item recommendations based on a
 chosen
   item.
  
   E.g. if I've added item #1 to my order then based on all the
   other items; in all the other orders for this site, what are the
   likely items that I may also want add to my order based; on the item
 to
   item relationship in the history of orders of this site?
  
   Most probably using the most popular relationship between the item I
   have
   chosen and all the items in all the other orders.
  
   My data is not 'user' specific; and I don't think it should be, but
 more
   like order specific as its the pattern of items in each order that
   should
   determine the recommendation.
  
   I have no preference values so merely boolean preferences will be
 used.
  
   If Mahout can perform these calculations then how must I present the
   data?
  
   Will I need to shape the data in some way to feed into Mahout
 (currently
   versed in using Hadoop via Aws Emr using Java)
  
   Thanks for the advice in advance,
  
   Billy





Problems reading solr index

2011-12-19 Thread Billy Newport
I have mahout 0.5, and solr 3.5

I have term vectors turned on for the text field and I'm trying to
read those indexes into mahout. I have updated all the lucene jars for
mahout to 3.5 from the original ones. Mahout is still complaining
about format -3 when reading the indexes however. It's like it's got
an older version of the lucene libraries somewhere.


Sent from my iPad


Re: Problems reading solr index

2011-12-19 Thread Billy Newport
Any place to download the current snapshot? I'm firewalled here so I
can't get Svn access.

Sent from my iPhone

On Dec 19, 2011, at 10:01 AM, Ted Dunning ted.dunn...@gmail.com wrote:

 Try with the trunk version.

 On Mon, Dec 19, 2011 at 6:16 AM, Billy Newport bi...@billynewport.comwrote:

 I have mahout 0.5, and solr 3.5

 I have term vectors turned on for the text field and I'm trying to
 read those indexes into mahout. I have updated all the lucene jars for
 mahout to 3.5 from the original ones. Mahout is still complaining
 about format -3 when reading the indexes however. It's like it's got
 an older version of the lucene libraries somewhere.


 Sent from my iPad



Re: Problems reading solr index

2011-12-19 Thread Billy Newport
That worked, thanks

Sent from my iPad

On Dec 19, 2011, at 10:01 AM, Ted Dunning ted.dunn...@gmail.com wrote:

 Try with the trunk version.

 On Mon, Dec 19, 2011 at 6:16 AM, Billy Newport bi...@billynewport.comwrote:

 I have mahout 0.5, and solr 3.5

 I have term vectors turned on for the text field and I'm trying to
 read those indexes into mahout. I have updated all the lucene jars for
 mahout to 3.5 from the original ones. Mahout is still complaining
 about format -3 when reading the indexes however. It's like it's got
 an older version of the lucene libraries somewhere.


 Sent from my iPad