Re: Using Mahout for low-volume data

2013-07-15 Thread Ted Dunning
With such small data, this sounds (without thinking too much) like you are doing reasonably well with LLR similarity. Have you tried a factorizing recommender? On Sun, Jul 14, 2013 at 10:49 PM, Jayesh jayesh.sidhw...@gmail.com wrote: Hi Ted, Thanks for the reply. My training data could

Re: Using Mahout for low-volume data

2013-07-15 Thread Koobas
Is a factorizing recommender a better idea for low volume data in general? On Mon, Jul 15, 2013 at 11:35 AM, Ted Dunning ted.dunn...@gmail.com wrote: With such small data, this sounds (without thinking too much) like you are doing reasonably well with LLR similarity. Have you tried a

Re: Using Mahout for low-volume data

2013-07-15 Thread Ted Dunning
I think so, but I cannot say that I know so. On Mon, Jul 15, 2013 at 8:37 AM, Koobas koo...@gmail.com wrote: Is a factorizing recommender a better idea for low volume data in general? On Mon, Jul 15, 2013 at 11:35 AM, Ted Dunning ted.dunn...@gmail.com wrote: With such small data, this

Re: Using Mahout for low-volume data

2013-07-15 Thread Jayesh Sidhwani
Okay. I'll try that and get back with the results. Thank You On Monday, July 15, 2013, Ted Dunning wrote: I think so, but I cannot say that I know so. On Mon, Jul 15, 2013 at 8:37 AM, Koobas koo...@gmail.com javascript:; wrote: Is a factorizing recommender a better idea for low volume

Using Mahout for low-volume data

2013-07-14 Thread Jayesh
Hello, I am exploring the collaborative filtering algorithms in Mahout to build a recommendation engine. I had recently gone for a Big Data conference where the speakers suggested that using Mahout is overkill for anything that doesn't have some terabytes of training data. I tried to google

Re: Using Mahout for low-volume data

2013-07-14 Thread Ted Dunning
Mahout will work fine for smaller data sizes. Collaborative filtering can be difficult in general with small data, however. How many users and how many items? How many actions? On Sun, Jul 14, 2013 at 10:22 PM, Jayesh jayesh.sidhw...@gmail.com wrote: Hello, I am exploring the

Re: Using Mahout for low-volume data

2013-07-14 Thread Jayesh
Hi Ted, Thanks for the reply. My training data could have around 100k users and around 1k items. The data is sparse (I have a boolean affinity - the user either bought the item or did not) PS: I have been playing around with a sample code, using Loglikelihood Similarity to get a 24% precision,