I seem to recall past discussions on where one hits the bottleneck w/ user
based recommendation approaches in Mahout, but I can't seem to locate it
anymore. Anyone know off hand? Where do user based approaches hit their
limits, more or less?
Thanks,
Grant
Limits in terms of scalability? If you mean, how much can you fit on
one machine without Hadoop, I usually say 100M data points or so.
Beyond that you can go as big as you like, but on Hadoop.
On Wed, Oct 26, 2011 at 1:56 PM, Grant Ingersoll gsing...@apache.org wrote:
I seem to recall past
Sorry, should have been more clear. I was referring to if one is using a user
based recommender (e.g GenericUserBasedRecommender) vs. item based recommender.
Our general recommendation is that user based approaches won't scale, I was
wondering what the general cutoff is on a single machine,
Yes, I would still say so. You could still easily find this too slow
if you're using user-user similarities and there are a lot of users
and few items behind these 100M data points. Or vice versa. Past this
point it's almost certainly too slow; before this point it could also
be slow. You would
Item based recommendations can also use more expensive off-line computations
which can make recommendations more accurate. SVD based methods in
particular can be very useful especially which smaller data sets.
On Wed, Oct 26, 2011 at 6:52 AM, Sean Owen sro...@gmail.com wrote:
Yes, I would