On Sat, Dec 12, 2009 at 11:08 PM, Jake Mannix <[email protected]> wrote:
> You're not computing only one recommendation at a time, are you?
> I really need to read through the hadoop.item code, but in general, what
> is the procedure here?  If you're doing work on HDFS as a M/R job, you're
> doing a huge batch, right?  You're saying the aggregate performance is
> 10 seconds per recomendation across millions of recommendations, or
> doing a one-shot task?

Recommendations are computed for one user at a time, by multiplying
the co-occurrence matrix by the user preference vector. And then yes
it's one big job invoking computation for all users.

I'm running this all one one machine (my laptop) so it's kind of
serialized anyway. yes it was 10 seconds to compute all recs for one
user; it's a couple secs now with some more work. That's still rough
but not awful.


> offline).  Can you give a quick review of which part of this is supposed
> to be on Hadoop, which parts are done live, a kind of big picture
> description of what's going on?

All of it is on Hadoop here. It's pretty simple -- make the user
vectors, make the co-occurrence matrix (all that is quite fast), then
multiply the two to make recommendations.

Reply via email to