Hi, there are two solutions suggested that take advantage of either (a) a vector x matrix (your CF / Mahout example ) or (b) a small matrix x large matrix (an earlier suggestion of putting the small matrix into the Distributed Cache). Not clear yet on good approaches of (c) large matrix x large matrix.
2011/11/19 <bejoy.had...@gmail.com> > Hey Mike > In mahout one place where matrix multiplication is used is in > Collaborative Filtering distributed implementation. The recommendations > here are generated by the multiplication of a cooccurence matrix with a > user vector. This user vector is treated as a single column matrix and then > the matrix multiplication takes place in there. > > Regards > Bejoy K S > > -----Original Message----- > From: Mike Spreitzer <mspre...@us.ibm.com> > Date: Fri, 18 Nov 2011 14:52:05 > To: <common-user@hadoop.apache.org> > Reply-To: common-user@hadoop.apache.org > Subject: RE: Matrix multiplication in Hadoop > > Well, this mismatch may tell me something interesting about Hadoop. Matrix > multiplication has a lot of inherent parallelism, so from very crude > considerations it is not obvious that there should be a mismatch. Why is > matrix multiplication ill-suited for Hadoop? > > BTW, I looked into the Mahout documentation some, and did not find matrix > multiplication there. It might be hidden inside one of the advertised > algorithms; I looked at the documentation for a few, but did not notice > mention of MM. > > Thanks, > Mike > > > > From: Michael Segel <michael_se...@hotmail.com> > To: <common-user@hadoop.apache.org> > Date: 11/18/2011 01:49 PM > Subject: RE: Matrix multiplication in Hadoop > > > > > Ok Mike, > > First I admire that you are studying Hadoop. > > To answer your question... not well. > > Might I suggest that if you want to learn Hadoop, you try and find a > problem which can easily be broken in to a series of parallel tasks where > there is minimal communication requirements between each task? > > No offense, but if I could make a parallel... what you're asking is akin > to taking a normalized relational model and trying to run it as is in > HBase. > Yes it can be done. But not the best use of resources. > > > To: common-user@hadoop.apache.org > > CC: common-user@hadoop.apache.org > > Subject: Re: Matrix multiplication in Hadoop > > From: mspre...@us.ibm.com > > Date: Fri, 18 Nov 2011 12:39:00 -0500 > > > > That's also an interesting question, but right now I am studying Hadoop > > and want to know how well dense MM can be done in Hadoop. > > > > Thanks, > > Mike > > > > > > > > From: Michel Segel <michael_se...@hotmail.com> > > To: "common-user@hadoop.apache.org" <common-user@hadoop.apache.org> > > Date: 11/18/2011 12:34 PM > > Subject: Re: Matrix multiplication in Hadoop > > > > > > > > Is Hadoop the best tool for doing large matrix math. > > Sure you can do it, but, aren't there better tools for these types of > > problems? > > > > > > Sent from a remote device. Please excuse any typos... > > > > Mike Segel > > > > >