See here: http://arxiv.org/abs/0909.4061v1
On Tue, Dec 8, 2009 at 6:47 PM, Zhengguo 'Mike' SUN <[email protected]>wrote: > Hi Jake, > > I am implementing the classical multiplicative update rule of NMF. The > matrix to be factorized is really big and sparse. Are you suggesting that I > can use some specialised algorithms for sparse matrix instead of the > standard multiplication algorithm? But what algorithms are you referring to? > Could you please provide some pointers? > > Thanks > > > > > ________________________________ > From: Jake Mannix <[email protected]> > To: [email protected] > Sent: Tue, December 8, 2009 8:53:39 PM > Subject: Re: Fwd: A MapReduce Algorithm for Matrix Multiplication > > Hi Mike, > > In the NMF work you're doing, the matrices you're multiplying are very > thin, right? They're a handful of dense vectors on the left, an equal > number of dense vectors on the right, and you're only computing the values > in the outer product which coincide with the nonzero entries of the > (sparse) > input matrix? In which case you're not really doing the full matrix > multiplication, and doing this a slightly specialized way should provide > some serious speedup. > > Unless you're not dealing with the "really big, but sparse" case, and > minimizing error only over the nonzero inputs? > > -jake > > On Tue, Dec 8, 2009 at 5:43 PM, Zhengguo 'Mike' SUN > <[email protected]>wrote: > > > Hi Ted, John, > > > > I am also interested in matrix multiplication. Actually, I am > implementing > > Non-negative Matrix Factorization, which basically is iterations of > matrix > > multiplication. What I did is exactly as what Ted said: doing an outer > > product using one column from left matrix and one row from right matrix > and > > summing up all the results. Idealy, this could be done using only one > job: > > mappers doing multiplication and reducers doing summing. But the thing is > > that how do you match a column from left matrix and a row from right > matrix. > > Assuming that two input matrices are give in column major order and row > > major order, it seems to me that you have to implement a customized > > InputFormat to do the matching. Or maybe you could use one matrix as > input > > and read the other inside mappers, but this also has problem when you > have a > > lot of mappers. The straightforward way seems to be using two jobs. The > > mappers of the first job emit the id and the content of the row or column > it > > reads and the reducers do the multiplication. The second job is to do > the > > summing. This is kind of similar to what John did. But it is inefficient > if > > you have many iterations. Any comments? > > > > > > > > > > ________________________________ > > From: Ted Dunning <[email protected]> > > To: [email protected]; mahout-user < > [email protected] > > > > > Sent: Tue, December 8, 2009 2:59:06 PM > > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication > > > > John, > > > > Your comment about matrix multiplication was forwarded to the mahout-user > > emailing list. > > > > I have a question for you about your approach. Have you considered doing > > the multiplication in a single step by storing the the first matrix in > > column major order and the second in row major order? The idea would be > to > > do a block-wise outer product and use the combiner to sum elements ahead > of > > the reducer. This could be done by reading a (block) column from the > left > > matrix and a (block) row from the second and emitting all of the (block) > > elements of the outer product of this column and row. The key emitted > with > > these blocks would be the i,j index of the final location of each block > in > > the final product. The combiner and reducer could add all of the blocks > > together and the use of the combiner would serve to substantially > minimize > > the amount of network traffic. > > > > This alternative is equivalent to inverting the loop nesting in a > > conventional matrix multiplication algorithm. It is also very closely > > related to the way that cooccurrence counting is typically done in > Hadoop. > > > > Is your project an ongoing effort? > > > > ---------- Forwarded message ---------- > > From: Robin Anil <[email protected]> > > Date: Tue, Dec 8, 2009 at 11:09 AM > > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication > > To: [email protected] > > Cc: John Norstad <[email protected]> > > > > > > ---------- Forwarded message ---------- > > From: John Norstad <[email protected]> > > Date: Tue, Dec 8, 2009 at 10:14 PM > > Subject: A MapReduce Algorithm for Matrix Multiplication > > To: MapReduce <[email protected]> > > > > > > As an exercise while learning MapReduce, I developed an algorithm for > > matrix multiplication and wrote it up on my web site. If you're > > interested, it's at: > > > > http://homepage.mac.com/j.norstad/matrix-multiply > > > > I present the algorithm in English, as pseudo-code, and as Java source > > code for Hadoop. > > > > -- > > John Norstad > > Academic and Research Technologies > > Northwestern University > > > > > > > > -- > > Ted Dunning, CTO > > DeepDyve > > > > > > > > > > > > > > -- Ted Dunning, CTO DeepDyve
