Try using the Apache Mahout code that solves exactly this problem.

Mahout has a distributed row-wise matrix that is read one row at a time.
 Dot products with the vector are computed and the results are collected.
 This capability is used extensively in the large scale SVD's in Mahout.

On Tue, May 17, 2011 at 1:13 PM, Alexandra Anghelescu <
axanghele...@gmail.com> wrote:

> Hi all,
>
> I was wondering how to go about doing a matrix-vector multiplication using
> hadoop. I have my matrix in one file and my vector in another. All the map
> tasks will need the vector file... basically they need to share it.
>
> Basically I want my map function to output key-value pairs (i,m[i,j]*v(j)),
> where i is the row number, and j the column number; v(j) is the jth element
> in v. And the reduce function will sum up all the values with the same key
> -
> i, and that will be the ith element of my result vector.
>
> I don't know how to format the input to do this.. even if I do it in 2 MR
> iterations, first formatting the input and second the actual
> matrix-vector-multiply, I don't have a clear idea.
>
> If you have any ideas/suggestions I would appreciate it!
>
> Thanks in advance,
> Alexandra
>

Reply via email to