If you're have very large and very sparse matrix represented as (i, j, value) entries, then you can try the algorithms mentioned in the post <https://groups.google.com/forum/#!topic/spark-users/CGfEafqiTsA> brought up earlier.
Reza On Fri, Nov 7, 2014 at 8:31 AM, Duy Huynh <duy.huynh....@gmail.com> wrote: > thanks reza. i'm not familiar with the "block matrix multiplication", but > is it a good fit for "very large dimension, but extremely sparse" matrix? > > if not, what is your recommendation on implementing matrix multiplication > in spark on "very large dimension, but extremely sparse" matrix? > > > > > On Thu, Nov 6, 2014 at 5:50 PM, Reza Zadeh <r...@databricks.com> wrote: > >> See this thread for examples of sparse matrix x sparse matrix: >> https://groups.google.com/forum/#!topic/spark-users/CGfEafqiTsA >> >> We thought about providing matrix multiplies on CoordinateMatrix, >> however, the matrices have to be very dense for the overhead of having many >> little (i, j, value) objects to be worth it. For this reason, we are >> focused on doing block matrix multiplication first. The goal is version 1.3. >> >> Best, >> Reza >> >> On Wed, Nov 5, 2014 at 11:48 PM, Wei Tan <w...@us.ibm.com> wrote: >> >>> I think Xiangrui's ALS code implement certain aspect of it. You may want >>> to check it out. >>> Best regards, >>> Wei >>> >>> --------------------------------- >>> Wei Tan, PhD >>> Research Staff Member >>> IBM T. J. Watson Research Center >>> >>> >>> [image: Inactive hide details for Xiangrui Meng ---11/05/2014 01:13:40 >>> PM---You can use breeze for local sparse-sparse matrix multiplic]Xiangrui >>> Meng ---11/05/2014 01:13:40 PM---You can use breeze for local sparse-sparse >>> matrix multiplication and then define an RDD of sub-matri >>> >>> From: Xiangrui Meng <men...@gmail.com> >>> To: Duy Huynh <duy.huynh....@gmail.com> >>> Cc: user <u...@spark.incubator.apache.org> >>> Date: 11/05/2014 01:13 PM >>> Subject: Re: sparse x sparse matrix multiplication >>> ------------------------------ >>> >>> >>> >>> You can use breeze for local sparse-sparse matrix multiplication and >>> then define an RDD of sub-matrices >>> >>> RDD[(Int, Int, CSCMatrix[Double])] (blockRowId, blockColId, sub-matrix) >>> >>> and then use join and aggregateByKey to implement this feature, which >>> is the same as in MapReduce. >>> >>> -Xiangrui >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >>> >> >