If you're have very large and very sparse matrix represented as (i, j,
value) entries, then you can try the algorithms mentioned in the post
<https://groups.google.com/forum/#!topic/spark-users/CGfEafqiTsA> brought
up earlier.

Reza

On Fri, Nov 7, 2014 at 8:31 AM, Duy Huynh <duy.huynh....@gmail.com> wrote:

> thanks reza.  i'm not familiar with the "block matrix multiplication", but
> is it a good fit for "very large dimension, but extremely sparse" matrix?
>
> if not, what is your recommendation on implementing matrix multiplication
> in spark on "very large dimension, but extremely sparse" matrix?
>
>
>
>
> On Thu, Nov 6, 2014 at 5:50 PM, Reza Zadeh <r...@databricks.com> wrote:
>
>> See this thread for examples of sparse matrix x sparse matrix:
>> https://groups.google.com/forum/#!topic/spark-users/CGfEafqiTsA
>>
>> We thought about providing matrix multiplies on CoordinateMatrix,
>> however, the matrices have to be very dense for the overhead of having many
>> little (i, j, value) objects to be worth it. For this reason, we are
>> focused on doing block matrix multiplication first. The goal is version 1.3.
>>
>> Best,
>> Reza
>>
>> On Wed, Nov 5, 2014 at 11:48 PM, Wei Tan <w...@us.ibm.com> wrote:
>>
>>> I think Xiangrui's ALS code implement certain aspect of it. You may want
>>> to check it out.
>>> Best regards,
>>> Wei
>>>
>>> ---------------------------------
>>> Wei Tan, PhD
>>> Research Staff Member
>>> IBM T. J. Watson Research Center
>>>
>>>
>>> [image: Inactive hide details for Xiangrui Meng ---11/05/2014 01:13:40
>>> PM---You can use breeze for local sparse-sparse matrix multiplic]Xiangrui
>>> Meng ---11/05/2014 01:13:40 PM---You can use breeze for local sparse-sparse
>>> matrix multiplication and then define an RDD of sub-matri
>>>
>>> From: Xiangrui Meng <men...@gmail.com>
>>> To: Duy Huynh <duy.huynh....@gmail.com>
>>> Cc: user <u...@spark.incubator.apache.org>
>>> Date: 11/05/2014 01:13 PM
>>> Subject: Re: sparse x sparse matrix multiplication
>>> ------------------------------
>>>
>>>
>>>
>>> You can use breeze for local sparse-sparse matrix multiplication and
>>> then define an RDD of sub-matrices
>>>
>>> RDD[(Int, Int, CSCMatrix[Double])] (blockRowId, blockColId, sub-matrix)
>>>
>>> and then use join and aggregateByKey to implement this feature, which
>>> is the same as in MapReduce.
>>>
>>> -Xiangrui
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>>
>>
>

Reply via email to