[ 
https://issues.apache.org/jira/browse/SPARK-10599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Burak Yavuz updated SPARK-10599:
--------------------------------
    Description: 
The BlockMatrix multiply sends each block to all the corresponding columns of 
the right BlockMatrix, even though there might not be any corresponding block 
to multiply with.

Some optimizations we can perform are:
 - Simulate the multiplication on the driver, and figure out which blocks 
actually need to be shuffled
 - Send the block once to a partition, and join inside the partition rather 
than sending multiple copies to the same partition

> Decrease communication in BlockMatrix multiply and increase performance
> -----------------------------------------------------------------------
>
>                 Key: SPARK-10599
>                 URL: https://issues.apache.org/jira/browse/SPARK-10599
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Burak Yavuz
>
> The BlockMatrix multiply sends each block to all the corresponding columns of 
> the right BlockMatrix, even though there might not be any corresponding block 
> to multiply with.
> Some optimizations we can perform are:
>  - Simulate the multiplication on the driver, and figure out which blocks 
> actually need to be shuffled
>  - Send the block once to a partition, and join inside the partition rather 
> than sending multiple copies to the same partition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to