[ 
https://issues.apache.org/jira/browse/SPARK-12869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107533#comment-15107533
 ] 

Fokko Driesprong commented on SPARK-12869:
------------------------------------------

Hi guys,

I've implemented an improved version of the toIndexedRowMatrix function on the 
BlockMatrix. I needed this for a project, but would like to share it with the 
rest of the community. In the case of dense matrices, it can increase 
performance up to 19 times:
https://github.com/Fokko/BlockMatrixToIndexedRowMatrix

The pull-request on Github:
https://github.com/apache/spark/pull/10839

> Optimize conversion from BlockMatrix to IndexedRowMatrix
> --------------------------------------------------------
>
>                 Key: SPARK-12869
>                 URL: https://issues.apache.org/jira/browse/SPARK-12869
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Fokko Driesprong
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In the current implementation of the BlockMatrix, the conversion to the 
> IndexedRowMatrix is done by converting it to a CoordinateMatrix first. This 
> is somewhat ok when the matrix is very sparse, but for dense matrices this is 
> very inefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to