[jira] [Commented] (SPARK-2361) Decide whether to broadcast or serialize the weights directly in MLlib algorithms

Xiangrui Meng (JIRA) Tue, 15 Jul 2014 20:04:18 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063040#comment-14063040
 ]


Xiangrui Meng commented on SPARK-2361:
--------------------------------------

PR that uses broadcast for both training and prediction: 
https://github.com/apache/spark/pull/1427

> Decide whether to broadcast or serialize the weights directly in MLlib 
> algorithms
> ---------------------------------------------------------------------------------
>
>                 Key: SPARK-2361
>                 URL: https://issues.apache.org/jira/browse/SPARK-2361
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Xiangrui Meng
>
> In the current implementation, MLlib serializes weights directly into 
> closure. This is okay for small feature dimension, but not efficient for 
> feature dimensions beyond 1M. Especially the default akka.frameSize is 10m. 
> We should use broadcast when the size of the serialized task is going to be 
> large.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2361) Decide whether to broadcast or serialize the weights directly in MLlib algorithms

Reply via email to