[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

Joseph K. Bradley (JIRA) Tue, 22 Aug 2017 17:56:15 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137697#comment-16137697
 ]


Joseph K. Bradley commented on SPARK-21535:
-------------------------------------------

[~yuhaoyan] Parallel training of models can be beneficial; we've done tests 
showing decent speedups (2-3x).  But the benefits are generally limited to 
small models or small data, where there isn't enough work during training a 
single model for the whole cluster to stay busy.  For larger problems, parallel 
training does not help as much.

I agree with you that parallel training & this fix should not conflict too 
much: The memory efficiency issue is a problem for big models; parallel 
training is more useful with smaller models.

> Reduce memory requirement for CrossValidator and TrainValidationSplit 
> ----------------------------------------------------------------------
>
>                 Key: SPARK-21535
>                 URL: https://issues.apache.org/jira/browse/SPARK-21535
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: yuhao yang
>
> CrossValidator and TrainValidationSplit both use 
> {code}models = est.fit(trainingDataset, epm) {code} to fit the models, where 
> epm is Array[ParamMap].
> Even though the training process is sequential, current implementation 
> consumes extra driver memory for holding the trained models, which is not 
> necessary and often leads to memory exception for both CrossValidator and 
> TrainValidationSplit. My proposal is to optimize the training implementation, 
> thus that used model can be collected by GC, and avoid the unnecessary OOM 
> exceptions.
> E.g. when grid search space is 12, old implementation needs to hold all 12 
> trained models in the driver memory at the same time, while the new 
> implementation only needs to hold 1 trained model at a time, and previous 
> model can be cleared by GC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

Reply via email to