Xiangrui Meng created SPARK-14084:
-------------------------------------

             Summary: Parallel training jobs in model selection
                 Key: SPARK-14084
                 URL: https://issues.apache.org/jira/browse/SPARK-14084
             Project: Spark
          Issue Type: New Feature
          Components: ML
    Affects Versions: 2.0.0
            Reporter: Xiangrui Meng


In CrossValidator and TrainValidationSplit, we run training jobs one by one. If 
users have a big cluster, they might see speed-ups if we parallelize the jobs. 
The trade-off is that we might need to make multiple copies of the training 
data, which could be expensive. It is worth testing and figure out the best way 
to implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to