Xiangrui Meng created SPARK-14084: ------------------------------------- Summary: Parallel training jobs in model selection Key: SPARK-14084 URL: https://issues.apache.org/jira/browse/SPARK-14084 Project: Spark Issue Type: New Feature Components: ML Affects Versions: 2.0.0 Reporter: Xiangrui Meng
In CrossValidator and TrainValidationSplit, we run training jobs one by one. If users have a big cluster, they might see speed-ups if we parallelize the jobs. The trade-off is that we might need to make multiple copies of the training data, which could be expensive. It is worth testing and figure out the best way to implement it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org