GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/20124
[WIP][SPARK-22126][ML] Fix model-specific optimization support for ML tuning. ## What changes were proposed in this pull request? Support model-specific optimizations for CrossValidator and TrainValidationSplit by grouping `ParamMap`s so that param groups can fit models in parallel, but still allow `Estimator`s to optimally fit a sequence of models themselves. This PR adds a new API to `Estimator` that can be overridden to indicate optimized params, and additional functions in `ParamGridBuilder` to group `ParamMap` arrays that can then be used by the meta-algorithms. ## How was this patch tested? WIP, need to add tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/BryanCutler/spark wip-model-specific-tuning-SPARK-22126 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20124.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20124 ---- commit c4ff7ab016f440a6f1684f79fdfe677507fca279 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-01T00:24:55Z added model specific optimization to parallel TVS commit 47a40399250af2f777e53475b5dee812bf244788 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-01T17:52:10Z remove unused import commit 4d113386a2ae20bae0c4e54860386103c82ae627 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-14T19:01:06Z moved splitting of param maps to ParamGridBuilder commit 6599cbac79375686b78792ff7c50c85749e4a6cf Author: Bryan Cutler <cutlerb@...> Date: 2017-12-15T00:58:41Z got param map split working commit 47781a15cd4d2307a6268d86cd693394e227d842 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-15T07:03:54Z added pipeline getOptimizedParams commit 0a887bc656e9485d247add1f6de34c299da4c19d Author: Bryan Cutler <cutlerb@...> Date: 2017-12-15T07:47:56Z moved param grouping to ParamGridBuilder.groupByParam commit f7256e649fb6aa1e63baca5159e919fbde30dd24 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-18T05:56:57Z remove unused import commit 7a53f57403ef17753e13cb099ac4866edabc5778 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-31T07:18:46Z fix CrossValidator to use grouped params commit 994accd402d87639ed70d3cd594f883633a0d849 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-31T07:44:34Z fixed style checks and added docs commit 53521cac9d39bf9682d67d94d46adde357db1b43 Author: Bryan Cutler <cutlerb@...> Date: 2017-12-31T07:46:27Z added doc ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org