GitHub user BryanCutler opened a pull request:

    https://github.com/apache/spark/pull/20124

    [WIP][SPARK-22126][ML] Fix model-specific optimization support for ML 
tuning.

    ## What changes were proposed in this pull request?
    
    Support model-specific optimizations for CrossValidator and 
TrainValidationSplit by grouping `ParamMap`s so that param groups can fit 
models in parallel, but still allow `Estimator`s to optimally fit a sequence of 
models themselves.  This PR adds a new API to `Estimator` that can be 
overridden to indicate optimized params, and additional functions in 
`ParamGridBuilder` to group `ParamMap` arrays that can then be used by the 
meta-algorithms.
    
    ## How was this patch tested?
    
    WIP, need to add tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/BryanCutler/spark 
wip-model-specific-tuning-SPARK-22126

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20124.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20124
    
----
commit c4ff7ab016f440a6f1684f79fdfe677507fca279
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-01T00:24:55Z

    added model specific optimization to parallel TVS

commit 47a40399250af2f777e53475b5dee812bf244788
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-01T17:52:10Z

    remove unused import

commit 4d113386a2ae20bae0c4e54860386103c82ae627
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-14T19:01:06Z

    moved splitting of param maps to ParamGridBuilder

commit 6599cbac79375686b78792ff7c50c85749e4a6cf
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-15T00:58:41Z

    got param map split working

commit 47781a15cd4d2307a6268d86cd693394e227d842
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-15T07:03:54Z

    added pipeline getOptimizedParams

commit 0a887bc656e9485d247add1f6de34c299da4c19d
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-15T07:47:56Z

    moved param grouping to ParamGridBuilder.groupByParam

commit f7256e649fb6aa1e63baca5159e919fbde30dd24
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-18T05:56:57Z

    remove unused import

commit 7a53f57403ef17753e13cb099ac4866edabc5778
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-31T07:18:46Z

    fix CrossValidator to use grouped params

commit 994accd402d87639ed70d3cd594f883633a0d849
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-31T07:44:34Z

    fixed style checks and added docs

commit 53521cac9d39bf9682d67d94d46adde357db1b43
Author: Bryan Cutler <cutlerb@...>
Date:   2017-12-31T07:46:27Z

    added doc

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to