[jira] [Commented] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

Joseph K. Bradley (JIRA) Tue, 22 Aug 2017 17:49:28 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137690#comment-16137690
 ]


Joseph K. Bradley commented on SPARK-21086:
-------------------------------------------

My understanding is that they actually want these models, and that the reasons 
vary.  Some reasons I've heard include:
* You may decide you want to use a different cross-val score later on, or you 
may want to compute it on a new dataset.
* You may want to do analysis on the model coefficients/data to understand what 
tuning is doing.
* (There's also the issue which can alternatively be solved by [SPARK-18704].)

> CrossValidator, TrainValidationSplit should preserve all models after fitting
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-21086
>                 URL: https://issues.apache.org/jira/browse/SPARK-21086
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: Joseph K. Bradley
>
> I've heard multiple requests for having CrossValidatorModel and 
> TrainValidationSplitModel preserve the full list of fitted models.  This 
> sounds very valuable.
> One decision should be made before we do this: Should we save and load the 
> models in ML persistence?  That could blow up the size of a saved Pipeline if 
> the models are large.
> * I suggest *not* saving the models by default but allowing saving if 
> specified.  We could specify whether to save the model as an extra Param for 
> CrossValidatorModelWriter, but we would have to make sure to expose 
> CrossValidatorModelWriter as a public API and modify the return type of 
> CrossValidatorModel.write to be CrossValidatorModelWriter (but this will not 
> be a breaking change).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

Reply via email to