[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137690#comment-16137690 ]
Joseph K. Bradley commented on SPARK-21086: ------------------------------------------- My understanding is that they actually want these models, and that the reasons vary. Some reasons I've heard include: * You may decide you want to use a different cross-val score later on, or you may want to compute it on a new dataset. * You may want to do analysis on the model coefficients/data to understand what tuning is doing. * (There's also the issue which can alternatively be solved by [SPARK-18704].) > CrossValidator, TrainValidationSplit should preserve all models after fitting > ----------------------------------------------------------------------------- > > Key: SPARK-21086 > URL: https://issues.apache.org/jira/browse/SPARK-21086 > Project: Spark > Issue Type: New Feature > Components: ML > Affects Versions: 2.2.0 > Reporter: Joseph K. Bradley > > I've heard multiple requests for having CrossValidatorModel and > TrainValidationSplitModel preserve the full list of fitted models. This > sounds very valuable. > One decision should be made before we do this: Should we save and load the > models in ML persistence? That could blow up the size of a saved Pipeline if > the models are large. > * I suggest *not* saving the models by default but allowing saving if > specified. We could specify whether to save the model as an extra Param for > CrossValidatorModelWriter, but we would have to make sure to expose > CrossValidatorModelWriter as a public API and modify the return type of > CrossValidatorModel.write to be CrossValidatorModelWriter (but this will not > be a breaking change). -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org