[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048647#comment-16048647 ]
yuhao yang edited comment on SPARK-21086 at 6/14/17 5:22 AM: ------------------------------------------------------------- Sounds good. About the default path for saving different models, how about we use the flatten parameter as the file name. e.g. LogisticRegressionModel-maxIter-100-regParam-0.1 And I would not implement it with the ML Persistence Framework, simply because caching the models in memory would be expensive (especially impractical for driver memory) and would impact the existing usage of CrossValidator (Slower or OOM). I would recommend adding an expert param and save the models during training. was (Author: yuhaoyan): Sounds good. About the default path for saving different models, how about we use the flatten parameter as the file name. e.g. LogisticRegressionModel-maxIter-100-regParam-0.1 And I would not implement it with the ML Persistence Framework, simply because caching the models in memory would be expensive and would impact the existing usage of CrossValidator (Slower or OOM). I would recommend adding an expert param and save the models during training. > CrossValidator, TrainValidationSplit should preserve all models after fitting > ----------------------------------------------------------------------------- > > Key: SPARK-21086 > URL: https://issues.apache.org/jira/browse/SPARK-21086 > Project: Spark > Issue Type: New Feature > Components: ML > Affects Versions: 2.2.0 > Reporter: Joseph K. Bradley > > I've heard multiple requests for having CrossValidatorModel and > TrainValidationSplitModel preserve the full list of fitted models. This > sounds very valuable. > One decision should be made before we do this: Should we save and load the > models in ML persistence? That could blow up the size of a saved Pipeline if > the models are large. > * I suggest *not* saving the models by default but allowing saving if > specified. We could specify whether to save the model as an extra Param for > CrossValidatorModelWriter, but we would have to make sure to expose > CrossValidatorModelWriter as a public API and modify the return type of > CrossValidatorModel.write to be CrossValidatorModelWriter (but this will not > be a breaking change). -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org