[ https://issues.apache.org/jira/browse/SPARK-21088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165867#comment-16165867 ]
Weichen Xu commented on SPARK-21088: ------------------------------------ [~dimberman] Because [~ajaysaini] is busy, I take this over. I will create PR once SPARK-21911 merged. Thanks! > CrossValidator, TrainValidationSplit should collect all models when fitting: > Python API > --------------------------------------------------------------------------------------- > > Key: SPARK-21088 > URL: https://issues.apache.org/jira/browse/SPARK-21088 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark > Affects Versions: 2.2.0 > Reporter: Joseph K. Bradley > > In pyspark: > We add a parameter whether to collect the full model list when > CrossValidator/TrainValidationSplit training (Default is NOT, avoid the > change cause OOM) > Add a method in CrossValidatorModel/TrainValidationSplitModel, allow user to > get the model list > CrossValidatorModelWriter add a “option”, allow user to control whether to > persist the model list to disk. > Note: when persisting the model list, use indices as the sub-model path -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org