Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/18313 @jkbradley I think the thing is simple. When persist model list param is `false`, just keep the code logic the same and **it won't increase the memory cost** (This is the default case) When persist model list param is `true`, in the `fit` method, we can collect all the models and pass them to the `CrossValidatorModel`/`TrainValidationSplit`. This will increase memory cost but it doesn't matter because it is not the default case. @hhbyyh Your new PR #18733 is meaningful I think, which make the memory cost to be O(1), but we need to consider the case of parallelism, in #16774 , maybe we should wait the #16774 merged, then consider how to optimize the memory usage.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org