[GitHub] spark issue #18313: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...

WeichenXu123 Tue, 01 Aug 2017 17:29:16 -0700

Github user WeichenXu123 commented on the issue:

    https://github.com/apache/spark/pull/18313
  
    @jkbradley 
    I think the thing is simple.
    When persist model list param is `false`, just keep the code logic the same 
and **it won't increase the memory cost** (This is the default case)
    When persist model list param is `true`, in the `fit` method, we can 
collect all the models and pass them to the 
`CrossValidatorModel`/`TrainValidationSplit`. This will increase memory cost 
but it doesn't matter because it is not the default case.
    
    @hhbyyh Your new PR #18733 is meaningful I think, which make the memory 
cost to be O(1), but we need to consider the case of parallelism, in #16774 , 
maybe we should wait the #16774 merged, then consider how to optimize the 
memory usage.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18313: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...

Reply via email to