Github user mengxr commented on the pull request:
https://github.com/apache/incubator-spark/pull/572#issuecomment-34668194
@holdenk , the PartitionwiseSampledRDD was designed with this use case in
mind. Both the folded RDD and its complement can be represented by
PartitionwiseSampledRDD with BernoulliSamplers. Do you mind modifying your code
to use it? Also, cross-validation is a machine learning specific operation.
spark.rdd.RDD may not be a good place for it. - [GitHub] incubator-spark pull request: MLI-2: Add k-fold cro... holdenk
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... rxin
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... mengxr
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... holdenk
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
- [GitHub] incubator-spark pull request: MLI-2: Add k-fol... AmplabJenkins
