[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-26 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 Let's back off this for now; see instead https://github.com/apache/spark/pull/14826 for just the 'fix' --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-23 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 I probably need some oversight here since I'm not familiar with pyspark in this regard, but it looks like PythonMLLibAPI treated seeds a little differently in a few cases which meant they couldn't

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-23 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/14524 Just to clarify what we are talking about on the Python side, pyspark-mllib given `seed=None` will generate a random seed based on system time, while pyspark-ml given `seed=None` will use a

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-23 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 Yes, isn't that why it's possible to fix a seed? I can understand an argument that the default should be not-random, but, every API I've ever seen (including Spark's) defaults to a random seed.

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-22 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14524 Sorry for late response! I'm against this change since it introduces indeterministic behavior and makes applications hard to debug. For example, I want to cross validate some estimator that accepts

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64193/ Test FAILed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64193/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-22 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/14524 + @yanboliang for R `spark.gaussianMixture` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64193/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64165/ Test FAILed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64165 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64165/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64165 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64165/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64064/ Test FAILed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64064 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64064/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64064 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64064/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64060 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64060/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64060/ Test FAILed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #64060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64060/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-19 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 OK, this sounds like it could be a good change. I'd like to ask @jkbradley again in case there was a strong reason I'm missing to fix the seed by default. Also CC @davies if possible to consider the

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63781/ Test PASSed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63781/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63781/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63743 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63743/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63743/ Test FAILed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63743 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63743/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-11 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 @mengxr do you happen to have an opinion on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-09 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/14524 +1 for changing this. If left as is, JIRAs like this one will be a recurring theme. A lot of users would probably not even notice, then wind up getting degraded results and not know why.

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63386/ Test FAILed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63386/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63386/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-08 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 Yes, it makes sense to try to fix seeds in tests for reproducibility, and most of the code does that already (hence tests pass) but right now it doesn't seem possible to get random behavior unless

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-08 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14524 I share some of @MLnick's concern we've looked at similar changes with using system random seeds for the defaults in other places and had difficult with making reproducible tests - in general we

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-08 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/14524 I tend to agree the default should be a new random seed, and it is more consistent with other libs. But @jkbradley seemed to explicitly want things to default to reproducible behavior in

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63325/ Test PASSed. ---

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14524 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63325/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14524 **[Test build #63325 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63325/consoleFull)** for PR 14524 at commit

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-07 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14524 Want to check with @jkbradley on this before proceeding, but seems to pass tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If