[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-29 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19979 > > test statistics (such as min/max ) on global transformer output > This is also used in some tests, such as "predictRaw and predictProbability" testcase in `DecisionTreeClassifierSuite"

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19979 @jkbradley > When there has been a shuffle, it is likely the Rows will not follow a fixed order. Agreed. But we can make sure it generate fix order from the last shuffle position

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19979 Since we are under time pressure, I'm OK with keeping it for now, but let's plan on refactoring the tests later to get rid of the use of globalCheckFunction wherever possible. --- -

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19979 > assume that Dataset.collect() returns the Rows in a fixed order. I'm quite sure that: * When the Dataset has been constructed without any shuffles or repartitions, then Rows are alwa

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85469/ Test PASSed. ---

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19979 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19979 **[Test build #85469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85469/testReport)** for PR 19979 at commit [`de345dc`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19979 **[Test build #85469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85469/testReport)** for PR 19979 at commit [`de345dc`](https://github.com/apache/spark/commit/de

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19979 @MrBago Merge your code suggestion. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additiona

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-28 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19979 @jkbradley There're two cases which can use `globalCheckFunction` - test statistics (such as min/max ) on global transformer output - get global result array and compare it with hardc

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-27 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19979 Actually, going further than what Bago said: All of the places which use globalCheckFunction assume that Dataset.collect() returns the Rows in a fixed order. We should really fix those unit tests

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-27 Thread MrBago
Github user MrBago commented on the issue: https://github.com/apache/spark/pull/19979 @WeichenXu123 it looks like `testTransformer` is a special case of `testTransformerByGlobalCheckFunc`. I think it's cleaner to structure the tests in this way instead of passing around nulls, because

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19979 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85374/ Test PASSed. ---

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19979 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85376/ Test PASSed. ---

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19979 **[Test build #85374 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85374/testReport)** for PR 19979 at commit [`f7a54ae`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19979 **[Test build #85376 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85376/testReport)** for PR 19979 at commit [`7bc588a`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19979 **[Test build #85376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85376/testReport)** for PR 19979 at commit [`7bc588a`](https://github.com/apache/spark/commit/7b

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19979 **[Test build #85374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85374/testReport)** for PR 19979 at commit [`f7a54ae`](https://github.com/apache/spark/commit/f7