[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94704 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94704/testReport)** for PR 22079 at commit [`bab8e68`](https://github.com/apache/spark/commit/bab8e68bc292a3a71ee378a839fd540dcf0a72bd). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94701/testReport)** for PR 22079 at commit [`81f57fe`](https://github.com/apache/spark/commit/81f57febfa5f81cd41ac7803eb9d8931df88fc5c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94665/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94665 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94665/testReport)** for PR 22079 at commit [`8d2d558`](https://github.com/apache/spark/commit/8d2d5585b2c2832cd4d88b3851607ce15180cca5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94665/testReport)** for PR 22079 at commit [`8d2d558`](https://github.com/apache/spark/commit/8d2d5585b2c2832cd4d88b3851607ce15180cca5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 Hmmm... I somehow managed to break SparkR tests but fixing a comment. It seems to have auto-retried and broke the second time too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94656/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94656/testReport)** for PR 22079 at commit [`8d2d558`](https://github.com/apache/spark/commit/8d2d5585b2c2832cd4d88b3851607ce15180cca5). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22079 Both seems fine to me, it's just a minor improvement. Normally we don't backport a improvement, but since it's a simple and small change I'm confident it is safe to also include the change in a backport PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94656/testReport)** for PR 22079 at commit [`8d2d558`](https://github.com/apache/spark/commit/8d2d5585b2c2832cd4d88b3851607ce15180cca5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 @jiangxb1987 > We shall also include #20088 in this backport PR. I did that shortly after commenting, which allowed the tests to pass. I squashed it into the first commit, so it wasn't obvious I did it. Should I also include #20426 in this PR, or treat that separately? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22079 We shall also include #20088 in this backport PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94633/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94633 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94633/testReport)** for PR 22079 at commit [`495cba5`](https://github.com/apache/spark/commit/495cba55aee0223daef089fc8513962997468f77). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class RecordBinaryComparator extends RecordComparator ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94633 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94633/testReport)** for PR 22079 at commit [`495cba5`](https://github.com/apache/spark/commit/495cba55aee0223daef089fc8513962997468f77). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 The test "model load / save" in ChiSqSelectorSuite fails because of this line in [ChiSqSelector.scala](https://github.com/apache/spark/blob/branch-2.2/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala#L147) spark.createDataFrame(dataArray).repartition(1).write.parquet(Loader.dataPath(path)) In 2.4, the line is: spark.createDataFrame(sc.makeRDD(dataArray, 1)).write.parquet(Loader.dataPath(path)) If you change 2.4 to also have that line, and also remove the follow-up PR (#20426) to avoid sorting when there is one partition, this test also fails on 2.4 in the same way. So I am not sure which way to go: Update ChiSqSelector.scala to be like 2.4 (simply a one line change), or make the test accept this new order. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94619/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94619/testReport)** for PR 22079 at commit [`efccc02`](https://github.com/apache/spark/commit/efccc028bce64bf4754ce81ee16533c19b4384b2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class RecordBinaryComparator extends RecordComparator ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 The original fix https://github.com/apache/spark/pull/22079/commits/efccc028bce64bf4754ce81ee16533c19b4384b2 has been merged to Spark 2.3. After 5+ months, we have not received any correctness regression that is caused by this fix, although this fix definitely will introduce a performance regression. I think we should merge it. The general risk is not very high and it resolves a serious correctness bug. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 cc @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22079 **[Test build #94619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94619/testReport)** for PR 22079 at commit [`efccc02`](https://github.com/apache/spark/commit/efccc028bce64bf4754ce81ee16533c19b4384b2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22079 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org