[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21443 @gatorsmile ok, I wil (so, I reopend https://issues.apache.org/jira/browse/SPARK-24369) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21443 I will revert this PR now. @maropu Could you submit a new fix to resolve the above issue? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21443 For the following query, we could see the performance regression ```SQL SELECT sum(DISTINCT x), avg(DISTINCT x) FROM (VALUES (1, 1), (2, 2), (2, 2)) t(x, y) ``` Before this PR: ``` == Optimized Logical Plan == Aggregate [sum(distinct cast(x#189 as bigint)) AS sum(DISTINCT x)#193L, avg(distinct cast(x#189 as bigint)) AS avg(DISTINCT x)#194] +- LocalRelation [x#189] ``` After this PR ``` == Optimized Logical Plan == Aggregate [sum(if ((gid#195 = 1)) CAST(`x` AS BIGINT)#196L else null) AS sum(DISTINCT x)#193L, avg(if ((gid#195 = 1)) CAST(`x` AS BIGINT)#196L else null) AS avg(DISTINCT x)#194] +- Aggregate [CAST(`x` AS BIGINT)#196L, gid#195], [CAST(`x` AS BIGINT)#196L, gid#195] +- Expand [List(cast(x#189 as bigint), 1)], [CAST(`x` AS BIGINT)#196L, gid#195] +- LocalRelation [x#189] ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21443 good catch! LGTM, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91245/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91245 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91245/testReport)** for PR 21443 at commit [`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3659/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3658/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91245 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91245/testReport)** for PR 21443 at commit [`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21443 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91244 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91244/testReport)** for PR 21443 at commit [`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91244/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91244 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91244/testReport)** for PR 21443 at commit [`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21443 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91241/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91241 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91241/testReport)** for PR 21443 at commit [`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3655/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91241 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91241/testReport)** for PR 21443 at commit [`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21443 cc @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91230/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91230 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91230/testReport)** for PR 21443 at commit [`00f6ad9`](https://github.com/apache/spark/commit/00f6ad9547f462fd0cc3377cdd3aee44be19ffaf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21443 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3645/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21443 **[Test build #91230 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91230/testReport)** for PR 21443 at commit [`00f6ad9`](https://github.com/apache/spark/commit/00f6ad9547f462fd0cc3377cdd3aee44be19ffaf). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org