[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-06-01 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21443
  
@gatorsmile ok, I wil (so, I reopend 
https://issues.apache.org/jira/browse/SPARK-24369)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-06-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21443
  
I will revert this PR now. @maropu Could you submit a new fix to resolve 
the above issue?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-06-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21443
  
For the following query, we could see the performance regression
```SQL
SELECT sum(DISTINCT x), avg(DISTINCT x)
FROM (VALUES (1, 1), (2, 2), (2, 2)) t(x, y)
```

Before this PR:
```
== Optimized Logical Plan ==
Aggregate [sum(distinct cast(x#189 as bigint)) AS sum(DISTINCT x)#193L, 
avg(distinct cast(x#189 as bigint)) AS avg(DISTINCT x)#194]
+- LocalRelation [x#189]
```

After this PR
```
== Optimized Logical Plan ==
Aggregate [sum(if ((gid#195 = 1)) CAST(`x` AS BIGINT)#196L else null) AS 
sum(DISTINCT x)#193L, avg(if ((gid#195 = 1)) CAST(`x` AS BIGINT)#196L else 
null) AS avg(DISTINCT x)#194]
+- Aggregate [CAST(`x` AS BIGINT)#196L, gid#195], [CAST(`x` AS 
BIGINT)#196L, gid#195]
   +- Expand [List(cast(x#189 as bigint), 1)], [CAST(`x` AS BIGINT)#196L, 
gid#195]
  +- LocalRelation [x#189]
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-30 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21443
  
good catch! LGTM, merging  to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91245/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91245 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91245/testReport)**
 for PR 21443 at commit 
[`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3659/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3658/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91245 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91245/testReport)**
 for PR 21443 at commit 
[`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21443
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91244 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91244/testReport)**
 for PR 21443 at commit 
[`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91244/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91244 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91244/testReport)**
 for PR 21443 at commit 
[`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21443
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91241/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91241 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91241/testReport)**
 for PR 21443 at commit 
[`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3655/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91241 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91241/testReport)**
 for PR 21443 at commit 
[`29e6485`](https://github.com/apache/spark/commit/29e64851f51aad5d79b2722e7ee2f8aeb7d8bf8a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-28 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21443
  
cc @hvanhovell 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91230/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91230 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91230/testReport)**
 for PR 21443 at commit 
[`00f6ad9`](https://github.com/apache/spark/commit/00f6ad9547f462fd0cc3377cdd3aee44be19ffaf).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21443
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3645/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21443: [SPARK-24369][SQL] Correct handling for multiple distinc...

2018-05-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21443
  
**[Test build #91230 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91230/testReport)**
 for PR 21443 at commit 
[`00f6ad9`](https://github.com/apache/spark/commit/00f6ad9547f462fd0cc3377cdd3aee44be19ffaf).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org