[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14296


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-28 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72579088
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
--- End diff --

I updated this. Please take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72396129
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
--- End diff --

what about we do some improve for next case:

1. add one more check: the expression should not exist in the chid output
2. if the expression is a `NamedExpression` already, don't alias it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-26 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72387057
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
--- End diff --

Since we do a transform, the inner col of `Grouping` will match this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72386739
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
--- End diff --

but in the next case, we will check: 1) if it's a grouping column. 2) if 
it's not grouping function. So how can this happen?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-26 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72386503
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
--- End diff --

When there is a `Grouping(col#1)` in the HAVING clause. If we push down it 
to Aggregate and use an Alias instead, the grouping function become 
`Grouping(col#1#2)`. Then in the rule `ResolveGroupingAnalytics` we try to find 
the index of col in group by expressions, we can't find it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72380502
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
--- End diff --

I don't quite understand this comment, can you give a concrete example?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-25 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72177317
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
+  case ne: NamedExpression => ne
+  // Grouping functions are handled in the rule 
[[ResolveGroupingAnalytics]].
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

Yes. SQLQuerySuite has test for with grouping function in HAVING.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72175501
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
+  case ne: NamedExpression => ne
+  // Grouping functions are handled in the rule 
[[ResolveGroupingAnalytics]].
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

I mean with this check. Can we resolve grouping function inside Filter?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-25 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72070208
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
+  case ne: NamedExpression => ne
+  // Grouping functions are handled in the rule 
[[ResolveGroupingAnalytics]].
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

You mean if we don't add this check? Yes. As the grouping function is 
pushed down to Aggregation, the rule `ResolveGroupingAnalytics` can't perform 
the replacement of grouping function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72034466
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
+  case ne: NamedExpression => ne
+  // Grouping functions are handled in the rule 
[[ResolveGroupingAnalytics]].
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

so what will happen if we use grouping function in HAVING? will the query 
fail?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-25 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72018862
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

yah. `ResolveGroupingAnalytics` performs grouping id and grouping function 
replacement. We should skip pushdown them here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-25 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72018679
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
--- End diff --

Removing it will cause error in `replaceGroupingFunc`. As we create a new 
attribute based on the grouping column with additional alias, we can't find it 
anymore in aggregate's group by expressions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72011203
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
--- End diff --

but `case e: Expression if grouping.exists(_.semanticEquals(e))` will 
filter them out right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-24 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72009959
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
--- End diff --

I think we don't need to replace named expr (attributes, alias)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72009643
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

It will be greate if we can figure it out and add comment here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72009599
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -46,6 +46,14 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   Row("one", 6) :: Row("three", 3) :: Nil)
   }
 
+  test("having condition contains grouping column") {
+Seq(("one", 1), ("two", 2), ("three", 3), ("one", 5)).toDF("k", "v")
--- End diff --

wrap it with `withTempView`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72009571
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
--- End diff --

why this case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-22 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r71965620
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

I am not near my laptop. But pushing grouping function causes test failed 
in SQLQuerySuite. I remember there is another rule taking care of grouping 
function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-22 Thread clockfly
Github user clockfly commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r71912313
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
+  !ResolveGroupingAnalytics.hasGroupingFunction(e) =>
--- End diff --

Why `!ResolveGroupingAnalytics.hasGroupingFunction(e)` is needed? Do we 
want a test case for this check?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-22 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r71842274
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,12 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression if grouping.exists(_.semanticEquals(e)) &&
--- End diff --

Only push down the group by expressions or aggregate expressions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r71832244
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1202,11 +1203,16 @@ class Analyzer(
   if (resolvedOperator.resolved) {
 // Try to replace all aggregate expressions in the filter by 
an alias.
 val aggregateExpressions = ArrayBuffer.empty[NamedExpression]
-val transformedAggregateFilter = 
resolvedAggregateFilter.transform {
+val transformedAggregateFilter = 
resolvedAggregateFilter.transformDown {
   case ae: AggregateExpression =>
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression =>
+val alias = Alias(e, e.toString)()
+aggregateExpressions += alias
+alias.toAttribute
--- End diff --

Looks like pushing the whole filter condition brings more problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r71698067
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1202,11 +1203,16 @@ class Analyzer(
   if (resolvedOperator.resolved) {
 // Try to replace all aggregate expressions in the filter by 
an alias.
 val aggregateExpressions = ArrayBuffer.empty[NamedExpression]
-val transformedAggregateFilter = 
resolvedAggregateFilter.transform {
+val transformedAggregateFilter = 
resolvedAggregateFilter.transformDown {
   case ae: AggregateExpression =>
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression =>
+val alias = Alias(e, e.toString)()
+aggregateExpressions += alias
+alias.toAttribute
--- End diff --

yup


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r71676064
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1202,11 +1203,16 @@ class Analyzer(
   if (resolvedOperator.resolved) {
 // Try to replace all aggregate expressions in the filter by 
an alias.
 val aggregateExpressions = ArrayBuffer.empty[NamedExpression]
-val transformedAggregateFilter = 
resolvedAggregateFilter.transform {
+val transformedAggregateFilter = 
resolvedAggregateFilter.transformDown {
   case ae: AggregateExpression =>
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression =>
+val alias = Alias(e, e.toString)()
+aggregateExpressions += alias
+alias.toAttribute
--- End diff --

You mean we don't replace and push the expressions, but just push the whole 
filter condition?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r71647958
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1202,11 +1203,16 @@ class Analyzer(
   if (resolvedOperator.resolved) {
 // Try to replace all aggregate expressions in the filter by 
an alias.
 val aggregateExpressions = ArrayBuffer.empty[NamedExpression]
-val transformedAggregateFilter = 
resolvedAggregateFilter.transform {
+val transformedAggregateFilter = 
resolvedAggregateFilter.transformDown {
   case ae: AggregateExpression =>
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  case ne: NamedExpression => ne
+  case e: Expression =>
+val alias = Alias(e, e.toString)()
+aggregateExpressions += alias
+alias.toAttribute
--- End diff --

looks like we are blindly pushing all expression into `Aggregate`, how 
about we just push the whole filter condition?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-20 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/14296

[SPARK-16639][SQL] The query with having condition that contains grouping 
by column should work

## What changes were proposed in this pull request?

The query with having condition that contains grouping by column will be 
failed during analysis. E.g.,

create table tbl(a int, b string);
select count(b) from tbl group by a + 1 having a + 1 = 2;

Having condition should be able to use grouping by column.

## How was this patch tested?

Jenkins tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 having-contains-grouping-column

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14296.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14296


commit 5704709fcddf90aa62810d82e4546eadd274e630
Author: Liang-Chi Hsieh 
Date:   2016-07-21T05:03:01Z

The query with having condition that contains grouping by column should 
work.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org