subject:"\[GitHub\] \[spark\] cloud\-fan commented on a diff in pull request #39509\: \[SPARK\-41635\]\[SQL\] Fix group by all error reporting"

[GitHub] [spark] cloud-fan commented on a diff in pull request #39509: [SPARK-41635][SQL] Fix group by all error reporting

2023-01-11 Thread GitBox



cloud-fan commented on code in PR #39509:
URL: https://github.com/apache/spark/pull/39509#discussion_r1067633765


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByAll.scala:
##
@@ -93,8 +93,9 @@ object ResolveGroupByAll extends Rule[LogicalPlan] {
* end of analysis, so we can tell users that we fail to infer the grouping 
columns.
*/
   def checkAnalysis(operator: LogicalPlan): Unit = operator match {
-case a: Aggregate if matchToken(a) =>
-  if (a.aggregateExpressions.exists(_.exists(_.isInstanceOf[Attribute]))) {
+case a: Aggregate if a.aggregateExpressions.forall(_.resolved) && 
matchToken(a) =>

Review Comment:
   The check code here is a partial version of the `apply` code. I can't think 
of a simple refactor that can reduce more code duplication. Let's think about 
this later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39509: [SPARK-41635][SQL] Fix group by all error reporting

2023-01-11 Thread GitBox



cloud-fan commented on code in PR #39509:
URL: https://github.com/apache/spark/pull/39509#discussion_r1066909968


##
sql/core/src/test/resources/sql-tests/inputs/group-by-all.sql:
##
@@ -75,8 +75,11 @@ select (id + id) / 2 + count(*) * 2 from data group by all;
 select country, (select count(*) from data) as cnt, count(id) as cnt_id from 
data group by all;
 
 -- correlated subquery should also work
-select (select count(*) from data d1 where d1.country = d2.country), count(id) 
from data d2 group by all;
+select country, (select count(*) from data d1 where d1.country = d2.country), 
count(id) from data d2 group by all;
 
--- make sure we report the right error when there's an attribute in correlated 
subquery
--- that is, to report UNRESOLVED_ALL_IN_GROUP_BY, rather than some random 
subquery error.
-select coutnry, (select count(*) from data d1 where d1.country = d2.country), 
count(id) from data d2 group by all;

Review Comment:
   `coutnry` is an non-existing column and we should report normal missing 
column error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39509: [SPARK-41635][SQL] Fix group by all error reporting

[GitHub] [spark] cloud-fan commented on a diff in pull request #39509: [SPARK-41635][SQL] Fix group by all error reporting

2 matches

Site Navigation

Mail list logo

Footer information