[GitHub] [spark] cloud-fan commented on a diff in pull request #39509: [SPARK-41635][SQL] Fix group by all error reporting
cloud-fan commented on code in PR #39509: URL: https://github.com/apache/spark/pull/39509#discussion_r1067633765 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveGroupByAll.scala: ## @@ -93,8 +93,9 @@ object ResolveGroupByAll extends Rule[LogicalPlan] { * end of analysis, so we can tell users that we fail to infer the grouping columns. */ def checkAnalysis(operator: LogicalPlan): Unit = operator match { -case a: Aggregate if matchToken(a) => - if (a.aggregateExpressions.exists(_.exists(_.isInstanceOf[Attribute]))) { +case a: Aggregate if a.aggregateExpressions.forall(_.resolved) && matchToken(a) => Review Comment: The check code here is a partial version of the `apply` code. I can't think of a simple refactor that can reduce more code duplication. Let's think about this later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a diff in pull request #39509: [SPARK-41635][SQL] Fix group by all error reporting
cloud-fan commented on code in PR #39509: URL: https://github.com/apache/spark/pull/39509#discussion_r1066909968 ## sql/core/src/test/resources/sql-tests/inputs/group-by-all.sql: ## @@ -75,8 +75,11 @@ select (id + id) / 2 + count(*) * 2 from data group by all; select country, (select count(*) from data) as cnt, count(id) as cnt_id from data group by all; -- correlated subquery should also work -select (select count(*) from data d1 where d1.country = d2.country), count(id) from data d2 group by all; +select country, (select count(*) from data d1 where d1.country = d2.country), count(id) from data d2 group by all; --- make sure we report the right error when there's an attribute in correlated subquery --- that is, to report UNRESOLVED_ALL_IN_GROUP_BY, rather than some random subquery error. -select coutnry, (select count(*) from data d1 where d1.country = d2.country), count(id) from data d2 group by all; Review Comment: `coutnry` is an non-existing column and we should report normal missing column error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org