[GitHub] [spark] cloud-fan commented on a diff in pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-09-25 Thread GitBox


cloud-fan commented on code in PR #36265:
URL: https://github.com/apache/spark/pull/36265#discussion_r979562710


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -2594,6 +2601,31 @@ class Analyzer(override val catalogManager: 
CatalogManager)
 })
 }
 
+private def resolveTemp(expr: Expression, agg: Aggregate): Expression = {

Review Comment:
   This is very similar to the `resolveCol` method inside 
`resolveExprsWithAggregate`. Can we have a common method to share code?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-05-06 Thread GitBox


cloud-fan commented on code in PR #36265:
URL: https://github.com/apache/spark/pull/36265#discussion_r866863003


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -2563,7 +2563,23 @@ class Analyzer(override val catalogManager: 
CatalogManager)
 
   case Sort(sortOrder, global, agg: Aggregate) if agg.resolved =>
 // We should resolve the references normally based on child 
(agg.output) first.
-val maybeResolved = 
sortOrder.map(_.child).map(resolveExpressionByPlanOutput(_, agg))

Review Comment:
   e.g.
   ```
   val maybeResolved = sortOrder.map(_.child).map { expr =>
 val resolved = resolveTemp(expr, agg)
 if (resolved.exists(_.isInstanceOf[AggregateFunction])) {
   unresolve(resolved)
 } else {
   resolved
 }
   }
   resolveOperatorWithAggregate...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-05-06 Thread GitBox


cloud-fan commented on code in PR #36265:
URL: https://github.com/apache/spark/pull/36265#discussion_r866859656


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -2563,7 +2563,23 @@ class Analyzer(override val catalogManager: 
CatalogManager)
 
   case Sort(sortOrder, global, agg: Aggregate) if agg.resolved =>
 // We should resolve the references normally based on child 
(agg.output) first.
-val maybeResolved = 
sortOrder.map(_.child).map(resolveExpressionByPlanOutput(_, agg))

Review Comment:
   My proposal is: we still try to resolve the column to `agg.output` first, 
but temporarily (use `TempResolvedColumn`). If the order by expression is 
resolved but contains aggregate functions, unresolve the column and call 
`resolveOperatorWithAggregate`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-05-06 Thread GitBox


cloud-fan commented on code in PR #36265:
URL: https://github.com/apache/spark/pull/36265#discussion_r866856072


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -2563,7 +2563,23 @@ class Analyzer(override val catalogManager: 
CatalogManager)
 
   case Sort(sortOrder, global, agg: Aggregate) if agg.resolved =>
 // We should resolve the references normally based on child 
(agg.output) first.
-val maybeResolved = 
sortOrder.map(_.child).map(resolveExpressionByPlanOutput(_, agg))

Review Comment:
   I think this bug is much more complicated. Think about `order by sum(id)` 
and `order by abs(id)`. For the first case, we want to resolve `id` to the 
table column and push `sum(id)` to the Aggregate. For the second case, we want 
to resolve `id` to `sum(id) as id`.
   
   How to define a clear rule here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-05-06 Thread GitBox


cloud-fan commented on code in PR #36265:
URL: https://github.com/apache/spark/pull/36265#discussion_r866842272


##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala:
##
@@ -1176,4 +1176,13 @@ class AnalysisSuite extends AnalysisTest with Matchers {
 false)
 }
   }
+
+  test("SPARK-38951: Aggregate aliases override field names in 
ResolveAggregateFunctions") {
+assertAnalysisSuccess(parsePlan(
+  s"""
+ |select sum(id) as id
+ |from range(10)
+ |group by id
+ |order by sum(id)""".stripMargin))

Review Comment:
   does this query work on other databases?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #36265: [SPARK-38951][SQL] Aggregate aliases override field names in ResolveAggregateFunctions

2022-05-06 Thread GitBox


cloud-fan commented on code in PR #36265:
URL: https://github.com/apache/spark/pull/36265#discussion_r866842272


##
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala:
##
@@ -1176,4 +1176,13 @@ class AnalysisSuite extends AnalysisTest with Matchers {
 false)
 }
   }
+
+  test("SPARK-38951: Aggregate aliases override field names in 
ResolveAggregateFunctions") {
+assertAnalysisSuccess(parsePlan(
+  s"""
+ |select sum(id) as id
+ |from range(10)
+ |group by id
+ |order by sum(id)""".stripMargin))

Review Comment:
   does this query work on other databases?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org