Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

via GitHub Thu, 14 Dec 2023 21:35:18 -0800


pan3793 commented on code in PR #44352:
URL: https://github.com/apache/spark/pull/44352#discussion_r1427566432



##########
sql/core/src/test/resources/sql-tests/analyzer-results/udf/postgreSQL/udf-select_having.sql.out:
##########
@@ -102,12 +102,11 @@ Project [udf(b)#x, udf(c)#x]
 SELECT udf(b), udf(c) FROM test_having
        GROUP BY b, c HAVING udf(b) = 3 ORDER BY udf(b), udf(c)
 -- !query analysis
-Project [udf(b)#x, udf(c)#x]

Review Comment:
   Though this query has both `HAVING` and `ORDER BY` clauses, but only scalar 
function is present in `ORDER BY` clause, I think the previous resolution 
matches item 4 of `ResolveReferencesInSort` comments.
   
   > 4. Resolves the column to [[AttributeReference]] with the output of a 
descendant plan node.
   >    Spark will propagate the missing attributes from the descendant plan 
node to the Sort node.
   >    This is to allow users to ORDER BY columns that are not in the SELECT 
clause, which is
   >    widely supported in other SQL dialects. For example, `SELECT a FROM t 
ORDER BY b`.
   
   With this patch, it should match item 3
   
   > 3. If the child plan is Aggregate or Filter(_, Aggregate), resolves the 
column to
   >    [[TempResolvedColumn]] with the output of Aggregate's child plan.
   >    This is to allow Sort to host grouping expressions and aggregate 
functions, which can
   >    be pushed down to the Aggregate later. For example,
   >    `SELECT max(a) FROM t GROUP BY b HAVING max(a) > 1 ORDER BY min(a)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

Reply via email to