[spark] branch branch-3.3 updated: [SPARK-42259][SQL] ResolveGroupingAnalytics should take care of Python UDAF

wenchen Wed, 01 Feb 2023 01:41:57 -0800

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.3 by this push:
     new 80e8df11d7e [SPARK-42259][SQL] ResolveGroupingAnalytics should take 
care of Python UDAF
80e8df11d7e is described below

commit 80e8df11d7e2c135ef707c1c1626b976a8dc09a0
Author: Wenchen Fan <wenc...@databricks.com>
AuthorDate: Wed Feb 1 17:36:14 2023 +0800

    [SPARK-42259][SQL] ResolveGroupingAnalytics should take care of Python UDAF
    
    This is a long-standing correctness issue with Python UDAF and grouping 
analytics. The rule `ResolveGroupingAnalytics` should take care of Python UDAF 
when matching aggregate expressions.
    
    bug fix
    
    Yes, the query result was wrong before
    
    existing tests
    
    Closes #39824 from cloud-fan/python.
    
    Authored-by: Wenchen Fan <wenc...@databricks.com>
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
    (cherry picked from commit 1219c8492376e038894111cd5d922229260482e7)
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
---
 .../main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala    | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 84aa06baaff..881f2cc2078 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -617,7 +617,7 @@ class Analyzer(override val catalogManager: CatalogManager)
         // AggregateExpression should be computed on the unmodified value of 
its argument
         // expressions, so we should not replace any references to grouping 
expression
         // inside it.
-        case e: AggregateExpression =>
+        case e if AggregateExpression.isAggregate(e) =>
           aggsBuffer += e
           e
         case e if isPartOfAggregation(e) => e


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.3 updated: [SPARK-42259][SQL] ResolveGroupingAnalytics should take care of Python UDAF

Reply via email to