Mihailo Timotic created SPARK-52531:
---------------------------------------

             Summary: `OuterReference` in subquery aggregate is incorrectly 
tied to outer query aggregate
                 Key: SPARK-52531
                 URL: https://issues.apache.org/jira/browse/SPARK-52531
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.1.0
            Reporter: Mihailo Timotic


For a query like:

 

{{}}
{code:java}

{code}
{{SELECT MAX(a.col1) }}
{{FROM VALUES (7) AS a(col1) }}
{{GROUP BY a.col1 }}
{{HAVING COUNT(*) = ( }}
{{    SELECT COUNT(*) }}
{{    FROM VALUES (7),(7),(8),(8) AS c(col1) }}
{{    WHERE c.col1 >= a.col1 }}
{{    GROUP BY c.col1 }}
{{    LIMIT 1 }}
{{) }}
{{ORDER BY a.col1 DESC;}}

 

 

Spark outputs result: `7`

{{}}

However, this is an incorrect result and both MySQL and DuckDB correctly output 
0 rows
 
!image-2025-06-19-10-08-02-417.png!
!image-2025-06-19-10-08-15-034.png!

This is because count(*) from the subquery under Having is incorrectly tied as 
an outer reference to the count(*) from the outer query:

 

{{}}
{code:java}
Project [max(col1)#3081747] +- Sort [col1#3081725 DESC NULLS LAST], true    +- 
Project [max(col1)#3081747, col1#3081725]       +- Filter (count(1)#3081750L = 
scalar-subquery#3081727 [col1#3081725])          :  +- GlobalLimit 1          : 
    +- LocalLimit 1          :        +- Aggregate [col1#3081726], 
[outer(count(1)#3081750L) AS count(1)#3081749L]          :           +- Filter 
(col1#3081726 >= outer(col1#3081725))          :              +- SubqueryAlias 
c          :                 +- LocalRelation [col1#3081726]          +- 
Aggregate [col1#3081725], [max(col1#3081725) AS max(col1)#3081747, count(1) AS 
count(1)#3081750L, col1#3081725]             +- SubqueryAlias a                
+- LocalRelation [col1#3081725]{code}
 

 

The outer reference on count(*) is introduced in UpdateOuterReferences rule.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to