mihailoale-db opened a new pull request, #50870:
URL: https://github.com/apache/spark/pull/50870
### What changes were proposed in this pull request?
In the following query we would have `min(outer(t2.t2a))` as a name for
`min(t2a)` expression.
```
SELECT t1a
FROM t1
WHERE t1a IN (SELECT t2a
FROM t2
WHERE EXISTS (SELECT min(t2a)
FROM t3))
```
This is a problem in compatibility between single-pass resolver and
fixed-point analyzer because names in single-pass are generated after we finish
resolution of aggregate expression `min(t2a)` (bottom-up manner) and at that
point we have `OuterReference` wrapped around aggregate expression (name looks
like `outer(min(t2a))`).
I propose that we fix it by normalizing `Alias` names for correlated
aggregate functions in HAVING condition.
### Why are the changes needed?
To ease development of single-pass analyzer.
### Does this PR introduce _any_ user-facing change?
`Explain extended` of affected plans would be different.
### How was this patch tested?
Existing tests (regenerated golden files).
### Was this patch authored or co-authored using generative AI tooling?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]