nuno-faria opened a new issue, #17360:
URL: https://github.com/apache/datafusion/issues/17360
### Describe the bug
The unparsing of Window functions on optimized plans is generating incorrect
queries. The reason for this is that the projection on Window columns is
optimized away, meaning the final query will use the entire function as the
column name.
### To Reproduce
Example:
```sql
-- original (the window function is used both in the filter and in the
projection)
select k, v, r
from (
select *, rank() over(partition by k order by v) as r
from t
) t
where r = 1
-- unoptimized plan (the window function is projected as "r", which is used
by the filter and the projection)
Projection: t.k, t.v, t.r
Filter: t.r = Int64(1)
SubqueryAlias: t
Projection: t.k, t.v, rank() PARTITION BY [t.k] ORDER BY [t.v ASC
NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS r
WindowAggr: windowExpr=[[rank() PARTITION BY [t.k] ORDER BY [t.v ASC
NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]]
TableScan: t
-- optimized plan (the projection is optimized out, the filter and
projection refer to the window function directly by the original code)
SubqueryAlias: t
Projection: t.k, t.v, rank() PARTITION BY [t.k] ORDER BY [t.v ASC NULLS
LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS r
Filter: rank() PARTITION BY [t.k] ORDER BY [t.v ASC NULLS LAST] RANGE
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW = UInt64(1)
WindowAggr: windowExpr=[[rank() PARTITION BY [t.k] ORDER BY [t.v ASC
NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]]
TableScan: t projection=[k, v]
-- invalid filter
SELECT * FROM (
SELECT t.k, t.v, rank() OVER (PARTITION BY t.k ORDER BY t.v ASC NULLS LAST
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS r
FROM t
WHERE ("rank() PARTITION BY [t.k] ORDER BY [t.v ASC NULLS LAST] RANGE
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" = 1)
) AS t
```
### Expected behavior
Return valid SQL.
### Additional context
Also failing with the `QUALIFY` statement:
```sql
-- original
select *, rank() over(partition by k order by v) as r
from t
qualify r = 1;
-- optimized plan
Projection: t.k, t.v, rank() PARTITION BY [t.k] ORDER BY [t.v ASC NULLS
LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW AS r
Filter: rank() PARTITION BY [t.k] ORDER BY [t.v ASC NULLS LAST] RANGE
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW = UInt64(1)
WindowAggr: windowExpr=[[rank() PARTITION BY [t.k] ORDER BY [t.v ASC
NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW]]
TableScan: t projection=[k, v]
-- output
SELECT t.k, t.v, rank() OVER (PARTITION BY t.k ORDER BY t.v ASC NULLS LAST
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS r
FROM t
WHERE ("rank() PARTITION BY [t.k] ORDER BY [t.v ASC NULLS LAST] RANGE
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" = 1)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]