kosiew opened a new issue, #22034:
URL: https://github.com/apache/datafusion/issues/22034
## Summary
#21912 introduced a SQL behavior change: an existing recursive CTE SLT had
to be amended from `0 AS level` to `SUM(0) AS level` to pass.
A follow-up fix is required because recursive CTE physical alignment must
preserve the CTE's declared/static schema, including nullability, rather than
widening the recursive CTE output schema and forcing SQL rewrites.
## Background
Issue #21910 is about preventing recursive CTE batches from leaking
recursive-term field names when the recursive term is planned independently
from the anchor/static term.
#21912 adds a reusable execution-layer schema helper and wires recursive CTE
execution to avoid ad hoc schema rebinding. During review/testing, another
problem appeared:
- The anchor/static term can declare a non-null field, e.g. `0 AS level`.
- The recursive term can compute the same column through an expression that
is nullable, e.g. `MIN(rs.level) + 1`.
- If `RecursiveQueryExec` declares a widened schema using `static_nullable
|| recursive_nullable`, downstream planning sees a physical schema that no
longer matches the logical/static CTE schema.
- Changing SQL from `0 AS level` to `SUM(0) AS level` makes the logical
anchor nullable too, but only hides the mismatch.
## Evidence
The SQL behavior change is visible in the amended SLT test:
```diff
diff --git a/datafusion/sqllogictest/test_files/cte.slt
b/datafusion/sqllogictest/test_files/cte.slt
index e9c1c0245..d13e0d4f0 100644
--- a/datafusion/sqllogictest/test_files/cte.slt
+++ b/datafusion/sqllogictest/test_files/cte.slt
@@ -699,7 +699,7 @@ WITH RECURSIVE region_sales AS (
SELECT
s.salesperson_id AS salesperson_id,
SUM(s.sale_amount) AS amount,
- 0 as level
+ SUM(0) as level
FROM
sales s
GROUP BY
```
Reverting this SLT query change from:
```sql
SUM(0) AS level
```
to the original:
```sql
0 AS level
```
fails with:
```text
Physical input schema should be the same as the one converted from logical
input schema.
Differences:
- field nullability at index 2 [level]: (physical) true vs (logical) false.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]