neilconway commented on PR #22037:
URL: https://github.com/apache/datafusion/pull/22037#issuecomment-4412797480
Please let me know if I'm understanding this correctly:
* The PR aims to address a situation where there is a schema mismatch
between the anchor and recursive cases in a CTE
* In particular, we might infer different nullability properties between the
anchor vs the recursive query -- e.g., if we have `0` in the anchor and
`min(...)` in the recursive case, `0` is non-nullable and `min(...)` is
nullable (as an aside, the latter is conservative: `min(x)` without `FILTER` in
a grouped query is non-nullable if `x` is non-nullable, but I suppose this is a
separate planner shortcoming...)
* The proposed behavior is to apply the anchor schema to the recursive CTE
branch schemas. So we would effectively be requiring that a nullable `min`
expression never returns a NULL, in the example above
* If the recursive query *does* return a NULL, we produce a runtime error
If that is accurate, then the proposed behavior would result in this query
producing an error:
```sql
SET datafusion.execution.enable_recursive_ctes = true;
WITH RECURSIVE t AS (
SELECT 0 AS n
UNION ALL
SELECT CAST(NULL AS INT) AS n FROM t WHERE n IS NOT NULL
)
SELECT * FROM t;
```
(`Column 'n' is declared as non-nullable but contains null values`) -- but
this query seems pretty reasonable to me and is allowed by other SQL
implementations (e.g., Postgres, DuckDB, MariaDB, SQLite).
Instead, shouldn't we be computing the CTE's logical schema by widening the
anchor and the recursive schemas? This is conceptually similar to what we do
for `UNION`. That is, if the anchor expression is non-nullable and the
recursive expression is nullable, the output schema should be nullable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]