nuno-faria opened a new issue, #19427:
URL: https://github.com/apache/datafusion/issues/19427
### Describe the bug
When two recursive CTEs are joined, the query (with a `LIMIT`) never ends.
It fails with both hash and nested loop joins.
I guess what is happening is that the join waits for one side to be fully
materialized to start generating batches for the `CoalesceBatchesExec` (that
applies the `LIMIT`), which never happens if the CTE is unbounded.
### To Reproduce
- Hash join (infinite):
```sql
with recursive t(i) as (
select 1 as i
union all
select *
from t
)
select *
from t t1
join t t2 on t1.i = t2.i
limit 1;
```
- Cross join (infinite):
```sql
with recursive t as (
select 1
union all
select *
from t
)
select *
from t t1, t t2
limit 1;
```
- Limiting each side (infinite, the inner limits are optimized away):
```sql
with recursive t(i) as (
select 1 as i
union all
select *
from t
)
select *
from (select * from t limit 1) t1
join (select * from t limit 1) t2 on t1.i = t2.i
limit 1;
```
- Limiting each side but removing the outer limit (works):
```sql
with recursive t(i) as (
select 1 as i
union all
select *
from t
)
select *
from (select * from t limit 1) t1
join (select * from t limit 1) t2 on t1.i = t2.i;
```
### Expected behavior
Not hanging.
### Additional context
Tested on main.
Works in PostgreSQL, but also hangs in DuckDB (curiously the third example
works in DuckDB while the fourth hangs).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]