zclllyybb commented on issue #64368:
URL: https://github.com/apache/doris/issues/64368#issuecomment-4668344568
Breakwater-GitHub-Analysis-Slot: slot_e2a93862adf5
Initial read: this looks like a real FE planner/coordinator bug, not a SQL
syntax issue.
The issue does not provide an actual Doris version; the "Version" field
contains the CTAS SQL. I checked current `apache/doris` upstream master
(`e8c06f265a2`) for the relevant path. The exact error text comes from
`Coordinator.findMaxParallelFragmentIndex()`, which asserts that a fragment
with no leftmost `ScanNode` must have at least one child fragment:
```text
Preconditions.checkState(!fragment.getChildren().isEmpty(), "fragment has no
children")
```
The reporter's recursive CTE has a special shape:
```sql
WITH RECURSIVE t(id, a) AS (
SELECT CAST(1 AS BIGINT), array(CAST(1 AS BIGINT))
UNION ALL
SELECT id + 1, array_pushback(a, id) FROM t WHERE id < 5
)
```
There is no real base table scan in either the anchor side or recursive
side. In Nereids translation, the anchor constant select is translated through
a one-row/union fragment, and the recursive reference is translated to
`RecursiveCteScanNode`, which extends `PlanNode`, not `ScanNode`.
`PhysicalPlanTranslator.visitPhysicalRecursiveUnion()` then sets the recursive
CTE fragment root to `RecursiveCteNode` and marks the fragment as
`UNPARTITIONED`.
That can produce a recursive CTE fragment where:
1. the leftmost node is not a `ScanNode`, because it is under
`RecursiveCteNode` / `RecursiveCteScanNode`; and
2. there is no child `PlanFragment`, because this minimal recursive CTE does
not read a base table or introduce an exchange-producing child.
The legacy `Coordinator` assignment code treats this as "Case B" and calls
`findMaxParallelFragmentIndex()`, but that helper assumes at least one child
fragment. That matches the reported `fragment has no children` error exactly.
Existing upstream regression coverage seems incomplete for this shape.
`regression-test/suites/recursive_cte/create_and_insert_select_test.groovy`
covers CTAS and INSERT with recursive CTE over a real OLAP table and join.
`recursive_100_number_test.groovy` covers a constant recursive CTE as a plain
SELECT. I did not find coverage for CTAS / INSERT INTO using a pure constant +
worktable recursive CTE, especially with an `ARRAY<BIGINT>` output column.
Suggested maintainer next steps:
1. Add a regression case based on the reporter's SQL for both CTAS and
`INSERT INTO` with an explicit target schema.
2. Check whether the same CTE succeeds as a plain `SELECT`. If plain SELECT
succeeds but CTAS/INSERT fails, the bug is isolated to the insert/load
coordinator path.
3. Fix the FE fragment assignment path so recursive CTE fragments without
base-table scan children are handled as valid serial/no-scan fragments, or
ensure the Nereids distributed coordinator path is used consistently for
recursive CTE CTAS/INSERT. The suspicious public code points are
`PhysicalPlanTranslator.visitPhysicalRecursiveUnion()` and
`Coordinator.findMaxParallelFragmentIndex()`.
Missing information from the reporter:
1. Actual Doris version or commit hash.
2. Session variable values for `enable_nereids_planner`,
`enable_nereids_distribute_planner`, `enable_fallback_to_original_planner`, and
local-shuffle related settings.
3. The FE stack trace around the failed statement.
4. Whether the same `WITH RECURSIVE ... SELECT * FROM t` fails outside
CTAS/INSERT.
I would not recommend disabling Nereids as a reliable workaround here.
Recursive CTE is a Nereids feature path, and the current code explicitly
requires `enable_nereids_distribute_planner=true` during recursive CTE
analysis. A safer temporary workaround is to avoid this pure constant recursive
CTE inside CTAS/INSERT, or materialize the rows through another method until
the FE fragment assignment bug is fixed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]