morningman opened a new pull request, #64799:
URL: https://github.com/apache/doris/pull/64799
### What problem does this PR solve?
Issue Number: close #62259
Related PR: #64797
Problem Summary:
Arrow Flight SQL queries against Iceberg (and other external) tables in
batch split mode crashed the BE / failed with `Split source X is released`.
Arrow Flight executes a query in two phases: `GetFlightInfo` (plan + submit
to BE) and `DoGet` (the client pulls results from the BE). For an external
table scan in batch split mode, the BE keeps scanning during `DoGet` and lazily
fetches file splits from the FE via the `fetchSplitBatch` RPC, using an async
`SplitSource` that the FE coordinator holds (through its scan nodes).
The FE closed the coordinator at the end of `GetFlightInfo`
(`StmtExecutor.executeAndSendResult`'s `finally` → `Coordinator.close()` →
`ScanNode.stop()` → `SplitSourceManager.removeSplitSource()`) and also
unregistered it (`FlightSqlConnectProcessor.close()` →
`StmtExecutor.finalizeQuery()`). So by the time the BE called `fetchSplitBatch`
during `DoGet`, the `SplitSource` was already gone. The MySQL protocol is
unaffected because plan + execute share one request, so the coordinator stays
alive until all results are consumed.
This PR keeps the coordinator (and its `SplitSource`) alive across the two
phases and cleans it up reliably:
- **StmtExecutor**: for an Arrow Flight query that produces results on the
BE (`coordBase == coord`), mark it deferred, register the executor on the
`ConnectContext`, and skip the eager `Coordinator.close()` in the `finally`. A
failed query (whose `exec()` threw) is not deferred and is closed as before.
- **ConnectContext**: hold the deferred executors and add
`closeFlightSqlDeferredExecutors()`, which closes their coordinators (releasing
the `SplitSource` and the query queue slot) and unregisters the queries.
- **FlightSqlConnectProcessor.close()**: do not finalize deferred executors.
- **DorisFlightSqlProducer**: finalize the previous query's deferred
coordinator when the next query starts on the connection.
- **FlightSqlConnectPoolMgr.unregisterConnection()**: finalize deferred
coordinators when the connection is torn down. All teardown paths (idle/query
timeout, bearer token expiry, explicit `CloseSession`) reach here, so an
abandoned connection cannot leak the coordinator.
Non-Arrow-Flight paths (MySQL, internal tables, point queries) are
unchanged: `deferredForArrowFlight` can only become true for `ARROW_FLIGHT_SQL`.
The BE-side error-path hardening (so any `fetchSplitBatch` failure fails
gracefully instead of crashing the BE) is handled separately in #64797.
### Release note
Fix Arrow Flight SQL queries against external tables (e.g. Iceberg) failing
with `Split source X is released` or crashing the BE in batch split mode.
### Check List (For Author)
- Test
- [x] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason
Added
`regression-test/suites/external_table_p0/iceberg/test_iceberg_arrow_flight_split_source.groovy`.
It forces batch split mode on the Arrow Flight session
(`num_files_in_batch_mode=1`), asserts via `explain` that the scan really uses
the batch `SplitSource` path (`approximate`) so it cannot silently pass on the
non-batch path, then scans `format_v2.sample_cow_orc` over Arrow Flight and
checks all rows come back. The test runs in the external (docker) pipeline and
is skipped when the Iceberg env or the Arrow Flight endpoint is not configured.
- Behavior changed:
- [x] Yes. An Arrow Flight query's coordinator (and its external-table
batch `SplitSource`) is now kept alive until the next query starts on the
connection or the connection is torn down, instead of being closed at the end
of `GetFlightInfo`.
- Does this need documentation?
- [x] No.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]