github-actions[bot] commented on code in PR #64799:
URL: https://github.com/apache/doris/pull/64799#discussion_r3480792992
##########
fe/fe-core/src/main/java/org/apache/doris/service/arrowflight/FlightSqlConnectProcessor.java:
##########
@@ -196,11 +196,20 @@ public void fetchArrowFlightSchema(int timeoutMs) {
@Override
public void close() throws Exception {
ctx.setCommand(MysqlCommand.COM_SLEEP);
+ // Executors whose results are pulled from the BE keep their
coordinator alive past
+ // GetFlightInfo (registered as deferred executors on the
ConnectContext) so the BE can
+ // still fetch external-table splits during DoGet. Do NOT finalize
those here; they are
+ // finalized when the next query starts or the connection is torn
down. Executors that are
+ // not deferred (local results, or a query that already failed) are
finalized now. See #62259.
for (StmtExecutor asynExecutor : returnResultFromRemoteExecutor) {
- asynExecutor.finalizeQuery();
+ if (!asynExecutor.isDeferredForArrowFlight()) {
+ asynExecutor.finalizeQuery();
+ }
Review Comment:
This guard also skips cleanup when `GetFlightInfo` fails after a remote
executor has already been marked deferred. In that flow,
`StmtExecutor.executeAndSendResult()` adds the executor to the deferred list
after `coordBase.exec()`, then `executeQueryStatement()` still has to fetch the
Arrow schema. If that schema fetch times out, returns non-OK/empty/mismatched
schema, or hits an RPC error, try-with-resources calls this `close()`, but both
deferred executors are skipped and the outer catch only rethrows. No
`FlightInfo` is returned, so no `DoGet` can need the coordinator, but the query
registration, query queue slot, and split sources stay alive until a later
query or session/token teardown. Please explicitly close the deferred executors
on this error path, and add a test that fails schema fetch after deferral.
##########
fe/fe-core/src/main/java/org/apache/doris/service/arrowflight/DorisFlightSqlProducer.java:
##########
@@ -189,6 +189,10 @@ private FlightInfo executeQueryStatement(String
peerIdentity, ConnectContext con
try {
Preconditions.checkState(null != connectContext);
Preconditions.checkState(!query.isEmpty());
+ // Finalize the previous query's coordinator on this connection
whose close was
+ // deferred (Arrow Flight keeps it alive across GetFlightInfo ->
DoGet so the BE can
+ // fetch external-table splits during DoGet). By now the previous
DoGet is done. #62259
+ connectContext.closeFlightSqlDeferredExecutors();
// After the previous query was executed, there was no
getStreamStatement to take away the result.
Review Comment:
The new cleanup assumes that starting the next statement means the previous
`DoGet` has finished, but Flight RPCs are independent and this method has no
completion signal from the BE result stream. A client can get `FlightInfo` for
q1, keep reading q1's BE endpoint, and issue `GetFlightInfo` for q2 on the same
session; this call will close q1's deferred coordinator. Closing the
coordinator stops the scan nodes and removes the batch split sources, while
q1's BE scan can still call `fetchSplitBatch`, which then fails with `Split
source X is released`. This reintroduces the issue for pipelined/interleaved
clients. Please keep the deferred coordinator until the result stream is
actually complete or the session is closed, or reject/serialize a new statement
while a remote Arrow Flight result is still live.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]