github-actions[bot] commented on code in PR #64799:
URL: https://github.com/apache/doris/pull/64799#discussion_r3480792992


##########
fe/fe-core/src/main/java/org/apache/doris/service/arrowflight/FlightSqlConnectProcessor.java:
##########
@@ -196,11 +196,20 @@ public void fetchArrowFlightSchema(int timeoutMs) {
     @Override
     public void close() throws Exception {
         ctx.setCommand(MysqlCommand.COM_SLEEP);
+        // Executors whose results are pulled from the BE keep their 
coordinator alive past
+        // GetFlightInfo (registered as deferred executors on the 
ConnectContext) so the BE can
+        // still fetch external-table splits during DoGet. Do NOT finalize 
those here; they are
+        // finalized when the next query starts or the connection is torn 
down. Executors that are
+        // not deferred (local results, or a query that already failed) are 
finalized now. See #62259.
         for (StmtExecutor asynExecutor : returnResultFromRemoteExecutor) {
-            asynExecutor.finalizeQuery();
+            if (!asynExecutor.isDeferredForArrowFlight()) {
+                asynExecutor.finalizeQuery();
+            }

Review Comment:
   This guard also skips cleanup when `GetFlightInfo` fails after a remote 
executor has already been marked deferred. In that flow, 
`StmtExecutor.executeAndSendResult()` adds the executor to the deferred list 
after `coordBase.exec()`, then `executeQueryStatement()` still has to fetch the 
Arrow schema. If that schema fetch times out, returns non-OK/empty/mismatched 
schema, or hits an RPC error, try-with-resources calls this `close()`, but both 
deferred executors are skipped and the outer catch only rethrows. No 
`FlightInfo` is returned, so no `DoGet` can need the coordinator, but the query 
registration, query queue slot, and split sources stay alive until a later 
query or session/token teardown. Please explicitly close the deferred executors 
on this error path, and add a test that fails schema fetch after deferral.



##########
fe/fe-core/src/main/java/org/apache/doris/service/arrowflight/DorisFlightSqlProducer.java:
##########
@@ -189,6 +189,10 @@ private FlightInfo executeQueryStatement(String 
peerIdentity, ConnectContext con
         try {
             Preconditions.checkState(null != connectContext);
             Preconditions.checkState(!query.isEmpty());
+            // Finalize the previous query's coordinator on this connection 
whose close was
+            // deferred (Arrow Flight keeps it alive across GetFlightInfo -> 
DoGet so the BE can
+            // fetch external-table splits during DoGet). By now the previous 
DoGet is done. #62259
+            connectContext.closeFlightSqlDeferredExecutors();
             // After the previous query was executed, there was no 
getStreamStatement to take away the result.

Review Comment:
   The new cleanup assumes that starting the next statement means the previous 
`DoGet` has finished, but Flight RPCs are independent and this method has no 
completion signal from the BE result stream. A client can get `FlightInfo` for 
q1, keep reading q1's BE endpoint, and issue `GetFlightInfo` for q2 on the same 
session; this call will close q1's deferred coordinator. Closing the 
coordinator stops the scan nodes and removes the batch split sources, while 
q1's BE scan can still call `fetchSplitBatch`, which then fails with `Split 
source X is released`. This reintroduces the issue for pipelined/interleaved 
clients. Please keep the deferred coordinator until the result stream is 
actually complete or the session is closed, or reject/serialize a new statement 
while a remote Arrow Flight result is still live.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to