*Compatibility chart of driver implementations for APIs that support queries and updates in the same function*
As Martin Prammer requested in the previous biweekly meeting, I analyzed the Arrow JDBC Flight SQL driver, the ODBC Flight SQL driver, and the ADBC Flight SQL drivers in Python and Go to list how current implementations decide whether to use CommandPreparedStatementQuery or CommandPreparedStatementUpdate for API method calls that allow both. All drivers except the JDBC driver use CommandPreparedStatementQuery for these methods, indicating a lack of DML support. Below are links to the implementations: 1. ODBC Flight SQL driver: For ODBC SQLExecute(), the driver always uses CommandPreparedStatementQuery: https://github.com/apache/arrow/blob/ca6845248b014db7131ba6dccec5f91b04b4543d/cpp/src/arrow/flight/sql/client.cc#L644 2. JDBC Flight SQL driver: For Statement.execute(), the driver uses CommandPreparedStatementQuery if the dataset_schema of the ActionCreatePreparedStatementResult is not empty, otherwise CommandPreparedStatementUpdate: https://github.com/apache/arrow-java/blob/7390f551267798d4670eae6b2894c527dbc90403/flight/flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/client/ArrowFlightSqlClientHandler.java#L458 3. JDBC Flight SQL driver: For PreparedStatement.execute(), the last released driver version, 18.3.0, always uses CommandPreparedStatementQuery. There is a merged PR that uses the same heuristic as Statement.execute(), see https://github.com/apache/arrow-java/pull/811, which has not yet been released 4. Python ADBC Flight SQL driver: DB-API 2.0 allows for both in cursor.execute(), see https://peps.python.org/pep-0249/#id20 The driver always uses CommandPreparedStatementQuery: https://github.com/apache/arrow-adbc/blob/b0611a123166b1e3778e26258e75c8a46b0e903b/python/adbc_driver_manager/adbc_driver_manager/dbapi.py#L817 5. Go ADBC Flight SQL driver: I don't think there is any API method that allows for both result set generating queries and result counts. Based on the docs at https://pkg.go.dev/github.com/apache/arrow-adbc/go/adbc#Statement, it seems like Statement.ExecuteQuery() is only for result set generating queries and Statement.ExecuteUpdate() is for updates. Let me know if I should expand this list with other implementations, I only checked the ones I am aware of. *Backward compatibility of the proposed change* The change being proposed (adding a boolean field to ActionCreatePreparedStatementResult to determine the network flow used by Flight SQL clients) is fully backward compatible. This follows directly from using a new proto3 optional field. See the section "Adding new fields is safe" in https://protobuf.dev/programming-guides/editions/?utm_source=chatgpt.com#wire-safe-changes For ease of understanding, I will outline the scenarios below: New client <-> Old server If a server does not set the new field that the client expects, the client can detect the field's absence (directly from the protobuf generated files) and follow the logic it previously used to determine the network flow. I have a draft PR in the JDBC driver that exemplifies this; I updated the driver but didn't change the server. All tests pass locally, and I tested it successfully end to end with a backend server that wasn't updated: https://github.com/apache/arrow-java/pull/1064 Old client <-> New server A client implementation that receives an unknown field will merely ignore it during parsing. Best, Pedro On Tue, Mar 3, 2026 at 10:50 PM David Li <[email protected]> wrote: > Sounds good. I think it would also be reasonable to raise a PR with the > spec change for discussion as well. > > I would much prefer to not cram more things into existing endpoints, but I > suppose it's not clear to me if it's possible to fix that at this point. > > On Wed, Mar 4, 2026, at 04:38, Pedro Matias wrote: > > I agree with consolidating the two execution modes. I don't think these > > approaches are mutually exclusive: we can fix the current execution split > > for correctness (which should be an easier and quicker fix) and > introduce a > > new consolidated endpoint to include row counts in query cases as well. > > > > Having the same endpoint allows us to use it for ad hoc queries, which > > reduces the number of roundtrips per query in Statement.execute(). > > > > Do you intend to use DoExchange for this new endpoint? > > > > In the meantime I'm working on some action items raised in the last sync > > regarding my proposed fix. I will send an email highlighting backward > > compatibility concerns and the current status of the different drivers > > before the next meeting. > > > > Pedro > > > > On Thu, Feb 26, 2026 at 2:24 AM David Li <[email protected]> wrote: > > > >> It seems reasonable to me if you want to raise a pull request to > discuss, > >> but we could consider consolidating the two execution modes? API-wise I > >> feel it would be better to just have one endpoint and let the server > return > >> what is appropriate. (Also because interfaces like PEP 249 and protocols > >> like Postgres's allow for a row count in both query and update cases, > >> albeit JDBC does not.) > >> > >> On Mon, Feb 23, 2026, at 10:17, Pedro Matias wrote: > >> > Hello all, > >> > > >> > @Hélder Gregório <[email protected]> and I identified a gap > >> > between common database API execution patterns and Arrow Flight SQL > >> > prepared statements. To address this, we propose adding an optional > >> boolean > >> > field to ActionCreatePreparedStatementResult. > >> > Background > >> > > >> > A common pattern in database APIs is: > >> > > >> > 1. > >> > > >> > Create a prepared statement > >> > 2. > >> > > >> > Execute the prepared statement, returning either a result set or an > >> > update count > >> > > >> > This pattern exists in: > >> > > >> > - > >> > > >> > *JDBC* (Connection.prepareStatement() + > PreparedStatement.execute()) > >> > - > >> > > >> > *Python PEP 249* (both steps condensed in cursor.execute()) > >> > - > >> > > >> > *ODBC* (SQLPrepare() + SQLExecute()) > >> > > >> > In Arrow Flight SQL, there are two mutually exclusive communication > paths > >> > for executing prepared statements. Both begin with > >> > ActionCreatePreparedStatementRequest, after which the client must > choose > >> > between: > >> > > >> > - > >> > > >> > CommandPreparedStatementQuery (returns a result set), or > >> > - > >> > > >> > CommandPreparedStatementUpdate (returns an update count). > >> > > >> > (For simplicity, we ignore parameter binding here.) > >> > > >> > The issue is that ActionCreatePreparedStatementResult, returned by the > >> > server in the first call, does not contain information indicating > which > >> > execution path the client should take. > >> > > >> > *Proposal* > >> > > >> > We propose adding the following field to > >> ActionCreatePreparedStatementResult > >> > : > >> > > >> > optional bool is_update = 4; > >> > > >> > > >> > - > >> > > >> > true → clients should use CommandPreparedStatementUpdate > >> > - > >> > > >> > false → clients should use CommandPreparedStatementQuery > >> > > >> > This makes the intended execution path explicit. > >> > > >> > The behavior of clients when the server does not set this field is > >> outside > >> > the scope of this proposal, though discussion is welcome. We would be > >> happy > >> > to open follow-up PRs to standardize client behavior across drivers if > >> > desired. > >> > Current state of driver implementations > >> > > >> > - > >> > > >> > The Arrow Flight SQL JDBC driver uses a heuristic to choose the > >> > execution path: > >> > https://github.com/apache/arrow-java/issues/797 > >> > < > >> https://github.com/apache/arrow-java/issues/797?utm_source=chatgpt.com> > >> > - > >> > > >> > The PEP 249 Python Flight SQL driver (in ADBC) always uses > >> > CommandPreparedStatementQuery in cursor.execute(). > >> > > >> > We believe making the execution path explicit improves protocol > >> > completeness and alignment with widely used database APIs. > >> > > >> > Let us know your thoughts. > >> > > >> > Best, > >> > Pedro Matias > >> >
