jackyhu-db opened a new pull request, #3171:
URL: https://github.com/apache/arrow-adbc/pull/3171

   ## Motivation
   
   Databricks adds `RunAsync` (default is false) option for all the thrift 
operations, when it is set as `true`, the operation runs async at the backend 
and the status can be polled by calling getStatus, it helps Databricks backend 
to better manage capacity and load on client requests. Furthermore, it can 
avoid the unnecessary retry on thrift operation (e.g. TExecuteStatementReq) 
when the warehouse is stopped or unavailable, the backend will returns 503 
error when `RunAsync` is false and client has to retry on this thrift operation 
till the warehouse is up, this generates lots of queries with 503 errors in the 
Databricks query history and consume more resources. When `RunAsync=true`, 
server will return 200 with query state 'PENDING`, the client will poll the 
status till the warehouse is up, this only generates one query in the query 
history. 
   
   ## Change
   - Add a connection parameter `adbc.databricks.enable_run_async_thrift`, 
default is `false` (it will be changed to `true` later)
   - Set `RunAsync` of `TExecuteStatementReq` with above connection parameter 
(`RunAsyncInThrift`) in `DatabricksStatement:SetStatementProperties`
   - Fix a bug in `BaseDatabricksReader` by adding `null` check on 
`statement.DirectResults.ResultSet`
   
   
   ## Test
   - Run all the E2E tests under `csharp/test/Drivers/Databricks/E2E` when 
`Connection.RunAsyncInThrift` is both on and off


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to