eric-wang-1990 opened a new pull request, #3489: URL: https://github.com/apache/arrow-adbc/pull/3489
The directResults field control how many rows/bytes can be returned in one arrow batch. Before this change, due to a bug for databricks it is calling base class SparkConnection, which has maxRows=1000, which is too small. ODBC can get all results in a single ExecuteStatement call while ADBC needs 1 ExecuteStatement and multiple FetchResults, which cause ADBC to be slower in small queries. For ADBC: <img width="614" height="136" alt="image" src="https://github.com/user-attachments/assets/64faa63c-9bc6-4dd1-8d71-66af09e95df4" /> For ODBC: <img width="611" height="27" alt="image" src="https://github.com/user-attachments/assets/52817f46-412a-41fc-9f0b-17d7ae02d91d" /> This PR update the DefaultMaxBytes to 10MB, which is the same limit on Databricks backend for Arrow row set. MaxRows to be 500K, assuming a minimum 20 Bytes column size. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
