Re: [I] discussion: ConnectionGetObjects vs ConnectionGetTableSchema [arrow-adbc]

via GitHub Thu, 18 Jun 2026 10:15:15 -0700


CurtHagenlocher commented on issue #1704:
URL: https://github.com/apache/arrow-adbc/issues/1704#issuecomment-4744418211


   **Modern catalogs have arbitrary depth**: this concern is reflected in #46 
and #320. An open question is whether the depth is fixed (either at a driver or 
a connection level) or whether the depth can be variable -- e.g. a filesystem 
hierarchy. I think #4400 is mostly orthogonal to this concern.
   
   **The API is vague when it comes to filtering**: yes, this has been raised a 
few times in e.g. #1321, #1508, #3220
   
   **Data lake catalogs can be huge**, **Data lake catalogs are slow**: There's 
nothing in the existing API which says that all the data needs to be returned 
in a single `RecordBatch`. The driver could choose to start enumerating values 
against the back end and then use a time- or size- based trigger to return all 
the values it has so far as the next batch. This would allow for incremental 
population of a UI.
   
   Anyway; above all, I agree with your implicit point that this area would 
benefit from a holistic rethink.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] discussion: ConnectionGetObjects vs ConnectionGetTableSchema [arrow-adbc]

Reply via email to