alamb commented on issue #2445:
URL: 
https://github.com/apache/arrow-datafusion/issues/2445#issuecomment-1120413298

   > From my perspective It might be beneficial to push information about data 
source from TableProvider to ObjectStore. Then ObjectStore for a local file 
system, would combine data(table) location and strategy for listing that kind 
of storage. As a result listing methods present in ObjectStore could drop the 
concept of path as a way to access data.
   
   I really like the idea of providing an extensible storage interface that 
allows APIs such as suggested by @Cheappie  and @timvw. 
   
   Given these APIs seem to be adding semantics to the list of files on 
ObjectStorage, perhaps we could an extra layer specifically in the APIs rather 
than trying to extend `ObjectStore` or adding more logic to `ListingTable`. 
Perhaps something like the `StorageFormat` in:
   
   
   ```text
   ┌───────────────────────────────────┐
   │                                   │
   │           ListingTable            │
   │                                   │
   └───────────────────────────────────┘
   ┌───────────────────────────────────┐
   │          StorageCatalog           │
   │  (e.g figure out which files on   │
   │     object store to process)      │
   └───────────────────────────────────┘
   ┌────────────────┐ ┌────────────────┐
   │  ObjectStore   │ │  File Format   │
   │(e.g. S3, HDFS) │ │ (e.g. parquet) │
   │                │ │                │
   └────────────────┘ └────────────────┘
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to