alamb commented on issue #16991:
URL: https://github.com/apache/datafusion/issues/16991#issuecomment-3141439401
Thank you for bringing this up @EmilyMatt
I think there is currently an assumption in the `ListingTable` (and all the
way down to the DataSourceExec) that all the files are read from the same
underlying `ObjectStore` instance which I think is the root cause of the
challenge
It would be pretty disruptive (aka a big and complicated PR) I think to try
and wire in support for multiple object stores
Another idea I had work for you could be to make a "virtual" ObjectStore
wrapper
Something like this (a sketch, not compiling):
```rust
struct VirtualObjectStore {
// Maps the first element of each path to a different ObjectStore
stores: HashMap<String, Arc<dyn ObjectStore>>.
}
impl ObjectStore for VirtualObjectStore {
// delegates to the correct store
// for example,
// get '/store1/my_data/1.parquet'
// would be mapped to a get call to `store1` at path `/my_data/1.parquet`
fn get(&self, path: Path) -> ... {
let store = path[0]; // first part of the path
let real_path = path[1..]; // remainder of the path
// delegate to the inner store
self.stores.get(store).get(real_path)
}
...
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]