alamb commented on issue #16991: URL: https://github.com/apache/datafusion/issues/16991#issuecomment-3141439401
Thank you for bringing this up @EmilyMatt I think there is currently an assumption in the `ListingTable` (and all the way down to the DataSourceExec) that all the files are read from the same underlying `ObjectStore` instance which I think is the root cause of the challenge It would be pretty disruptive (aka a big and complicated PR) I think to try and wire in support for multiple object stores Another idea I had work for you could be to make a "virtual" ObjectStore wrapper Something like this (a sketch, not compiling): ```rust struct VirtualObjectStore { // Maps the first element of each path to a different ObjectStore stores: HashMap<String, Arc<dyn ObjectStore>>. } impl ObjectStore for VirtualObjectStore { // delegates to the correct store // for example, // get '/store1/my_data/1.parquet' // would be mapped to a get call to `store1` at path `/my_data/1.parquet` fn get(&self, path: Path) -> ... { let store = path[0]; // first part of the path let real_path = path[1..]; // remainder of the path // delegate to the inner store self.stores.get(store).get(real_path) } ... } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org