matthewmturner edited a comment on issue #1836:
URL:
https://github.com/apache/arrow-datafusion/issues/1836#issuecomment-1043725945
@alamb thanks much for explanations. I've been giving this some thought and
came up with a potentially simple solution for a first step that would
facilitate adding tables to a schema from an `ObjectStore`.
What if we update `MemorySchemaProvider` to look like the following:
```
pub struct MemorySchemaProvider {
tables: RwLock<HashMap<String, Arc<dyn TableProvider>>>,
object_store: Option<Arc<dyn ObjectStore>>,
}
```
then we add the following methods (either to `SchemaProvider` trait or just
to `MemorySchemaProvider`:
```
impl SchemaProvider for MemorySchemaProvider {
...
pub fn register_store(self, object_store: Arc<dyn ObjectStore>) {
...
}
pub fn register_listing_table(self, uri, config:
Option<ListingTableConfig>) {
...
}
}
```
Then, to leverage further on `ObjectStore` implementation i'm wondering if
the `list_dir` method could be used to achieve my objective(if this is what
`list_dir` is intended for - if not maybe we could add a method for this). I
think this would look something roughly like the following:
```
let object_store = S3FileSystem::default();
let schema = MemorySchemaProvider::new();
schema.register_store(object_store);
let tables = object_store.list_dir("s3://active/schema1");
tables.iter().map(|file| {
let config = ListingTableConfig::new(..);
schema.register_listing_table(file, config);
}
```
This could all be done within datafusion and work with a s3 object store
which would achieve my short term objective. Then I could look into whether a
more sophisticated catalog or schema provider, like you outllined, is needed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]