houqp commented on a change in pull request #1062:
URL: https://github.com/apache/arrow-datafusion/pull/1062#discussion_r718131881
##########
File path: datafusion/src/datasource/object_store/mod.rs
##########
@@ -75,8 +83,15 @@ pub type ListEntryStream =
/// It maps strings (e.g. URLs, filesystem paths, etc) to sources of bytes
#[async_trait]
pub trait ObjectStore: Sync + Send + Debug {
+ /// Get file system scheme
+ fn get_schema(&self) -> &'static str;
Review comment:
an object store could have multiple schemes, for example, s3/s3a or
file/fs/filesystem, so it would be better to return a slice of str here.
##########
File path: datafusion/src/datasource/object_store/mod.rs
##########
@@ -75,8 +83,15 @@ pub type ListEntryStream =
/// It maps strings (e.g. URLs, filesystem paths, etc) to sources of bytes
#[async_trait]
pub trait ObjectStore: Sync + Send + Debug {
+ /// Get file system scheme
+ fn get_schema(&self) -> &'static str;
Review comment:
also i think the name should be `get_scheme` instead?
##########
File path: datafusion/src/datasource/object_store/mod.rs
##########
@@ -75,8 +83,15 @@ pub type ListEntryStream =
/// It maps strings (e.g. URLs, filesystem paths, etc) to sources of bytes
#[async_trait]
pub trait ObjectStore: Sync + Send + Debug {
+ /// Get file system scheme
+ fn get_schema(&self) -> &'static str;
Review comment:
hmm... after taking a closer look at this, it looks like this is mainly
used in `get_chunk_reader` to build object store specific chunkreaders based on
the file scheme. I think the ideal abstraction would be making file format
modules agnostic to object stores instead of implementing object specific
format readers like `HadoopParquetFileReader`.
##########
File path: datafusion/src/datasource/object_store/mod.rs
##########
@@ -75,8 +83,15 @@ pub type ListEntryStream =
/// It maps strings (e.g. URLs, filesystem paths, etc) to sources of bytes
#[async_trait]
pub trait ObjectStore: Sync + Send + Debug {
+ /// Get file system scheme
+ fn get_schema(&self) -> &'static str;
Review comment:
hmm... after taking a closer look at this, it looks like this is mainly
used in `get_chunk_reader` to build object store specific chunkreaders based on
the file scheme. I think the ideal abstraction would be to make file format
modules agnostic to object stores instead of implementing object store specific
format readers like `HadoopParquetFileReader`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]