chitralverma opened a new issue, #4047:
URL: https://github.com/apache/arrow-rs/issues/4047

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   Currently, in the projects that are using `object_store` - datafusion/ 
delta-rs/ pola-rs etc, a `dyn ObjectStore` has to be created by manually by 
parsing the provided URL, checking the scheme and providing the options.
   
   It would be great to have this capability directly provided by the crate.
   
   **Describe the solution you'd like**
   My proposal is to standardize implementation and bring it into this crate 
itself exposed by a simple function like below,
   
   ```
   #[derive(Clone, Debug, Serialize, Deserialize)]
   pub struct StorageOptions(pub HashMap<String, String>);
   
   #[derive(Debug, Clone)]
   pub struct AObjectStore {
       storage: Arc<dyn ObjectStore>,
       location: Url,
       options: StorageOptions,
   }
   
   /// Try creating a new instance of [`AObjectStore`]
   pub fn get_object_store(location: Url, options: impl Into<StorageOptions> + 
Clone) -> Result<AObjectStore> {
       let prefix = Path::from(location.path());
      
       // parse URL to a kind (s3/ aws/ ... )
       let kind = ObjectStoreKind::parse_url(&location)?;
   
       // instantiate object store
       let store = kind.into_impl( .... );
   
       // return
       Ok(Self {
           store,
           location,
           options: options.into(),
       })
   }
   ```
   
   **Describe alternatives you've considered**
   Without this, each lib using `object_store` has to implement its own parsing.
   
   
   Examples: 
   - See datafusion registry 
[here](https://github.com/apache/arrow-datafusion/blob/52fa2285b43ad6712e9b8bf6c05b4b8ff93f44f9/datafusion/execution/src/object_store.rs#L186-L217).
   - See delta-rs implementation 
[here](https://github.com/delta-io/delta-rs/blob/c8371b38fdf22802f0f91b4ddc2a47da6be97c68/rust/src/storage/config.rs#L138-L196)
   
   
   **Additional context**
   This idea is also implemented by,
   
   - PyArrow FileSystem API 
[fs.FileSystem.from_uri(uri)(https://arrow.apache.org/docs/python/generated/pyarrow.fs.FileSystem.html#pyarrow.fs.FileSystem.from_uri)
   - Fsspec
   - Hadoop FileSystem API [org.apache.hadoop.fs.FileSystem.get(uri, 
conf)](https://hadoop.apache.org/docs/r3.0.0/api/org/apache/hadoop/fs/FileSystem.html#get-java.net.URI-org.apache.hadoop.conf.Configuration-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to