crepererum opened a new issue, #385:
URL: https://github.com/apache/arrow-rs-object-store/issues/385
# Abstract
The `ObjectStore` trait -- as designed currently -- is a middle ground of
somewhat competing design goals. I think we can do better than that.
# Requirements
The trait serves two groups of API users:
## Object Store Users
Humans and machines that want to use an object store through a unified
interface.
They usually like to write only the relevant parts, e.g.:
- "get object `foo/bar.txt`" ==> `store.get("foo/bar.txt").await?`
- "get first 10 bytes of `foo/bar.txt`" ==>
`store.get("foo/bar.txt").with_range(..10).await?`
## Object Store Implementations
Parties that implement a new object store or wrap an existing one. They
usually want to implement the methods that determine the behavior of the object
store, ideally without surprises (like opinionated default implementations).
# Status Quo
Due to this dual nature the trait has accumulated an increasing number of
methods, a bunch of them with default implementation. Let's have a look at the
"get" methods:
https://github.com/apache/arrow-rs-object-store/blob/0c3152c709d5101bc2346c49fff5c94e033b8e71/src/lib.rs#L633-L662
All except for `get_ranges` basically map to `get_opts` and there is no
reason for a store to override any of the defaults. And even for `get_ranges`
we could come up with a sensible mapping to `get_opts` if the range parameter
would support multi-ranges similar to HTTP.
Now let's look at "rename":
https://github.com/apache/arrow-rs-object-store/blob/0c3152c709d5101bc2346c49fff5c94e033b8e71/src/lib.rs#L773-L782
https://github.com/apache/arrow-rs-object-store/blob/0c3152c709d5101bc2346c49fff5c94e033b8e71/src/lib.rs#L793-L799
I think it is out of question that a store implementation should definitely
override these if there's any way to perform key renames without a full "get +
put".
# Proposal
I propose to remove all default implementations from the trait and only have
a single method per operation, so "get" would look like this:
```rust
async fn get(&self, location: &Path, options: GetOptions) ->
Result<GetResult>;
```
Note that the `location` is NOT part of `options` so that `GetOptions` can
still implement `Default` and a user only needs to specify the options of
interest:
```rust
// get bytes of file
store.get(
&location,
Default::default(),
).await?.bytes().await?;
// get range
store.get(
&location,
GetOptions{
range: (..10).into(),
..Default::default()
},
).await?.bytes().await?;
```
A similar mapping can be done for "rename".
I think that strikes a good balance between boilerplate / verbosity of the
user API and the clarity of the interface.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]