yjshen edited a comment on pull request #950: URL: https://github.com/apache/arrow-datafusion/pull/950#issuecomment-906652660
@rdettai Thanks for reviewing 👍 > an URI (string just like the prefix currently), which could be a sort of path (bucket+prefix) for a plain object store like S3, but could also be something a bit more evolved: > - an S3 location with hive partitioning (URI=bucket/prefix?partition=year&partition=month) > - a delta table (URI=bucket/prefix?versionAsOf=v2) I think this could be achieved inside the S3 object store implementation with another PR on the `PartitionedFile` abstraction #932 . `list` could return a stream of `PartitionedFile` instead of the current `FileMeta`. (`PartitionedFile` could have a field of `FileMeta`). > an expression so that we can pushdown the filter to the generation of the file list. This is VERY important for very large datasets with lots of files where listing all files is too long. I think the current, non-filtering version of the listing is made here for simplicity. check more discussions on this in doc [here](https://docs.google.com/document/d/1ZEZqvdohrot0ewtTNeaBtqczOIJ1Q0OnX9PqMMxpOF8/edit?disco=AAAANwU9MzE#heading=h.358nvuimx7yr) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org