yjshen edited a comment on pull request #1905:
URL:
https://github.com/apache/arrow-datafusion/pull/1905#issuecomment-1062686947
Thanks @tustvold for the detailed analysis. ❤️
We already have a workaround for the repeated open issue in the HDFS object
store. And I'm changing the object reader API here to avoid future object
reader implementations falling into the pitfall of repeated open
unintentionally.
I really like the idea of getting rid of `ChunkReader` APIs and using an
async parquet exec. I expect we could also achieve file-handle reuse for the
async reading path on top of tokio async io. And I think we could remove this
API:
```rust
/// Get reader for a part [start, start + length] in the file
fn sync_chunk_reader(
&self,
start: u64,
length: usize,
) -> Result<Box<dyn Read + Send + Sync>>;
```
totally since it's misuse-prone and only used by Parquet exec.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]