alamb commented on a change in pull request #1905:
URL: https://github.com/apache/arrow-datafusion/pull/1905#discussion_r820111433
##########
File path: datafusion/src/datasource/object_store/local.rs
##########
@@ -82,23 +112,12 @@ impl ObjectReader for LocalFileReader {
)
}
- fn sync_chunk_reader(
- &self,
- start: u64,
- length: usize,
- ) -> Result<Box<dyn Read + Send + Sync>> {
- // A new file descriptor is opened for each chunk reader.
- // This okay because chunks are usually fairly large.
- let mut file = File::open(&self.file.path)?;
Review comment:
I probably misunderstand something here and I am sorry I don't quite
follow all the comments on this PR.
If the issue you are trying to solve is that `File::open` is called too
often, would it be possible to "memoize" the open here with a mutex inside of
the FileReader?
Something like
```rust
struct LocalFileReader {
...
/// Keep the open file descriptor to avoid reopening it
cache: Mutex<Option<Box<dyn Read + Send + Sync + Clone>>>
}
impl LocalFileReader {
...
fn sync_chunk_reader(
&self,
start: u64,
length: usize,
) -> Result<Box<dyn Read + Send + Sync>> {
let mut cache = self.cache.lock();
if let Some(cache) = cache {
return Ok(cache.clone())
};
*cache = File::open(...);
return cache.clone();
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]