pitrou commented on issue #46935: URL: https://github.com/apache/arrow/issues/46935#issuecomment-3026843605
> After looking at this more closely, I see that read ranges are [coalesced](https://github.com/apache/arrow/blob/3b3684bb7d400b1f93d9aa17ff8f6c98641abea4/cpp/src/arrow/io/caching.cc#L178) before being stored in the ReadRangeCache, and reads usually only use a [slice](https://github.com/apache/arrow/blob/3b3684bb7d400b1f93d9aa17ff8f6c98641abea4/cpp/src/arrow/io/caching.cc#L227) of a cached range. > > This probably makes it very difficult to track when a read buffer is no longer needed. I wouldn't call it "very difficult", but it would require either: 1. use a simple ref count mechanism on each physical range (initialize it to the number of associated logical ranges, and decrement it each time a logical range is requested) 2. if that's too simplistic, track the association between logical ranges and physical ranges -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org