Re: [PR] fix prefetch of page index [arrow-rs]

via GitHub Mon, 20 Jan 2025 05:40:09 -0800


adriangb commented on PR #6999:
URL: https://github.com/apache/arrow-rs/pull/6999#issuecomment-2602456064


   As a side note I think one of the biggest bottlenecks in systems working 
from object storage tends to be latency, so it's important to minimize latency 
(this is well known, including in the comments/docstrings in this file).
   
   Would it be beneficial to have the right APIs to make it possible to 
pre-fetch the entire file? E.g. if I'm going to load a <1MB parquet file I 
might want to just make a single request to object storage and know I have 
everything I need instead of loading the metadata, then making another request 
to load the data. This would especially be beneficial for the scenario where 
you don't know the metadata size but maybe know the file size, then you do 1 
request instead of potentially 3+.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix prefetch of page index [arrow-rs]

Reply via email to