lidavidm commented on pull request #9482: URL: https://github.com/apache/arrow/pull/9482#issuecomment-781635820
> Ok, I think I get it now. Let's pretend ~we are outside of S3,~ (nvm, S3 not relevant) there is 1 file, 3 row groups, and 3 columns. Three scan tasks will be generated. Task 1 needs RG1C1, RG1C2, RG1C3. Task 2 needs RG2C1, RG2C2, RG2C3. Task 3 needs RG3C1, RG3C2, RG3C3. > > Prebuffer will be called asking for all 9 blocks and it will then issue three reads in parallel (instead of the 9 reads that would otherwise be issued) RG1, RG2, and RG3. The number of reads that actually gets issued depends on the parameters but it would be anywhere from 1 to 9. (If you had a filesystem with high bandwidth but a very large time-to-first-byte, you'd issue one; if you had a filesystem with very low latency and high bandwidth, you'd issue all 9.) > If there are only 3 columns in the file, is it possible Prebuffer would coalesce this all into one read? In that case wouldn't all three tasks be blocked until the entire file is read, preventing the ability for task 1 to start running as soon as RG1 is issued? Yes, it's possible they'd all be coalesced into one read. You're trading off throughput for latency. The point is that for some filesystems, this can be faster than issuing separate reads. Picking numbers out of thin air, if it takes 10ms to establish a connection and 5ms to read one column, it's better making one request at (10ms + 3 * 5ms) than three requests at (10ms + 15ms) each (15ms since you're splitting the available bandwidth 3 ways). I guess one thing I should mention is that just because two ranges are adjacent, doesn't mean that PreBuffer will always coalesce them (and furthermore, just because two ranges are adjacent, doesn't mean that PreBuffer won't coalesce them anyways and pay the penalty to read the extra data between them). Another is that the Parquet reader decodes one column at a time, and each column decoder will block until its chunk is read. So you're moving the blocking up front and hopefully consolidating it, instead of sequentially blocking several times. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
