westonpace opened a new pull request #9533: URL: https://github.com/apache/arrow/pull/9533
The `AsyncThreadedTableReader` was previously creating a dedicated size 1 thread pool for readahead (similar to the way the serial reader behaves). This allowed for parallel readahead because the futures couldn't get worked on faster than 1 at a time anyways. However, this introduced a thread outside of both the CPU context and the I/O context. This PR adds a new AsyncGenerator utility (MakeSerialReadaheadGenerator) which still provides readahead but, unlike MakeReadaheadGenerator, it does not pull in an async reentrant fashion. This allows the I/O thread pool to be used because the serial logic will take care of ensuring that no more than one request is ever active. Aside: In implementing this feature we encountered an issue with futures. The logic in SerialReadaheadGenerator is relying on callbacks being run by Future in a reliable order. Some changes were made to Future to ensure this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org