It seems the arrow-dataset api already has the async IO layer.
However, I want to use the low-level Parquet api with async IO. That
is, the decoded values are consumed by some user-defined function, not
converted to arrow table. Something similar to ScanFileContents:
https://github.com/apache/arrow/blob/master/cpp/src/parquet/file_reader.cc#L818

The current async io interface inside ParquetFileReader seems to be
served for arrow dataset api. I was wondering if there is any code
snippet to implement the async version of ScanFileContents? If there
is no, one way for me to approach this is to try to use
ParquetFileReader::PreBuffer and ParquetFileReader::WhenBuffered and
refer to dataset api implementation.

Reply via email to