I believe the Arrow bindings use the low-level parquet reader so you should be able to use the low-level APIs in a similar multithreaded manner if you want raw bytes.
On Sun, Apr 2, 2023 at 10:30 PM Алексей Рябов <[email protected]> wrote: > Hello Team. > > As far as i could find in documentation/samples, there are 2 ways for > reading parquet files: > - using FileReader from parquet::arrow namespace. > - using low-level ParquetFileReader from parquet namespace. > > 1st one reads to arrow tables, transforming parquet data to arrow > types according to logical data in parquet schema. 2nd one reads raw > parquet bytes w/o any transform. > When reading to arrow tabes I can use threads to speedup process, but > get only arrow types, not raw bytes. The question is if there is a way > to read parquet file using threads but w/o converting to arrow types, > thus, getting an arrow table where each raw bytes are not converted to > arrow types, such as Utf8, Decimal128 and so on (except primitives)? > > Please advise. >
