Re: [C++] ways to read parquet

Micah Kornfield Fri, 14 Apr 2023 22:02:24 -0700

I believe the Arrow bindings use the low-level parquet reader so you should
be able to use the low-level APIs in a similar multithreaded manner if you
want raw bytes.


On Sun, Apr 2, 2023 at 10:30 PM Алексей Рябов <[email protected]> wrote:

> Hello Team.
>
> As far as i could find in documentation/samples, there are 2 ways for
> reading parquet files:
> - using FileReader from parquet::arrow namespace.
> - using low-level ParquetFileReader from parquet namespace.
>
> 1st one reads to arrow tables, transforming parquet data to arrow
> types according to logical data in parquet schema. 2nd one reads raw
> parquet bytes w/o any transform.
> When reading to arrow tabes I can use threads to speedup process, but
> get only arrow types, not raw bytes. The question is if there is a way
> to read parquet file using threads but w/o converting to arrow types,
> thus, getting an arrow table where each raw bytes are not converted to
> arrow types, such as Utf8, Decimal128 and so on (except primitives)?
>
> Please advise.
>

Re: [C++] ways to read parquet

Reply via email to