Ty!

On Fri, Nov 17, 2023 at 9:12 PM wish maple <maplewish...@gmail.com> wrote:

> Hi,
>
> The parquet is divided into arrow and parquet part.
>
> 1. The parquet part lowest position is parquet decoder, in [1].
>     The float point might choosing PLAIN, RLE_DCIT or BYTE_STREAM_SPLIT
>     encoding.
> 2. parquet::ColumnReader is applied beyond decoder, each row-group might
> have
>     one or two ( if choosing dictionary encoding and fall-back to plain,
> there're
>     two encoding in a RowGroup for a column). This is in [2]
>
> Other modules are mentioned by Bryce.
>
> Best,
> Xuwei Fu
>
> [1] https://github.com/apache/arrow/blob/main/cpp/src/parquet/encoding.cc
> [2]
> https://github.com/apache/arrow/blob/main/cpp/src/parquet/column_reader.cc
>
> Li Jin <ice.xell...@gmail.com> 于2023年11月18日周六 05:27写道:
>
> > Hi,
> >
> > I am recently investigating a null/nan issue with Parquet and Arrow and
> > wonder if someone can give me a pointer to the code that decodes Parquet
> > row group into Arrow float/double arrays?
> >
> > Thanks,
> > Li
> >
>

Reply via email to