I agree with Ed on this, the first page in a repeated field in any
row-group must start with 0 to be valid.

On Fri, May 10, 2024 at 4:46 PM Ed Seidl <[email protected]> wrote:

> Fun stuff...have felt the pain ;-)
>
> Given that the glossary defines a row group as "[a] logical horizontal
> partitioning of the data into rows", emphasis "rows" and not "records",
> I think that pretty strongly implies that row groups, at least, must
> start on a row boundary.
>
> I too would be in support of explicitly stating pages start with r==0 in
> the page index or V2 page header cases to clear up any remaining
> confusion. Although I find page-spanning rows to be a nuisance, I do see
> the value in continuing to allow them as this can lead to more uniform
> page sizes when nested schemas are involved.
>
> Cheers,
> Ed
>
> On 5/10/24 3:45 PM, Jan Finis wrote:
> > Interesting, thanks for the input so far. Since the spec doesn't say this
> > exactly, let me spin this one step further:
> >
> > May *row groups* start with an R-Level > 0? Intuitively, I would say
> "hell
> > no", but there is nothing in the Parquet spec that would say that this is
> > forbidden.
> >
> > Am Fr., 10. Mai 2024 um 12:21 Uhr schrieb Micah Kornfield <
> > [email protected]>:
> >
>
>

Reply via email to