I think you can potentially use the example binary data here[1] to answer
these question, specifically [2] and [3]

I don't think the keys are concatenated with parent key names.

Andrew

[1]: https://github.com/apache/parquet-testing/tree/master/variant
[2]:
https://github.com/apache/parquet-testing/blob/master/variant/object_nested.metadata
[3]:
https://github.com/apache/parquet-testing/blob/master/variant/object_nested.value


https://github.com/apache/parquet-testing/issues/75

On Tue, May 13, 2025 at 4:37 AM Gang Wu <[email protected]> wrote:

> quick question: how to serialize keys in the nested objects? Do we need to
> concatenate its parent key like the json path?
>
> On Tue, May 13, 2025 at 3:19 PM wish maple <[email protected]> wrote:
>
> > Just to make sure if it's ok or this should be forbidden. Since it
> > affect how reader/writer handles this
> >
> > Best,
> > Xuwei Fu
> >
> > Aihua Xu <[email protected]> 于2025年5月13日周二 14:32写道:
> >
> > > It should be just single ‘a’ to reduce the storage by reusing the same
> > > key. Any reason that we want to keep both ‘a’ there?
> > >
> > >
> > >
> > > > On May 12, 2025, at 7:43 PM, wish maple <[email protected]>
> > wrote:
> > > >
> > > > Thanks! So, in the nested object scenario, would the metadata be
> > > > field 0: "a", field 1: "a" or just field 0: "a"
> > > > do the both way is ok for reader/writer, or we need limit the
> > > > metadata implementation?
> > > >
> > > > Best,
> > > > Xuwei Fu
> > > >
> > > > Ryan Blue <[email protected]> 于2025年5月13日周二 04:05写道:
> > > >
> > > >> Keys may appear in nested objects, but cannot appear in the same
> > > object. So
> > > >> the first example, {"a": {"a": 1}} is allowed. The second example,
> > > {"a": 1,
> > > >> "a": 2} is not allowed.
> > > >>
> > > >> Ryan
> > > >>
> > > >>> On Sun, May 11, 2025 at 11:47 PM wish maple <
> [email protected]>
> > > >>> wrote:
> > > >>>
> > > >>> In the Parquet variant spec, metadata part says that
> > > >>>
> > > >>>> Object: An unordered collection of string/Variant pairs (i.e.
> > > key/value
> > > >>> pairs). An object may not contain duplicate keys. [1]
> > > >>>
> > > >>> Considering a nested json object like {"a": {"a": 1}}, would the
> > > metadata
> > > >>> like field 0: "a", field 1: "a" or just field 0: "a" , or both of
> > them
> > > is
> > > >>> ok for reader/writer?
> > > >>>
> > > >>> And besides, would duplicate keys be allowed in the same object?
> Like
> > > >> {"a":
> > > >>> 1, "a": 2}?
> > > >>>
> > > >>> Best, Xuwei Fu
> > > >>>
> > > >>> [1]
> > > >>>
> > >
> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
> > > >>>
> > > >>
> > >
> >
>

Reply via email to