I think you can potentially use the example binary data here[1] to answer these question, specifically [2] and [3]
I don't think the keys are concatenated with parent key names. Andrew [1]: https://github.com/apache/parquet-testing/tree/master/variant [2]: https://github.com/apache/parquet-testing/blob/master/variant/object_nested.metadata [3]: https://github.com/apache/parquet-testing/blob/master/variant/object_nested.value https://github.com/apache/parquet-testing/issues/75 On Tue, May 13, 2025 at 4:37 AM Gang Wu <[email protected]> wrote: > quick question: how to serialize keys in the nested objects? Do we need to > concatenate its parent key like the json path? > > On Tue, May 13, 2025 at 3:19 PM wish maple <[email protected]> wrote: > > > Just to make sure if it's ok or this should be forbidden. Since it > > affect how reader/writer handles this > > > > Best, > > Xuwei Fu > > > > Aihua Xu <[email protected]> 于2025年5月13日周二 14:32写道: > > > > > It should be just single ‘a’ to reduce the storage by reusing the same > > > key. Any reason that we want to keep both ‘a’ there? > > > > > > > > > > > > > On May 12, 2025, at 7:43 PM, wish maple <[email protected]> > > wrote: > > > > > > > > Thanks! So, in the nested object scenario, would the metadata be > > > > field 0: "a", field 1: "a" or just field 0: "a" > > > > do the both way is ok for reader/writer, or we need limit the > > > > metadata implementation? > > > > > > > > Best, > > > > Xuwei Fu > > > > > > > > Ryan Blue <[email protected]> 于2025年5月13日周二 04:05写道: > > > > > > > >> Keys may appear in nested objects, but cannot appear in the same > > > object. So > > > >> the first example, {"a": {"a": 1}} is allowed. The second example, > > > {"a": 1, > > > >> "a": 2} is not allowed. > > > >> > > > >> Ryan > > > >> > > > >>> On Sun, May 11, 2025 at 11:47 PM wish maple < > [email protected]> > > > >>> wrote: > > > >>> > > > >>> In the Parquet variant spec, metadata part says that > > > >>> > > > >>>> Object: An unordered collection of string/Variant pairs (i.e. > > > key/value > > > >>> pairs). An object may not contain duplicate keys. [1] > > > >>> > > > >>> Considering a nested json object like {"a": {"a": 1}}, would the > > > metadata > > > >>> like field 0: "a", field 1: "a" or just field 0: "a" , or both of > > them > > > is > > > >>> ok for reader/writer? > > > >>> > > > >>> And besides, would duplicate keys be allowed in the same object? > Like > > > >> {"a": > > > >>> 1, "a": 2}? > > > >>> > > > >>> Best, Xuwei Fu > > > >>> > > > >>> [1] > > > >>> > > > > https://github.com/apache/parquet-format/blob/master/VariantEncoding.md > > > >>> > > > >> > > > > > >
