sdf-jkl commented on issue #8153: URL: https://github.com/apache/arrow-rs/issues/8153#issuecomment-4092994173
> I think two kinds of struct-based extractions are actually possible, and can actually both be used at the same time to produce a third flavor? > > 1. User provides a plain old ordinary struct schema whose non-leaf fields are all normal struct/list and whose leaf fields are all strongly-typed. > > * The top-level does _not_ have a `metadata` column, for example. This really is just a plain ordinary struct. > * A `variant_get` call should attempt to extract exactly that type, and either produce NULL or an error if any of the casting fails. > * The result is a `StructArray` whose leaves are e.g. `PrimitiveArray` We do have support for shredded `Object`s in `variant_get` via _plain old ordinary struct schema_ > 2. User provides an official shredding schema -- a variant-tagged struct schema where the top-level has a `metadata` column, and each field (both leaf and non-leaf) is a struct with value/typed_value pair that is characteristic of shredding. > > * A `variant_get` call should attempt to shred the data at each level into its respective `typed_value` field, falling back to `value` when necessary. > * The result is a `VariantArray` that may have an arbitrarily complex shreding schema. > 3. And, just for fun, the user can do both at the same time -- providing a struct schema where some of the leaf fields are shredded variants. > > * The result is a `StructArray` where some of the leaves are `VariantArray` > * Each of those variant leaves needs its own `metadata` column and has its own shredding schema, because it's now a stand-alone variant value (internal detail that they all can use a (shallow) copy of the original `metadata` column without having to re-encode anything). These two are not supported yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
