scovich commented on issue #7715: URL: https://github.com/apache/arrow-rs/issues/7715#issuecomment-3062418674
> > > Do we need an `unshred_variant` kernel > > > > > > Yes. > > If nothing else, we need a way for engines that don't support shredding to correctly consume shredded variant. Or who do support shredding to some degree, but don't want the high complexity of propagating shredded variant all through the query plan above the scan. Or who fully support variant but want to write it back out with a different shredding schema (see below). > > I also agree with this [@scovich](https://github.com/scovich) -- however, I am not quite sure what the API would look like yet so I am not sure yet what ticket to file The public API seems simple enough? A shredded variant column would (physically) be a `StructArray` with `typed_value` alongside its `metadata` and `value` fields. I would expect an `unshred_variant` kernel to take such an input, and produce an output that does _not_ have a `typed_value` column any more. The spec requires that the `metadata` column already contain every needed variant path name, so it's really just a matter of rewriting the `value` column under the hood. The internal API (for low-level variant operations) would leverage a variant builder with some tweaks: * Wraps a `VariantMetadata` instead of a `VariantMetadataBuilder`, and field insertions fail if the key is not present * Ability to manually inject the bytes of an existing variant object as the value of an object field or array element, so we can bring across existing variant-encoded fields. With optional full validation, in case the input variant bytes are untrusted. That's what immediately comes to mind; some pathfinding would probably be required to go any further. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org