alamb opened a new issue, #7895: URL: https://github.com/apache/arrow-rs/issues/7895
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** - part of https://github.com/apache/arrow-rs/issues/6736 - This is a follow up to https://github.com/apache/arrow-rs/issues/7715 As we begin to contemplate how to read and write shredded variants, we will need some way to construct arrow arrays that contain shredded variants Physically these will be Arrow `StructArrays` with two or three fields * Non shredded: (2 fields) `STRUCT { "metadata": Binary, "value": Binary}` * Shredded: (3 fields)`STRUCT { "metadata": Binary, "value": Binary, typed_value: STRUCT { ... } }` More information on to represent Variants as Arrow arrays can be found on the proposal: - https://github.com/apache/arrow/issues/46908 - Google Document: https://docs.google.com/document/d/1pw0AWoMQY3SjD7R4LgbPvMjG_xSCtXp3rZHkVp9jpZ4/edit?usp=sharing **Describe the solution you'd like** I would like some way to construct such shredded arrays easily and efficiently in Idomatic Rust style **Describe alternatives you've considered** One an idea from @zeroshade (thank you!) is to create a `VariantArrayBuilder` that is responsible for building the correct `StructArray`s from variants, including shredding out any columns. In order to created a shredded output, you would provide the shredded schema up front For example, (based on the go implemntation), to create a shredded Arrow array that shreds out columns "foo" and "bar" from any variant objects, We would need this schema: ```text STRUCT { metadata: BinaryView, value: BinaryView, typed_value: STRUCT { foo: Int64, bar: Int32 } } ``` The code would look like this ```rust // Create an arrow Field that describes the desired shredded output schema let shredded_schema = Field::new_struct( vec![ "metadata", "value", "typed_value"], vec![Field::new(DataType::BinaryView), Field::new(DataType::BinaryView), Field:::new_struct( vec!["foo", "bar"], vec![Field::new(DataType::Int64), Field::new(DataType::Int32)], )); // Create a builder for an array (batch) of Variant values let array_builder = VariantArrayBuilder::new(shredded_schema); // append a row to the builder let object= array_builder.new_object(); ... add appropriate fields ... // use like normal ObjectBuilder(??) object.finish() // append a second row (has no foo or bar fields) array_builder.append_value(43); ... /// Finalze the builder let variant_array: StructArray = array_builder.build()?; // variant_array is a shreded variant ``` I think a VariantArrayBuilder will be helpful for usecases other than Variant, and @harshmotw-db has created some version of one here: - https://github.com/apache/arrow-rs/issues/7883 ### Prior Art Golang implementation: - https://github.com/apache/arrow-go/blob/main/arrow/extensions/variant_test.go - https://github.com/apache/arrow-go/blob/main/arrow/extensions/variant.go - Here are some examples of it being used: https://github.com/apache/arrow-go/blob/b196d3b316d09f63786f021d4f1baa1fdd7620d2/arrow/extensions/variant_test.go#L363-L391 - Spark variant code: https://github.com/apache/spark/tree/master/common/variant/src/main/java/org/apache/spark/types/variant **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org