scovich commented on code in PR #7915:
URL: https://github.com/apache/arrow-rs/pull/7915#discussion_r2205755479


##########
parquet-variant/src/builder.rs:
##########
@@ -350,14 +378,59 @@ impl<S: AsRef<str>> FromIterator<S> for MetadataBuilder {
     }
 }
 
-impl<S: AsRef<str>> Extend<S> for MetadataBuilder {
+impl<S: AsRef<str>> Extend<S> for DefaultMetadataBuilder {
     fn extend<T: IntoIterator<Item = S>>(&mut self, iter: T) {
         for field_name in iter {
             self.upsert_field_name(field_name.as_ref());
         }
     }
 }
 
+/// Read-only metadata builder that validates field names against an existing 
metadata dictionary

Review Comment:
   > > What do we gain by using views at this low level?
   > 
   > The major benefit in my mind is that it means any row level reading of 
`Variant` is the same for shredded and unshredded Variants.
   
   Hmm. So for the following shredded variant:
   ```
   v: STRUCT {
       value: BINARY, /* variant object containing unshredded siblings of `a` */
       typed_value: STRUCT {
           a: {
               value: BINARY, /* variant object containing unshredded siblings 
of `b` */
               typed_value: STRUCT {
                   b: {
                       value: BINARY, /* variant object containing unshredded 
siblings of `c` */
                       typed_value: STRUCT {
                           c: {
                               value: BINARY, /* variant with whatever didn't 
shred as int32 */
                               typed_value: INT32,
                           }
                       }
                   }
               }
           }
       }
   }
   ```
   We would have the following?
   ```
   Variant::ShreddedObject {
       value: VariantObject {
           /* unshredded siblings of `a` */
       },
       typed_value: IndexMap {
           "a" : Variant::ShreddedObject {
               value: VariantObject {
                   /* unshredded siblings of `b` */
               },
               typed_value: IndexMap {
                   "b": Variant::ShreddedObject {
                       value: VariantObject {
                           /* unshredded siblings of `c` */
                       },
                       typed_value: IndexMap {
                           "c": Variant::In32, /* or whatever unshredded value 
it was instead  */
                       },
                   }
               },
           }
       },
   }
   ```
   Very interesting! I think I get it now.
   
   (I took the liberty of renaming the members of `Variant::ShreddedObject` to 
match the spec... seems a bit less confusing that way?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to