scovich commented on code in PR #7888:
URL: https://github.com/apache/arrow-rs/pull/7888#discussion_r2195965956


##########
parquet-variant/src/variant/metadata.rs:
##########
@@ -128,15 +128,20 @@ impl VariantMetadataHeader {
 ///
 /// [`Variant`]: crate::Variant
 /// [Variant Spec]: 
https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#metadata-encoding
-#[derive(Debug, Clone, Copy, PartialEq)]
+#[derive(Debug, Clone, PartialEq)]
 pub struct VariantMetadata<'m> {
     bytes: &'m [u8],
     header: VariantMetadataHeader,
-    dictionary_size: usize,
-    first_value_byte: usize,
+    dictionary_size: u32,
+    first_value_byte: u32,
     validated: bool,
 }
 
+#[cfg(test)]
+const _: () = if std::mem::size_of::<VariantMetadata>() != 32 {
+    panic!("VariantMetadata changed size, which will impact VariantList and 
VariantObject");
+};

Review Comment:
   This snippet will produce a compilation failure if the struct size changes.
   Seemed more robust/immediate than a doc test or unit test.



##########
parquet-variant/src/variant/list.rs:
##########
@@ -158,7 +163,7 @@ impl<'m, 'v> VariantList<'m, 'v> {
         let num_elements =
             header
                 .num_elements_size
-                .unpack_usize_at_offset(value, NUM_HEADER_BYTES, 0)?;
+                .unpack_u32_at_offset(value, NUM_HEADER_BYTES as _, 0)?;

Review Comment:
   This is an annoying side effect of using a named constant... the literal `1` 
would "just work" for both `u32` and `usize`.



##########
parquet-variant/src/variant/list.rs:
##########
@@ -186,10 +191,10 @@ impl<'m, 'v> VariantList<'m, 'v> {
 
         // Use the last offset to upper-bound the value buffer
         let last_offset = new_self
-            .get_offset(num_elements)?
+            .get_offset(num_elements as _)?
             .checked_add(first_value_byte)
             .ok_or_else(|| overflow_error("variant array size"))?;
-        new_self.value = slice_from_slice(value, ..last_offset)?;
+        new_self.value = slice_from_slice(value, ..last_offset as _)?;

Review Comment:
   Unfortunately the `SliceIndex` trait _only_ works for `usize`, so we have to 
widen the `u32` values whenever we create one.



##########
parquet-variant/src/variant/list.rs:
##########
@@ -186,10 +191,10 @@ impl<'m, 'v> VariantList<'m, 'v> {
 
         // Use the last offset to upper-bound the value buffer
         let last_offset = new_self
-            .get_offset(num_elements)?
+            .get_offset(num_elements as _)?

Review Comment:
   Rather than do a bunch of `try_into().map_err(...)` calls, just admit that 
converting `u32` to `usize` is infallible for all practical purposes -- I 
seriously doubt arrow-rs can run on 16-bit hardware where usize might be only 
16 bits.
   
   (I don't love blind `as _` casting in general -- too easy to cast to 
something unexpected or ignore the implications of the cast -- but it seems ok 
in this specific set of cases)



##########
parquet-variant/src/variant/list.rs:
##########
@@ -247,10 +252,10 @@ impl<'m, 'v> VariantList<'m, 'v> {
     fn try_get_with_shallow_validation(&self, index: usize) -> 
Result<Variant<'m, 'v>, ArrowError> {
         // Fetch the value bytes between the two offsets for this index, from 
the value array region
         // of the byte buffer
-        let byte_range = self.get_offset(index)?..self.get_offset(index + 1)?;
+        let byte_range = self.get_offset(index)? as _..self.get_offset(index + 
1)? as _;
         let value_bytes =
-            slice_from_slice_at_offset(self.value, self.first_value_byte, 
byte_range)?;
-        Variant::try_new_with_metadata_and_shallow_validation(self.metadata, 
value_bytes)
+            slice_from_slice_at_offset(self.value, self.first_value_byte as _, 
byte_range)?;
+        
Variant::try_new_with_metadata_and_shallow_validation(self.metadata.clone(), 
value_bytes)

Review Comment:
   `VariantMetadata` is no longer `Copy`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to