scovich commented on code in PR #7888: URL: https://github.com/apache/arrow-rs/pull/7888#discussion_r2195965956
########## parquet-variant/src/variant/metadata.rs: ########## @@ -128,15 +128,20 @@ impl VariantMetadataHeader { /// /// [`Variant`]: crate::Variant /// [Variant Spec]: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#metadata-encoding -#[derive(Debug, Clone, Copy, PartialEq)] +#[derive(Debug, Clone, PartialEq)] pub struct VariantMetadata<'m> { bytes: &'m [u8], header: VariantMetadataHeader, - dictionary_size: usize, - first_value_byte: usize, + dictionary_size: u32, + first_value_byte: u32, validated: bool, } +#[cfg(test)] +const _: () = if std::mem::size_of::<VariantMetadata>() != 32 { + panic!("VariantMetadata changed size, which will impact VariantList and VariantObject"); +}; Review Comment: This snippet will produce a compilation failure if the struct size changes. Seemed more robust/immediate than a doc test or unit test. ########## parquet-variant/src/variant/list.rs: ########## @@ -158,7 +163,7 @@ impl<'m, 'v> VariantList<'m, 'v> { let num_elements = header .num_elements_size - .unpack_usize_at_offset(value, NUM_HEADER_BYTES, 0)?; + .unpack_u32_at_offset(value, NUM_HEADER_BYTES as _, 0)?; Review Comment: This is an annoying side effect of using a named constant... the literal `1` would "just work" for both `u32` and `usize`. ########## parquet-variant/src/variant/list.rs: ########## @@ -186,10 +191,10 @@ impl<'m, 'v> VariantList<'m, 'v> { // Use the last offset to upper-bound the value buffer let last_offset = new_self - .get_offset(num_elements)? + .get_offset(num_elements as _)? .checked_add(first_value_byte) .ok_or_else(|| overflow_error("variant array size"))?; - new_self.value = slice_from_slice(value, ..last_offset)?; + new_self.value = slice_from_slice(value, ..last_offset as _)?; Review Comment: Unfortunately the `SliceIndex` trait _only_ works for `usize`, so we have to widen the `u32` values whenever we create one. ########## parquet-variant/src/variant/list.rs: ########## @@ -186,10 +191,10 @@ impl<'m, 'v> VariantList<'m, 'v> { // Use the last offset to upper-bound the value buffer let last_offset = new_self - .get_offset(num_elements)? + .get_offset(num_elements as _)? Review Comment: Rather than do a bunch of `try_into().map_err(...)` calls, just admit that converting `u32` to `usize` is infallible for all practical purposes -- I seriously doubt arrow-rs can run on 16-bit hardware where usize might be only 16 bits. (I don't love blind `as _` casting in general -- too easy to cast to something unexpected or ignore the implications of the cast -- but it seems ok in this specific set of cases) ########## parquet-variant/src/variant/list.rs: ########## @@ -247,10 +252,10 @@ impl<'m, 'v> VariantList<'m, 'v> { fn try_get_with_shallow_validation(&self, index: usize) -> Result<Variant<'m, 'v>, ArrowError> { // Fetch the value bytes between the two offsets for this index, from the value array region // of the byte buffer - let byte_range = self.get_offset(index)?..self.get_offset(index + 1)?; + let byte_range = self.get_offset(index)? as _..self.get_offset(index + 1)? as _; let value_bytes = - slice_from_slice_at_offset(self.value, self.first_value_byte, byte_range)?; - Variant::try_new_with_metadata_and_shallow_validation(self.metadata, value_bytes) + slice_from_slice_at_offset(self.value, self.first_value_byte as _, byte_range)?; + Variant::try_new_with_metadata_and_shallow_validation(self.metadata.clone(), value_bytes) Review Comment: `VariantMetadata` is no longer `Copy` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org