friendlymatthew commented on code in PR #7888: URL: https://github.com/apache/arrow-rs/pull/7888#discussion_r2198302382
########## parquet-variant/src/variant/metadata.rs: ########## @@ -37,16 +37,16 @@ pub(crate) struct VariantMetadataHeader { const CORRECT_VERSION_VALUE: u8 = 1; // The metadata header occupies one byte; use a named constant for readability -const NUM_HEADER_BYTES: usize = 1; +const NUM_HEADER_BYTES: u32 = 1; impl VariantMetadataHeader { // Hide the cast - const fn offset_size(&self) -> usize { - self.offset_size as usize + const fn offset_size(&self) -> u32 { + self.offset_size as u32 } // Avoid materializing this offset, since it's cheaply and safely computable - const fn first_offset_byte(&self) -> usize { + const fn first_offset_byte(&self) -> u32 { NUM_HEADER_BYTES + self.offset_size() } Review Comment: Hi `VariantMetadataHeader` is currently a u8 that encodes 3 pieces of information. I'm wondering if, instead of storing each piece separately as fields, we could store just the u8 itself and extract the individual components using bitmasking when needed. If we are aiming to minimize the byte footprint, it's a bit unfortunate that we're storing 3 times more bytes than necessary fro this data. Plus, deriving the values from the byte is not computationally expensive. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org