jorgecarleitao commented on a change in pull request #416:
URL: https://github.com/apache/arrow-rs/pull/416#discussion_r646800597



##########
File path: arrow/src/util/bit_chunk_iterator.rs
##########
@@ -137,14 +137,16 @@ impl Iterator for BitChunkIterator<'_> {
         // so when reading as u64 on a big-endian machine, the bytes need to 
be swapped
         let current = unsafe { 
std::ptr::read_unaligned(raw_data.add(index)).to_le() };
 
-        let combined = if self.bit_offset == 0 {
+        let bit_offset = self.bit_offset;
+
+        let combined = if bit_offset == 0 {
             current
         } else {
-            let next =
-                unsafe { std::ptr::read_unaligned(raw_data.add(index + 
1)).to_le() };
+            let next = unsafe {
+                std::ptr::read_unaligned(raw_data.add(index + 1) as *const u8) 
as u64

Review comment:
       Since this is not the remainder, don't we potentially need to read more 
than 8 bits? I.e. doesn't this index contain between 1 and 63 bits that need to 
be "merged" into `current`?
   
   I get a feeling that this will ignore all bits after the 8th and less than 
64. At least this is what I remember from fixing it in arrow2 
[here](https://github.com/jorgecarleitao/arrow2/blob/main/src/bitmap/utils/chunk_iterator/mod.rs#L149).
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to