LouisLou2 opened a new issue, #2068:
URL: https://github.com/apache/fury/issues/2068

   ### Question
   
   ```java
   /** Read {@code len} bytes into a long using little-endian order. */
     public long readBytesAsInt64(int len) {
       int readerIdx = readerIndex;
       // use subtract to avoid overflow
       int remaining = size - readerIdx;
       if (remaining >= 8) {
         readerIndex = readerIdx + len;
         long v =
             UNSAFE.getLong(heapMemory, address + readerIdx)
                 & (0xffffffffffffffffL >>> ((8 - len) * 8));
         return LITTLE_ENDIAN ? v : Long.reverseBytes(v);
       }
       return slowReadBytesAsInt64(remaining, len);
     }
   ```
   I have some doubts about this method above when applying big endian.
   ### Problem Analysis
   
   1. **Incorrect Mask Calculation**
      - The issue arises when `len < 8`. The current mask calculation:
        ```java
        0xffffffffffffffffL >>> ((8 - len) * 8)
        ```
        keeps the **lower `len * 8` bits**, but in the **Big Endian** scenario, 
the valid data may actually reside in the **higher `len * 8` bits** (the upper 
bits).
      
      **Example:**
      When `len = 3`, the mask calculation becomes:
      ```java
      0xffffffffffffffffL >>> 40
      ```
      which results in the mask `0x0000000000ffffff`, effectively keeping the 
**lower 24 bits**. However, if the data is stored in **Big Endian**, the valid 
data would actually reside in the **upper 24 bits** (the first 3 bytes). As a 
result, this incorrect mask clears the valid data.
   
   2. **Incorrect Byte Order Handling**
      - The byte order reversal operation should happen **before** applying the 
mask. Currently, the reversal happens after the mask is applied:
        ```java
        return LITTLE_ENDIAN ? v : Long.reverseBytes(v);
        ```
        This is problematic because, after the byte order is reversed, the 
high-order bits may get erroneously cleared due to the mask being applied in 
the wrong order. Reversing the byte order after masking results in the 
potential loss of the valid high-order bits.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to