lurongjiang commented on PR #1044:
URL: https://github.com/apache/poi/pull/1044#issuecomment-4207636944

   ### **Changes Made**  
   1. **Removed zero-padding logic**  
      - During standard `.doc` file parsing, zero-padding introduced unexpected 
exceptions due to invalid buffer content.  
      - Added test cases for both **standard (Microsoft Office 97-2003 
`.doc`)** and **non-standard (WPS-compatible)** formats to ensure robustness.  
   
   2. **Enhanced out-of-bounds handling in `FileBackedDataSource`**  
      - Discovered that file-based operations rely on `FileBackedDataSource`, 
which previously threw exceptions when `position >= size()`.  
      - Now returns an empty `ByteBuffer.allocate(length)` instead of throwing 
an exception, ensuring consistent behavior with `ByteArrayBackedDataSource`.  
      - Added corresponding test cases to validate boundary conditions for 
file-backed reads.  
   
   ### **Impact Assessment**  
   - Since the original implementation threw exceptions on out-of-bounds 
access, the new behavior (returning an empty buffer) maintains backward 
compatibility for valid cases.  
   - For invalid cases (e.g., corrupted files or incomplete reads), the change 
prevents abrupt exceptions, allowing callers to handle incomplete data 
gracefully (e.g., failing silently or logging errors).  
   - **No functional regression expected**, as the new logic aligns with the 
previous exception-throwing behavior in terms of outcome (failure to parse 
invalid data).  
   
   ### **Testing Validation**  
   - Verified that:  
     - Standard `.doc` files parse correctly (no zero-padding artifacts).  
     - Non-standard WPS files remain supported.  
     - Out-of-bounds reads (both byte-array and file-backed) now return empty 
buffers instead of exceptions.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to