ppkarwasz commented on code in PR #699:
URL: https://github.com/apache/commons-compress/pull/699#discussion_r2304735466


##########
src/changes/changes.xml:
##########
@@ -58,8 +58,8 @@ The <action> type attribute can be add,update,fix,remove.
       <action type="fix" dev="ggregory" due-to="Gary Gregory">Don't loose 
precision while reading folders from a SevenZFile.</action>
       <action type="fix" dev="ggregory" due-to="Roel van Dijk, Gary 
Gregory">Improve some exception messages in TarUtils and 
TarArchiveEntry.</action>
       <!-- FIX bzip2 -->      
-      <action type="fix" dev="ggregory" due-to="Tyler Nighswander, Gary 
Gregory">org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream
 constructors now throws CompressorException instead of 
ArrayIndexOutOfBoundsException.</action>
-      <action type="fix" dev="ggregory" due-to="Gary 
Gregory">org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream
 throws CompressorException (an IOException subclass) instead of IOException 
when it encounters formatting problems.</action>
+      <action type="fix" dev="ggregory" due-to="Tyler Nighswander, Gary 
Gregory">BZip2 input streams now throw CompressorException (a subclass of 
IOException) for invalid or corrupted data, providing more specific error 
reporting.</action>
+      <action type="fix" dev="pkarwasz" due-to="Tyler Nighswander, Piotr P. 
Karwasz">BZip2 input streams treat Huffman codes longer than 20 bits as 
corrupted data, matching the behavior of the reference implementation.</action>

Review Comment:
   @garydgregory,
   
   I consolidated the BZip2 changelog entries into these two, which I think 
strike a good balance between clarity and user relevance across the 5 commits 
and this PR:
   
   * They explain that BZip2 streams now distinguish between I/O errors 
(`IOException`) and content-related errors (`CompressorException`), which helps 
users decide whether to retry or report a clearer error.
   * They highlight the Huffman code length restriction, which may matter for 
anyone relying on custom-encoded data.
   
   I didn’t include the removal of unchecked exceptions (e.g., 
`ArrayIndexOutOfBoundsException`), since those are more indicative of internal 
bugs than user-facing changes. Do you think it’s worth adding an explicit note 
like:
   
   > “Fixed `ArrayIndexOutOfBoundsException` for archives with non-compliant 
Huffman codes.”
   
   or should we keep that detail implicit in the general error handling change?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to