Copilot commented on code in PR #699:
URL: https://github.com/apache/commons-compress/pull/699#discussion_r2303621589


##########
src/test/java/org/apache/commons/compress/compressors/bzip2/BZip2CompressorInputStreamTest.java:
##########
@@ -149,4 +160,103 @@ void testSingleByteReadConsistentlyReturnsMinusOneAtEof() 
throws IOException {
             assertEquals(-1, in.read());
         }
     }
+
+    @Test
+    void testCreateHuffmanDecodingTablesWithLargeAlphaSize() {
+        final Data data = new Data(1);
+        // Use a codeLengths array with length equal to MAX_ALPHA_SIZE (258) 
to test array bounds.
+        final char[] codeLengths = new char[258];
+        for (int i = 0; i < codeLengths.length; i++) {
+            // Use all code lengths within valid range [1, 20]
+            codeLengths[i] = (char) ((i % MAX_CODE_LEN) + 1);
+        }
+        data.temp_charArray2d[0] = codeLengths;
+        assertDoesNotThrow(
+                () -> 
BZip2CompressorInputStream.createHuffmanDecodingTables(codeLengths.length, 1, 
data),
+                "createHuffmanDecodingTables should not throw for valid 
codeLengths array of MAX_ALPHA_SIZE");
+        assertEquals(data.minLens[0], 1, "Minimum code length should be 1");
+    }
+
+    @ParameterizedTest(name = "code length {0} -> must be rejected")
+    @ValueSource(ints = {MIN_CODE_LEN - 1, MAX_CODE_LEN + 1})
+    void testRecvDecodingTablesWithOutOfRangeCodeLength(final int codeLength) 
throws IOException {
+        try (BitInputStream tables = prepareDecodingTables(codeLength)) {
+            final Data data = new Data(1);
+
+            final CompressorException ex = assertThrows(
+                    CompressorException.class,
+                    () -> 
BZip2CompressorInputStream.recvDecodingTables(tables, data),
+                    "Expected CompressorException for invalid code length " + 
codeLength);
+
+            final String msg = ex.getMessage();
+            assertAll(
+                    () -> assertNotNull(msg, "Exception message must not be 
null"),
+                    () -> assertTrue(msg.toLowerCase().contains("code 
length"), "Message should mention 'code length'"),
+                    () -> assertTrue(
+                            msg.contains("[" + MIN_CODE_LEN + ", " + 
MAX_CODE_LEN + "]"),
+                            "Message should mention valid range [" + 
MIN_CODE_LEN + ", " + MAX_CODE_LEN + "]"),
+                    () -> assertTrue(
+                            msg.contains(Integer.toString(codeLength)),
+                            "Message should include the offending value " + 
codeLength));
+        }
+    }
+
+    @ParameterizedTest(name = "code length {0} -> accepted and stored")
+    @ValueSource(ints = {MIN_CODE_LEN, MAX_CODE_LEN})
+    void testRecvDecodingTablesWithValidCodeLength(final int codeLength) 
throws IOException {
+        try (BitInputStream tables = prepareDecodingTables(codeLength)) {
+            final Data data = new Data(1);
+
+            assertDoesNotThrow(
+                    () -> 
BZip2CompressorInputStream.recvDecodingTables(tables, data),
+                    "Should accept code length " + codeLength + " within [" + 
MIN_CODE_LEN + ", " + MAX_CODE_LEN + "]");
+
+            // We encoded 2 Huffman groups; both minLens should equal the 
encoded codeLength
+            assertAll(
+                    () -> assertEquals(codeLength, data.minLens[0], "Group 0 
min code length mismatch"),
+                    () -> assertEquals(codeLength, data.minLens[1], "Group 1 
min code length mismatch"));
+        }
+    }
+
+    /**
+     * Builds a minimal bitstream for recvDecodingTables():
+     * <ul>
+     *     <li>Uses only one symbol 'A' (0x41).</li>
+     *     <li>Number of groups: 2 (minimum).</li>
+     *     <li>Number of selectors: 3.</li>
+     *     <li>Selectors: all three encode j=1 (unary "10").</li>
+     *     <li>Huffman code lengths for 2 groups over alphabet size 3 (RUNA, 
RUNB, EOB) are all equal to {@code codeLength}.</li>
+     * </ul>
+     * <p>
+     *     <strong>Note:</strong> The values are chosen to keep everything 
byte-aligned.
+     * </p>
+     * @param codeLength the code length to use for each symbol in each group; 
must be in [0, 31]
+     */
+    private BitInputStream prepareDecodingTables(final int codeLength) {
+        assertTrue(0 <= codeLength && codeLength <= 31, "codeLength must be 
between 0 and 31");

Review Comment:
   The validation range [0, 31] in the test helper method is inconsistent with 
the actual valid range [1, 20] used in production code. Consider using 
constants or making the comment more explicit about why this broader range is 
needed for testing.
   ```suggestion
        * @param codeLength the code length to use for each symbol in each 
group; must be in [MIN_CODE_LEN, MAX_CODE_LEN]
        */
       private BitInputStream prepareDecodingTables(final int codeLength) {
           assertTrue(MIN_CODE_LEN <= codeLength && codeLength <= MAX_CODE_LEN, 
"codeLength must be between " + MIN_CODE_LEN + " and " + MAX_CODE_LEN);
   ```



##########
src/main/java/org/apache/commons/compress/compressors/bzip2/BZip2CompressorInputStream.java:
##########
@@ -42,10 +42,12 @@
  */
 public class BZip2CompressorInputStream extends CompressorInputStream 
implements BZip2Constants, InputStreamStatistics {
 
-    private static final class Data {
+    // package private for testing
+    static final class Data {
 
         // (with blockSize 900k)
         final boolean[] inUse = new boolean[256]; // 256 byte
+        private int inUseCount;

Review Comment:
   [nitpick] The field `inUseCount` should be initialized explicitly to 0 for 
clarity, even though Java initializes int fields to 0 by default.
   ```suggestion
           private int inUseCount = 0;
   ```



##########
src/main/java/org/apache/commons/compress/compressors/bzip2/BZip2CompressorInputStream.java:
##########
@@ -155,38 +159,46 @@ private static void checkBounds(final int checkVal, final 
int limitExclusive, fi
 
     /**
      * Called by createHuffmanDecodingTables() exclusively.
+     *
+     * @param minLen minimum code length in the range [1, {@value 
MAX_CODE_LEN}] guaranteed by the caller.
+     * @param maxLen maximum code length in the range [1, {@value 
MAX_CODE_LEN}] guaranteed by the caller.
      */
     private static void hbCreateDecodeTables(final int[] limit, final int[] 
base, final int[] perm, final char[] length, final int minLen, final int maxLen,
-            final int alphaSize) throws IOException {
+            final int alphaSize) {
         for (int i = minLen, pp = 0; i <= maxLen; i++) {
             for (int j = 0; j < alphaSize; j++) {
                 if (length[j] == i) {
                     perm[pp++] = j;
                 }
             }
         }
-        for (int i = MAX_CODE_LEN; --i > 0;) {
-            base[i] = 0;
-            limit[i] = 0;
-        }
+        // Ensure the arrays were not reused.
+        Arrays.fill(base, 0);
+        Arrays.fill(limit, 0);

Review Comment:
   [nitpick] Using `Arrays.fill()` to clear the entire arrays may be 
inefficient if only a small portion needs to be reset. Consider clearing only 
the range [minLen, maxLen] if this is a performance-critical path.
   ```suggestion
           Arrays.fill(limit, minLen, maxLen + 1, 0);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to