celinke97 commented on PR #771:
URL: https://github.com/apache/commons-compress/pull/771#issuecomment-4337653136

   Thanks for checking this. The EOFException from the attached one-byte 
ByteArrayInputStream case is expected and does not contradict the OOME report: 
with `c_namesize == 0`, `readEntryName` computes `lengthWithNull - 1 == -1`, 
and the current `readRange(-1)` path drains/copies the remaining stream. For a 
tiny remaining stream that reaches EOF and then throws EOFException; for a 
larger remaining stream under a constrained heap it can exhaust the heap before 
EOF is reached.
   
   I agree that the original custom `MalformedCpioStream` test was not the best 
way to communicate this. A deterministic unit test should not try to assert 
OOME. It should assert the important invariant instead: a zero CPIO name size 
must be rejected before Commons Compress reads attacker-controlled trailing 
data as the entry name.
   
   Here is a ByteArrayInputStream-based regression test that uses a normal 
finite byte array and a larger, but still test-sized, trailing payload:
   
   ```java
   @Test
   void testZeroNameSizeRejectedBeforeReadingTrailingData() throws Exception {
       final int trailingBytes = 1024 * 1024;
       final ByteArrayInputStream inputStream = new 
ByteArrayInputStream(newAsciiCpioEntryWithNameSize("00000000", trailingBytes));
       try (CpioArchiveInputStream cpio = 
CpioArchiveInputStream.builder().setInputStream(inputStream).get()) {
           assertThrows(IOException.class, () -> cpio.getNextEntry());
       }
       assertTrue(inputStream.available() > 0,
               "A c_namesize of zero must be rejected before reading the 
attacker-controlled trailing data");
   }
   ```
   
   On the vulnerable code path, the same input is drained completely and then 
EOFException is thrown, so the test fails at `inputStream.available() > 0`. 
That small-input EOFException is the observable symptom before heap exhaustion; 
increasing the trailing payload relative to `-Xmx` turns the same 
unbounded-copy behavior into the reported OutOfMemoryError.
   
   I also reran this with a real Java runtime. The ByteArrayInputStream case 
consumed the entire 1 MiB tail before throwing EOFException:
   
   ```text
   runtime=21.0.11 OpenJDK 64-Bit Server VM
   trailingBytes=1048576
   headerBytes=110
   outcome=java.io.EOFException: null
   byteArrayInputStream.available=0
   byteArrayInputStream.consumed=1048686
   ```
   
   With `-Xmx16m` and a 64 MiB generated finite tail, the same path throws 
`java.lang.OutOfMemoryError: Java heap space` from 
`ByteArrayOutputStream.ensureCapacity`, called via 
`org.apache.commons.compress.utils.IOUtils.readRange` and 
`CpioArchiveInputStream.readEntryName`.
   
   I have attached a patch version of the test as 
`bytearray-regression-test.patch`.
   
[CpioZeroNameSizeByteArrayRuntime.java](https://github.com/user-attachments/files/27175504/CpioZeroNameSizeByteArrayRuntime.java)
   
[CpioZeroNameSizeGeneratedTailRuntime.java](https://github.com/user-attachments/files/27175505/CpioZeroNameSizeGeneratedTailRuntime.java)
   
[bytearray-regression-test.patch](https://github.com/user-attachments/files/27175506/bytearray-regression-test.patch)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to