On Wed, 10 Jan 2024 13:39:52 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:

>> ZipInputStream.readEnd currently assumes a Zip64 data descriptor if the 
>> number of compressed or uncompressed bytes read from the inflater is larger 
>> than the Zip64 magic value.
>> 
>> While the ZIP format  mandates that the data descriptor `SHOULD be stored in 
>> ZIP64 format (as 8 byte values) when a file's size exceeds 0xFFFFFFFF`, it 
>> also states that `ZIP64 format MAY be used regardless of the size of a 
>> file`. For such small entries, the above assumption does not hold.
>> 
>> This PR augments ZipInputStream.readEnd to also assume 8-byte sizes if the 
>> ZipEntry includes a Zip64 extra information field AND the 'compressed size' 
>> and 'uncompressed size' have the expected Zip64 "magic" value 0xFFFFFFFF. 
>> This brings ZipInputStream into alignment with the APPNOTE format spec:
>> 
>> 
>> When extracting, if the zip64 extended information extra 
>> field is present for the file the compressed and 
>> uncompressed sizes will be 8 byte values.
>> 
>> 
>> While small Zip64 files with 8-byte data descriptors are not commonly found 
>> in the wild, it is possible to create one using the Info-ZIP command line 
>> `-fd` flag:
>> 
>> `echo hello | zip -fd > hello.zip`
>> 
>> The PR also adds a test verifying that such a small Zip64 file can be parsed 
>> by ZipInputStream.
>
> Eirik Bjørsnøs has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - Remove trailing whitespace
>  - Remove trailing whitespace

src/java.base/share/classes/java/util/zip/ZipInputStream.java line 664:

> 662: 
> 663:         // The LOC's 'compressed size' and 'uncompressed size' must both 
> be marked for Zip64
> 664:         if (csize != ZIP64_MAGICVAL || size != ZIP64_MAGICVAL) {

The spec for this says different. It says:

>
> 4.4.4 general purpose bit flag:
> ...
>    Bit 3: If this bit is set, the fields crc-32, compressed size and 
> uncompressed size are set to zero in the local header.  The correct values 
> are put in the data descriptor immediately following the compressed data.  

So it expects the value zero for the compressed/uncompressed sizes in the LOC 
when the data descriptor bit is set.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1453460177

Reply via email to