On Wed, 10 Jan 2024 13:39:52 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:
>> ZipInputStream.readEnd currently assumes a Zip64 data descriptor if the >> number of compressed or uncompressed bytes read from the inflater is larger >> than the Zip64 magic value. >> >> While the ZIP format mandates that the data descriptor `SHOULD be stored in >> ZIP64 format (as 8 byte values) when a file's size exceeds 0xFFFFFFFF`, it >> also states that `ZIP64 format MAY be used regardless of the size of a >> file`. For such small entries, the above assumption does not hold. >> >> This PR augments ZipInputStream.readEnd to also assume 8-byte sizes if the >> ZipEntry includes a Zip64 extra information field AND the 'compressed size' >> and 'uncompressed size' have the expected Zip64 "magic" value 0xFFFFFFFF. >> This brings ZipInputStream into alignment with the APPNOTE format spec: >> >> >> When extracting, if the zip64 extended information extra >> field is present for the file the compressed and >> uncompressed sizes will be 8 byte values. >> >> >> While small Zip64 files with 8-byte data descriptors are not commonly found >> in the wild, it is possible to create one using the Info-ZIP command line >> `-fd` flag: >> >> `echo hello | zip -fd > hello.zip` >> >> The PR also adds a test verifying that such a small Zip64 file can be parsed >> by ZipInputStream. > > Eirik Bjørsnøs has updated the pull request incrementally with two additional > commits since the last revision: > > - Remove trailing whitespace > - Remove trailing whitespace src/java.base/share/classes/java/util/zip/ZipInputStream.java line 664: > 662: > 663: // The LOC's 'compressed size' and 'uncompressed size' must both > be marked for Zip64 > 664: if (csize != ZIP64_MAGICVAL || size != ZIP64_MAGICVAL) { The spec for this says different. It says: > > 4.4.4 general purpose bit flag: > ... > Bit 3: If this bit is set, the fields crc-32, compressed size and > uncompressed size are set to zero in the local header. The correct values > are put in the data descriptor immediately following the compressed data. So it expects the value zero for the compressed/uncompressed sizes in the LOC when the data descriptor bit is set. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1453460177