On Thu, 18 Jul 2024 21:36:54 GMT, Liam Miller-Cushon <[email protected]> wrote:
>> This change fixes a zip64 bug in the launcher that is prevent it from
>> reading the manifest of jars where the 'relative offset of local header'
>> field in the central directory entry is >4GB. As described in APPNOTE.TXT
>> 4.5.3, the offset is too large to be stored in the central directory it is
>> stored in a 'Zip64 Extended Information Extra Field'.
>
> Liam Miller-Cushon has updated the pull request with a new target base due to
> a merge or a rebase. The incremental webrev excludes the unrelated changes
> brought in by the merge/rebase. The pull request contains 10 additional
> commits since the last revision:
>
> - Add a missing `break`
> - Merge branch 'master' into JDK-8328995
> - Merge remote-tracking branch 'origin/master' into JDK-8328995
> - Move test to test/jdk/tools/launcher
> - Add some more comments
> - Maximum Zip64 extra field length is 32
> - Make cendsk an unsigned short
> - Fix disk number size
> - Improvements
>
> * don't rely on variable length arrays
> * only run the test of 64 bit machines, since it requires >4GB of heap
> - 8328995: launcher can't open jar files where the offset of the manifest is
> >4GB
src/java.base/share/native/libjli/parse_manifest.c line 507:
> 505: || censiz == ZIP64_MAGICVAL
> 506: || cenoff == ZIP64_MAGICVAL)
> 507: && cenext > 0) {
I went through these changes and here's my first round of review.
Before the changes proposed in this PR, this part of the code which is
responsible for finding and returning a constructed zip entry instance would
blindly use the local header offset value from the central directory of the
entry being looked up. It would then "seek" to that position and read the
metadata of that entry (details like uncompressed length, compressed length,
the compression method) and return back that entry instance. Clearly this isn't
right when zip64 entries are involved since various details of such an entry
can reside in the zip64 extra field block of that entry instead of being in the
central directory.
This change proposes that this part of the code first identify that a zip64
extra block exists for a particular entry and then read that zip64 block,
validate some parts of the zip64 block and if that validation succeeds, then
use the entry metadata from the zip64 block for constructing and returning the
entry instance.
The idea to identify and support zip64 entries in this part of the code I think
is agreed as the right one.
Coming to the implementation, I think we need to come to an agreement on what
we want to do here. Specifically:
- What logic do we use to decide when to look for zip64 extra block for an
entry? The changes in this PR (at this line here) proposes that we look for
zip64 extra block for an entry if any of the compressed size, uncompressed size
or the local header offset value is `0xFFFFFFFFL` and the extra field size
noted in this central directory entry is greater than 0. This however doesn't
match with what we do in the implementation of `java.util.zip.ZipFile` which
too does similar checks for zip64 entry presence when parsing the central
directory. Very specifically, in the ZipFile where this logic is implemented is
here
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/zip/ZipFile.java#L1254.
That code in the ZipFile has gone through the necessary vetting for dealing
with various possibilities with the zip files. I think we should implement that
same logic here while checking for zip64 entries.
- The next one to decide is what kind of validations do we want to do in this
code for zip64 extra field block. I think here too we should match whatever w
currently do in the `java.util.zip.ZipFile` implementation, specifically what's
being done in the `checkExtraFields` and `checkZip64ExtraFieldValues` methods
of that class.
- Next, what do we do when these validations fail. Right now in this proposed
implementation, we appear to consider some validation failures as the entry
being absent. My opinion is that we should be more stricter and fail the jar
like we do in the implementation of `ZipFile` for such validation failures.
One additional thing that this proposed change does is that it validates that
the local header offset determined for an entry (either through the central
directory or through the zip64 extra field block) does indeed point to a local
header for an entry with the same file name. I don't think we do that in the
ZipFile implementation. But I think doing that check in this code is fine.
There are few other implementation details that I haven't commented about
because they may not matter depending on what conclusions we arrive at for the
above few points.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18479#discussion_r1684450259