On Sat, 24 Feb 2024 14:56:01 GMT, Lance Andersen <lan...@openjdk.org> wrote:

> Please review this PR which addresses the handling of invalid UTF-8 byte 
> sequences in the entry name of a LOC file header and a Zip file comment which 
> is returned via ZipFile::getComment.
> 
> As part of the change, `ZipFile::getComment` will now return `null` if an 
> invalid UTF-8 byte sequence is encountered while converting the byte array to 
> a String.  The CSR for this change has also been approved.
> 
> Mach5 tiers 1-3 are clean with this change.

Left a few comment on the code changes and the test. Some needs to be fixed, 
others may be a matter of preference.

src/java.base/share/classes/java/util/zip/ZipFile.java line 331:

> 329:             try {
> 330:                 return res.zsrc.zc.toString(res.zsrc.comment);
> 331:             } catch(IllegalArgumentException iae) {

Suggestion:

            } catch (IllegalArgumentException iae) {

src/java.base/share/classes/java/util/zip/ZipInputStream.java line 526:

> 524:             throw  (ZipException) new ZipException(
> 525:                     "invalid LOC header (bad entry name)").initCause(ex);
> 526:         }

There is a lot going on here:

* The creation of a ZipEntry
* The interpretation of the general purpose bit flag
* The split to call either a static or non-static toString depending on the 
flag interpretation
* The exception handling, including the somewhat unusual initCause and cast

The comment at 517 seems even more detached from the logic it tries to describe 
after this change.

Perhaps we could benefit from introducing a `zipCoderFromFlag(int flag)` 
method, similar to that in ZipFile?

This would allow the section to be rewritten to something that hide the flag 
parsing details, leaves us with a single (polymorphic) toString invocation and 
extracts `createZipEntry` from the try/catch scope. 

Something like this: (I also changed the exception handling here, that's a bit 
orthogonal to the main points above)

Suggestion:

        String name;
        try {
            name = zipCoderForFlag(flag).toString(b, len);
        } catch (IllegalArgumentException ex) {
            ZipException e = new ZipException("invalid LOC header (bad entry 
name)");
            e.initCause(ex);
            throw e;
        }
        ZipEntry e = createZipEntry(name);

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 49:

> 47:  * @summary Validate that a ZipException is thrown when opening a ZIP 
> file via
> 48:  * ZipFile, traversing a Zip File via ZipInputStream, with invalid UTF-8
> 49:  * byte sequences in the name or comment fields fails with ZipException.

The leading summary sentence needs a cleanup. Here's one suggestion:

Suggestion:

 * @summary Validate that a ZipException is thrown when a ZIP file with 
 * invalid UTF-8 byte sequences in the name or comment fields is opened via
 * ZipFile or traversed via ZipInputStream.

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 60:

> 58:     private static final byte[] INVALID_UTF8_BYTE_SEQUENCE = {(byte) 
> 0xF0, (byte) 0xA4, (byte) 0xAD};
> 59:     // Expected error message when an invalid entry name or entry comment 
> is
> 60:     // encountered when accessing a CEN Header

This two-line comment could benefit from an earlier line break.

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 64:

> 62: 
> 63:     // Expected error message when an invalid entry name is encountered 
> when
> 64:     // accessing a LOC Header

This two-line comment could benefit from an earlier line break.

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 65:

> 63:     // Expected error message when an invalid entry name is encountered 
> when
> 64:     // accessing a LOC Header
> 65:     private static final String LOC_BAD_ENTRY_NAME_OR_COMMENT = "invalid 
> LOC header (bad entry name)";

Should this be renamed `LOC_BAD_ENTRY_NAME`?

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 128:

> 126:      * 0x93: Comment     :   [ZipFileComment]
> 127:      */
> 128:     public static byte[] VALID_ZIP = {

I see this byte array encoding of ZIP files is a pattern used across tests. My 
preference would be to encode this in Base64 or hex, since that would make the 
consumption by external command line tools easier and the encoded 
representation somewhat prettier. But no big deal, this might just come down to 
personal preference. (This isn't human readable anyhow for the normal 
definition of "human" :)

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 182:

> 180:     /**
> 181:      * Validate that the original Zip file can be opened via ZipFile.
> 182:      * @throws IOException if an entry occurs

Suggestion:

     * @throws IOException if an error occurs

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 196:

> 194:      * Validate that the original Zip file can be opened and traversed 
> via
> 195:      * ZipinputStream::getNextEntry.
> 196:      * @throws IOException if an entry occurs

Suggestion:

     * @throws IOException if an error occurs

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 201:

> 199:     public void traverseZipWithZipInputStreamTest() throws IOException {
> 200:         Files.write(ZIP_FILE, zipArray);
> 201:         try( ZipInputStream zis = new ZipInputStream(new 
> FileInputStream(ZIP_FILE.toFile()))) {

Suggestion:

        try ( ZipInputStream zis = new ZipInputStream(new 
FileInputStream(ZIP_FILE.toFile()))) {

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 203:

> 201:         try( ZipInputStream zis = new ZipInputStream(new 
> FileInputStream(ZIP_FILE.toFile()))) {
> 202:             ZipEntry ze;
> 203:             while((ze = zis.getNextEntry()) != null ) {

Suggestion:

            while ((ze = zis.getNextEntry()) != null ) {

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 243:

> 241: 
> 242:     /**
> 243:      * Validate that a ZipException is thrown when an entry name is 
> encountered

Is the "is" here out of place or are we missing a "which"?

Suggestion:

     * Validate that a ZipException is thrown when an entry name encountered

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 267:

> 265:      * @throws IOException if an error occurs
> 266:      */
> 267:     private void createInvalidUTFEntryInZipFile(int offset) throws 
> IOException {

Is the "In" in this method name helpful? (To me it suggests a file argument, 
not a result) 
Suggestion:

    private void createInvalidUTFEntryInZipFile(int offset) throws IOException {

test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 304:

> 302:      *        System.out.println(result);
> 303:      *      }
> 304:      * </pre>

Can we use `@snippet` in tests? Would allow syntax highlighting in IDEs

-------------

PR Review: https://git.openjdk.org/jdk/pull/17995#pullrequestreview-1899525053
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501631653
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501630004
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501613107
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501601638
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501603203
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501600929
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501605456
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501606660
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501606918
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501607429
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501607377
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501609316
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501610399
PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501615750

Reply via email to