On Sat, 24 Feb 2024 14:56:01 GMT, Lance Andersen <lan...@openjdk.org> wrote:
> Please review this PR which addresses the handling of invalid UTF-8 byte > sequences in the entry name of a LOC file header and a Zip file comment which > is returned via ZipFile::getComment. > > As part of the change, `ZipFile::getComment` will now return `null` if an > invalid UTF-8 byte sequence is encountered while converting the byte array to > a String. The CSR for this change has also been approved. > > Mach5 tiers 1-3 are clean with this change. Left a few comment on the code changes and the test. Some needs to be fixed, others may be a matter of preference. src/java.base/share/classes/java/util/zip/ZipFile.java line 331: > 329: try { > 330: return res.zsrc.zc.toString(res.zsrc.comment); > 331: } catch(IllegalArgumentException iae) { Suggestion: } catch (IllegalArgumentException iae) { src/java.base/share/classes/java/util/zip/ZipInputStream.java line 526: > 524: throw (ZipException) new ZipException( > 525: "invalid LOC header (bad entry name)").initCause(ex); > 526: } There is a lot going on here: * The creation of a ZipEntry * The interpretation of the general purpose bit flag * The split to call either a static or non-static toString depending on the flag interpretation * The exception handling, including the somewhat unusual initCause and cast The comment at 517 seems even more detached from the logic it tries to describe after this change. Perhaps we could benefit from introducing a `zipCoderFromFlag(int flag)` method, similar to that in ZipFile? This would allow the section to be rewritten to something that hide the flag parsing details, leaves us with a single (polymorphic) toString invocation and extracts `createZipEntry` from the try/catch scope. Something like this: (I also changed the exception handling here, that's a bit orthogonal to the main points above) Suggestion: String name; try { name = zipCoderForFlag(flag).toString(b, len); } catch (IllegalArgumentException ex) { ZipException e = new ZipException("invalid LOC header (bad entry name)"); e.initCause(ex); throw e; } ZipEntry e = createZipEntry(name); test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 49: > 47: * @summary Validate that a ZipException is thrown when opening a ZIP > file via > 48: * ZipFile, traversing a Zip File via ZipInputStream, with invalid UTF-8 > 49: * byte sequences in the name or comment fields fails with ZipException. The leading summary sentence needs a cleanup. Here's one suggestion: Suggestion: * @summary Validate that a ZipException is thrown when a ZIP file with * invalid UTF-8 byte sequences in the name or comment fields is opened via * ZipFile or traversed via ZipInputStream. test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 60: > 58: private static final byte[] INVALID_UTF8_BYTE_SEQUENCE = {(byte) > 0xF0, (byte) 0xA4, (byte) 0xAD}; > 59: // Expected error message when an invalid entry name or entry comment > is > 60: // encountered when accessing a CEN Header This two-line comment could benefit from an earlier line break. test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 64: > 62: > 63: // Expected error message when an invalid entry name is encountered > when > 64: // accessing a LOC Header This two-line comment could benefit from an earlier line break. test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 65: > 63: // Expected error message when an invalid entry name is encountered > when > 64: // accessing a LOC Header > 65: private static final String LOC_BAD_ENTRY_NAME_OR_COMMENT = "invalid > LOC header (bad entry name)"; Should this be renamed `LOC_BAD_ENTRY_NAME`? test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 128: > 126: * 0x93: Comment : [ZipFileComment] > 127: */ > 128: public static byte[] VALID_ZIP = { I see this byte array encoding of ZIP files is a pattern used across tests. My preference would be to encode this in Base64 or hex, since that would make the consumption by external command line tools easier and the encoded representation somewhat prettier. But no big deal, this might just come down to personal preference. (This isn't human readable anyhow for the normal definition of "human" :) test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 182: > 180: /** > 181: * Validate that the original Zip file can be opened via ZipFile. > 182: * @throws IOException if an entry occurs Suggestion: * @throws IOException if an error occurs test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 196: > 194: * Validate that the original Zip file can be opened and traversed > via > 195: * ZipinputStream::getNextEntry. > 196: * @throws IOException if an entry occurs Suggestion: * @throws IOException if an error occurs test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 201: > 199: public void traverseZipWithZipInputStreamTest() throws IOException { > 200: Files.write(ZIP_FILE, zipArray); > 201: try( ZipInputStream zis = new ZipInputStream(new > FileInputStream(ZIP_FILE.toFile()))) { Suggestion: try ( ZipInputStream zis = new ZipInputStream(new FileInputStream(ZIP_FILE.toFile()))) { test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 203: > 201: try( ZipInputStream zis = new ZipInputStream(new > FileInputStream(ZIP_FILE.toFile()))) { > 202: ZipEntry ze; > 203: while((ze = zis.getNextEntry()) != null ) { Suggestion: while ((ze = zis.getNextEntry()) != null ) { test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 243: > 241: > 242: /** > 243: * Validate that a ZipException is thrown when an entry name is > encountered Is the "is" here out of place or are we missing a "which"? Suggestion: * Validate that a ZipException is thrown when an entry name encountered test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 267: > 265: * @throws IOException if an error occurs > 266: */ > 267: private void createInvalidUTFEntryInZipFile(int offset) throws > IOException { Is the "In" in this method name helpful? (To me it suggests a file argument, not a result) Suggestion: private void createInvalidUTFEntryInZipFile(int offset) throws IOException { test/jdk/java/util/zip/ZipFile/InvalidBytesInEntryNameOrComment.java line 304: > 302: * System.out.println(result); > 303: * } > 304: * </pre> Can we use `@snippet` in tests? Would allow syntax highlighting in IDEs ------------- PR Review: https://git.openjdk.org/jdk/pull/17995#pullrequestreview-1899525053 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501631653 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501630004 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501613107 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501601638 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501603203 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501600929 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501605456 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501606660 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501606918 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501607429 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501607377 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501609316 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501610399 PR Review Comment: https://git.openjdk.org/jdk/pull/17995#discussion_r1501615750