Re: RFR: 8380542: ZipOutputStream.setComment and ZipEntry.setComment spec updates for rejecting unmappable characters and allowing null or empty comments [v10]

Lance Andersen Wed, 08 Apr 2026 05:39:27 -0700

On Wed, 8 Apr 2026 08:40:07 GMT, Eirik Bjørsnøs <[email protected]> wrote:


>>> Let's see what Alan/Jai have for a preference.
>> 
>> The Charset is used to encode entry names and comments so it's okay to say 
>> that.
>> 
>> I see Eirik has inlined another patch. The class description part is okay 
>> except for "uses UTF-8 to encode strings". This is problematic because it 
>> jumps from entry names and comments to "strings", we should keep it 
>> consistent as otherwise it creates doubt that the 1-arg and 2-arg 
>> constructors do something different. For the same reason, I don't think the 
>> changes in the inlined patch to the setComment and constructors should be 
>> included.
>
>> > Let's see what Alan/Jai have for a preference.
>> 
>> The Charset is used to encode entry names and comments so it's okay to say 
>> that.
>> 
>> I see Eirik has inlined another patch. The class description part is okay 
>> except for "uses UTF-8 to encode strings". 
> 
> I've pushed a change introducing the change in the class description, sans 
> the "to encode strings" ending.
> 
>> For the same reason, I don't think the changes in the inlined patch to the 
>> setComment and constructors should be included.
> 
> Looking at the constructor descriptions I noticed that they have identical 
> and quite non-descript leading sentences. This means that constructor 
> summaries end up saying just "Creates a new ZIP output stream" for both.
> 
> To figure out how they differ, you need to navigate to second paragraph for 
> the 1-arg constructor and to the `@param` description for the 2-arg 
> constructors. This made me wonder if it would make sense to lift charset 
> concerns into the leading sentence, making their difference more obvious 
> sooner.
> 
> Somewhat unrelated or tangential to the change here though, just wanted to 
> share the observation:
> 
> 
> Index: src/java.base/share/classes/java/util/zip/ZipOutputStream.java
> IDEA additional info:
> Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
> <+>UTF-8
> ===================================================================
> diff --git a/src/java.base/share/classes/java/util/zip/ZipOutputStream.java 
> b/src/java.base/share/classes/java/util/zip/ZipOutputStream.java
> --- a/src/java.base/share/classes/java/util/zip/ZipOutputStream.java  
> (revision b74e3c4cada4c5581b75c85370f06a0137fbc3b0)
> +++ b/src/java.base/share/classes/java/util/zip/ZipOutputStream.java  (date 
> 1775637217668)
> @@ -114,10 +114,9 @@
>      public static final int DEFLATED = ZipEntry.DEFLATED;
>  
>      /**
> -     * Creates a new ZIP output stream.
> -     *
> -     * <p>The UTF-8 {@link java.nio.charset.Charset charset} is used
> -     * to encode the entry names and comments.
> +     * Creates a new ZIP output stream using the UTF-8
> +     * {@link java.nio.charset.Charset charset} to encode
> +     * the entry names and comments.
>       *
>       * @param out the actual output stream
>       */
> @@ -126,7 +125,9 @@
>      }
>  
>      /**
> -     * Creates a new ZIP output stream.
> +     * Creates a new ZIP output stream using the provided
> +     * {@linkplain java.nio.charset.Charset charset} to encode
> +     * the entry names and comments
>       *
>       * @param out the actual output stream
>       *

I would prefer "specified" vs. "provided" in the above

WRT:

> A stream created without specifying
> + * a {@code Charset} uses UTF-8 to encode strings.

I might wordsmith a bit more  to something similar to

By default, the  UTF-8 charset is used to encode entry names and comments.  
ZipOutPutStream(OutputStream, CharSet) maybe be used to specify an alternative 
charset)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/30338#discussion_r3051341361

Re: RFR: 8380542: ZipOutputStream.setComment and ZipEntry.setComment spec updates for rejecting unmappable characters and allowing null or empty comments [v10]

Reply via email to