On Tue, 17 Mar 2026 18:44:37 GMT, Raffaello Giulietti <[email protected]>
wrote:
>> The encodedLengthUTF8() method uses an int accumulator (dp) for the LATIN1
>> code path, while the UTF16 path (encodedLengthUTF8_UTF16) correctly uses a
>> long accumulator with an overflow check. When a LATIN1 string contains more
>> than Integer.MAX_VALUE/2 non-ASCII bytes, the int dp overflows, potentially
>> causing NegativeArraySizeException in downstream buffer allocation.
>>
>> Fix: change dp from int to long and add the same overflow check used in the
>> UTF16 path.
>
> src/java.base/share/classes/java/lang/String.java line 1519:
>
>> 1517: throw new OutOfMemoryError("Required length exceeds
>> implementation limit");
>> 1518: }
>> 1519: return (int) dp;
>
> I think you can leave the code as it currently is and throw when `dp < 0`.
> But this variant only works when `dp` is incremented by at most 2 at each
> iteration, like here.
> Your variant with `long` is more robust.
Thank you for the suggestion. You're right that checking `dp < 0` would work
here since we increment by at most 2 per iteration. However, I prefer to keep
the `long` approach because:
1. It's more explicit and robust - the overflow check is clear rather than
implicit
2. It matches the existing UTF16 path pattern (encodedLengthUTF8_UTF16)
3. It doesn't rely on the assumption that dp always increments by ≤2, making it
more maintainable if the code evolves
The performance difference is negligible, so I believe the clarity and
robustness are worth the slight verbosity.
> test/jdk/java/lang/String/EncodedLengthUTF8Overflow.java line 111:
>
>> 109: }
>> 110: bigArray = null; // allow GC
>> 111:
>
> Have you considered simplifying the above code with just `bigString =
> String.valueOf(\u00ff).repeat(length)`?
Great suggestion! I've applied this simplification in the latest commit
(f0c2830e1c1). The `String.repeat()` approach is indeed much cleaner - it
eliminates the manual byte array allocation, `Arrays.fill()`, and the need for
explicit GC hints.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/30189#discussion_r2951446437
PR Review Comment: https://git.openjdk.org/jdk/pull/30189#discussion_r2951446582