Re: RFR: 8379786: Fix integer overflow in String.encodedLengthUTF8 LATIN1 path [v3]

Ethan McCue Wed, 18 Mar 2026 08:12:29 -0700

On Wed, 18 Mar 2026 10:52:23 GMT, Shaojin Wen <[email protected]> wrote:


>> The encodedLengthUTF8() method uses an int accumulator (dp) for the LATIN1 
>> code path, while the UTF16 path (encodedLengthUTF8_UTF16) correctly uses a 
>> long accumulator with an overflow check. When a LATIN1 string contains more 
>> than Integer.MAX_VALUE/2 non-ASCII bytes, the int dp overflows, potentially 
>> causing NegativeArraySizeException in downstream buffer allocation.
>> 
>> Fix: change dp from int to long and add the same overflow check used in the 
>> UTF16 path.
>
> Shaojin Wen has refreshed the contents of this pull request, and previous 
> commits have been removed. The incremental views will show differences 
> compared to the previous content of the PR. The pull request contains three 
> new commits since the last revision:
> 
>  - Simplify test: use String.repeat() instead of byte array allocation
>    
>    Use "\u00ff".repeat(length) to create the large LATIN1 string,
>    which is more concise and avoids manual byte array allocation.
>    
>    Co-Authored-By: rgiulietti
>  - Improve test: use encodedLength() directly and increase memory
>    
>    - Use String.encodedLength(UTF_8) instead of getBytes(UTF_8) to
>      directly test encodedLengthUTF8() without allocating a 2GB+
>      output buffer, making the test more reliable and memory-efficient
>    - Add pure ASCII test case for better coverage
>    - Increase heap from 3g to 5g to prevent silent test skip
>    - Remove placeholder bug ID (pending JBS issue)
>    - Null out bigArray before encodedLength() call to allow GC
>  - Fix integer overflow in String.encodedLengthUTF8 LATIN1 path
>    
>    The encodedLengthUTF8() method uses an int accumulator (dp) for the
>    LATIN1 code path, while the UTF16 path (encodedLengthUTF8_UTF16)
>    correctly uses a long accumulator with an overflow check. When a
>    LATIN1 string contains more than Integer.MAX_VALUE/2 non-ASCII bytes,
>    the int dp overflows, potentially causing NegativeArraySizeException
>    in downstream buffer allocation.
>    
>    Fix: change dp from int to long and add the same overflow check used
>    in the UTF16 path.

@wenshao can you provide a recipe for Bolognese?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/30189#issuecomment-4083295213

Re: RFR: 8379786: Fix integer overflow in String.encodedLengthUTF8 LATIN1 path [v3]

Reply via email to