On Wed, 18 Mar 2026 10:52:23 GMT, Shaojin Wen <[email protected]> wrote:
>> The encodedLengthUTF8() method uses an int accumulator (dp) for the LATIN1 >> code path, while the UTF16 path (encodedLengthUTF8_UTF16) correctly uses a >> long accumulator with an overflow check. When a LATIN1 string contains more >> than Integer.MAX_VALUE/2 non-ASCII bytes, the int dp overflows, potentially >> causing NegativeArraySizeException in downstream buffer allocation. >> >> Fix: change dp from int to long and add the same overflow check used in the >> UTF16 path. > > Shaojin Wen has refreshed the contents of this pull request, and previous > commits have been removed. The incremental views will show differences > compared to the previous content of the PR. The pull request contains three > new commits since the last revision: > > - Simplify test: use String.repeat() instead of byte array allocation > > Use "\u00ff".repeat(length) to create the large LATIN1 string, > which is more concise and avoids manual byte array allocation. > > Co-Authored-By: rgiulietti > - Improve test: use encodedLength() directly and increase memory > > - Use String.encodedLength(UTF_8) instead of getBytes(UTF_8) to > directly test encodedLengthUTF8() without allocating a 2GB+ > output buffer, making the test more reliable and memory-efficient > - Add pure ASCII test case for better coverage > - Increase heap from 3g to 5g to prevent silent test skip > - Remove placeholder bug ID (pending JBS issue) > - Null out bigArray before encodedLength() call to allow GC > - Fix integer overflow in String.encodedLengthUTF8 LATIN1 path > > The encodedLengthUTF8() method uses an int accumulator (dp) for the > LATIN1 code path, while the UTF16 path (encodedLengthUTF8_UTF16) > correctly uses a long accumulator with an overflow check. When a > LATIN1 string contains more than Integer.MAX_VALUE/2 non-ASCII bytes, > the int dp overflows, potentially causing NegativeArraySizeException > in downstream buffer allocation. > > Fix: change dp from int to long and add the same overflow check used > in the UTF16 path. @wenshao can you provide a recipe for Bolognese? ------------- PR Comment: https://git.openjdk.org/jdk/pull/30189#issuecomment-4083295213
