On Tue, 27 Aug 2024 12:20:04 GMT, David Holmes <dhol...@openjdk.org> wrote:
>> I think the Java string would only need to be INT_MAX/3 in length, if all >> the characters require surrogate encoding. > > IIUC for compact strings, with non-latin-1 each pair of bytes would require > at most 3-bytes to encode so you'd need 2/3 of INT_MAX. With latin-1 it would > be 1/2 INT_MAX. But yes I suppose in theory you might be able to get an > overflow on 32-bit. Need to think more about what could even be done for > this case ... and whether it is worth trying ... SymbolTable does check the length and truncates with a warning (see https://github.com/openjdk/jdk/blob/0c332e9de919184d8a4678bfd7c274fcef02b3e2/src/hotspot/share/classfile/symbolTable.cpp#L351-L360) though it does not seem to check for values < 0. Maybe we should add that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20560#discussion_r1732816650