On Tue, 3 Sep 2024 03:00:56 GMT, David Holmes <dhol...@openjdk.org> wrote:

>> This is the implementation of a new method added to the JNI specification.
>> 
>> From the CSR request:
>> 
>> The `GetStringUTFLength` function returns the length as a `jint` (`jsize`) 
>> value and so is limited to returning at most `Integer.MAX_VALUE`. But a Java 
>> string can itself consist of `Integer.MAX_VALUE` characters, each of which 
>> may require more than one byte to represent them in modified UTF-8 format.** 
>> It follows then that this function cannot return the correct answer for all 
>> String values and yet the specification makes no mention of this, nor of any 
>> possible error to report if this situation is encountered.
>> 
>> **The modified UTF-8 format used by the VM can require up to six bytes to 
>> represent one unicode character, but six byte characters are stored as UTF16 
>> surrogate pairs. Hence the most bytes per character is 3, and so the maximum 
>> length is 3*`Integer.MAX_VALUE`. With compact strings this reduces to 
>> 2*`Integer.MAX_VALUE`.
>> 
>> Solution
>> 
>> Deprecate the existing JNI `GetStringUTFLength` method noting that it may 
>> return a truncated length, and add a new method, JNI 
>> `GetStringUTFLengthAsLong` that returns the string length as a `jlong` value.
>> 
>> ---
>> 
>> We also add a truncation warning to `GetStringUTFLength` under -Xcheck:jni
>> 
>> There are some incidental whitespace changes in 
>> `src/hotspot/os/posix/dtrace/hotspot_jni.d` along with the new method 
>> entries.
>> 
>> Testing:
>>  - new test added
>>  - tiers 1-3 sanity
>> 
>> Thanks
>
> David Holmes has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   The JNI version update was incompete

Marked as reviewed by alanb (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/20784#pullrequestreview-2276469775

Reply via email to