On Thu, 20 Nov 2025 08:59:11 GMT, Liam Miller-Cushon <[email protected]> wrote:
>> This PR proposes adding a new overload to `MemorySegment::getString` that >> takes a known byte length of the content. >> >> This was previously proposed in https://github.com/openjdk/jdk/pull/20725, >> but the outcome of >> [JDK-8333843](https://bugs.openjdk.org/browse/JDK-8333843) was to update >> `MemorySegment#getString` to suggest >> >> >> byte[] bytes = new byte[length]; >> MemorySegment.copy(segment, JAVA_BYTE, offset, bytes, 0, length); >> return new String(bytes, charset); >> >> >> However this is less efficient than what the implementation of getString >> does after [JDK-8362893](https://bugs.openjdk.org/browse/JDK-8362893), it >> now uses `JavaLangAccess::uncheckedNewStringNoRepl` to avoid the copy. >> >> See also discussion in [this panama-dev@ >> thread](https://mail.openjdk.org/pipermail/panama-dev/2025-November/021193.html), >> and mcimadamore's document [Pulling the (foreign) >> string](https://cr.openjdk.org/~mcimadamore/panama/strings_ffm.html) >> >> Benchmark results: >> >> >> Benchmark (size) Mode Cnt Score Error >> Units >> ToJavaStringTest.jni_readString 5 avgt 30 55.339 ± 0.401 >> ns/op >> ToJavaStringTest.jni_readString 20 avgt 30 59.887 ± 0.295 >> ns/op >> ToJavaStringTest.jni_readString 100 avgt 30 84.288 ± 0.419 >> ns/op >> ToJavaStringTest.jni_readString 200 avgt 30 119.275 ± 0.496 >> ns/op >> ToJavaStringTest.jni_readString 451 avgt 30 193.106 ± 1.528 >> ns/op >> ToJavaStringTest.panama_copyLength 5 avgt 30 7.348 ± 0.048 >> ns/op >> ToJavaStringTest.panama_copyLength 20 avgt 30 7.440 ± 0.125 >> ns/op >> ToJavaStringTest.panama_copyLength 100 avgt 30 11.766 ± 0.058 >> ns/op >> ToJavaStringTest.panama_copyLength 200 avgt 30 16.096 ± 0.089 >> ns/op >> ToJavaStringTest.panama_copyLength 451 avgt 30 25.844 ± 0.054 >> ns/op >> ToJavaStringTest.panama_readString 5 avgt 30 5.857 ± 0.046 >> ns/op >> ToJavaStringTest.panama_readString 20 avgt 30 7.750 ± 0.046 >> ns/op >> ToJavaStringTest.panama_readString 100 avgt 30 14.109 ± 0.187 >> ns/op >> ToJavaStringTest.panama_readString 200 avgt 30 18.035 ± 0.130 >> ns/op >> ToJavaStringTest.panama_readString 451 avgt 30 35.896 ± 0.227 >> ns/op >> ToJavaStringTest.panama_readStringLength 5 avgt 30 4.565 ± 0.038 >> ns/op >> ToJavaStringTest.panama_readStringLength 20... > > Liam Miller-Cushon has updated the pull request incrementally with one > additional commit since the last revision: > > Review feedback Thanks for working on this! I've left some inline comments. src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1339: > 1337: * @param charset the charset used to {@linkplain > Charset#newDecoder() decode} the > 1338: * string bytes > 1339: * @param length length to be used for string conversion, in bytes Please keep the same format Suggestion: * @param length length in bytes of the string to read src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1341: > 1339: * @param length length to be used for string conversion, in bytes > 1340: * @return a Java string constructed from the bytes read from the > given starting > 1341: * address reading the given length of characters Suggestion: * address reading the given length of bytes src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2662: > 2660: * are {@code < 0} > 2661: * @throws IndexOutOfBoundsException if the {@code endIndex} is > larger than the length of > 2662: * this {@code String} object, or {@code beginIndex} is > larger than {@code endIndex}. There's no `beginIndex` and `endIndex`? src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java line 157: > 155: > 156: /** > 157: * Converts a Java string into a null-terminated C string using the > provided charset, Not null-terminated, right? src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java line 177: > 175: * @return a new native segment containing the converted C string > 176: * @throws IllegalArgumentException if {@code charset} is not a > 177: * {@linkplain StandardCharsets standard charset} Same here, I don't think we have this limitation. src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java line 180: > 178: * @throws IndexOutOfBoundsException if either {@code srcIndex} or > {@code numChars} are {@code < 0} > 179: * @throws IndexOutOfBoundsException if the {@code endIndex} is > larger than the length of > 180: * this {@code String} object, or {@code beginIndex} is > larger than {@code endIndex}. There is no 'endIndext'? src/java.base/share/classes/jdk/internal/foreign/StringSupport.java line 68: > 66: @ForceInline > 67: public static String readBytes(AbstractMemorySegmentImpl segment, > long offset, Charset charset, long length) { > 68: final int lengthBytes = (int) length; I think we should do something more here than just ignore the upper bits. We probably need to throw an exception when the value is > Integer.MAX_VALUE test/jdk/java/foreign/TestStringEncoding.java line 135: > 133: } > 134: } > 135: We need some more tests for the other new methods as well. Also, it would be nice to test non-standard charsets. ------------- PR Review: https://git.openjdk.org/jdk/pull/28043#pullrequestreview-3488572591 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546633357 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546635587 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546650214 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546610473 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546614368 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546616502 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546607537 PR Review Comment: https://git.openjdk.org/jdk/pull/28043#discussion_r2546628459
