On Fri, 29 Nov 2024 09:46:55 GMT, Per Minborg <[email protected]> wrote:
>> This PR proposes to rewrite the `StringSupport::chunkedStrlen*` methods,
>> fixing a bug in the `short_strlen` variant for odd offsets (`offset % 2 !=
>> 0`).
>>
>> This PR also improves performance on modern hardware, as there is no need
>> for pre-looping alignment. Removing this improves performance by about 30%
>> for larger strings.
>>
>> It passes the `jdk_foreign` test suit. Also, `tier1` through `tier3` passes
>> on various platforms and configurations.
>>
>> Base:
>>
>>
>> Benchmark (size) Mode Cnt Score Error
>> Units
>> InternalStrLen.changedElementQuad 1 avgt 30 2.057 ? 0.012
>> ns/op
>> InternalStrLen.changedElementQuad 4 avgt 30 3.776 ? 0.031
>> ns/op
>> InternalStrLen.changedElementQuad 16 avgt 30 6.690 ? 0.060
>> ns/op
>> InternalStrLen.changedElementQuad 251 avgt 30 48.581 ? 0.764
>> ns/op
>> InternalStrLen.changedElementQuad 1024 avgt 30 196.188 ? 3.484
>> ns/op
>> InternalStrLen.chunkedDouble 1 avgt 30 1.903 ? 0.013
>> ns/op
>> InternalStrLen.chunkedDouble 4 avgt 30 3.446 ? 0.025
>> ns/op
>> InternalStrLen.chunkedDouble 16 avgt 30 5.759 ? 0.062
>> ns/op
>> InternalStrLen.chunkedDouble 251 avgt 30 26.892 ? 0.141
>> ns/op
>> InternalStrLen.chunkedDouble 1024 avgt 30 72.940 ? 1.562
>> ns/op
>> InternalStrLen.chunkedSingle 1 avgt 30 1.897 ? 0.015
>> ns/op
>> InternalStrLen.chunkedSingle 4 avgt 30 5.357 ? 0.560
>> ns/op
>> InternalStrLen.chunkedSingle 16 avgt 30 3.821 ? 0.052
>> ns/op
>> InternalStrLen.chunkedSingle 251 avgt 30 19.482 ? 0.190
>> ns/op
>> InternalStrLen.chunkedSingle 1024 avgt 30 38.938 ? 0.411
>> ns/op
>> InternalStrLen.chunkedSingleMisaligned 1 avgt 30 2.230 ? 0.147
>> ns/op
>> InternalStrLen.chunkedSingleMisaligned 4 avgt 30 5.424 ? 0.688
>> ns/op
>> InternalStrLen.chunkedSingleMisaligned 16 avgt 30 9.573 ? 0.063
>> ns/op
>> InternalStrLen.chunkedSingleMisaligned 251 avgt 30 22.242 ? 0.182
>> ns/op
>> InternalStrLen.chunkedSingleMisaligned 1024 avgt 30 45.442 ? 0.252
>> ns/op
>> InternalStrLen.elementByteMisaligned 1 avgt 30 1.616 ? 0.041
>> ns/op
>> InternalStrLen.elementByteMisaligned 4 avgt 30 2.982 ? 0.018
>> ns/op
>> InternalStrLen.elementByteMisaligned 16 avgt 30 8.662 ? 0.085
>> ns/op
>> InternalStrLe...
>
> Per Minborg has updated the pull request incrementally with one additional
> commit since the last revision:
>
> Fix imports
src/java.base/share/classes/jdk/internal/foreign/StringSupport.java line 187:
> 185: // Prevent over scanning as we step by 2
> 186: final long endScan = toOffset & ~1; // The last bit is zero
> 187: for (; offset < endScan; offset += Short.BYTES) {
Question: do we care about finding unaligned sequences of two consecutive zero
bytes? Because I don't think this method can detect those. E.g. if the first
zero byte in a terminator sequence starts at an odd offset I don't think we can
recognize the terminator.
This is a bit of a weird side-effect, because now this method no longer behaves
like a plain byte loop where we look for two consecutive bytes. But, I suppose,
in a way it also makes sense: if I want to read a C string whose characters are
2-bytes each, then I definitively need to scan the segment (at given offset)
two bytes at a time. A terminator at an "odd offset" from the start of the scan
is not good. I believe the javadoc is a bit "light" on how this terminator is
actually found :-)
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22451#discussion_r1863463824