On Wed, 9 Mar 2022 16:52:54 GMT, Claes Redestad <redes...@openjdk.org> wrote:

>> I'm requesting comments and, hopefully, some help with this patch to replace 
>> `StringCoding.hasNegatives` with `countPositives`. The new method does a 
>> very similar pass, but alters the intrinsic to return the number of leading 
>> bytes in the `byte[]` range which only has positive bytes. This allows for 
>> dealing much more efficiently with those `byte[]`s that has a ASCII prefix, 
>> with no measurable cost on ASCII-only or latin1/UTF16-mostly input.
>> 
>> Microbenchmark results: 
>> https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
>> 
>> - Only implemented on x86 for now, but I want to verify that implementations 
>> of `countPositives` can be implemented with similar efficiency on all 
>> platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) 
>> before moving ahead. This pretty much means holding up this until it's 
>> implemented on all platforms, which can either contributed to this PR or as 
>> dependent follow-ups.
>> 
>> - An alternative to holding up until all platforms are on board is to allow 
>> the implementation of `StringCoding.hasNegatives` and `countPositives` to be 
>> implemented so that the non-intrinsified method calls into the intrinsified. 
>> This requires structuring the implementations differently based on which 
>> intrinsic - if any - is actually implemented. One way to do this could be to 
>> mimic how `java.nio` handles unaligned accesses and expose which intrinsic 
>> is available via `Unsafe` into a `static final` field.
>> 
>> - There are a few minor regressions (~5%) in the x86 implementation on 
>> `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, 
>> for example `encode-/decodeShortMixed` even see a minor improvement, which 
>> makes me consider those corner case regressions with little real world 
>> implications (if you have latin1 Strings, you're likely to also have 
>> ASCII-only strings in your mix).
>
> Claes Redestad has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Restructure encodeUTF8 to reduce code gen issues

The regressions I observe on aarch64 in `encodeLatin1Short` and a few others 
are not in the intrinsic itself but due changes to the surrounding code. 
Reverting the changes to `String.encodeUTF8` removes the regressions (but also 
the improvements).  Seems some loop optimization is not taking place like it 
should - or just differently. Going back to check I see that x64 is also 
affected, meaning this is something that has come in when syncing up with 
master. 

I've experimented with adjusting the code to try and workaround and improve 
code gen, but with only partial success. I'll back out the changes to 
`String.encodeUTF8`, see if we can deal with the loop opt regression 
(separately) and return to do the `encodeUTF8` optimization later on.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7231

Reply via email to