On Fri, 11 Feb 2022 12:11:54 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> I'm requesting comments and, hopefully, some help with this patch to replace >> `StringCoding.hasNegatives` with `countPositives`. The new method does a >> very similar pass, but alters the intrinsic to return the number of leading >> bytes in the `byte[]` range which only has positive bytes. This allows for >> dealing much more efficiently with those `byte[]`s that has a ASCII prefix, >> with no measurable cost on ASCII-only or latin1/UTF16-mostly input. >> >> Microbenchmark results: >> https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904 >> >> - Only implemented on x86 for now, but I want to verify that implementations >> of `countPositives` can be implemented with similar efficiency on all >> platforms that today implement a `hasNegatives` intrinsic (aarch64, ppc etc) >> before moving ahead. This pretty much means holding up this until it's >> implemented on all platforms, which can either contributed to this PR or as >> dependent follow-ups. >> >> - An alternative to holding up until all platforms are on board is to allow >> the implementation of `StringCoding.hasNegatives` and `countPositives` to be >> implemented so that the non-intrinsified method calls into the intrinsified. >> This requires structuring the implementations differently based on which >> intrinsic - if any - is actually implemented. One way to do this could be to >> mimic how `java.nio` handles unaligned accesses and expose which intrinsic >> is available via `Unsafe` into a `static final` field. >> >> - There are a few minor regressions (~5%) in the x86 implementation on >> `encode-/decodeLatin1Short`. Those regressions disappear when mixing inputs, >> for example `encode-/decodeShortMixed` even see a minor improvement, which >> makes me consider those corner case regressions with little real world >> implications (if you have latin1 Strings, you're likely to also have >> ASCII-only strings in your mix). > > Claes Redestad has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 23 additional > commits since the last revision: > > - Merge branch 'master' into count_positives > - Restore partial vector checks in AVX2 and SSE intrinsic variants > - Let countPositives use hasNegatives to allow ports not implementing the > countPositives intrinsic to stay neutral > - Simplify changes to encodeUTF8 > - Fix little-endian error caught by testing > - Reduce jumps in the ascii path > - Remove unused tail_mask > - Remove has_negatives intrinsic on x86 (and hook up 32-bit x86 to use > count_positives) > - Add more comments, simplify tail branching in AVX512 variant > - Resolve issues in the precise implementation > - ... and 13 more: > https://git.openjdk.java.net/jdk/compare/811eb365...c4bb3612 Hi Claes, doing it for all platforms and cleaning it up sounds good. My PPC64 contribution is already tested and reviewed. I'll try to find a volunteer for s390 which uses exactly the same algorithm as PPC64. ------------- PR: https://git.openjdk.java.net/jdk/pull/7231