Re: RFR: 8303401: Add a Vector API equalsIgnoreCase micro benchmark [v2]
On Tue, 28 Feb 2023 22:19:46 GMT, Sandhya Viswanathan wrote: >> Eirik Bjorsnos has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Use GE, LE, NE operations instead of the lt,not combinations. Use >> uppercase in vectorized code to match the scalar version. > > test/micro/org/openjdk/bench/jdk/incubator/vector/EqualsIgnoreCaseBenchmark.java > line 85: > >> 83: >> 84: // ASCII and Latin-1 were designed to optimize >> case-twiddling operations >> 85: ByteVector lowerA = va.or((byte) 0x20); > > Just curious, here you use lower whereas in scalar code upper is being used. > Any reasons? I can't remember the exact reason. Perhaps I thought I could get simpler range checking of the Latin-1 code points because they are almost at the end of the Latin-1 range. I changed the vectorized version to use uppercase for better alignment with the scalar version. > test/micro/org/openjdk/bench/jdk/incubator/vector/EqualsIgnoreCaseBenchmark.java > line 88: > >> 86: >> 87: // Determine which bytes represent ASCII or Latin-1 letters: >> 88: VectorMask asciiLetter = lowerA.lt((byte) >> '{').and(lowerA.lt((byte) 0x60).not()); > > We do have GT/GE/NE etc comparison operators supported in Vector API, which > you can use here and other places in this benchmark. e.g. > You could do lowerA.compare(GE, (byte)0x60) instead of using lt() followed > not(). BTW did you mean to use GT here? > > You will need to do the following import: > import static jdk.incubator.vector.VectorOperators.*; Thanks, that's a lot more readable! This checks that bytes are in the range => (byte) 'A', <= (byte) 'A', so I think this "not less than" was a poor man's compare(GE, (byte) 'A') here. I've updated the code to use LE, GE, NE where appropriate - PR: https://git.openjdk.org/jdk/pull/12790
Re: RFR: 8303401: Add a Vector API equalsIgnoreCase micro benchmark [v2]
> This PR suggests we add a vectorized equalsIgnoreCase benchmark to the set of > benchmarks in `org.openjdk.bench.jdk.incubator.vector`. This benchmark serves > as an example of how vectorization can be useful also in the area of text > processing. It takes advantage of the fact that ASCII and Latin-1 were > designed to optimize case-twiddling operations. > > The code came about during the work on #12632, where vectorization was deemed > out of scope. > > Benchmark results: > > > Benchmark (size) Mode Cnt Score Error > Units > EqualsIgnoreCaseBenchmark.scalar 16 avgt 1520.671 ± 0.718 > ns/op > EqualsIgnoreCaseBenchmark.scalar 32 avgt 1546.155 ± 3.258 > ns/op > EqualsIgnoreCaseBenchmark.scalar 64 avgt 1568.248 ± 1.767 > ns/op > EqualsIgnoreCaseBenchmark.scalar 128 avgt 15 148.948 ± 0.890 > ns/op > EqualsIgnoreCaseBenchmark.scalar1024 avgt 15 1090.708 ± 7.540 > ns/op > EqualsIgnoreCaseBenchmark.vectorized 16 avgt 1521.872 ± 0.232 > ns/op > EqualsIgnoreCaseBenchmark.vectorized 32 avgt 1511.378 ± 0.097 > ns/op > EqualsIgnoreCaseBenchmark.vectorized 64 avgt 1513.703 ± 0.135 > ns/op > EqualsIgnoreCaseBenchmark.vectorized 128 avgt 1521.632 ± 0.735 > ns/op > EqualsIgnoreCaseBenchmark.vectorized1024 avgt 15 105.509 ± 7.493 > ns/op Eirik Bjorsnos has updated the pull request incrementally with one additional commit since the last revision: Use GE, LE, NE operations instead of the lt,not combinations. Use uppercase in vectorized code to match the scalar version. - Changes: - all: https://git.openjdk.org/jdk/pull/12790/files - new: https://git.openjdk.org/jdk/pull/12790/files/c01a464d..d8c0c2ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12790&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12790&range=00-01 Stats: 18 lines in 1 file changed: 7 ins; 4 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/12790.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12790/head:pull/12790 PR: https://git.openjdk.org/jdk/pull/12790