Re: RFR: 8303401: Add a Vector API equalsIgnoreCase micro benchmark [v6]

Eirik Bjorsnos Wed, 01 Mar 2023 01:33:01 -0800

On Wed, 1 Mar 2023 09:10:47 GMT, Eirik Bjorsnos <[email protected]> wrote:


>> This PR suggests we add a vectorized equalsIgnoreCase benchmark to the set 
>> of benchmarks in `org.openjdk.bench.jdk.incubator.vector`. This benchmark 
>> serves as an example of how vectorization can be useful also in the area of 
>> text processing. It takes advantage of the fact that ASCII and Latin-1 were 
>> designed to optimize case-twiddling operations.
>> 
>> The code came about during the work on #12632, where vectorization was 
>> deemed out of scope.
>> 
>> Benchmark results:
>> 
>> 
>> Benchmark                             (size)  Mode  Cnt     Score   Error  
>> Units
>> EqualsIgnoreCaseBenchmark.scalar          16  avgt   15    20.671 ± 0.718  
>> ns/op
>> EqualsIgnoreCaseBenchmark.scalar          32  avgt   15    46.155 ± 3.258  
>> ns/op
>> EqualsIgnoreCaseBenchmark.scalar          64  avgt   15    68.248 ± 1.767  
>> ns/op
>> EqualsIgnoreCaseBenchmark.scalar         128  avgt   15   148.948 ± 0.890  
>> ns/op
>> EqualsIgnoreCaseBenchmark.scalar        1024  avgt   15  1090.708 ± 7.540  
>> ns/op
>> EqualsIgnoreCaseBenchmark.vectorized      16  avgt   15    21.872 ± 0.232  
>> ns/op
>> EqualsIgnoreCaseBenchmark.vectorized      32  avgt   15    11.378 ± 0.097  
>> ns/op
>> EqualsIgnoreCaseBenchmark.vectorized      64  avgt   15    13.703 ± 0.135  
>> ns/op
>> EqualsIgnoreCaseBenchmark.vectorized     128  avgt   15    21.632 ± 0.735  
>> ns/op
>> EqualsIgnoreCaseBenchmark.vectorized    1024  avgt   15   105.509 ± 7.493  
>> ns/op
>
> Eirik Bjorsnos has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   The equal.allTrue check early if the loop does not cover cases where some 
> bytes are equal, but not all. Reverting this change.

While using the compare method with the GT/GE/NE operators allows for cleaner 
code, it also seems to come with a significant performance penalty.

Is this to be expected?

Before, using lt, not:


Benchmark                             (size)  Mode  Cnt   Score   Error  Units
EqualsIgnoreCaseBenchmark.vectorized    1024  avgt   15  98.903 ± 1.508  ns/op


After, using compare with LE, GE, NE:


Benchmark                             (size)  Mode  Cnt    Score   Error  Units
EqualsIgnoreCaseBenchmark.vectorized    1024  avgt   15  119.723 ± 2.903  ns/op


The lt, not version:


// Determine which bytes represent ASCII or Latin-1 letters:
VectorMask<Byte> asciiLetter = upperA.lt((byte) '[').and(upperA.lt((byte) 
'@').not());

VectorMask<Byte> lat1Letter = upperA
        .lt((byte) 0xDF)  // <= Thorn
        .and(upperA.lt((byte) 0XBF).not()) // >= A-grave
        .and(upperA.eq((byte) 0xD7).not()); // Excluding multiplication


And the LE, GE, NE version:


// Determine which bytes represent ASCII or Latin-1 letters:
VectorMask<Byte> asciiLetter = upperA.compare(GE, (byte) 'A') // >= 'A'
        .and(upperA.compare(LE, (byte) 'Z')); // <= 'Z'

VectorMask<Byte> lat1Letter = upperA.compare(GE, (byte) 0XC0) // >= A-grave
        .and(upperA.compare(LE, (byte) 0xDE))  // <= Thorn
        .and(upperA.compare(NE, (byte) 0xD7)); // Excluding multiplication

-------------

PR: https://git.openjdk.org/jdk/pull/12790

Re: RFR: 8303401: Add a Vector API equalsIgnoreCase micro benchmark [v6]

Reply via email to