On Mon, 17 Jun 2024 16:38:55 GMT, Volodymyr Paprotski <d...@openjdk.org> wrote:

>> This fix recovers XDH performance but removes some of the P256 gains 
>> (~-8-14%). Still faster, but not as much.
>> 
>> The fix is to undo 'int' return type on mult()/square(), which allowed to 
>> return partially reduced result (e.g. this avoids extra reductions when 
>> mult() result is fed into addition). This is the behaviour before the 
>> Montgomery ECC PR.
>> 
>> ---
>> XDH.generateSecret performance 
>> before Montgomery PR:
>> 
>> Benchmark                             (algorithm)  (keyLength)  
>> (kpgAlgorithm)  (provider)   Mode  Cnt     Score    Error  Units
>> KeyAgreementBench.XDH.generateSecret          XDH          255             
>> XDH              thrpt    3  8435.277 ± 27.230  ops/s
>> 
>> after Montgomery PR:
>> 
>> Benchmark                             (algorithm)  (keyLength)  
>> (kpgAlgorithm)  (provider)   Mode  Cnt     Score    Error  Units
>> KeyAgreementBench.XDH.generateSecret          XDH          255             
>> XDH              thrpt    3  8309.028 ± 22.071  ops/s
>> 
>> with this PR:
>> 
>> Benchmark                             (algorithm)  (keyLength)  
>> (kpgAlgorithm)  (provider)   Mode  Cnt     Score    Error  Units
>> KeyAgreementBench.XDH.generateSecret          XDH          255             
>> XDH              thrpt    3  8491.268 ± 32.858  ops/s
>> 
>> ---
>> 
>> P256 performance with/without mult intrinsic:
>> 
>> Performance before Montgomery PR:
>> 
>> Benchmark                        (algorithm)  (dataSize)  (keyLength)  
>> (provider)   Mode  Cnt     Score    Error  Units
>> SignatureBench.ECDSA.sign    SHA256withECDSA        1024          256        
>>       thrpt    3  6398.727 ±  7.400  ops/s
>> SignatureBench.ECDSA.sign    SHA256withECDSA       16384          256        
>>       thrpt    3  6129.739 ±  5.995  ops/s
>> SignatureBench.ECDSA.verify  SHA256withECDSA        1024          256        
>>       thrpt    3  1889.928 ± 54.660  ops/s
>> SignatureBench.ECDSA.verify  SHA256withECDSA       16384          256        
>>       thrpt    3  1866.339 ± 42.438  ops/s
>> Benchmark                                            (algorithm)  
>> (keyLength)  (kpgAlgorithm)  (provider)   Mode  Cnt     Score    Error  Units
>> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret          ECDH          
>> 256              EC              thrpt    3  1350.745 ± 28.514  ops/s
>> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret         ECDH          
>> 256              EC              thrpt    3  1349.393 ± 32.050  ops/s
>> 
>> Performance in master without mult() intrinsic
>> 
>> Benchmark                        ...
>
> Volodymyr Paprotski has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   comment from Sandhya

Approved with review by @ferakocz

-------------

Marked as reviewed by ascarpino (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/19728#pullrequestreview-2139971558

Reply via email to