On Fri, 14 Jun 2024 20:23:04 GMT, Volodymyr Paprotski <d...@openjdk.org> wrote:
> This fix recovers XDH performance but removes some of the P256 gains > (~-8-14%). Still faster, but not as much. > > The fix is to undo 'int' return type on mult()/square(), which allowed to > return partially reduced result (i.e. this avoids extra reductions when > mult() result is fed into addition). This is the behaviour before the > Montgomery ECC PR. > > I have a slightly better mult() intrinsic that does reduction at the end, but > decided to use a more conservative fix and just keep the reduction in Java > (i.e. original mult() refactored into multImpl() and reducePositive()) Will > commit these optimizations I discovered while working on this in next release. > > --- > > Performance before Montgomery PR: > > Benchmark (algorithm) (dataSize) (keyLength) > (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 > thrpt 3 6398.727 ± 7.400 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 > thrpt 3 6129.739 ± 5.995 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 > thrpt 3 1889.928 ± 54.660 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 > thrpt 3 1866.339 ± 42.438 ops/s > Benchmark (algorithm) (keyLength) > (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 > EC thrpt 3 1350.745 ± 28.514 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 > EC thrpt 3 1349.393 ± 32.050 ops/s > Benchmark (algorithm) (keyLength) > (kpgAlgorithm) (provider) Mode Cnt Score Error Units > KeyAgreementBench.XDH.generateSecret XDH 255 > XDH thrpt 3 8435.277 ± 27.230 ops/s > > Performance in master without mult() intrinsic > > Benchmark (algorithm) (dataSize) (keyLength) > (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 > thrpt 3 6539.589 ± 132.844 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 > thrpt 3 6202.530 ± 124.496 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 > thrpt 3 1967.0... @ascarpino Would you mind reviewing this again please? Mostly java you reviewed before. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19728#issuecomment-2168728473