Module Name: src Committed By: riastradh Date: Wed Mar 29 13:07:46 UTC 2023
Modified Files: src/crypto/external/bsd/openssl/dist/crypto/bn/asm: x86_64-gcc.c Log Message: openssl: Remove local micro-optimization on AMD (but not Intel). Upstream OpenSSL changed loop 1b to dec %rcx jnz 1b which has mostly the same semantics, in this change: https://github.com/openssl/openssl/pull/4743 For some reason, in one of the OpenSSL updates, we ended up with a local change to revert this. The Intel and AMD optimization guides are silent on the LOOP instruction, but Agner Fog's tables shows that while LOOP is one cycle shorter than DEC;JNZ on AMD Zen microarchitectures, it is a good half dozen cycles longer than DEC;JNZ on recent Intel microarchitectures. The history of the OpenSSL change suggests it was intended, and I can't find any indication other than `merge conflicts' that we intended to keep the LOOP version. So let's reduce the local diff by nixing it. To generate a diff of this commit: cvs rdiff -u -r1.11 -r1.12 \ src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
Modified files: Index: src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c diff -u src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c:1.11 src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c:1.12 --- src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c:1.11 Sun Mar 22 00:53:03 2020 +++ src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gcc.c Wed Mar 29 13:07:46 2023 @@ -219,9 +219,10 @@ BN_ULONG bn_add_words(BN_ULONG *rp, cons " adcq (%5,%2,8),%0 \n" " movq %0,(%3,%2,8) \n" " lea 1(%2),%2 \n" - " loop 1b \n" - " sbbq %0,%0 \n":"=&r" (ret), "+c"(n), - "+r"(i) + " dec %1 \n" + " jnz 1b \n" + " sbbq %0,%0 \n" + :"=&r" (ret), "+c"(n), "+r"(i) :"r"(rp), "r"(ap), "r"(bp) :"cc", "memory"); @@ -245,9 +246,10 @@ BN_ULONG bn_sub_words(BN_ULONG *rp, cons " sbbq (%5,%2,8),%0 \n" " movq %0,(%3,%2,8) \n" " lea 1(%2),%2 \n" - " loop 1b \n" - " sbbq %0,%0 \n":"=&r" (ret), "+c"(n), - "+r"(i) + " dec %1 \n" + " jnz 1b \n" + " sbbq %0,%0 \n" + :"=&r" (ret), "+c"(n), "+r"(i) :"r"(rp), "r"(ap), "r"(bp) :"cc", "memory");