Hi, I've had a look at the latest gcd_11 asm, and it's really neat, including naturally getting %rdx zero on return.
One question: the bd2 and bd4 versions use L(top): rep;bsf %rdx, %rcx C tzcnt! I've not seen this before, but a quick web search indicates that tzcnt is the same as bsf, except that it has a well defined result also when the input is zero. But in these loops, we should get to this instruction only for non-zero %rdx. So are there any other subtleties? Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel