ni...@lysator.liu.se (Niels Möller) writes: > I might convert the former, but I am tempted to simply delete the k6 > gcd_1.asm.
The k6 code uses a branch on the u - v sign? Might be slower than the current C code. What hardware is it used for? https://en.wikipedia.org/wiki/AMD_K6 https://en.wikipedia.org/wiki/AMD_K6-2 https://en.wikipedia.org/wiki/AMD_K6-III I.e., not the latest. I also see that there's no x86/gcd_1.asm. We might want to add a _11 there, perhaps using the C algorithm. (That would be used also for k6, if anybody cares.) Actually, the trick if keeping the lsb implicit might not be needed for x86 (or other ISAs with a carry flag). A subtract-with-carry x,x,x yields the needed mask. The default x86 code cannot rely on bsf/bsr as these are awfully slow on many older systems. We need the table trick used in x86/k7/gcd_11.asm, and other places. Below a gcd_11 unit test. Not terribly interesting, but we'll need something similar for testing gcd_22. And we get a place to add tests for any problematic corner cases. Nice! Assuming the test code has been tested, please push! -- Torbjörn Please encrypt, key id 0xC8601622 _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel