ni...@lysator.liu.se (Niels Möller) writes: > I went ahead and committed that version, replacing the old > HGCD2_METHOD=2. I expect it is be the fastest method on some platform.
Will be interesting to see results on thresholds. Nobody loves the new METHOD 2. :-( (Not many machines reported results, but some machines which I expected could have wanted METHOD 2 did.) We might as well switch default to METHOD 3 or 1. > (We might want to arrange for longlong.h to use lzcnt instead of bsr for > modern AMD processors; the initial two count_leading_zeros would > terminate in one cycle instead of 8 thereby!) Looks like you did that too. Yes, and caused some widespread breakage with the mulx change I also committed. (A fix is ready.) But at least one failure, with ivyfbsd64v12, is related to tuneup.c. I've now tried the similar #if:ed out div2 code, and enabling it gives an 8% speedup on my laptop. Nice! Next, I think we should go ahead with the rename HGCD2_METHOD to DIV11_METHOD or possibly HGCD2_DIV1_METHOD. Feel free. -- Torbjörn Please encrypt, key id 0xC8601622 _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel