ni...@lysator.liu.se (Niels Möller) writes: > I believe METHOD 1 could almost always be pointless if METHOD 3's > straight line code is made really efficient.
I'd guess so too. But benchmarks are better than faith alone ;-) I have a lot of faith in benchmarking! :-) > That should in particular be true if the q = 1 code is removed from > hgcd2, for machines with good multiplication throughput. Can we remove that for everything but HGCD2_METHOD == 2? It would be nice to not have an explosion of cases. Do you really mean HGCD2_METHOD 2? I'd say that only _METHOD 1 is poor at q = 1, while _METHOD 2 and 3 handle it well. Or perhaps you reason along the lines of "if a CPU is good at small quotient division, it ought to be good at multiplication, therefore _METHOD 1 does not need q = 1 code which principally avoids some limb multiplications"? Perhaps there is some confusion about which method is chosen for each HGCD2_METHOD value? You designated the original method number 2, your plain / based one number 1. I added lead texts to https://gmplib.org/devel/thres/HGCD2_METHOD.html for clarity. We call the normalizing div2 only when q > bh, which implies bh^2 < ah. Then bh can't be normalized. When debugging (on 64-bit), I only saw this code entered with size of bh exactly 32 bits, ah exactly 64 bits. But I think it might be called with bh even smaller. A comment stating that cnt = 0 is not possible (or an ASSERT!). > Should be revive the commented-out div2? It looks good to me, the > comment about that it is slower than the branchy div2 is probably from > the era where branchy div1/div2 was great. I haven't had the time to benchmark it, but I wouldn't be surprised if it runs better than the other div2. I suggest that we rename HGCD2_METHOD to DIV_11_METHOD. Agree? Than we might want DIV_22_METHOD too. The thing is, we don't want q = 2 and one number reduced to a zero. We want to terminate with a matrix corresponding to both a and b roughly half a limb (full limb, if it weren't for the truncation when going from double precision to single precision), and |a - b| smaller. [snip] Pretty subtle! -- Torbjörn Please encrypt, key id 0xC8601622 _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel