Zimmermann Paul <paul.zimmerm...@loria.fr> writes: on January 25 I reported efficiency issues with unbalanced multiplication. Here I report similar issues with unbalanced division. Consider the program below. I get on a 2Ghz Intel Xeon E7540 with GMP 5.1.0: barbecue% ./bug2 2000001 1000001 mpz_tdiv_q(2000001,1000001) took 2244ms barbecue% ./bug2 1698971 698971 mpz_tdiv_q(1698971,698971) took 3144ms The first run computes the 10^6-limb quotient of the division of 2000001 limbs by 1000001 limbs. The second run computes the 10^6-limb quotient of the division of 1698971 limbs by 698971 limbs. The second run should be faster (or equal in speed) to the first one, since one can multiply both numerator and divisor by B^(1000001-698971) where B=2^64, then perform a 2000001/1000001 division. Instead the second run is 40% slower... This is more alarming than the rather unalarming multiply performance fluctuations. I must admit that it is actually unsurprising.
I put some diagrams at gmplib.org/devel/ that highlights the problem over a large range. The 2nd plot covers your test case but for an AMD Phenom 2. The 2st plot is for 10 times smaller operands, which demonstrates the main problem but gives better smoothness. I suspect some of the non-smoothness of the 2nd plot is due to limited FFT tables. As soon as the dividend size is twice the divisor size one expect (average) complexity to be constant for a given quotient size. The diagrams indicate that the present code do behave ideally in that range. (I measured much longer, but truncated these plot to highlight the problematic ranges.) The reason for this to be unsurprising is that the "MU" code didn't get much thought about strategy decisions. I expect it to quite simple to greatly improve the situation with some analysis. To get the curves perfectly smooth might require more work. -- Torbjörn _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel