Marco, > Date: Wed, 6 Feb 2013 17:59:44 +0100 (CET) > From: bodr...@mail.dm.unipi.it > > Ciao Paul!
Ciao!!! > Of course. With current implementation, unbalanced multiplications need > some more memory and a few additions/subtractions, but this should not > give a measurable slow-down. The "matrix" obtained with > > $ tune/speed -s 400000-800000 -t 100000 mpn_mul.800000 mpn_mul.900000 > mpn_mul.1000000 mpn_mul.1100000 mpn_mul.1200000 > > shows that times are not as monotonic as desired, but "unbalancement" does > not really have an influence. indeed: frite% ./speed -s 400000-800000 -t 100000 mpn_mul.800000 mpn_mul.900000 mpn_mul.1000000 mpn_mul.1100000 mpn_mul.1200000 overhead 0.000000002 secs, precision 10000 units of 3.33e-10 secs, CPU freq 3000.00 MHz mpn_mul.800000 mpn_mul.900000 mpn_mul.1000000 mpn_mul.1100000 mpn_mul.1200000 400000 #0.460029000 0.472029000 0.564035000 0.572035000 0.712044000 500000 #0.476029000 0.560035000 0.548034000 0.692043000 0.708044000 600000 #0.572036000 0.576036000 0.704044000 0.680042000 0.696044000 700000 #0.556035000 0.688043000 0.672042000 0.676042000 0.724046000 800000 0.712045000 0.700044000 #0.668042000 0.688043000 0.772048000 > I think that the culprit is the tune/speed program, but I'm not able to > correct it. I just tested the attached patch. After patching, the results > are: > > $ tune/speed -s 800000-1000000 -t 100000 mpn_mul_n mpn_mul mpn_mul_bal > overhead 0.000000000 secs, precision 10000 units of 3.13e-11 secs, CPU > freq 31990.26 MHz > mpn_mul_n mpn_mul mpn_mul_bal > 800000 0.646571000 0.682834000 #0.632961000 > 900000 #0.652178000 0.678564000 0.655979000 > 1000000 0.710674000 0.740998000 #0.702508000 I can reproduce this on GMP 5.1.0 with your patch: frite% ./speed -s 800000-1000000 -t 100000 mpn_mul_n mpn_mul mpn_mul_bal overhead 0.000000002 secs, precision 10000 units of 3.33e-10 secs, CPU freq 3000.00 MHz mpn_mul_n mpn_mul mpn_mul_bal 800000 0.668041000 0.716045000 #0.664041000 900000 #0.652040000 0.696043000 0.656041000 1000000 0.724045000 0.748046000 #0.720045000 > As you can see mpn_mul_n and mpn_mul_bal are comparable, and mpn_mul is > always slower. All the three functions measure the time to multiply two > numbers of the same size. mpn_mul_bal and mpn_mul measure the same > function, but the first uses the same macro that tune/speed uses for > mpn_mul_n... if the culprit is the macro used in speed, it should be fixed! Paul _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel