Hi all, I've been doing some timings of the new divide and conquer style division with remainder code I wrote.
I did some timings of a generic build (C only) of MPIR-2.6.0 with the new C code I wrote (BSDNT). Remarkably, the new code I have written displays quite a few interesting features in comparison. The timings below are for a 2*size - 1 by size division. I call the MPIR low level functions directly. MPIR-2.6.0: size = 10: classical = 1e-06s, divconquer = 1.5e-06s size = 11: classical = 1.5e-06s, divconquer = 1.5e-06s size = 13: classical = 1.5e-06s, divconquer = 2e-06s size = 15: classical = 2.5e-06s, divconquer = 2.5e-06s size = 17: classical = 3e-06s, divconquer = 3e-06s size = 19: classical = 3.5e-06s, divconquer = 4e-06s size = 21: classical = 4e-06s, divconquer = 4.5e-06s size = 24: classical = 5.5e-06s, divconquer = 5.5e-06s size = 27: classical = 7e-06s, divconquer = 7e-06s size = 30: classical = 8.5e-06s, divconquer = 8.5e-06s size = 33: classical = 1e-05s, divconquer = 1e-05s size = 37: classical = 1.25e-05s, divconquer = 1.25e-05s size = 41: classical = 1.5e-05s, divconquer = 1.5e-05s size = 46: classical = 1.9e-05s, divconquer = 1.9e-05s size = 51: classical = 2.3e-05s, divconquer = 2.3e-05s size = 57: classical = 2.8e-05s, divconquer = 2.9e-05s size = 63: classical = 3.45e-05s, divconquer = 3.45e-05s size = 70: classical = 4.25e-05s, divconquer = 4.25e-05s size = 77: classical = 5.1e-05s, divconquer = 5.15e-05s size = 85: classical = 6.2e-05s, divconquer = 6.2e-05s size = 94: classical = 7.6e-05s, divconquer = 7.55e-05s size = 104: classical = 8.1e-05s, divconquer = 9.25e-05s size = 115: classical = 9.95e-05s, divconquer = 0.0001125s size = 127: classical = 0.0001125s, divconquer = 0.0001375s size = 140: classical = 0.000135s, divconquer = 0.000167s size = 154: classical = 0.000162s, divconquer = 0.000201s size = 170: classical = 0.000195s, divconquer = 0.0002445s size = 188: classical = 0.0002375s, divconquer = 0.0002985s size = 207: classical = 0.0002655s, divconquer = 0.000362s size = 228: classical = 0.000318s, divconquer = 0.0004385s size = 251: classical = 0.0003595s, divconquer = 0.000531s size = 277: classical = 0.000414s, divconquer = 0.0006465s size = 305: classical = 0.0004935s, divconquer = 0.0007825s size = 336: classical = 0.0005905s, divconquer = 0.000949s size = 370: classical = 0.000691s, divconquer = 0.0011505s size = 408: classical = 0.00077s, divconquer = 0.001397s size = 449: classical = 0.0009165s, divconquer = 0.001691s size = 494: classical = 0.0011015s, divconquer = 0.002046s BSDNT: size = 10: classical = 2e-06s, divconquer = 1.5e-06s size = 11: classical = 2e-06s, divconquer = 2e-06s size = 13: classical = 1.5e-06s, divconquer = 2.5e-06s size = 15: classical = 2.5e-06s, divconquer = 2.5e-06s size = 17: classical = 4e-06s, divconquer = 3.5e-06s size = 19: classical = 4.5e-06s, divconquer = 4e-06s size = 21: classical = 5.5e-06s, divconquer = 5e-06s size = 24: classical = 6.5e-06s, divconquer = 6e-06s size = 27: classical = 6e-06s, divconquer = 6e-06s size = 30: classical = 9.5e-06s, divconquer = 7.5e-06s size = 33: classical = 8e-06s, divconquer = 7.5e-06s size = 37: classical = 1.35e-05s, divconquer = 1.1e-05s size = 41: classical = 1.5e-05s, divconquer = 1.05e-05s size = 46: classical = 1.45e-05s, divconquer = 1.25e-05s size = 51: classical = 2.15e-05s, divconquer = 1.6e-05s size = 57: classical = 2.3e-05s, divconquer = 1.7e-05s size = 63: classical = 2.2e-05s, divconquer = 1.9e-05s size = 70: classical = 3.45e-05s, divconquer = 2.45e-05s size = 77: classical = 4.5e-05s, divconquer = 2.9e-05s size = 85: classical = 3.95e-05s, divconquer = 3.1e-05s size = 94: classical = 6.15e-05s, divconquer = 3.85e-05s size = 104: classical = 6.35e-05s, divconquer = 4.4e-05s size = 115: classical = 7.05e-05s, divconquer = 5.1e-05s size = 127: classical = 0.000114s, divconquer = 6.6e-05s size = 140: classical = 0.0001365s, divconquer = 7.85e-05s size = 154: classical = 0.000112s, divconquer = 8.2e-05s size = 170: classical = 0.000198s, divconquer = 0.000101s size = 188: classical = 0.000203s, divconquer = 0.0001165s size = 207: classical = 0.0002885s, divconquer = 0.0001415s size = 228: classical = 0.000354s, divconquer = 0.000167s size = 251: classical = 0.0003345s, divconquer = 0.00019s size = 277: classical = 0.0005065s, divconquer = 0.0002325s size = 305: classical = 0.000671s, divconquer = 0.000272s size = 336: classical = 0.000739s, divconquer = 0.000305s size = 370: classical = 0.0008925s, divconquer = 0.0003525s size = 408: classical = 0.0006595s, divconquer = 0.0004095s size = 449: classical = 0.0013055s, divconquer = 0.000495s size = 494: classical = 0.001575s, divconquer = 0.000583s Well, remarkably the MPIR divconquer code is actually slower than the MPIR schoolbook (classical) code after about 100 limbs! Secondly, the crossover between classical and divconquer is about 30 limbs for my new code and at that point the classical code is about the same speed in both MPIR and BSDNT. The BSDNT classical code keeps pace with MPIR fairly well all the way up (the slight difference would almost entirely be due to only using 1 limb when computing a precomputed inverse). However, the BSDNTdivconquer code is nearly 4 times faster than the MPIR divconquer code! This is very surprising to me. If I compile MPIR with assembly support, of course it becomes many times faster. Remarkably however, BSDNT is only 2 times slower than MPIR at 494 limbs, even though it has no assembly optimisation, no toom4 multiplication and only uses a single limb of the divisor for computing a precomputed inverse instead of two limbs. I think with these and assembly optimised mulmid basecase, it could end up being two to four times faster than MPIR, or more. The next thing I intend to implement is toom63_mulmid which might speed things up even more, although it has been noted that this is an especially difficult algorithm to implement and to optimise, so one does not have all that much hope for this algorithm. Bill. -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+unsubscr...@googlegroups.com. To post to this group, send email to mpir-devel@googlegroups.com. Visit this group at http://groups.google.com/group/mpir-devel?hl=en. For more options, visit https://groups.google.com/groups/opt_out.