Hi all,

I've been doing some timings of the new divide and conquer style
division with remainder code I wrote.

I did some timings of a generic build (C only) of MPIR-2.6.0 with the
new C code I wrote (BSDNT). Remarkably, the new code I have written
displays quite a few interesting features in comparison.

The timings below are for a 2*size - 1 by size division. I call the
MPIR low level functions directly.

MPIR-2.6.0:

size = 10: classical = 1e-06s, divconquer = 1.5e-06s
size = 11: classical = 1.5e-06s, divconquer = 1.5e-06s
size = 13: classical = 1.5e-06s, divconquer = 2e-06s
size = 15: classical = 2.5e-06s, divconquer = 2.5e-06s
size = 17: classical = 3e-06s, divconquer = 3e-06s
size = 19: classical = 3.5e-06s, divconquer = 4e-06s
size = 21: classical = 4e-06s, divconquer = 4.5e-06s
size = 24: classical = 5.5e-06s, divconquer = 5.5e-06s
size = 27: classical = 7e-06s, divconquer = 7e-06s
size = 30: classical = 8.5e-06s, divconquer = 8.5e-06s
size = 33: classical = 1e-05s, divconquer = 1e-05s
size = 37: classical = 1.25e-05s, divconquer = 1.25e-05s
size = 41: classical = 1.5e-05s, divconquer = 1.5e-05s
size = 46: classical = 1.9e-05s, divconquer = 1.9e-05s
size = 51: classical = 2.3e-05s, divconquer = 2.3e-05s
size = 57: classical = 2.8e-05s, divconquer = 2.9e-05s
size = 63: classical = 3.45e-05s, divconquer = 3.45e-05s
size = 70: classical = 4.25e-05s, divconquer = 4.25e-05s
size = 77: classical = 5.1e-05s, divconquer = 5.15e-05s
size = 85: classical = 6.2e-05s, divconquer = 6.2e-05s
size = 94: classical = 7.6e-05s, divconquer = 7.55e-05s
size = 104: classical = 8.1e-05s, divconquer = 9.25e-05s
size = 115: classical = 9.95e-05s, divconquer = 0.0001125s
size = 127: classical = 0.0001125s, divconquer = 0.0001375s
size = 140: classical = 0.000135s, divconquer = 0.000167s
size = 154: classical = 0.000162s, divconquer = 0.000201s
size = 170: classical = 0.000195s, divconquer = 0.0002445s
size = 188: classical = 0.0002375s, divconquer = 0.0002985s
size = 207: classical = 0.0002655s, divconquer = 0.000362s
size = 228: classical = 0.000318s, divconquer = 0.0004385s
size = 251: classical = 0.0003595s, divconquer = 0.000531s
size = 277: classical = 0.000414s, divconquer = 0.0006465s
size = 305: classical = 0.0004935s, divconquer = 0.0007825s
size = 336: classical = 0.0005905s, divconquer = 0.000949s
size = 370: classical = 0.000691s, divconquer = 0.0011505s
size = 408: classical = 0.00077s, divconquer = 0.001397s
size = 449: classical = 0.0009165s, divconquer = 0.001691s
size = 494: classical = 0.0011015s, divconquer = 0.002046s

BSDNT:

size = 10: classical = 2e-06s, divconquer = 1.5e-06s
size = 11: classical = 2e-06s, divconquer = 2e-06s
size = 13: classical = 1.5e-06s, divconquer = 2.5e-06s
size = 15: classical = 2.5e-06s, divconquer = 2.5e-06s
size = 17: classical = 4e-06s, divconquer = 3.5e-06s
size = 19: classical = 4.5e-06s, divconquer = 4e-06s
size = 21: classical = 5.5e-06s, divconquer = 5e-06s
size = 24: classical = 6.5e-06s, divconquer = 6e-06s
size = 27: classical = 6e-06s, divconquer = 6e-06s
size = 30: classical = 9.5e-06s, divconquer = 7.5e-06s
size = 33: classical = 8e-06s, divconquer = 7.5e-06s
size = 37: classical = 1.35e-05s, divconquer = 1.1e-05s
size = 41: classical = 1.5e-05s, divconquer = 1.05e-05s
size = 46: classical = 1.45e-05s, divconquer = 1.25e-05s
size = 51: classical = 2.15e-05s, divconquer = 1.6e-05s
size = 57: classical = 2.3e-05s, divconquer = 1.7e-05s
size = 63: classical = 2.2e-05s, divconquer = 1.9e-05s
size = 70: classical = 3.45e-05s, divconquer = 2.45e-05s
size = 77: classical = 4.5e-05s, divconquer = 2.9e-05s
size = 85: classical = 3.95e-05s, divconquer = 3.1e-05s
size = 94: classical = 6.15e-05s, divconquer = 3.85e-05s
size = 104: classical = 6.35e-05s, divconquer = 4.4e-05s
size = 115: classical = 7.05e-05s, divconquer = 5.1e-05s
size = 127: classical = 0.000114s, divconquer = 6.6e-05s
size = 140: classical = 0.0001365s, divconquer = 7.85e-05s
size = 154: classical = 0.000112s, divconquer = 8.2e-05s
size = 170: classical = 0.000198s, divconquer = 0.000101s
size = 188: classical = 0.000203s, divconquer = 0.0001165s
size = 207: classical = 0.0002885s, divconquer = 0.0001415s
size = 228: classical = 0.000354s, divconquer = 0.000167s
size = 251: classical = 0.0003345s, divconquer = 0.00019s
size = 277: classical = 0.0005065s, divconquer = 0.0002325s
size = 305: classical = 0.000671s, divconquer = 0.000272s
size = 336: classical = 0.000739s, divconquer = 0.000305s
size = 370: classical = 0.0008925s, divconquer = 0.0003525s
size = 408: classical = 0.0006595s, divconquer = 0.0004095s
size = 449: classical = 0.0013055s, divconquer = 0.000495s
size = 494: classical = 0.001575s, divconquer = 0.000583s

Well, remarkably the MPIR divconquer code is actually slower than the
MPIR schoolbook (classical) code after about 100 limbs!

Secondly, the crossover between classical and divconquer is about 30
limbs for my new code and at that point the classical code is about
the same speed in both MPIR and BSDNT.

The BSDNT classical code keeps pace with MPIR fairly well all the way
up (the slight difference would almost entirely be due to only using 1
limb when computing a precomputed inverse). However, the
BSDNTdivconquer code is nearly 4 times faster than the MPIR divconquer
code! This is very surprising to me.

If I compile MPIR with assembly support, of course it becomes many
times faster. Remarkably however, BSDNT is only 2 times slower than
MPIR at 494 limbs, even though it has no assembly optimisation, no
toom4 multiplication and only uses a single limb of the divisor for
computing a precomputed inverse instead of two limbs. I think with
these and assembly optimised mulmid basecase, it could end up being
two to four times faster than MPIR, or more.

The next thing I intend to implement is toom63_mulmid which might
speed things up even more, although it has been noted that this is an
especially difficult algorithm to implement and to optimise, so one
does not have all that much hope for this algorithm.

Bill.

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mpir-devel+unsubscr...@googlegroups.com.
To post to this group, send email to mpir-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/mpir-devel?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to