Brilliant!! 2010/1/10 Case Vanhorsen <cas...@gmail.com>: > On Sun, Jan 10, 2010 at 2:02 PM, Bill Hart <goodwillh...@googlemail.com> > wrote: >> Thanks Case, >> >> those are very useful timings. I see more of what I expected to see >> for unbalanced multiplication, especially here: >> >> 5NxN mpz multiplication: MPIR 1.3.0 GMP 5.0.0 >> 1000 digits: 0.00002064 sec 0.00001863 sec >> 5000 digits: 0.00023417 sec 0.00021708 sec >> 10000 digits: 0.00064239 sec 0.00058681 sec >> 50000 digits: 0.00608666 sec 0.00436574 sec >> >> You also clearly see the asymptotically fast division code in GMP here: >> >> 17N/N mpz quotient: MPIR 1.3.0 GMP 5.0.0 >> 500000 digits: 1.24625611 sec 0.44447249 sec >> 1000000 digits: 2.99730897 sec 1.01921391 sec >> 10000000 digits: 55.43707895 sec 15.77324915 sec >> 100000000 digits: 915.73236585 sec 200.43964195 sec >> >> The squaring is not a surprise. This is due to the mpn_sqr_basecase >> function not being optimised in assembly for the Core 2. It's not bad, >> but for the last couple of releases we have suffered on that one. >> >> I think our strategy to continue working on the multiplication code, >> from karatsuba right through FFT, then completely redo division >> (pretty much from scratch, apart from the little we've already >> touched) is the right one. Of course we also still need to optimise >> that one file in the gcd/xgcd code and cube root. >> >> I see our next_prime is now way faster, but did the two programs agree >> on what the 10,000,000,000th prime actually was? I suppose that they >> should. > They actually agree to the 100,000,000,000th prime and to various > references on the Internet. >> >> Anyhow, I am now about to try and get an rc5 up for testing. There's >> nothing I can change at present to help out Gianrico, but perhaps we >> can encourage Jason Moxham to look at the assembly code for p6 when he >> comes back after his break, perhaps towards the end of March. >> >> I would like to see if Gianrico has found an FFT bug though. I'd like >> to fix that. I think SkyNet/cicero might be similar enough to trigger >> the assertion he encountered, if we have his tuning values. There are >> still plenty of people using such machines. >> >> Bill. >> >> 2010/1/10 Case Vanhorsen <cas...@gmail.com>: >>> On Sun, Jan 10, 2010 at 1:18 PM, Bill Hart <goodwillh...@googlemail.com> >>> wrote: >>>> You are of course welcome to choose whichever package best meets your >>>> needs. And indeed on your particular system, it seems GMP may well do >>>> that for you at present. >>>> >>>> One thing you should bear in mind however. Here are some times as they >>>> have changed over the past year and a half: >>>> >>>> K8 >>>> >>>> Multiplication: GMP 4.3.0 MPIR 1.2.0 MPIR 1.1.2 MPIR 1.0.0 MPIR >>>> 0.9.0 GMP 4.2.1 >>>> ========== ======== ========= ========= ======== >>>> ======== ========= >>>> 128 x 128 : 52766506 53794646 51802252 35856598 >>>> 37299412 25896136 >>>> 512 x 512 : 10879150 12488043 11802334 10928085 >>>> 8122452 6383542 >>>> 8192 x 8192 : 114927 117404 111772 111641 >>>> 86301 60819 >>>> 131072 x 131072 : 1757 2062 1873 1650 >>>> 1165 885 >>>> 2097152 x 2097152 : 52.5 63.4 44.1 44.1 >>>> 36.8 30.1 >>>> >>>> So as you can see, the times have changed *MUCH* more for *both* >>>> projects than the current difference between them. In fact >>>> multiplication speed (the most important speed by far) has nearly >>>> doubled in the past year, right across the board. I think with GMP 5 >>>> and MPIR 1.3 it really has doubled. >>>> >>>> So it is the improvement *over time* which is the important thing. >>>> You'll also note that the projects have leapfrogged each other, MPIR >>>> 0.9.0 beating GMP 4.2.1, GMP 4.3.0 beating MPIR 1.1.2, MPIR 1.2.0 >>>> beating GMP 4.3.0 and so on. So it does depend at what time you do the >>>> comparison whether one or the other is better. >>>> >>>> Also, if you look at the times Case provided, what you said about only >>>> multiplication above 100000 bits being faster is not really true. >>>> There are other places where MPIR beats GMP, even on that system. Also >>>> Case's benchmark only tests certain functionality. The full benchmark >>>> that we were running earlier shows plenty of other improvements of >>>> MPIR over GMP and is intended to give a much better overall guide. >>>> Case is showing us benchmarks that he is personally very interested >>>> in, and so that will be important for us to look at improving. >>> >>> I selected some of the tests specifically because they were improved >>> in GMP (especially diviision) or MPIR (next_prime). It is not a >>> general purpose benchmark. It was intended to highlight differences. >>> >>> BTW, the platform is a 64-bit Core2 running Linux. I'm guessing the >>> results would be radically different on 64-bit Windows. >>> >>> casevh >>>> >>>> Some of the program benchmarks that we have in our full benchmark >>>> suite tell a completely different story, putting MPIR well ahead for >>>> those sorts of things. They show that in an overall program, we do >>>> quite well. >>>> >>>> As I said, it is a mixed bag. Neither is showing clear superiority at >>>> this point. However, I will accept that on your 32 bit system, the >>>> assembly code is better optimised in GMP. That is definitely something >>>> we should look at improving further. >>>> >>>> Of course that is not completely trivial to do though. You are welcome >>>> to give it a go. I believe you will very quickly find that just about >>>> everything you try will make it slower. The assembly optimisation has >>>> got to such an art these days it cannot be done by hand. We have >>>> special optimisation tools for doing it, and it takes large amounts of >>>> CPU time, and human hours, to do the optimisation work. Progressively, >>>> over time, all the code will get optimised, but it is a long process! >>>> >>>> Bill. >>>> >>>> 2010/1/10 Gianrico Fini <gianrico.f...@gmail.com>: >>>>> It seems that also on your platform (32 bits you too?) MPIR is faster >>>>> only for one thing: multiplication (or squaring) above 100000 digits, >>>>> up to 30%. >>>>> And slower almost everywhere... somewhere +100% or more... >>>>> >>>>> This strengthen my decision... >>>>> >>>>> Gian. >>>>> >>>>> On 10 Gen, 18:47, Case Vanhorsen <cas...@gmail.com> wrote: >>>>>> I'll toss in my benchmark results. :-) >>>>>> >>>>>> GMPY performance benchmark >>>>>> >>>>>> Decimal string to mpz: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000021 sec 0.00000022 sec >>>>>> 100 digits: 0.00000063 sec 0.00000066 sec >>>>>> 500 digits: 0.00000318 sec 0.00000302 sec >>>>>> 1000 digits: 0.00000716 sec 0.00000693 sec >>>>>> 5000 digits: 0.00008661 sec 0.00006298 sec >>>>>> 10000 digits: 0.00026616 sec 0.00016775 sec >>>>>> 50000 digits: 0.00265514 sec 0.00168555 sec >>>>>> 100000 digits: 0.00651324 sec 0.00444604 sec >>>>>> 500000 digits: 0.04866513 sec 0.03830050 sec >>>>>> 1000000 digits: 0.11429363 sec 0.09162606 sec >>>>>> 10000000 digits: 2.31600404 sec 1.59257817 sec >>>>>> >>>>>> Mpz to decimal string: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000034 sec 0.00000035 sec >>>>>> 100 digits: 0.00000105 sec 0.00000101 sec >>>>>> 500 digits: 0.00000717 sec 0.00000589 sec >>>>>> 1000 digits: 0.00001586 sec 0.00001262 sec >>>>>> 5000 digits: 0.00014800 sec 0.00010783 sec >>>>>> 10000 digits: 0.00041150 sec 0.00029588 sec >>>>>> 50000 digits: 0.00420932 sec 0.00338085 sec >>>>>> 100000 digits: 0.01185473 sec 0.00920948 sec >>>>>> 500000 digits: 0.12125288 sec 0.08355007 sec >>>>>> 1000000 digits: 0.31727976 sec 0.20738387 sec >>>>>> 10000000 digits: 7.70821309 sec 3.94376493 sec >>>>>> >>>>>> Mpz addition: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000010 sec 0.00000009 sec >>>>>> 100 digits: 0.00000010 sec 0.00000010 sec >>>>>> 500 digits: 0.00000012 sec 0.00000011 sec >>>>>> 1000 digits: 0.00000014 sec 0.00000013 sec >>>>>> 5000 digits: 0.00000051 sec 0.00000050 sec >>>>>> 10000 digits: 0.00000073 sec 0.00000073 sec >>>>>> 50000 digits: 0.00000430 sec 0.00000429 sec >>>>>> 100000 digits: 0.00000822 sec 0.00000818 sec >>>>>> 500000 digits: 0.00003971 sec 0.00003959 sec >>>>>> 1000000 digits: 0.00007838 sec 0.00007884 sec >>>>>> 10000000 digits: 0.00357354 sec 0.00354370 sec >>>>>> 100000000 digits: 0.05413541 sec 0.05324940 sec >>>>>> >>>>>> 1NxN mpz multiplication: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000009 sec 0.00000009 sec >>>>>> 100 digits: 0.00000017 sec 0.00000018 sec >>>>>> 500 digits: 0.00000124 sec 0.00000126 sec >>>>>> 1000 digits: 0.00000414 sec 0.00000378 sec >>>>>> 5000 digits: 0.00004730 sec 0.00004805 sec >>>>>> 10000 digits: 0.00012850 sec 0.00012088 sec >>>>>> 50000 digits: 0.00123085 sec 0.00109137 sec >>>>>> 100000 digits: 0.00290135 sec 0.00280582 sec >>>>>> 500000 digits: 0.01663006 sec 0.01763764 sec >>>>>> 1000000 digits: 0.03379822 sec 0.03994881 sec >>>>>> 10000000 digits: 0.68572044 sec 0.61115754 sec >>>>>> 100000000 digits: 6.44622898 sec 7.93841791 sec >>>>>> >>>>>> 5NxN mpz multiplication: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000011 sec 0.00000010 sec >>>>>> 100 digits: 0.00000038 sec 0.00000040 sec >>>>>> 500 digits: 0.00000604 sec 0.00000652 sec >>>>>> 1000 digits: 0.00002064 sec 0.00001863 sec >>>>>> 5000 digits: 0.00023417 sec 0.00021708 sec >>>>>> 10000 digits: 0.00064239 sec 0.00058681 sec >>>>>> 50000 digits: 0.00608666 sec 0.00436574 sec >>>>>> 100000 digits: 0.00847080 sec 0.00917852 sec >>>>>> 500000 digits: 0.05356821 sec 0.06811212 sec >>>>>> 1000000 digits: 0.12863311 sec 0.14648414 sec >>>>>> 10000000 digits: 2.27829909 sec 2.17810798 sec >>>>>> 100000000 digits: 21.30186605 sec 27.38823199 sec >>>>>> >>>>>> 17NxN mpz multiplication: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000010 sec 0.00000011 sec >>>>>> 100 digits: 0.00000113 sec 0.00000108 sec >>>>>> 500 digits: 0.00002057 sec 0.00002183 sec >>>>>> 1000 digits: 0.00007094 sec 0.00006423 sec >>>>>> 5000 digits: 0.00081254 sec 0.00071725 sec >>>>>> 10000 digits: 0.00217992 sec 0.00197989 sec >>>>>> 50000 digits: 0.02072028 sec 0.01620061 sec >>>>>> 100000 digits: 0.02676870 sec 0.03553003 sec >>>>>> 500000 digits: 0.20828125 sec 0.23191699 sec >>>>>> 1000000 digits: 0.42618978 sec 0.52746260 sec >>>>>> 10000000 digits: 5.84609008 sec 7.77125812 sec >>>>>> 100000000 digits: 74.05822110 sec 100.53587508 sec >>>>>> >>>>>> 2N/N mpz quotient: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000018 sec 0.00000018 sec >>>>>> 100 digits: 0.00000041 sec 0.00000037 sec >>>>>> 500 digits: 0.00000234 sec 0.00000203 sec >>>>>> 1000 digits: 0.00000729 sec 0.00000638 sec >>>>>> 5000 digits: 0.00009662 sec 0.00009747 sec >>>>>> 10000 digits: 0.00029030 sec 0.00029359 sec >>>>>> 50000 digits: 0.00329851 sec 0.00279975 sec >>>>>> 100000 digits: 0.00912671 sec 0.00663861 sec >>>>>> 500000 digits: 0.07756643 sec 0.04376046 sec >>>>>> 1000000 digits: 0.18805614 sec 0.10166769 sec >>>>>> 10000000 digits: 3.46835899 sec 1.65955496 sec >>>>>> 100000000 digits: 57.28032804 sec 21.36209702 sec >>>>>> >>>>>> 5N/N mpz quotient: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000021 sec 0.00000020 sec >>>>>> 100 digits: 0.00000095 sec 0.00000085 sec >>>>>> 500 digits: 0.00000846 sec 0.00000747 sec >>>>>> 1000 digits: 0.00002843 sec 0.00002508 sec >>>>>> 5000 digits: 0.00038293 sec 0.00038840 sec >>>>>> 10000 digits: 0.00115942 sec 0.00117106 sec >>>>>> 50000 digits: 0.01321486 sec 0.00858406 sec >>>>>> 100000 digits: 0.03642362 sec 0.02081330 sec >>>>>> 500000 digits: 0.31158978 sec 0.13223937 sec >>>>>> 1000000 digits: 0.75152898 sec 0.30322999 sec >>>>>> 10000000 digits: 13.88208699 sec 4.75602698 sec >>>>>> 100000000 digits: 228.71033311 sec 60.81353498 sec >>>>>> >>>>>> 17N/N mpz quotient: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000028 sec 0.00000026 sec >>>>>> 100 digits: 0.00000343 sec 0.00000287 sec >>>>>> 500 digits: 0.00003360 sec 0.00002929 sec >>>>>> 1000 digits: 0.00011155 sec 0.00010037 sec >>>>>> 5000 digits: 0.00153300 sec 0.00132398 sec >>>>>> 10000 digits: 0.00462759 sec 0.00346040 sec >>>>>> 50000 digits: 0.05275750 sec 0.02910585 sec >>>>>> 100000 digits: 0.14589587 sec 0.07089074 sec >>>>>> 500000 digits: 1.24625611 sec 0.44447249 sec >>>>>> 1000000 digits: 2.99730897 sec 1.01921391 sec >>>>>> 10000000 digits: 55.43707895 sec 15.77324915 sec >>>>>> 100000000 digits: 915.73236585 sec 200.43964195 sec >>>>>> >>>>>> 2N/N mpz quot & rem: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000026 sec 0.00000025 sec >>>>>> 100 digits: 0.00000049 sec 0.00000045 sec >>>>>> 500 digits: 0.00000242 sec 0.00000207 sec >>>>>> 1000 digits: 0.00000738 sec 0.00000648 sec >>>>>> 5000 digits: 0.00009728 sec 0.00009805 sec >>>>>> 10000 digits: 0.00029146 sec 0.00029408 sec >>>>>> 50000 digits: 0.00330097 sec 0.00279113 sec >>>>>> 100000 digits: 0.00913051 sec 0.00663646 sec >>>>>> 500000 digits: 0.07772918 sec 0.04375675 sec >>>>>> 1000000 digits: 0.18767852 sec 0.10160725 sec >>>>>> 10000000 digits: 3.47579503 sec 1.65892482 sec >>>>>> 100000000 digits: 57.20885682 sec 21.33134699 sec >>>>>> >>>>>> Mpz squaring: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000015 sec 0.00000013 sec >>>>>> 100 digits: 0.00000025 sec 0.00000019 sec >>>>>> 500 digits: 0.00000102 sec 0.00000090 sec >>>>>> 1000 digits: 0.00000288 sec 0.00000267 sec >>>>>> 5000 digits: 0.00003559 sec 0.00003249 sec >>>>>> 10000 digits: 0.00009700 sec 0.00008516 sec >>>>>> 50000 digits: 0.00088845 sec 0.00078422 sec >>>>>> 100000 digits: 0.00201402 sec 0.00190495 sec >>>>>> 500000 digits: 0.01089044 sec 0.01193870 sec >>>>>> 1000000 digits: 0.02389035 sec 0.02675930 sec >>>>>> 10000000 digits: 0.49165547 sec 0.42553878 sec >>>>>> 100000000 digits: 4.66965413 sec 5.35446501 sec >>>>>> >>>>>> Mpz square root & rem: MPIR 1.3.0 GMP 5.0.0 >>>>>> 10 digits: 0.00000030 sec 0.00000026 sec >>>>>> 100 digits: 0.00000058 sec 0.00000051 sec >>>>>> 500 digits: 0.00000167 sec 0.00000139 sec >>>>>> 1000 digits: 0.00000316 sec 0.00000244 sec >>>>>> 5000 digits: 0.00002410 sec 0.00002248 sec >>>>>> 10000 digits: 0.00007203 sec 0.00006785 sec >>>>>> 50000 digits: 0.00083096 sec 0.00080454 sec >>>>>> ... >>>>>> >>>>>> leggi tutto >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "mpir-devel" group. >>>>> To post to this group, send email to mpir-de...@googlegroups.com. >>>>> To unsubscribe from this group, send email to >>>>> mpir-devel+unsubscr...@googlegroups.com. >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/mpir-devel?hl=en. >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "mpir-devel" group. >>>> To post to this group, send email to mpir-de...@googlegroups.com. >>>> To unsubscribe from this group, send email to >>>> mpir-devel+unsubscr...@googlegroups.com. >>>> For more options, visit this group at >>>> http://groups.google.com/group/mpir-devel?hl=en. >>>> >>>> >>>> >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "mpir-devel" group. >>> To post to this group, send email to mpir-de...@googlegroups.com. >>> To unsubscribe from this group, send email to >>> mpir-devel+unsubscr...@googlegroups.com. >>> For more options, visit this group at >>> http://groups.google.com/group/mpir-devel?hl=en. >>> >>> >>> >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "mpir-devel" group. >> To post to this group, send email to mpir-de...@googlegroups.com. >> To unsubscribe from this group, send email to >> mpir-devel+unsubscr...@googlegroups.com. >> For more options, visit this group at >> http://groups.google.com/group/mpir-devel?hl=en. >> >> >> >> > > -- > You received this message because you are subscribed to the Google Groups > "mpir-devel" group. > To post to this group, send email to mpir-de...@googlegroups.com. > To unsubscribe from this group, send email to > mpir-devel+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/mpir-devel?hl=en. > > > >
-- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-de...@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.