I added xgcd code too, just for good measure: https://github.com/wbhart/ccas/blob/master/nn/xgcd_ngcd.c https://github.com/wbhart/ccas/blob/master/nn/ngcd_cofactor_update.c
The xgcd times don't seem to differ too much from the gcd times for large inputs, 10% more perhaps. On 10 February 2015 at 16:46, Bill Hart <goodwillh...@googlemail.com> wrote: > In case anyone is interested, I just spent the last week trying to > implement a "straightforward" version of Moller's half gcd algorithm for > asymptotically fast integer GCD: > > http://www.lysator.liu.se/~nisse/archive/S0025-5718-07-02017-0.pdf > > Here it is: > > > https://github.com/wbhart/ccas/blob/master/nn/gcd_ngcd.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_sdiv.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_sdiv_cmp.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_apply.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_update.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_mul.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_identity.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_init.c > https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_clear.c > > The main things I do differently to the paper are: > > 1) I break on limb boundaries instead of bit boundaries (the main theorems > in Moller's paper still hold in this case) > > 2) I always assume that that m >= n when taking gcd of {a, m} and {b, n} > where m and n are the number of words > > 3) I try to only deal with nonnegative integers throughout the > implementation > > 4) I prevent the algorithm from recursing too deep, which it will do > without the additional tests I've added in ngcd.c (which otherwise looks > like the pseudocode in Moller's paper). (Without being careful of this, I > found the complexity to be more like cubic than quasilinear :-)) > > I've done lots of testing and I am fairly sure if there are any bugs left, > they are fairly obscure. > > It's about 10x slower than the much more complicated version in GMP/MPIR. > Their implementation does the following things which are not done in my > implementation: > > * my integer multiplication code is entirely written in C, so is 5-10x > slower (up to about a billion bits) > > * they optimise division, which is usually just a subtraction when done in > gcd algorithms, as all the terms in the remainder sequence are usually only > 1 or 2 bits bigger than the next > > * they fall back to a fast Lehmer gcd, I fall back to the standard > Euclidean algorithm > > * they use 2x2 Strassen matrix multiplication > > * they do lots of minor optimisations, such as allocating all memory up > front and avoiding copies, etc. > > Note that the crossover from Euclidean is only 20 words on a 64 bit > machine! > > Bill. > -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+unsubscr...@googlegroups.com. To post to this group, send email to mpir-devel@googlegroups.com. Visit this group at http://groups.google.com/group/mpir-devel. For more options, visit https://groups.google.com/d/optout.