I added xgcd code too, just for good measure:

https://github.com/wbhart/ccas/blob/master/nn/xgcd_ngcd.c
https://github.com/wbhart/ccas/blob/master/nn/ngcd_cofactor_update.c

The xgcd times don't seem to differ too much from the gcd times for large
inputs, 10% more perhaps.

On 10 February 2015 at 16:46, Bill Hart <goodwillh...@googlemail.com> wrote:

> In case anyone is interested, I just spent the last week trying to
> implement a "straightforward" version of Moller's half gcd algorithm for
> asymptotically fast integer GCD:
>
> http://www.lysator.liu.se/~nisse/archive/S0025-5718-07-02017-0.pdf
>
> Here it is:
>
>
> https://github.com/wbhart/ccas/blob/master/nn/gcd_ngcd.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_sdiv.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_sdiv_cmp.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_apply.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_update.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_mul.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_identity.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_init.c
> https://github.com/wbhart/ccas/blob/master/nn/ngcd_mat_clear.c
>
> The main things I do differently to the paper are:
>
> 1) I break on limb boundaries instead of bit boundaries (the main theorems
> in Moller's paper still hold in this case)
>
> 2) I always assume that  that m >= n when taking gcd of {a, m} and {b, n}
> where m and n are the number of words
>
> 3) I try to only deal with nonnegative integers throughout the
> implementation
>
> 4) I prevent the algorithm from recursing too deep, which it will do
> without the additional tests I've added in ngcd.c (which otherwise looks
> like the pseudocode in Moller's paper). (Without being careful of this, I
> found the complexity to be more like cubic than quasilinear :-))
>
> I've done lots of testing and I am fairly sure if there are any bugs left,
> they are fairly obscure.
>
> It's about 10x slower than the much more complicated version in GMP/MPIR.
> Their implementation does the following things which are not done in my
> implementation:
>
> * my integer multiplication code is entirely written in C, so is 5-10x
> slower (up to about a billion bits)
>
> * they optimise division, which is usually just a subtraction when done in
> gcd algorithms, as all the terms in the remainder sequence are usually only
> 1 or 2 bits bigger than the next
>
> * they fall back to a fast Lehmer gcd, I fall back to the standard
> Euclidean algorithm
>
> * they use 2x2 Strassen matrix multiplication
>
> * they do lots of minor optimisations, such as allocating all memory up
> front and avoiding copies, etc.
>
> Note that the crossover from Euclidean is only 20 words on a 64 bit
> machine!
>
> Bill.
>

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mpir-devel+unsubscr...@googlegroups.com.
To post to this group, send email to mpir-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/mpir-devel.
For more options, visit https://groups.google.com/d/optout.

Reply via email to