ni...@lysator.liu.se (Niels Möller) writes: > ni...@lysator.liu.se (Niels Möller) writes: > >> You're idea of conditonally adding the invariant d * B2 at the right >> place is also interesting, > > I've tried it out. Works nicely, but no speedup on my machine. I'm > attaching another patch. There are then 4 methods: > > method 1: Old loop around udiv_qrnnd_preinv. > > method 2: The clever code from 10 years ago, with the microoptimization > I commited the other day. > > method 3: More or less the same as I posted a few days ago. > > method 4: Postpones the update u1 -= u2 d, off the critical recurrency > chain. Instead, conditionally adds in the constant B2 (B - d) to the > lower u limbs.
I'm tempted to commit this code. I.e., new variants (not enabled) + tuneup changes. To see which variants are favorites on the various test machines. Should give some guidance as to what's most promising for assembly implementation. What do you think? Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel