Torbjorn Granlund <t...@gmplib.org> writes: > I think it exists also for b can be factorised into prime powers...
Sure, any factorization can be used with CRT, but it's only powers which allow newton-like hensel-lifts, as far as I'm aware. And the most important case is when b is prime. > You need to have roughly *2k* bits of each operand to make k bits > progress. And you get into trouble at least in the case that those 2k > bits are equal (you need to subtract one operand from the other, but > which depends on the other bits). > > We don't need to insist on keeping operands positive. Hmm. In general, one needs to replace the largest number, to make progres. But I guess in the case of many high bits being equal, it might not matter. > My thought was not to have a table, but plain loop which executed a > limb-size dependent number of iterations, and has not internal branches. I see. Makes some sense. To get a nice progress guarantee, one such "large iteration" would need to apply the matrix, and then do a small number of additional reductions. > I have not looked at right-to-left algorithms for gcdext, at least not > in a very long time. Do they resolve gcd(a,b) = ax + by or something > like gcd(a,b)*2^t = ax + by? I haven't looked at more sophisticated right-to-left gcd for a long time (Stehlé and Zimmermann). But the plain binary algorithm shifts out the zero bits as they appear. And the co-factors (to-be inverse) are divided by 2 (mod b) in each step, by a shift and a conditional add. It's possible, and it might be more efficient, to divide the cofactors by 2^k (mod b) at the end, instead of one bit at a time. > Right, answering my question above. Cannot false factors of 2 be > handled? As long as the iteration only makes the update a <-- a + k b, no false factors can appear. The problem is the k-ary algorithm (Weber's accelarated gcd), which uses an update of the form a <-- u a + v b, with |u| != 1. And as for factors of two, I have only considered inversion modulo an odd b. > I suspect non-leaky code has to compute the sign as a full subtraction. It should be possible to do a non-leaky loop examining all limbs, without carry propagation. Just doing comparisions a[k] < b[k] and a[k] == b[k], combining results with logic operations. No idea if that can be faster than a plain subtraction. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel