David Harvey <d.har...@unsw.edu.au> writes: > The main problem with this approach is that there is not much > granularity in the choice of r.
In my code, I used some huristics like this: Start with r = 5, and choose the appropriate size k for the splitting into pieces, and transform size 2^n. Then redo this choice with one prime less (which requires a smaller k, and gives a higher degree polynomials), and use as few primes as you can get away with, without doubling the transform size. This eliminates the large bumps. I probably don't get the details right here, but you get steps like switching from 2^n and r = 5 to 2^{n+1} and r = 3. > You don't want r to get too large, because then the reduction/CRT steps > can become quite expensive. I looked into that some years ago. I just put my notes from back then at http://www.lysator.liu.se/~nisse/mist/crc-complexity.pdf. For small r, the cost appears to be sub-squadratic (which was a bit surprising to me, since all multiplications are schoolbook; for large r, one would need to use divide-and-conquer, building on larger multiplies). For CRT with r limb sized moduli, 2 <= r <= 10, you need approximately 3.86 (r-1)^{1.57} limb multiplies. > Another solution, which doesn't work at all, is to use TFTs. One > cannot use TFTs directly for multiplication mod x^n - 1, because they > evaluate at the wrong roots of unity. I implemented TFT, but I think you're right that it's directly applicable only in the no-wraparound case. > I propose using a modulus of the form > > M = (B^{k u_1} + 1) (B^{k u_2 + 1) ... (B^{k u_s} + 1), > > where > > B = 2^64 > k is some integer > u_1 > u_2 > ... > u_s is a decreasing sequence of powers of 2. Seems closely related to variants of the TFT which use polynomial f(x) = (x^{u_1}+1) (x^{u_2} + 1) ... (x^{u_s} + 1) where u_j = 2^{k_j} are distinct powers of two and f(x) is a factor of the larger x^{2^{2u_1} - 1. Which is a TFT, since we evaluate the input polynomial (or reduced variants thereof) at the roots of f, and the roots of f are a subset of the roots of x^{2^n} - 1 (and if needed, one can always go between x^{2^k} - 1 and x^{2^k} + 1 by a simple change of variable). I've seen the details spelled out in some paper on in-place TFT transforms. > REDUCE: takes as input an arbitrary integer, and computes its residues > modulo B^{k u_1} + 1, ..., B^{k u_s} + 1. [...] It ends up looking > very similar to the top layers of the TFT algorithm, working on the > input in chunks of k limbs. I'd guess it really *is* a TFT algorithm... And the cool thing, compared to the plain TFT, is that the subset of roots used really correspond to a useful integer modulo. (*Maybe* that's in some way the case also with the plain TFT, but I'm afraid that the subset of roots used in my code corresponds to a factorization of x^{2^n}-1 which can't be lifted up from Z_p[x] to Z[x]. I haven't thought this through carefully). Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel