TLDR: How much do square roots matter for ECC primes? I've been working on finding Granger Moss primes where t fits in a 32 bit integer. Along the way, I worked out tighter bounds for the min and max values after reduction.
In the paper, they aren't explicit about how much extra room is needed to handle additions for curve operations. For edwards curves, you need to account for a factor of 6. This changes their formula ceil(log_2(m / 2)) + 2k + 5 <= 2w to: ceil(log_2(3 * m) + 2k + 5 <= 2w for m + 1 = 17 and 19, this requires k < 26.5 Similarly for their reduction algorithm, they work in bits, but I did the math on the actual bounds. Assuming z_i = 2^63 - 1, after 1 reduction the max value is 2^(63 - l) - 1 + (t - c) t - c = (b - 1) * c, the max value carried from z_(i+1), assuming c < b, after 2 reductions the max value is: 2^(63 - 2*l) - 1 + c + (t - c). = 2^(63 - 2*l) + t - 1. The first c occurs because t/b = c. The minimum value is easier to calculate. In this case, the worst case carry is 0. so the min value is 2^(63 - 2 * l). Upon multiplying these values are subtracted. If their difference is less than 2^27.5, then we can adjust the formula above to: ceil(log_2(3 * m) + 2k + 4 <= 2w This does place a greater constraint on l, but it yields larger primes. because we know that the result of the product takes 1 less bit. This allows us to construct larger primes for m + 1 = 17, 19. For m+1 = 19 we can construct a prime with 483 bits, specifically: p = phi 18 t, where t = (2^3 - 1) * (2^24); b = 2^24; l = 24 Unfortunately, these primes are not fast square root primes, by construction. It should be clear that p % (2^24) = 1. How much does this matter? S is 24 in this case, so Tonelli Shanks should be reasonably fast. Equality checking is also a little tricky due to the redundancy, but a subtraction single round of modular carries can be done in parallel. Then you can verify that all the limbs are the same value. I found an edwards curve that satisfies the safecurves criteria, the constant is unusual but rigid: It is the least d in absolute value such that x^2 + y^2 = 1 + (d/b)x^2*y^2 has cofactor {4, 8} and so does its twist. The reason for choosing such a d is that coefficient multiplication requires reduction, which means that even for small constants, they must be in the montgomery domain, taking up at least 2 limbs. But for a constant d/b it can be a single limb. the least such d is: -56904 for that curve the trace is: 1805876552616329139365429358604192390835234313183868353172046570215054410 I prototyped the code in Haskell, for quick turnaround and so that I have something where I can print the intermediate values as test vectors for other implementations. I can tidy it up and share it if there's interest. It's not written for speed but for verification. I plan to try and implement a cuda or opencl version, to try and take advantage of the parallel nature of the primes. Thoughts or suggestions? Would it be worthwhile to write any of this up in a more formal setting? I think at a minimum borrowing the half bit by computing tighter bounds is pretty cool. Kyle
_______________________________________________ Curves mailing list Curves@moderncrypto.org https://moderncrypto.org/mailman/listinfo/curves