Consider attached diff. > The issue is particularly clear when we multiply the generator by > zero. Note that in general, an application shouldn't multiply the > generator by zero since there's no useful cryptographic purpose for > doing so. But, this is a convenient example. > > In the code we have, > > ecp_nistz256_gather_w7(&p.a, preComputedTable[0], wvalue >> 1); > ecp_nistz256_neg(p.p.Z, p.p.Y); > copy_conditional(p.p.Y, p.p.Z, wvalue & 1); > memcpy(p.p.Z, ONE, sizeof(ONE)); > > The generator is all zeros, so we'll start off with the point (0, 0, > 1). Then we add the point at infinity over and over again, leaving > that point unchanged each time. Thus, we'll end with (0, 0, 1). Then: > > r->Z_is_one = is_one(p.p.Z) & 1; > > Thus, the resulting point will be (0, 0, 1). > > After the memcpy quoted above, we need to do a copy_conditional(p.p.Z, > is_infinity, ZERO) or equivalent, where ZERO is all-zeros and where > is_infinity is the result of checking if (x, y) == (0, 0). > > This is just one example of an edge case that is handled in a > surprising way. I think there are more, as described in the quoted > message below. > > Cheers, > Brian > > > Brian Smith <br...@briansmith.org> wrote: >> >> Brian Smith <br...@briansmith.org> wrote: >>> >>> When doing math on short Weierstrass curves like P-256, we have to >>> special case points at infinity. In Jacobian coordinates (X, Y, Z), >>> points at infinity have Z == 0. However, instead of checking for Z == >>> 0, p256-x86-64 instead checks for (X, Y) == (0, 0). In other words, it >>> does, in some sense, the opposite of what I expect it to do. >> >> >> I exchanged email with both of the original authors of the code, Shay and >> Vlad. He that the ecp_nistz256_point_* functions indeed intend to represent >> the point at infinity as (0, 0) and it is expected (but I did not verify) >> that those functions should work when the point at infinity is encoded as >> (0, 0, _). >> >>> The authors >>> instead decided to encode the point at infinity as (0, 0), since the >>> affine point (0, 0) isn't on the P-256 curve. It isn't clear why the >>> authors chose to do that though, since the point at infinity doesn't >>> (can't, logically) appear in the table of precomputed multiples of G >>> anyway. >> >> >> Actually, as the code says, the point at infinity implicitly occurs in the >> table implicitly. Obviously the accumulator point will be at infinity until >> at least a one bit is found in the scalar. >> >>> >>> But, it seems like the functions that do the computations, like >>> >>> ecp_nistz256_point_add and ecp_nistz256_point_add_affine, output the >>> point at infinity as (_, _, 0), not necessarily (0, 0, _). Also, >>> ecp_nistz256's EC_METHOD uses ec_GFp_simple_is_at_infinity and >>> ec_GFp_simple_point_set_to_infinity, which represent the point at >>> infinity with z == 0, not (x, y) == 0. Further ecp_nistz256_get_affine >>> uses EC_POINT_is_at_infinity, which checks z == 0, not (x, y) == 0. >>> This inconsistency is confusing, if not wrong. Given this, it seems >>> like the point-at-infinity checks in the ecp_nistz256_point_add and >>> ecp_nistz256_point_add_affine code should also be checking that z == 0 >>> instead of (x, y) == (0, 0). >> >> >> Instead, when we convert a point from EC_POINT to P256_POINT or >> P256_POINT_AFFINE, we should translate the (_, _, 0) form into the (0, 0, 0) >> form. And/or reject points at infinity as inputs to the function. Similarly, >> when we convert the resultant P256_POINT to an EC_POINT, we chould translate >> the (0, 0) encoding of the point at infinity to the (0, 0, 0) form or at >> least the (_, _, 0) form. >> >> In particular, consider the case where you have an EC_POINT that isn't at >> infinity, e.g. (x, y, 1). Then you call EC_POINT_set_to_infinity on it. Then >> it is (x, y, 0). Then you pass it to EC_POINT_mul/EC_POINTs_mul even though >> you shouldn't. >> >> Maybe the precondition of EC_POINT_mul/EC_POINTs_mul is that none of the >> input points are at infinity, in which case it doesn't matter. (But if so, >> maybe these functions should do a point-at-infinity check.) But if it is >> allowed to pass in the point at infinity, then the ecp_nistz256 code should >> translate the input point from (_, _, 0) form to (0, 0, _) form before doing >> the computation. It can use is_zero and copy_conditional and friends to do >> this. >> >> Similarly, after the computation, it should translate the (0, 0, _) form to >> (_, _, 0) form. In particular, it should do such a translation before the >> code sets Z_is_one, AFAICT. Note that the nistz256 code might be using the >> form (0, 0, 0) instead of (0, 0, _) in which case this translation might not >> be necessary. >> >> Regardless, it would be useful to write tests for these cases: >> 1. Verify that the result is correct when any of the input points are (0, 0, >> 0) >> 2. Verify that the result is correct when any of the input points are (_, _, >> 0). >> 3. Verify that the result is correct, and in particular that Z_is_one is set >> correctly on the result, when the final result is at infinity, especially >> for the cases where neither the input points are at infinity, e.g. when >> adding (n-1)G to 1G. >> >> Note that all of the above cases have interesting sub-cases: the G scalar is >> NULL, the G scalar is non-NULL and zero-valued, G scalar is a multiple of G, >> G scalar is larger than G. And same for the P scalars. >> >> Cheers, >> Brian >> -- >> https://briansmith.org/ >> > > >
-- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4636 Please log in as guest with password guest if prompted
>From 846d293b3cc040e19a5f6fd4492ba4a6ba5a1064 Mon Sep 17 00:00:00 2001 From: Andy Polyakov <ap...@openssl.org> Date: Fri, 19 Aug 2016 23:16:04 +0200 Subject: [PATCH 1/2] ec/ecp_nistz256: harmonize is_infinity with ec_GFp_simple_is_at_infinity. RT#4625 --- crypto/ec/asm/ecp_nistz256-x86_64.pl | 24 +++++++---------- crypto/ec/ecp_nistz256.c | 52 ++++++++++++++++++++++++------------ 2 files changed, 45 insertions(+), 31 deletions(-) diff --git a/crypto/ec/asm/ecp_nistz256-x86_64.pl b/crypto/ec/asm/ecp_nistz256-x86_64.pl index cce92b9..32da387 100755 --- a/crypto/ec/asm/ecp_nistz256-x86_64.pl +++ b/crypto/ec/asm/ecp_nistz256-x86_64.pl @@ -2291,16 +2291,14 @@ $code.=<<___; mov $b_org, $a_ptr # reassign movdqa %xmm0, $in1_x(%rsp) movdqa %xmm1, $in1_x+0x10(%rsp) - por %xmm0, %xmm1 movdqa %xmm2, $in1_y(%rsp) movdqa %xmm3, $in1_y+0x10(%rsp) - por %xmm2, %xmm3 movdqa %xmm4, $in1_z(%rsp) movdqa %xmm5, $in1_z+0x10(%rsp) - por %xmm1, %xmm3 + por %xmm4, %xmm5 movdqu 0x00($a_ptr), %xmm0 # copy *(P256_POINT *)$b_ptr - pshufd \$0xb1, %xmm3, %xmm5 + pshufd \$0xb1, %xmm5, %xmm3 movdqu 0x10($a_ptr), %xmm1 movdqu 0x20($a_ptr), %xmm2 por %xmm3, %xmm5 @@ -2312,14 +2310,14 @@ $code.=<<___; movdqa %xmm0, $in2_x(%rsp) pshufd \$0x1e, %xmm5, %xmm4 movdqa %xmm1, $in2_x+0x10(%rsp) - por %xmm0, %xmm1 - movq $r_ptr, %xmm0 # save $r_ptr + movdqu 0x40($a_ptr),%xmm0 # in2_z again + movdqu 0x50($a_ptr),%xmm1 movdqa %xmm2, $in2_y(%rsp) movdqa %xmm3, $in2_y+0x10(%rsp) - por %xmm2, %xmm3 por %xmm4, %xmm5 pxor %xmm4, %xmm4 - por %xmm1, %xmm3 + por %xmm0, %xmm1 + movq $r_ptr, %xmm0 # save $r_ptr lea 0x40-$bias($a_ptr), $a_ptr # $a_ptr is still valid mov $src0, $in2_z+8*0(%rsp) # make in2_z copy @@ -2330,8 +2328,8 @@ $code.=<<___; call __ecp_nistz256_sqr_mont$x # p256_sqr_mont(Z2sqr, in2_z); pcmpeqd %xmm4, %xmm5 - pshufd \$0xb1, %xmm3, %xmm4 - por %xmm3, %xmm4 + pshufd \$0xb1, %xmm1, %xmm4 + por %xmm1, %xmm4 pshufd \$0, %xmm5, %xmm5 # in1infty pshufd \$0x1e, %xmm4, %xmm3 por %xmm3, %xmm4 @@ -2662,16 +2660,14 @@ $code.=<<___; mov 0x40+8*3($a_ptr), $acc0 movdqa %xmm0, $in1_x(%rsp) movdqa %xmm1, $in1_x+0x10(%rsp) - por %xmm0, %xmm1 movdqa %xmm2, $in1_y(%rsp) movdqa %xmm3, $in1_y+0x10(%rsp) - por %xmm2, %xmm3 movdqa %xmm4, $in1_z(%rsp) movdqa %xmm5, $in1_z+0x10(%rsp) - por %xmm1, %xmm3 + por %xmm4, %xmm5 movdqu 0x00($b_ptr), %xmm0 # copy *(P256_POINT_AFFINE *)$b_ptr - pshufd \$0xb1, %xmm3, %xmm5 + pshufd \$0xb1, %xmm5, %xmm3 movdqu 0x10($b_ptr), %xmm1 movdqu 0x20($b_ptr), %xmm2 por %xmm3, %xmm5 diff --git a/crypto/ec/ecp_nistz256.c b/crypto/ec/ecp_nistz256.c index f824a82..411220e 100644 --- a/crypto/ec/ecp_nistz256.c +++ b/crypto/ec/ecp_nistz256.c @@ -310,19 +310,16 @@ static void ecp_nistz256_point_add(P256_POINT *r, const BN_ULONG *in2_y = b->Y; const BN_ULONG *in2_z = b->Z; - /* We encode infinity as (0,0), which is not on the curve, - * so it is OK. */ - in1infty = (in1_x[0] | in1_x[1] | in1_x[2] | in1_x[3] | - in1_y[0] | in1_y[1] | in1_y[2] | in1_y[3]); + /* + * Infinity in encoded as (,,0) + */ + in1infty = (in1_z[0] | in1_z[1] | in1_z[2] | in1_z[3]); if (P256_LIMBS == 8) - in1infty |= (in1_x[4] | in1_x[5] | in1_x[6] | in1_x[7] | - in1_y[4] | in1_y[5] | in1_y[6] | in1_y[7]); + in1infty |= (in1_z[4] | in1_z[5] | in1_z[6] | in1_z[7]); - in2infty = (in2_x[0] | in2_x[1] | in2_x[2] | in2_x[3] | - in2_y[0] | in2_y[1] | in2_y[2] | in2_y[3]); + in2infty = (in2_z[0] | in2_z[1] | in2_z[2] | in2_z[3]); if (P256_LIMBS == 8) - in2infty |= (in2_x[4] | in2_x[5] | in2_x[6] | in2_x[7] | - in2_y[4] | in2_y[5] | in2_y[6] | in2_y[7]); + in2infty |= (in2_z[4] | in2_z[5] | in2_z[6] | in2_z[7]); in1infty = is_zero(in1infty); in2infty = is_zero(in2infty); @@ -411,15 +408,16 @@ static void ecp_nistz256_point_add_affine(P256_POINT *r, const BN_ULONG *in2_y = b->Y; /* - * In affine representation we encode infty as (0,0), which is not on the - * curve, so it is OK + * Infinity in encoded as (,,0) */ - in1infty = (in1_x[0] | in1_x[1] | in1_x[2] | in1_x[3] | - in1_y[0] | in1_y[1] | in1_y[2] | in1_y[3]); + in1infty = (in1_z[0] | in1_z[1] | in1_z[2] | in1_z[3]); if (P256_LIMBS == 8) - in1infty |= (in1_x[4] | in1_x[5] | in1_x[6] | in1_x[7] | - in1_y[4] | in1_y[5] | in1_y[6] | in1_y[7]); + in1infty |= (in1_z[4] | in1_z[5] | in1_z[6] | in1_z[7]); + /* + * In affine representation we encode infinity as (0,0), which is + * not on the curve, so it is OK + */ in2infty = (in2_x[0] | in2_x[1] | in2_x[2] | in2_x[3] | in2_y[0] | in2_y[1] | in2_y[2] | in2_y[3]); if (P256_LIMBS == 8) @@ -1249,6 +1247,8 @@ __owur static int ecp_nistz256_points_mul(const EC_GROUP *group, } else #endif { + BN_ULONG infty; + /* First window */ wvalue = (p_str[0] << 1) & mask; idx += window_size; @@ -1261,7 +1261,25 @@ __owur static int ecp_nistz256_points_mul(const EC_GROUP *group, ecp_nistz256_neg(p.p.Z, p.p.Y); copy_conditional(p.p.Y, p.p.Z, wvalue & 1); - memcpy(p.p.Z, ONE, sizeof(ONE)); + infty = (p.p.X[0] | p.p.X[1] | p.p.X[2] | p.p.X[3] | + p.p.Y[0] | p.p.Y[1] | p.p.Y[2] | p.p.Y[3]); + if (P256_LIMBS == 8) + infty |= (p.p.X[4] | p.p.X[5] | p.p.X[6] | p.p.X[7] | + p.p.Y[4] | p.p.Y[5] | p.p.Y[6] | p.p.Y[7]); + + infty = 0 - is_zero(infty); + infty = ~infty; + + p.p.Z[0] = ONE[0] & infty; + p.p.Z[1] = ONE[1] & infty; + p.p.Z[2] = ONE[2] & infty; + p.p.Z[3] = ONE[3] & infty; + if (P256_LIMBS == 8) { + p.p.Z[4] = ONE[4] & infty; + p.p.Z[5] = ONE[5] & infty; + p.p.Z[6] = ONE[6] & infty; + p.p.Z[7] = ONE[7] & infty; + } for (i = 1; i < 37; i++) { unsigned int off = (idx - 1) / 8; -- 2.7.4
-- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev