Re: RFR 8181594: Efficient and constant-time modular arithmetic

Adam Petcher Tue, 30 Jan 2018 08:53:22 -0800

+core-libs-dev


On 1/26/2018 4:06 PM, Adam Petcher wrote:

JBS: https://bugs.openjdk.java.net/browse/JDK-8181594
Webrev: http://cr.openjdk.java.net/~apetcher/8181594/webrev.00/
This is a code review for the field arithmetic that will be used inimplementations of X25519/X448 key agreement, the Poly1305authenticator, and EdDSA signatures. I believe that the library hasall the features necessary for X25519/X448 and Poly1305, and I expectat most a couple of minor enhancements will be required to supportEdDSA. There is no public API for this library, so we can change it inthe future to suit the needs of new algorithms without breakingcompatibility with external code. Still, I made an attempt to clearlystructure and document the (internal) API, and I want to make sure itis understandable and easy to use.
This is not a general-purpose modular arithmetic library. It will onlywork well in circumstances where the sequence of operations isrestricted, and where the prime that defines the field has some usefulstructure. Moreover, each new field will require some field-specificcode that takes into account the structure of the prime and the waythe field is used in the application. The initial implementationincludes a field for Poly1305 and the fields for X25519/X448 whichshould also work for EdDSA.
The benefits of using this library are that it is much more efficientthan using similar operations in BigInteger. Also, many operations arebranch-free, making them suitable for use in a side-channel resistantimplementation that does not branch on secrets.
To provide some context, I have attached a code snippet describing howthis library can be used. The snippet is the constant-time Montgomeryladder from my X25519/X448 implementation, which I expect to be outfor review soon. X25519/X448 only uses standard arithmetic operations,and the more unusual features (e.g. add modulo a power of 2) areneeded by Poly1305.
The field arithmetic (for all fields) is implemented using a 32-bitrepresentation similar to the one described in the Ed448 paper[1] (inthe "Implementation on 32-bit platforms" section). Though myimplementation uses signed limbs, and grade-school multiplicationinstead of Karatsuba. The argument for correctness is essentially thesame for all three fields: the magnitude of each 64-bit limb is atmost 2^(k-1) after reduction, except for the last limb which may havea magnitude of up to 2^k. The values of k are between 26 to 28(depending on the field), and we can calculate that the maximummagnitude for any limb during an add-multiply-carry-reduce sequence isalways less than 2^63. Therefore, no overflow occurs and alloperations are correct.
Process note: this enhancement is part of JEP 324 (Key Agreement withCurve25519 and Curve448). When this code review is complete, nothingwill happen until all other work for this JEP is complete, and the JEPis accepted as part of some release. This means that this code will bepushed to the repo along with the X25519/X448 code that uses it.
[1] https://eprint.iacr.org/2015/625.pdf

    private IntegerModuloP_Base pointMultiply(byte[] k, IntegerModuloP u){

        IntegerModuloP x_1 = u;
        MutableIntegerModuloP x_2 = one.mutable();
        MutableIntegerModuloP z_2 = zero.mutable();
        MutableIntegerModuloP x_3 = u.mutable();
        MutableIntegerModuloP z_3 = one.mutable();
        int swap = 0;

        // Variables below are reused to avoid unnecessary allocation
        // They will be assigned in the loop, so initial value doesn't matter
        MutableIntegerModuloP m1 = zero.mutable();
        MutableIntegerModuloP DA = zero.mutable();
        MutableIntegerModuloP E = zero.mutable();
        MutableIntegerModuloP a24_times_E = zero.mutable();

        for(int t = params.getBits() - 1; t >= 0; t--){
            int k_t = bitAt(k, t);
            swap = swap ^ k_t;
            x_2.conditionalSwapWith(x_3, swap);
            z_2.conditionalSwapWith(z_3, swap);
            swap = k_t;

            // A(m1) = x_2 + z_2
            m1.setValue(x_2).setSum(z_2);
            // D = x_3 - z_3
            // DA = D * A(m1)
            DA.setValue(x_3).setDifference(z_3).setProduct(m1);
            // AA(m1) = A(m1)^2
            m1.setSquare();
            // B(x_2) = x_2 - z_2
            x_2.setDifference(z_2);
            // C = x_3 + z_3
            // CB(x_3) = C * B(x_2)
            x_3.setSum(z_3).setProduct(x_2);
            // BB(x_2) = B^2
            x_2.setSquare();
            // E = AA(m1) - BB(x_2)
            E.setValue(m1).setDifference(x_2);
            // compute a24 * E using SmallValue
            a24_times_E.setValue(E);
            a24_times_E.setProduct(a24);

            // assign results to x_3, z_3, x_2, z_2
            // x_2 = AA(m1) * BB
            x_2.setProduct(m1);
            // z_2 = E * (AA(m1) + a24 * E)
            z_2.setValue(m1).setSum(a24_times_E).setProduct(E);
            // z_3 = x_1*(DA - CB(x_3))^2
            z_3.setValue(DA).setDifference(x_3).setSquare().setProduct(x_1);
            // x_3 = (CB(x_3) + DA)^2
            x_3.setSum(DA).setSquare();
        }

        x_2.conditionalSwapWith(x_3, swap);
        z_2.conditionalSwapWith(z_3, swap);

        // return (x_2 * z_2^(p - 2))
        return x_2.setProduct(z_2.multiplicativeInverse());
    }

Re: RFR 8181594: Efficient and constant-time modular arithmetic

Reply via email to