Re: 8215441: Increase uniformity of the distribution of BigIntegers constructed by BigInteger(int, Random)
I'm not sure what the problem is. If X is uniform, then "the number of leading zero bits" of X is exponential. The probability of getting a "small" number, in which the first (say) 32 bits are 0 is 2^-32. If this is what is measured by the histrogram[5], then the current implementation looks correct to me. On 12/19/2018 11:01 PM, Brian Burkhalter wrote: https://bugs.openjdk.java.net/browse/JDK-8215441 This issue was filed to cover improving the uniformity of randomly generated BigIntegers. It is not intended to resolve [1] which is deliberately left open. The proposed patch implements a modified version of the “workaround” suggested in [1]. The problem is that the magnitude of the random BigInteger is created from a sequence of bytes generated by Random.nextBytes() [2]. The likelihood that any of these bytes is zero is small so the distribution of the resulting random BigIntegers is skewed towards values close to the maximum bit size “numBits” specified to the constructor. The workaround suggested in [1] is to randomly change numBits to the value numBits = Random.nextInt(numBits + 1) [3]. (Actually the suggested workaround is nextInt(numBits) which is incorrect as the parameter is an exclusive upper bound.) This greatly improves the uniformity of the distribution. A remaining problem however is that now the very largest numbers in the interval [0,2^numBits) are underrepresented. A modification of this approach is to increment the new value of numBits as numBits = Random.nextInt(numBits + 1) + 1 [4]. This was empirically observed to improve the underrepresentation of the largest values. The distribution of the random BigIntegers was estimated using [5]. For a given maximum bit length, bin size, and number of random values to generate, this creates a histogram and calculates the coefficient of variation of the numbers generated. The histogram bin at index zero represents the largest valued bin in the result. The count in a given histogram bin is the number of values for which that bin is the leftmost (largest valued) with at least one non-zero bit. The bin of maximum index represents zero. Results for the current and two modified approaches for 256 bits with a 1-bit bin size and for 4096 bits with a 4-bit bin size are given at [6-11]. As may be observed, the original histogram is clustered towards the largest possible value 2^numBits - 1, and the coefficient of variation is small. The results for the two variants of the patch show a flattened distribution, i.e., more uniform, and a significantly larger coefficient of variation. The second approach shows better flattening of both ends of the histogram. These results are samples only but are exemplary of the results observed over numerous runs of this code. The test ModPow is modified as the modPow() method throws an ArithmeticException for a zero modulus. The current algorithm never generates a random BigInteger equal to zero however so that exception never occurs. That is not the case for either modified version. Thanks, Brian [1] https://bugs.openjdk.java.net/browse/JDK-8146153 [2] https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Random.html#nextBytes(byte[]) [3] http://cr.openjdk.java.net/~bpb/8146153/webrev.00/ [4] http://cr.openjdk.java.net/~bpb/8146153/webrev.01/ [5] http://cr.openjdk.java.net/~bpb/8146153/StatsBigIntegerRandom.java [6] http://cr.openjdk.java.net/~bpb/8146153/before-256-1.txt [7] http://cr.openjdk.java.net/~bpb/8146153/after-256-1.txt [8] http://cr.openjdk.java.net/~bpb/8146153/after-extra-bit-256-1.txt [9] http://cr.openjdk.java.net/~bpb/8146153/before-4096-4.txt [10] http://cr.openjdk.java.net/~bpb/8146153/after-4096-4.txt [11] http://cr.openjdk.java.net/~bpb/8146153/after-extra-bit-4096-4.txt
Re: Please review EdDSA API
+core-libs-dev for additional API expertise. On 7/25/2018 10:29 AM, Adam Petcher wrote: The draft CSR[1] for the EdDSA API[2] is ready for review. Please take a look and send me any feedback you may have. Here are a few high-level notes to explain the API: 1) Where possible, this API is similar to the API for X25519/X448. To get the complete background/motivation for the API design, you can review the discussion[3] on this topic. 2) Similar to X25519/X448, private keys are byte arrays, and public keys coordinates. Though we can't get by with a single BigInteger coordinate for EdDSA, so I am using the new EdPoint class to hold the coordinates. 3) EdDSA has multiple signature modes defined in the RFC[4], including some that "prehash" the input before signing. The draft API uses the EdDSAParameterSpec class to specify parameters of these modes. The standard does not allow an arbitrary choice of prehash function, so the API for EdDSA does not support algorithm names like "SHA256withEdDSA". [1] https://wiki.openjdk.java.net/display/csr/Main [2] https://bugs.openjdk.java.net/browse/JDK-8190219 [3] http://mail.openjdk.java.net/pipermail/security-dev/2017-September/016325.html [4] https://tools.ietf.org/html/rfc8032
Re: RFR 8181594: Efficient and constant-time modular arithmetic
Latest webrev: http://cr.openjdk.java.net/~apetcher/8181594/webrev.02/ Comments inline below. In addition, I also changed the name of IntegerModuloP_Base to IntegerModuloP, and IntegerModuloP to ImmutableIntegerModuloP. On 3/11/2018 12:04 PM, Xuelei Fan wrote: On 2/26/2018 10:39 AM, Adam Petcher wrote: On 2/23/2018 12:46 PM, Xuelei Fan wrote: ArrayUtil.java: === I'm not very sure how widely this utilities will be used in the future. Looks like only BigIntegerModuloP uses this classes. I may prefer to define private methods for byte array swap in BigIntegerModuloP. It is also used by XDHPublicKeyImpl (in the XDH code review). XDH public keys are represented as BigInteger, and I use the array reverse method to convert encoded keys to BigInteger. If it is not widely used by other classes, please have these methods in the class where is get called. The sun.security.util is exported to other modules as well, we may not want to add stuff into this package unless it is really necessary. Okay. I put these methods in BigIntegerModuloP and removed the ArrayUtil class. This array reverse code will be duplicated where it is used by XDH public keys (in the other code review). MutableIntegerModuloP.java == void conditionalSwapWith(MutableIntegerModuloP b, int swap); As the 'swap' parameter can only be 0 or 1, could it be a boolean parameter? I couldn't come up with a way to implement this without branching when the swap parameter is boolean. See IntegerPolynomial.conditionalSwap to see how this is implemented in arithmetic with an int swap argument. If you (or anyone) can think of a way to do this with boolean, let me know. I added a sentence to the comment above conditionalSwapWith that describes why it is an int instead of a boolean. I did not get the point about the need to avoid branching. Can you have more details? The goal is to avoid things like if(secret){...}, in order to prevent the secrets from leaking into side channels (mostly timing and cache). The way this method is used by XDH, the swap parameter is a single bit of the private key. By storing this bit as an integer, and then doing the swap using only integer arithmetic, we can avoid branching which may leak the bits of the key. Except the conditionalSwapWith() method, I did not get the points why we need a mutable version. Would you please have more description of this requirement? The comment above the class definition has this sentence: "This interface can be used to improve performance and avoid the allocation of a large number of temporary objects." Do you need more information than this in the comments? The performance motivation is so that a.add(b).multiply(c)... can be done without allocating a new buffer for each operation. For example, without mutable field elements, an X25519 point multiplication would allocate around 4,300 temporary arrays totaling 350,000 bytes. If I remember correctly, switching the X25519 implementation to mutable field elements reduced the point multiplication time by about half. I see your point. The benefits is obviously. OK, why you need the immutable version then? Sounds like the mutable version interface is sufficient, including performance. If an immutable version is really needed, we can have the implementation making the decision. Accordingly, the conditionalSwapWith() can be defined as optional method, if it is not required to be implemented in immutable implementation. It's confusing to me that the immutable and mutable and the base versions/interfaces mixed together. It would be nice if we can simplify the interface a little bit. For internal APIs, sometimes we don't want the same quality level as public APIs. I think this set of class will be widely used by new EC curves, ChaCha20/Poly1305, or more in the future. It would be nice if we could do it good at the beginning. The mutable version adds the conditional swap as well as mutable versions of many of the basic operations. The XDH implementation uses both the mutable and immutable versions. The immutable version allows me to simplify the client code because I don't have to worry about whether some value has been modified. For example, the XDH code keeps a representation of 0, 1, and the constant that defines the curve as immutable values. So I prefer to have both. It complicates this API a bit, but it allows simpler and more robust code in the client of this API. IntegerModuloP_Base.java default byte[] addModPowerTwo(IntegerModuloP_Base b, int len) void addModPowerTwo(IntegerModuloP_Base b, byte[] result); For the first sign of the method names, I thought it is to calculate as "(this + b) ^ 2 mod m". To be precise, it calculates "((this % p) + (b % p)) % 2^m" (where p is the prime that defines the field, and m is the desired length, in bits
Re: RFR 8181594: Efficient and constant-time modular arithmetic
this would not produce the desired result. I guess, but not very sure, it is for constant time calculation. If the function is required, could it be renamed as: // the result is inside of the size range IntegerModuloP addModSize(IntegerModuloP_Base b, int size) Or // the result is wrapped if outside of the size range IntegerModuloP addOnWrap(IntegerModuloP_Base b, int size) and the use may look like: this.addModSize(b, size).asByteArray() Any attempt to perform the addition in IntegerModuloP and then pull out the byte array will not work. This class can only represent field elements, so the sum would be in the field, which is not what we want. Will review the rest when I understand more about the interfaces design. Thanks, Xuelei On 1/30/2018 8:52 AM, Adam Petcher wrote: +core-libs-dev On 1/26/2018 4:06 PM, Adam Petcher wrote: JBS: https://bugs.openjdk.java.net/browse/JDK-8181594 Webrev: http://cr.openjdk.java.net/~apetcher/8181594/webrev.00/ This is a code review for the field arithmetic that will be used in implementations of X25519/X448 key agreement, the Poly1305 authenticator, and EdDSA signatures. I believe that the library has all the features necessary for X25519/X448 and Poly1305, and I expect at most a couple of minor enhancements will be required to support EdDSA. There is no public API for this library, so we can change it in the future to suit the needs of new algorithms without breaking compatibility with external code. Still, I made an attempt to clearly structure and document the (internal) API, and I want to make sure it is understandable and easy to use. This is not a general-purpose modular arithmetic library. It will only work well in circumstances where the sequence of operations is restricted, and where the prime that defines the field has some useful structure. Moreover, each new field will require some field-specific code that takes into account the structure of the prime and the way the field is used in the application. The initial implementation includes a field for Poly1305 and the fields for X25519/X448 which should also work for EdDSA. The benefits of using this library are that it is much more efficient than using similar operations in BigInteger. Also, many operations are branch-free, making them suitable for use in a side-channel resistant implementation that does not branch on secrets. To provide some context, I have attached a code snippet describing how this library can be used. The snippet is the constant-time Montgomery ladder from my X25519/X448 implementation, which I expect to be out for review soon. X25519/X448 only uses standard arithmetic operations, and the more unusual features (e.g. add modulo a power of 2) are needed by Poly1305. The field arithmetic (for all fields) is implemented using a 32-bit representation similar to the one described in the Ed448 paper[1] (in the "Implementation on 32-bit platforms" section). Though my implementation uses signed limbs, and grade-school multiplication instead of Karatsuba. The argument for correctness is essentially the same for all three fields: the magnitude of each 64-bit limb is at most 2^(k-1) after reduction, except for the last limb which may have a magnitude of up to 2^k. The values of k are between 26 to 28 (depending on the field), and we can calculate that the maximum magnitude for any limb during an add-multiply-carry-reduce sequence is always less than 2^63. Therefore, no overflow occurs and all operations are correct. Process note: this enhancement is part of JEP 324 (Key Agreement with Curve25519 and Curve448). When this code review is complete, nothing will happen until all other work for this JEP is complete, and the JEP is accepted as part of some release. This means that this code will be pushed to the repo along with the X25519/X448 code that uses it. [1] https://eprint.iacr.org/2015/625.pdf
Re: RFR 8181594: Efficient and constant-time modular arithmetic
+core-libs-dev On 1/26/2018 4:06 PM, Adam Petcher wrote: JBS: https://bugs.openjdk.java.net/browse/JDK-8181594 Webrev: http://cr.openjdk.java.net/~apetcher/8181594/webrev.00/ This is a code review for the field arithmetic that will be used in implementations of X25519/X448 key agreement, the Poly1305 authenticator, and EdDSA signatures. I believe that the library has all the features necessary for X25519/X448 and Poly1305, and I expect at most a couple of minor enhancements will be required to support EdDSA. There is no public API for this library, so we can change it in the future to suit the needs of new algorithms without breaking compatibility with external code. Still, I made an attempt to clearly structure and document the (internal) API, and I want to make sure it is understandable and easy to use. This is not a general-purpose modular arithmetic library. It will only work well in circumstances where the sequence of operations is restricted, and where the prime that defines the field has some useful structure. Moreover, each new field will require some field-specific code that takes into account the structure of the prime and the way the field is used in the application. The initial implementation includes a field for Poly1305 and the fields for X25519/X448 which should also work for EdDSA. The benefits of using this library are that it is much more efficient than using similar operations in BigInteger. Also, many operations are branch-free, making them suitable for use in a side-channel resistant implementation that does not branch on secrets. To provide some context, I have attached a code snippet describing how this library can be used. The snippet is the constant-time Montgomery ladder from my X25519/X448 implementation, which I expect to be out for review soon. X25519/X448 only uses standard arithmetic operations, and the more unusual features (e.g. add modulo a power of 2) are needed by Poly1305. The field arithmetic (for all fields) is implemented using a 32-bit representation similar to the one described in the Ed448 paper[1] (in the "Implementation on 32-bit platforms" section). Though my implementation uses signed limbs, and grade-school multiplication instead of Karatsuba. The argument for correctness is essentially the same for all three fields: the magnitude of each 64-bit limb is at most 2^(k-1) after reduction, except for the last limb which may have a magnitude of up to 2^k. The values of k are between 26 to 28 (depending on the field), and we can calculate that the maximum magnitude for any limb during an add-multiply-carry-reduce sequence is always less than 2^63. Therefore, no overflow occurs and all operations are correct. Process note: this enhancement is part of JEP 324 (Key Agreement with Curve25519 and Curve448). When this code review is complete, nothing will happen until all other work for this JEP is complete, and the JEP is accepted as part of some release. This means that this code will be pushed to the repo along with the X25519/X448 code that uses it. [1] https://eprint.iacr.org/2015/625.pdf private IntegerModuloP_Base pointMultiply(byte[] k, IntegerModuloP u){ IntegerModuloP x_1 = u; MutableIntegerModuloP x_2 = one.mutable(); MutableIntegerModuloP z_2 = zero.mutable(); MutableIntegerModuloP x_3 = u.mutable(); MutableIntegerModuloP z_3 = one.mutable(); int swap = 0; // Variables below are reused to avoid unnecessary allocation // They will be assigned in the loop, so initial value doesn't matter MutableIntegerModuloP m1 = zero.mutable(); MutableIntegerModuloP DA = zero.mutable(); MutableIntegerModuloP E = zero.mutable(); MutableIntegerModuloP a24_times_E = zero.mutable(); for(int t = params.getBits() - 1; t >= 0; t--){ int k_t = bitAt(k, t); swap = swap ^ k_t; x_2.conditionalSwapWith(x_3, swap); z_2.conditionalSwapWith(z_3, swap); swap = k_t; // A(m1) = x_2 + z_2 m1.setValue(x_2).setSum(z_2); // D = x_3 - z_3 // DA = D * A(m1) DA.setValue(x_3).setDifference(z_3).setProduct(m1); // AA(m1) = A(m1)^2 m1.setSquare(); // B(x_2) = x_2 - z_2 x_2.setDifference(z_2); // C = x_3 + z_3 // CB(x_3) = C * B(x_2) x_3.setSum(z_3).setProduct(x_2); // BB(x_2) = B^2 x_2.setSquare(); // E = AA(m1) - BB(x_2) E.setValue(m1).setDifference(x_2); // compute a24 * E using SmallValue a24_times_E.setValue(E); a24_times_E.setProduct(a24); // assign results to x_3, z_3, x_2, z_2 // x_2 = AA(m1) * BB x_2.setProduct(m1); // z_2 = E * (AA(m1) + a24 * E) z_2.setValue(m1).setSu
Re: RFR 8139206: Add InputStream readNBytes(int len)
The spec of the new method doesn't give me enough information to determine whether it is safe to call it when the value of the length argument is much larger than the number of bytes I expect to actually read. This use case comes up frequently in security libraries, because we have to handle length values that were chosen by an attacker. Would it be possible to add a sentence or two to the spec to clarify this situation? Possible wording, if this method can be called with large length values: "The total amount of memory allocated by this method is proportional to the number of bytes read from the stream. Therefore, the method may be safely called with very large values of {@code len}. Possible wording, otherwise: "The total amount of memory allocated by this method may be proportional to the value of {@code len}. Therefore, calling this method with very large values of {@code len} is not recommended." On 1/17/2018 11:24 AM, Brian Burkhalter wrote: The proposed change has been modified to replace the two methods byte[] InputStream.readAllBytes(int) // reads at most ‘len’ bytes byte[] InputStream.readNBytes(int) // reads exactly ‘len’ bytes or throws IOException with a single method byte[] InputStream.readNBytes(int) // reads at most ‘len’ bytes A negative value of ‘len’ will now cause an IllegalArgumentException instead of an IndexOutOfBoundsException. Also some verbiage has been improved. http://cr.openjdk.java.net/~bpb/8139206/webrev.01/ Thanks, Brian On Jan 16, 2018, at 11:17 AM, Brian Burkhalterwrote: https://bugs.openjdk.java.net/browse/JDK-8139206 http://cr.openjdk.java.net/~bpb/8139206/webrev.00/ This change would add a new method “byte[] InputStream.readNBytes(int len)” which would read up to at most ‘len’ bytes from the stream and return them in an internally allocated array.
Re: API review for X25519/X448
+core-libs-dev (to get some additional API guidance) On 1/3/2018 11:26 AM, Adam Petcher wrote: Now that the JEP[1] for X25519/X448 key agreement is a candidate, we can proceed with the API and specification review. Please review the proposed API spec[2] and provide comments by the end of Saturday, January 13, anywhere on earth. At that point, I will combine your feedback with the initial feedback from the CSR group[3] and submit the API for final review by the CSR. The only significant change to the API since our last discussion[4] is that I changed the names of the key specs and interfaces from "XDH..." to "XEC..." This makes them more general and reusable in things like XEdDSA[5] and other non-Diffie-Hellman cryptosystems based on the representations/operations defined in RFC 7748[6]. [1] http://openjdk.java.net/jeps/324 [2] https://bugs.openjdk.java.net/browse/JDK-8189806 [3] https://wiki.openjdk.java.net/display/csr/Main [4] http://mail.openjdk.java.net/pipermail/security-dev/2017-September/016325.html [5] https://signal.org/docs/specifications/xeddsa/ [6] https://tools.ietf.org/html/rfc7748