On 10/8/2020 9:20 AM, Alexander Scheel wrote:
Hi all,

I saw that ALPN support from JEP 244 was backported to JDK8 and I've
recently had the time to take a closer look at it. For context, I'm
one of the maintainers of JSS, a NSS wrapper for Java. I've been
discussing this with another contributor, Fraser (cc'd).

Hi, thanks for looking it over, and especially thanks for reporting this. I've filed:

    https://bugs.openjdk.java.net/browse/JDK-8254631

to track.

One of the concerns we have with the implementation (and its exposure
in the corresponding SSLEngine/SSLSocket/SSLParameters interface) is
that protocols are passed in as Strings. However, RFC 7301 says in
section 6:

    o  Identification Sequence: The precise set of octet values that
       identifies the protocol.  This could be the UTF-8 encoding
       [RFC3629] of the protocol name.

This "could be" is probably what the original designer of the ALPN API went with for API ease-of-use, and it made sense at the time as everything in the IANA TLS Extensions list was in the ASCII range (0x00-0x7F). But the GREASE values (0x80-0xFF) invalidated that assumption.

When applied with GREASE'd values from RFC 8701, Strings don't work
well. In particular, most of the registered values [0] are non-UTF-8,

0x0A-0x7A does work, but 0x8A-0xFA won't as you pointed out.

which can't be easily round-tripped in Java. This means that while
precise octet values are specified by IANA, they cannot be properly
specified in Java.

In particular:

     byte[] desired = new byte[]{ (byte) 0xFA, (byte) 0xFA };
     String encoded = new String(desired, StandardCharsets.UTF_8);
     byte[] wire    = encoded.getBytes(StandardCharsets.UTF_8);
     String round   = new String(wire, StandardCharsets.UTF_8);

Right. These 2 values are mapped by the decoder to 2 Object Replacement Characters ("?" - \ufffd) representing 6 bytes:

    0xef, 0xbf, 0xbd,     0xef, 0xbf, 0xbd

    https://www.fileformat.info/info/unicode/char/fffd/index.htm

fails, as does choosing US_ASCII for the encoding:

     byte[] desired = new byte[]{ (byte) 0xFA, (byte) 0xFA };
     String encoded = new String(desired, StandardCharsets.US_ASCII);
     byte[] wire    = encoded.getBytes(StandardCharsets.UTF_8);
     String round   = new String(wire, StandardCharsets.UTF_8);

Yes, US_ASCII only uses the first 7 bits, so it also maps to 2 replacement characters ("?"):

    0x3f    0x3f

Note that we (at the application level) can't control the final (wire
/ round-tripped) encoding to UTF_8 as this is done within the SunJSSE
implementation:

Correct.

and perhaps other files I'm missing.

This decreases interoperability with other TLS implementations.
OpenSSL [1], NSS [2], and GnuTLS [3] support setting opaque blobs as
the ALPN protocol list, meaning the caller is free to supply GREASE'd
values. Go on the other hand still uses its string [4], but that
string class supports round-tripping non-UTF8 values correctly [5].

Additionally, it means that GREASE'd values sent by Java applications
aren't compliant with the RFC 8701/IANA wire values.

Is there some workaround I'm missing?

Nothing is coming to mind.

I believe that setting US_ASCII internally in SunJSSE isn't sufficient
to ensure the right wire encoding gets used. I'm thinking the only
real fix is to deprecate the String methods and provide byte[] methods
for all identifiers.

There is one other option that doesn't introduce a new API but does have some compatibility risk, and that is to use the ISO_8859_1/LATIN-1 charset instead of UTF-8. This would require folks who use UTF-8 to update their code, but I haven't yet found any code in the wild which actually uses anything U+0080 and above. I'm proposing a Security (or System?) property which would revert the behavior if it becomes a problem.

See the attached file, which is a proposal+code example which will eventually be turned into a formal CSR barring any significant issue.

I talked to our CSR lead, he felt that in this case, interoperability probably trumps compatibility for character values that likely aren't being used anyway, and behavior that was underspecified.

Brad
/*
 * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2 only, as
 * published by the Free Software Foundation.
 *
 * This code is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 * version 2 for more details (a copy is included in the LICENSE file that
 * accompanied this code).
 *
 * You should have received a copy of the GNU General Public License version
 * 2 along with this work; if not, write to the Free Software Foundation,
 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
 *
 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
 * or visit www.oracle.com if you need additional information or have any
 * questions.
 */

import java.nio.charset.StandardCharsets;

/*
 * (This text will likely form the basis of a future CSR.)
 * 
 * https://bugs.openjdk.java.net/browse/JDK-8254631
 * 
 * ALPN (RFC7301) values are sent in TLS extensions using byte arrays, but the
 * Java ALPN APIs selected Strings for ease of use. Internally, these Java
 * Strings are converted to byte arrays using UTF-8 as suggested as a possible
 * encoding in Section 6 of RFC 7301. This encoding convention was never
 * specified by the RFC or Java documentation/APIs.
 *
 * It is currently not possible for ALPN characters in the range of
 * (U+0080-U+00FF) to be output in SunJSSE, which are instead converted to a
 * multi-byte representation by the UTF-8 encoder/decoder.
 *
 * The GREASE mechanism (RFC 8701) was subsequently developed to help prevent
 * extensibility failures in the TLS ecosystem. Unfortunately, 1/2 of the
 * defined GREASE values fall into the (U+0080-U+00FF) range, and thus can't be
 * represented by SunJSSE (client or server side).
 *
 * A new API could be defined to use byte arrays, but this would be not be
 * helpful for earlier Java releases (8/11/15) without a Maintenance
 * Release (MR).  e.g. 
 * 
 *     https://jcp.org/aboutJava/communityprocess/mrel/jsr337/index3.html
 * 
 * The proposed workaround/fix is to have the Java JSSE implementation encode
 * Strings directly as ISO_8859_1/LATIN-1 which correctly outputs
 * (U+0000-U+00FF), but other UNICODE values U+8000-U+10FFFF will need to be
 * converted by applications to multiple consecutive bytes before sending
 * (e.g. UTF-8) rather than depending on SunJSSE to automatically provide
 * the (possibly incorrect) encoding.
 *
 * We don't anticipate this to be a significant interoperability issue, since
 * all known/current values in the IETF/IANA TLS ALPN extension list can be
 * encoded as ISO_8859_1/LATIN-1:
 *
 * 
https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml#alpn-protocol-ids
 *
 * This change will actually enhance interoperabibility with other
 * implementations which use byte arrays.
 *
 * The only compatability issue is if characters larger than U+007F are used.
 * We don't know of any applications currently using such ALPN values, but
 * there could be.  These values must be converted to the format required by
 * their peer.
 *
 * For compatibility issues, we introduce the following Java Security Property
 * to reverse this change:
 *
 *     #
 *     # The default Character set for converting ALPN values between byte
 *     # arrays and Strings. Older versions of JDK used UTF-8.
 *     #
 *     # jdk.jsse.alpnCharacterEncoder=UTF-8
 *     jdk.tls.alpnCharacterEncoder=ISO_8859_1
 *
 * which can be overridden to restore the previous conversion process.
 */
public class ALPNStringToBytesExample {
    
    /*
     * Any Unicode/Supplemental Unicode Values that need to be passed as UTF-8
     * must be first converted (see below):
     *
     *     'MEETEI MAYEK LETTER HUK'
     *     'MEETEI MAYEK LETTER UN'
     *     'MEETEI MAYEK LETTER I'
     *
     *     'DESERET CAPITAL LETTER LONG I'
     *     'DESERET CAPITAL LETTER LONG E'
     */
    private static final String HUKUNI = "\uabcd\uabce\uabcf";
    private static final String IE
            = new String(new int[]{0x10400, 0x10401}, 0, 2);

    // ALPN String array that will eventually be passed to SSLEngine/SSLSocket.
    private static final String[] ALPN_STRINGS = new String[]{
        
        // From the IETF/IANA TLS ALPN extension list.
        
        // ASCII/ISO_8859_1/LATIN-1 Strings
        "http/1.1",    // 0x68 0x74 0x74 0x70 0x2f 0x31 0x2e 0x31
        "h2",          // 0x68 0x32
        "imap",        // 0x69 0x6d 0x61 0x70
        "sunrpc",      // 0x73 0x75 0x6e 0x72 0x70 0x63
                       // etc.

        // GREASE (RFC 8701)
        toISO_8859_1((byte) 0x0A, (byte) 0x0A),
        toISO_8859_1((byte) 0x1A, (byte) 0x1A),
        toISO_8859_1((byte) 0x2A, (byte) 0x2A),
        toISO_8859_1((byte) 0x3A, (byte) 0x3A),
        toISO_8859_1((byte) 0x4A, (byte) 0x4A),
        toISO_8859_1((byte) 0x5A, (byte) 0x5A),
        toISO_8859_1((byte) 0x6A, (byte) 0x6A),
        toISO_8859_1((byte) 0x7A, (byte) 0x7A),
        toISO_8859_1((byte) 0x8A, (byte) 0x8A),
        toISO_8859_1((byte) 0x9A, (byte) 0x9A),
        toISO_8859_1((byte) 0xAA, (byte) 0xAA),
        toISO_8859_1((byte) 0xBA, (byte) 0xBA),
        toISO_8859_1((byte) 0xCA, (byte) 0xCA),
        toISO_8859_1((byte) 0xDA, (byte) 0xDA),
        toISO_8859_1((byte) 0xEA, (byte) 0xEA),
        toISO_8859_1((byte) 0xFA, (byte) 0xFA),

        // Additional Regular and Supplemental Unicode Points (above)
        toISO_8859_1(HUKUNI.getBytes(StandardCharsets.UTF_8)),
        toISO_8859_1(IE.getBytes(StandardCharsets.UTF_8))
    };

    public static void main(String[] args) throws Exception {

        /*
         * Create SSLEngine and set ALPN parameters.
         * 
         *     SSLContext sslContext = SSLContext.getDefault();
         *     SSLEngine sslEngine = sslContext.createSSLEngine("peer", 80);
         *     SSLParameters sslParameters = sslEngine.getSSLParameters();
         *     sslParameters.setApplicationProtocols(ALPN_VALUES);
         *     sslEngine.setSSLParameters(sslParameters);
         *     sslEngine.beginHandshake(); sslEngine.wrap()/unwrap();
         *     // etc.
         */

        /*
         * Local SunJSSE will now encode the String array as ISO_8859_1
         * byte array as expected by RFC 8701.
         */
        byte[][] outgoingBytes = new byte[ALPN_STRINGS.length][0];
        for (int i = 0; i < ALPN_STRINGS.length; i++) {
            outgoingBytes[i]
                    = ALPN_STRINGS[i].getBytes(StandardCharsets.ISO_8859_1);
        }

        /*
         * Peer SunJSSE receives byte array and parses back into ISO_8859_1
         * String array.
         */
        String[] incomingStrings = new String[outgoingBytes.length];
        for (int i = 0; i < incomingStrings.length; i++) {
            incomingStrings[i]
                    = new String(outgoingBytes[i], StandardCharsets.ISO_8859_1);
        }

        // Check the ASCII/LATIN chars.
        for (int i = 0; i < incomingStrings.length - 2; i++) {
            checkStrings(i, incomingStrings[i], ALPN_STRINGS[i]);
        }

        // Last 2 Strings need to be decoded back as UTF-8.
        checkStrings(incomingStrings.length - 2,
                toUTF_8String(incomingStrings[incomingStrings.length - 2]),
                HUKUNI);
        checkStrings(incomingStrings.length - 1,
                toUTF_8String(incomingStrings[incomingStrings.length - 1]),
                IE);
    }

    // Shorten method calls above.
    private static String toISO_8859_1(byte... bytes) {
        return new String(bytes, StandardCharsets.ISO_8859_1);
    }
    
    // Shorten method calls above.
    private static String toUTF_8String(String incomingString) {
        return new String(incomingString.getBytes(
                StandardCharsets.ISO_8859_1), StandardCharsets.UTF_8);
    }

    private static void checkStrings(int i, String incoming, String alpn) {
        System.out.println(i + ": \"" + incoming + "\" = \""
                + alpn + "\"");

        if (!incoming.equals(alpn)) {
            System.out.println("ISO_8859_1 didn't convert cleanly");
        }
    }
}

Reply via email to