Hi Roger, I will plan to push tomorrow morning pending any last minute reviews
Best Lance > On Jul 21, 2020, at 9:57 AM, Roger Riggs <roger.ri...@oracle.com> wrote: > > Hi Rafaello, Lance, > > That looks good to me. > > Thanks, Roger > > On 7/19/20 2:31 PM, Lance Andersen wrote: >> Hi Raffaello, >> >> I think the changes look reasonable. >> >> I have created a webrev, >> http://cr.openjdk.java.net/~lancea/8222187/webrev.00/ >> <http://cr.openjdk.java.net/~lancea/8222187/webrev.00/>, for others to >> review as it is a bit easier than the patch. >> >> I have also run the JCK tests in this area as well as mach5 tiers1-3 (which >> I believe Roger has also) >> >> Best >> Lance >> >>> On Jul 15, 2020, at 5:44 PM, Raffaello Giulietti >>> <raffaello.giulie...@gmail.com <mailto:raffaello.giulie...@gmail.com>> >>> wrote: >>> >>> Hi Roger, >>> >>> On 2020-07-15 22:21, Roger Riggs wrote: >>>> Hi Raffaello, >>>> Base64.java: >>>> 2: Please update 2nd copyright year to 2020 >>> >>> I was unsure what to do here, thanks for guidance. >>> I also removed the seemingly useless import at former L33. >>> >>> >>> >>>> leftovers(): >>>> 996: "&" the shortcutting && is more typical in a condition. >>>> 997: the -= is buried in an expression and a reader might miss it. >>> >>> Corrected, as well as the analogous -= usage for wpos now at L1106-7 >>> >>> >>>> 1053: "can be reallocated to other vars": not a useful comment, >>>> reflecting on only one possible implementation that is out of the control >>>> of the developer. >>>> I'd almost rather see 'len' than 'limit - off'; it might be informative to >>>> the reader if 'limit' was declared final. (1056:) >>> >>> Done, as well as declaring i and v final in the loop. >>> >>> >>> >>>> TestBase54.java: >>>> 2: Update 2nd copyright year to 2020 >>>> 27: Please add the bug number to the @bug line. >>>> Style-wise, I would remove the blank lines at the beginning of blocks. >>>> (1095, 1115) >>> >>> Done. >>> >>> >>>> Thanks, Roger >>>> On 7/14/20 11:47 AM, Raffaello Giulietti wrote: >>>>> Hi Roger, >>>>> >>>>> here's the latest version of the patch. >>>>> >>>>> I did: >>>>> * Withdraw the simplification in encodedOutLength(), as it is indeed out >>>>> of scope for this bug fix. >>>>> * Restore field names in DecInputStream except for nextin (now wpos) and >>>>> nextout (now rpos) because they have slightly different semantics. That's >>>>> just for clarity. I would have nothing against reusing the old names for >>>>> the proposed purpose. >>>>> * Switch back to the original no-arg read() implementation. >>>>> * Adjust comment continuation lines. >>>>> * Preserve the proposed logic of read(byte[], int, int) and the >>>>> supporting methods. >>>>> >>>>> Others needed changes are: >>>>> * Correct the copyright years: that's something better left to Oracle. >>>>> * Remove the apparently useless import at L33. I could build and run >>>>> without it. >>>>> >>>>> >>>>> Greetings >>>>> Raffaello >>>>> >>> >>> >>> ---- >>> >>> # HG changeset patch >>> # User lello >>> # Date 1594848775 -7200 >>> # Wed Jul 15 23:32:55 2020 +0200 >>> # Node ID f7407f35e83ab508f0e6deab4f776bb1a1c85e68 >>> # Parent a5649ebf94193770ca763391dd63807059d333b4 >>> 8222187: java.util.Base64.Decoder stream adds unexpected null bytes at the >>> end >>> Reviewed-by: TBD >>> Contributed-by: Raffaello Giulietti <raffaello.giulie...@gmail.com >>> <mailto:raffaello.giulie...@gmail.com>> >>> >>> diff --git a/src/java.base/share/classes/java/util/Base64.java >>> b/src/java.base/share/classes/java/util/Base64.java >>> --- a/src/java.base/share/classes/java/util/Base64.java >>> +++ b/src/java.base/share/classes/java/util/Base64.java >>> @@ -1,5 +1,5 @@ >>> /* >>> - * Copyright (c) 2012, 2019, Oracle and/or its affiliates. All rights >>> reserved. >>> + * Copyright (c) 2012, 2020, Oracle and/or its affiliates. All rights >>> reserved. >>> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>> * >>> * This code is free software; you can redistribute it and/or modify it >>> @@ -30,7 +30,6 @@ >>> import java.io.IOException; >>> import java.io.OutputStream; >>> import java.nio.ByteBuffer; >>> -import java.nio.charset.StandardCharsets; >>> >>> import sun.nio.cs.ISO_8859_1; >>> >>> @@ -961,12 +960,15 @@ >>> >>> private final InputStream is; >>> private final boolean isMIME; >>> - private final int[] base64; // base64 -> byte mapping >>> - private int bits = 0; // 24-bit buffer for decoding >>> - private int nextin = 18; // next available "off" in "bits" >>> for input; >>> - // -> 18, 12, 6, 0 >>> - private int nextout = -8; // next available "off" in "bits" >>> for output; >>> - // -> 8, 0, -8 (no byte for >>> output) >>> + private final int[] base64; // base64 -> byte mapping >>> + private int bits = 0; // 24-bit buffer for decoding >>> + >>> + /* writing bit pos inside bits; one of 24 (left, msb), 18, 12, 6, >>> 0 */ >>> + private int wpos = 0; >>> + >>> + /* reading bit pos inside bits: one of 24 (left, msb), 16, 8, 0 */ >>> + private int rpos = 0; >>> + >>> private boolean eof = false; >>> private boolean closed = false; >>> >>> @@ -983,107 +985,153 @@ >>> return read(sbBuf, 0, 1) == -1 ? -1 : sbBuf[0] & 0xff; >>> } >>> >>> - private int eof(byte[] b, int off, int len, int oldOff) >>> - throws IOException >>> - { >>> + private int leftovers(byte[] b, int off, int pos, int limit) { >>> eof = true; >>> - if (nextin != 18) { >>> - if (nextin == 12) >>> - throw new IOException("Base64 stream has one >>> un-decoded dangling byte."); >>> - // treat ending xx/xxx without padding character legal. >>> - // same logic as v == '=' below >>> - b[off++] = (byte)(bits >> (16)); >>> - if (nextin == 0) { // only one padding byte >>> - if (len == 1) { // no enough output space >>> - bits >>= 8; // shift to lowest byte >>> - nextout = 0; >>> - } else { >>> - b[off++] = (byte) (bits >> 8); >>> - } >>> - } >>> + >>> + /* >>> + * We use a loop here, as this method is executed only a few >>> times. >>> + * Unrolling the loop would probably not contribute much here. >>> + */ >>> + while (rpos - 8 >= wpos && pos != limit) { >>> + rpos -= 8; >>> + b[pos++] = (byte) (bits >> rpos); >>> } >>> - return off == oldOff ? -1 : off - oldOff; >>> + return pos - off != 0 || rpos - 8 >= wpos ? pos - off : -1; >>> } >>> >>> - private int padding(byte[] b, int off, int len, int oldOff) >>> - throws IOException >>> - { >>> - // = shiftto==18 unnecessary padding >>> - // x= shiftto==12 dangling x, invalid unit >>> - // xx= shiftto==6 && missing last '=' >>> - // xx=y or last is not '=' >>> - if (nextin == 18 || nextin == 12 || >>> - nextin == 6 && is.read() != '=') { >>> - throw new IOException("Illegal base64 ending sequence:" + >>> nextin); >>> + private int eof(byte[] b, int off, int pos, int limit) throws >>> IOException { >>> + /* >>> + * pos != limit >>> + * >>> + * wpos == 18: x dangling single x, invalid unit >>> + * accept ending xx or xxx without padding characters >>> + */ >>> + if (wpos == 18) { >>> + throw new IOException("Base64 stream has one un-decoded >>> dangling byte."); >>> } >>> - b[off++] = (byte)(bits >> (16)); >>> - if (nextin == 0) { // only one padding byte >>> - if (len == 1) { // no enough output space >>> - bits >>= 8; // shift to lowest byte >>> - nextout = 0; >>> - } else { >>> - b[off++] = (byte) (bits >> 8); >>> - } >>> + rpos = 24; >>> + return leftovers(b, off, pos, limit); >>> + } >>> + >>> + private int padding(byte[] b, int off, int pos, int limit) throws >>> IOException { >>> + /* >>> + * pos != limit >>> + * >>> + * wpos == 24: = (unnecessary padding) >>> + * wpos == 18: x= (dangling single x, invalid unit) >>> + * wpos == 12 and missing last '=': xx= (invalid padding) >>> + * wpos == 12 and last is not '=': xx=x (invalid padding) >>> + */ >>> + if (wpos >= 18 || wpos == 12 && is.read() != '=') { >>> + throw new IOException("Illegal base64 ending sequence:" + >>> wpos); >>> } >>> - eof = true; >>> - return off - oldOff; >>> + rpos = 24; >>> + return leftovers(b, off, pos, limit); >>> } >>> >>> @Override >>> public int read(byte[] b, int off, int len) throws IOException { >>> - if (closed) >>> + if (closed) { >>> throw new IOException("Stream is closed"); >>> - if (eof && nextout < 0) // eof and no leftover >>> - return -1; >>> - if (off < 0 || len < 0 || len > b.length - off) >>> - throw new IndexOutOfBoundsException(); >>> - int oldOff = off; >>> - while (nextout >= 0) { // leftover output byte(s) in >>> bits buf >>> - if (len == 0) >>> - return off - oldOff; >>> - b[off++] = (byte)(bits >> nextout); >>> - len--; >>> - nextout -= 8; >>> + } >>> + Objects.checkFromIndexSize(off, len, b.length); >>> + if (len == 0) { >>> + return 0; >>> } >>> - bits = 0; >>> - while (len > 0) { >>> - int v = is.read(); >>> - if (v == -1) { >>> - return eof(b, off, len, oldOff); >>> + >>> + /* >>> + * Rather than keeping 2 running vars (e.g., off and len), >>> + * we only keep one (pos), while definitely fixing the >>> boundaries >>> + * of the range [off, limit). >>> + * More specifically, each use of pos as an index in b meets >>> + * pos - off >= 0 & limit - pos > 0 >>> + * >>> + * Note that limit can overflow to Integer.MIN_VALUE. However, >>> + * as long as comparisons with pos are as coded, there's no >>> harm. >>> + */ >>> + int pos = off; >>> + final int limit = off + len; >>> + if (eof) { >>> + return leftovers(b, off, pos, limit); >>> + } >>> + >>> + /* >>> + * Leftovers from previous invocation; here, wpos = 0. >>> + * There can be at most 2 leftover bytes (rpos <= 16). >>> + * Further, b has at least one free place. >>> + * >>> + * The logic could be coded as a loop, (as in method >>> leftovers()) >>> + * but the explicit "unrolling" makes it possible to generate >>> + * better byte extraction code. >>> + */ >>> + if (rpos == 16) { >>> + b[pos++] = (byte) (bits >> 8); >>> + rpos = 8; >>> + if (pos == limit) { >>> + return len; >>> } >>> - if ((v = base64[v]) < 0) { >>> - if (v == -2) { // padding byte(s) >>> - return padding(b, off, len, oldOff); >>> - } >>> - if (v == -1) { >>> - if (!isMIME) >>> - throw new IOException("Illegal base64 >>> character " + >>> - Integer.toString(v, 16)); >>> - continue; // skip if for rfc2045 >>> - } >>> - // neve be here >>> - } >>> - bits |= (v << nextin); >>> - if (nextin == 0) { >>> - nextin = 18; // clear for next in >>> - b[off++] = (byte)(bits >> 16); >>> - if (len == 1) { >>> - nextout = 8; // 2 bytes left in bits >>> - break; >>> - } >>> - b[off++] = (byte)(bits >> 8); >>> - if (len == 2) { >>> - nextout = 0; // 1 byte left in bits >>> - break; >>> - } >>> - b[off++] = (byte)bits; >>> - len -= 3; >>> - bits = 0; >>> - } else { >>> - nextin -= 6; >>> + } >>> + if (rpos == 8) { >>> + b[pos++] = (byte) bits; >>> + rpos = 0; >>> + if (pos == limit) { >>> + return len; >>> } >>> } >>> - return off - oldOff; >>> + >>> + bits = 0; >>> + wpos = 24; >>> + for (;;) { >>> + /* pos != limit & rpos == 0 */ >>> + final int i = is.read(); >>> + if (i < 0) { >>> + return eof(b, off, pos, limit); >>> + } >>> + final int v = base64[i]; >>> + if (v < 0) { >>> + /* >>> + * i not in alphabet, thus >>> + * v == -2: i is '=', the padding >>> + * v == -1: i is something else, typically CR or >>> LF >>> + */ >>> + if (v == -1) { >>> + if (isMIME) { >>> + continue; >>> + } >>> + throw new IOException("Illegal base64 character >>> 0x" + >>> + Integer.toHexString(i)); >>> + } >>> + return padding(b, off, pos, limit); >>> + } >>> + wpos -= 6; >>> + bits |= v << wpos; >>> + if (wpos != 0) { >>> + continue; >>> + } >>> + if (limit - pos >= 3) { >>> + /* frequently taken fast path, no need to track rpos */ >>> + b[pos++] = (byte) (bits >> 16); >>> + b[pos++] = (byte) (bits >> 8); >>> + b[pos++] = (byte) bits; >>> + bits = 0; >>> + wpos = 24; >>> + if (pos == limit) { >>> + return len; >>> + } >>> + continue; >>> + } >>> + >>> + /* b has either 1 or 2 free places */ >>> + b[pos++] = (byte) (bits >> 16); >>> + if (pos == limit) { >>> + rpos = 16; >>> + return len; >>> + } >>> + b[pos++] = (byte) (bits >> 8); >>> + /* pos == limit, no need for an if */ >>> + rpos = 8; >>> + return len; >>> + } >>> } >>> >>> @Override >>> diff --git a/test/jdk/java/util/Base64/TestBase64.java >>> b/test/jdk/java/util/Base64/TestBase64.java >>> --- a/test/jdk/java/util/Base64/TestBase64.java >>> +++ b/test/jdk/java/util/Base64/TestBase64.java >>> @@ -1,5 +1,5 @@ >>> /* >>> - * Copyright (c) 2012, 2017, Oracle and/or its affiliates. All rights >>> reserved. >>> + * Copyright (c) 2012, 2020, Oracle and/or its affiliates. All rights >>> reserved. >>> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>> * >>> * This code is free software; you can redistribute it and/or modify it >>> @@ -24,7 +24,7 @@ >>> /** >>> * @test >>> * @bug 4235519 8004212 8005394 8007298 8006295 8006315 8006530 8007379 >>> 8008925 >>> - * 8014217 8025003 8026330 8028397 8129544 8165243 8176379 >>> + * 8014217 8025003 8026330 8028397 8129544 8165243 8176379 8222187 >>> * @summary tests java.util.Base64 >>> * @library /test/lib >>> * @build jdk.test.lib.RandomFactory >>> @@ -144,6 +144,10 @@ >>> testDecoderKeepsAbstinence(Base64.getDecoder()); >>> testDecoderKeepsAbstinence(Base64.getUrlDecoder()); >>> testDecoderKeepsAbstinence(Base64.getMimeDecoder()); >>> + >>> + // tests patch addressing JDK-8222187 >>> + // https://bugs.openjdk.java.net/browse/JDK-8222187 >>> <https://bugs.openjdk.java.net/browse/JDK-8222187> >>> + testJDK_8222187(); >>> } >>> >>> private static void test(Base64.Encoder enc, Base64.Decoder dec, >>> @@ -607,4 +611,26 @@ >>> } >>> } >>> } >>> + >>> + private static void testJDK_8222187() throws Throwable { >>> + byte[] orig = "12345678".getBytes("US-ASCII"); >>> + byte[] encoded = Base64.getEncoder().encode(orig); >>> + // decode using different buffer sizes, up to a longer one than >>> needed >>> + for (int bufferSize = 1; bufferSize <= encoded.length + 1; >>> bufferSize++) { >>> + try ( >>> + InputStream in = Base64.getDecoder().wrap( >>> + new ByteArrayInputStream(encoded)); >>> + ByteArrayOutputStream baos = new >>> ByteArrayOutputStream(); >>> + ) { >>> + byte[] buffer = new byte[bufferSize]; >>> + int read; >>> + while ((read = in.read(buffer, 0, bufferSize)) >= 0) { >>> + baos.write(buffer, 0, read); >>> + } >>> + // compare result, output info if lengths do not match >>> + byte[] decoded = baos.toByteArray(); >>> + checkEqual(decoded, orig, "Base64 stream decoding >>> failed!"); >>> + } >>> + } >>> + } >>> } >>> <JDK-8222187.patch> >> >> <oracle_sig_logo.gif> >> <http://oracle.com/us/design/oracle-email-sig-198324.gif> >> <http://oracle.com/us/design/oracle-email-sig-198324.gif> >> <http://oracle.com/us/design/oracle-email-sig-198324.gif> >> <http://oracle.com/us/design/oracle-email-sig-198324.gif>Lance Andersen| >> Principal Member of Technical Staff | +1.781.442.2037 >> Oracle Java Engineering >> 1 Network Drive >> Burlington, MA 01803 >> lance.ander...@oracle.com <mailto:lance.ander...@oracle.com> >> >> >> > Best Lance ------------------ Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037 Oracle Java Engineering 1 Network Drive Burlington, MA 01803 lance.ander...@oracle.com