Hi Rafaello, Lance,

That looks good to me.

Thanks, Roger

On 7/19/20 2:31 PM, Lance Andersen wrote:
Hi Raffaello,

I think the changes look reasonable.

I have created a webrev, http://cr.openjdk.java.net/~lancea/8222187/webrev.00/, for others to review as it is a bit easier than the patch.

I have also run the JCK tests in this area as well as mach5 tiers1-3 (which I believe Roger has also)

Best
Lance

On Jul 15, 2020, at 5:44 PM, Raffaello Giulietti <raffaello.giulie...@gmail.com <mailto:raffaello.giulie...@gmail.com>> wrote:

Hi Roger,

On 2020-07-15 22:21, Roger Riggs wrote:
Hi Raffaello,
Base64.java:
2: Please update 2nd copyright year to 2020

I was unsure what to do here, thanks for guidance.
I also removed the seemingly useless import at former L33.



leftovers():
996: "&" the shortcutting && is more typical in a condition.
997: the -= is buried in an expression and a reader might miss it.

Corrected, as well as the analogous -= usage for wpos now at L1106-7


1053:  "can be reallocated to other vars":  not a useful comment, reflecting on only one possible implementation that is out of the control of the developer. I'd almost rather see 'len' than 'limit - off'; it might be informative to the reader if 'limit' was declared final. (1056:)

Done, as well as declaring i and v final in the loop.



TestBase54.java:
2: Update 2nd copyright year to 2020
27:  Please add the bug number to the @bug line.
Style-wise, I would remove the blank lines at the beginning of blocks. (1095, 1115)

Done.


Thanks, Roger
On 7/14/20 11:47 AM, Raffaello Giulietti wrote:
Hi Roger,

here's the latest version of the patch.

I did:
* Withdraw the simplification in encodedOutLength(), as it is indeed out of scope for this bug fix. * Restore field names in DecInputStream except for nextin (now wpos) and nextout (now rpos) because they have slightly different semantics. That's just for clarity. I would have nothing against reusing the old names for the proposed purpose.
* Switch back to the original no-arg read() implementation.
* Adjust comment continuation lines.
* Preserve the proposed logic of read(byte[], int, int) and the supporting methods.

Others needed changes are:
* Correct the copyright years: that's something better left to Oracle.
* Remove the apparently useless import at L33. I could build and run without it.


Greetings
Raffaello



----

# HG changeset patch
# User lello
# Date 1594848775 -7200
#      Wed Jul 15 23:32:55 2020 +0200
# Node ID f7407f35e83ab508f0e6deab4f776bb1a1c85e68
# Parent  a5649ebf94193770ca763391dd63807059d333b4
8222187: java.util.Base64.Decoder stream adds unexpected null bytes at the end
Reviewed-by: TBD
Contributed-by: Raffaello Giulietti <raffaello.giulie...@gmail.com <mailto:raffaello.giulie...@gmail.com>>

diff --git a/src/java.base/share/classes/java/util/Base64.java b/src/java.base/share/classes/java/util/Base64.java
--- a/src/java.base/share/classes/java/util/Base64.java
+++ b/src/java.base/share/classes/java/util/Base64.java
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2012, 2019, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2012, 2020, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
@@ -30,7 +30,6 @@
import java.io.IOException;
import java.io.OutputStream;
import java.nio.ByteBuffer;
-import java.nio.charset.StandardCharsets;

import sun.nio.cs.ISO_8859_1;

@@ -961,12 +960,15 @@

        private final InputStream is;
        private final boolean isMIME;
-        private final int[] base64;      // base64 -> byte mapping
-        private int bits = 0;            // 24-bit buffer for decoding
-        private int nextin = 18;         // next available "off" in "bits" for input;
-                                         // -> 18, 12, 6, 0
-        private int nextout = -8;        // next available "off" in "bits" for output; -                                         // -> 8, 0, -8 (no byte for output)
+        private final int[] base64;     // base64 -> byte mapping
+        private int bits = 0;           // 24-bit buffer for decoding
+
+        /* writing bit pos inside bits; one of 24 (left, msb), 18, 12, 6, 0 */
+        private int wpos = 0;
+
+        /* reading bit pos inside bits: one of 24 (left, msb), 16, 8, 0 */
+        private int rpos = 0;
+
        private boolean eof = false;
        private boolean closed = false;

@@ -983,107 +985,153 @@
            return read(sbBuf, 0, 1) == -1 ? -1 : sbBuf[0] & 0xff;
        }

-        private int eof(byte[] b, int off, int len, int oldOff)
-            throws IOException
-        {
+        private int leftovers(byte[] b, int off, int pos, int limit) {
            eof = true;
-            if (nextin != 18) {
-                if (nextin == 12)
-                    throw new IOException("Base64 stream has one un-decoded dangling byte.");
-                // treat ending xx/xxx without padding character legal.
-                // same logic as v == '=' below
-                b[off++] = (byte)(bits >> (16));
-                if (nextin == 0) {           // only one padding byte
-                    if (len == 1) {          // no enough output space
-                        bits >>= 8;          // shift to lowest byte
-                        nextout = 0;
-                    } else {
-                        b[off++] = (byte) (bits >>  8);
-                    }
-                }
+
+            /*
+             * We use a loop here, as this method is executed only a few times. +             * Unrolling the loop would probably not contribute much here.
+             */
+            while (rpos - 8 >= wpos && pos != limit) {
+                rpos -= 8;
+                b[pos++] = (byte) (bits >> rpos);
            }
-            return off == oldOff ? -1 : off - oldOff;
+            return pos - off != 0 || rpos - 8 >= wpos ? pos - off : -1;
        }

-        private int padding(byte[] b, int off, int len, int oldOff)
-            throws IOException
-        {
-            // =     shiftto==18 unnecessary padding
-            // x=    shiftto==12 dangling x, invalid unit
-            // xx=   shiftto==6 && missing last '='
-            // xx=y  or last is not '='
-            if (nextin == 18 || nextin == 12 ||
-                nextin == 6 && is.read() != '=') {
-                throw new IOException("Illegal base64 ending sequence:" + nextin); +        private int eof(byte[] b, int off, int pos, int limit) throws IOException {
+            /*
+             * pos != limit
+             *
+             * wpos == 18: x     dangling single x, invalid unit
+             * accept ending xx or xxx without padding characters
+             */
+            if (wpos == 18) {
+                throw new IOException("Base64 stream has one un-decoded dangling byte.");
            }
-            b[off++] = (byte)(bits >> (16));
-            if (nextin == 0) {           // only one padding byte
-                if (len == 1) {          // no enough output space
-                    bits >>= 8;          // shift to lowest byte
-                    nextout = 0;
-                } else {
-                    b[off++] = (byte) (bits >>  8);
-                }
+            rpos = 24;
+            return leftovers(b, off, pos, limit);
+        }
+
+        private int padding(byte[] b, int off, int pos, int limit) throws IOException {
+            /*
+             * pos != limit
+             *
+             * wpos == 24: =    (unnecessary padding)
+             * wpos == 18: x=   (dangling single x, invalid unit)
+             * wpos == 12 and missing last '=': xx=  (invalid padding)
+             * wpos == 12 and last is not '=': xx=x (invalid padding)
+             */
+            if (wpos >= 18 || wpos == 12 && is.read() != '=') {
+                throw new IOException("Illegal base64 ending sequence:" + wpos);
            }
-            eof = true;
-            return off - oldOff;
+            rpos = 24;
+            return leftovers(b, off, pos, limit);
        }

        @Override
        public int read(byte[] b, int off, int len) throws IOException {
-            if (closed)
+            if (closed) {
                throw new IOException("Stream is closed");
-            if (eof && nextout < 0)    // eof and no leftover
-                return -1;
-            if (off < 0 || len < 0 || len > b.length - off)
-                throw new IndexOutOfBoundsException();
-            int oldOff = off;
-            while (nextout >= 0) {       // leftover output byte(s) in bits buf
-                if (len == 0)
-                    return off - oldOff;
-                b[off++] = (byte)(bits >> nextout);
-                len--;
-                nextout -= 8;
+            }
+            Objects.checkFromIndexSize(off, len, b.length);
+            if (len == 0) {
+                return 0;
            }
-            bits = 0;
-            while (len > 0) {
-                int v = is.read();
-                if (v == -1) {
-                    return eof(b, off, len, oldOff);
+
+            /*
+             * Rather than keeping 2 running vars (e.g., off and len),
+             * we only keep one (pos), while definitely fixing the boundaries
+             * of the range [off, limit).
+             * More specifically, each use of pos as an index in b meets
+             *      pos - off >= 0 & limit - pos > 0
+             *
+             * Note that limit can overflow to Integer.MIN_VALUE. However, +             * as long as comparisons with pos are as coded, there's no harm.
+             */
+            int pos = off;
+            final int limit = off + len;
+            if (eof) {
+                return leftovers(b, off, pos, limit);
+            }
+
+            /*
+             * Leftovers from previous invocation; here, wpos = 0.
+             * There can be at most 2 leftover bytes (rpos <= 16).
+             * Further, b has at least one free place.
+             *
+             * The logic could be coded as a loop, (as in method leftovers()) +             * but the explicit "unrolling" makes it possible to generate
+             * better byte extraction code.
+             */
+            if (rpos == 16) {
+                b[pos++] = (byte) (bits >> 8);
+                rpos = 8;
+                if (pos == limit) {
+                    return len;
                }
-                if ((v = base64[v]) < 0) {
-                    if (v == -2) {       // padding byte(s)
-                        return padding(b, off, len, oldOff);
-                    }
-                    if (v == -1) {
-                        if (!isMIME)
-                            throw new IOException("Illegal base64 character " +
-                                Integer.toString(v, 16));
-                        continue;        // skip if for rfc2045
-                    }
-                    // neve be here
-                }
-                bits |= (v << nextin);
-                if (nextin == 0) {
-                    nextin = 18;         // clear for next in
-                    b[off++] = (byte)(bits >> 16);
-                    if (len == 1) {
-                        nextout = 8;    // 2 bytes left in bits
-                        break;
-                    }
-                    b[off++] = (byte)(bits >> 8);
-                    if (len == 2) {
-                        nextout = 0;    // 1 byte left in bits
-                        break;
-                    }
-                    b[off++] = (byte)bits;
-                    len -= 3;
-                    bits = 0;
-                } else {
-                    nextin -= 6;
+            }
+            if (rpos == 8) {
+                b[pos++] = (byte) bits;
+                rpos = 0;
+                if (pos == limit) {
+                    return len;
                }
            }
-            return off - oldOff;
+
+            bits = 0;
+            wpos = 24;
+            for (;;) {
+                /* pos != limit & rpos == 0 */
+                final int i = is.read();
+                if (i < 0) {
+                    return eof(b, off, pos, limit);
+                }
+                final int v = base64[i];
+                if (v < 0) {
+                    /*
+                     * i not in alphabet, thus
+                     *      v == -2: i is '=', the padding
+                     *      v == -1: i is something else, typically CR or LF
+                     */
+                    if (v == -1) {
+                        if (isMIME) {
+                            continue;
+                        }
+                        throw new IOException("Illegal base64 character 0x" +
+                                Integer.toHexString(i));
+                    }
+                    return padding(b, off, pos, limit);
+                }
+                wpos -= 6;
+                bits |= v << wpos;
+                if (wpos != 0) {
+                    continue;
+                }
+                if (limit - pos >= 3) {
+                    /* frequently taken fast path, no need to track rpos */
+                    b[pos++] = (byte) (bits >> 16);
+                    b[pos++] = (byte) (bits >> 8);
+                    b[pos++] = (byte) bits;
+                    bits = 0;
+                    wpos = 24;
+                    if (pos == limit) {
+                        return len;
+                    }
+                    continue;
+                }
+
+                /* b has either 1 or 2 free places */
+                b[pos++] = (byte) (bits >> 16);
+                if (pos == limit) {
+                    rpos = 16;
+                    return len;
+                }
+                b[pos++] = (byte) (bits >> 8);
+                /* pos == limit, no need for an if */
+                rpos = 8;
+                return len;
+            }
        }

        @Override
diff --git a/test/jdk/java/util/Base64/TestBase64.java b/test/jdk/java/util/Base64/TestBase64.java
--- a/test/jdk/java/util/Base64/TestBase64.java
+++ b/test/jdk/java/util/Base64/TestBase64.java
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2012, 2017, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 2012, 2020, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
@@ -24,7 +24,7 @@
/**
 * @test
 * @bug 4235519 8004212 8005394 8007298 8006295 8006315 8006530 8007379 8008925
- *      8014217 8025003 8026330 8028397 8129544 8165243 8176379
+ *      8014217 8025003 8026330 8028397 8129544 8165243 8176379 8222187
 * @summary tests java.util.Base64
 * @library /test/lib
 * @build jdk.test.lib.RandomFactory
@@ -144,6 +144,10 @@
        testDecoderKeepsAbstinence(Base64.getDecoder());
        testDecoderKeepsAbstinence(Base64.getUrlDecoder());
        testDecoderKeepsAbstinence(Base64.getMimeDecoder());
+
+        // tests patch addressing JDK-8222187
+        // https://bugs.openjdk.java.net/browse/JDK-8222187
+        testJDK_8222187();
    }

    private static void test(Base64.Encoder enc, Base64.Decoder dec,
@@ -607,4 +611,26 @@
            }
        }
    }
+
+    private static void testJDK_8222187() throws Throwable {
+        byte[] orig = "12345678".getBytes("US-ASCII");
+        byte[] encoded = Base64.getEncoder().encode(orig);
+        // decode using different buffer sizes, up to a longer one than needed +        for (int bufferSize = 1; bufferSize <= encoded.length + 1; bufferSize++) {
+            try (
+                    InputStream in = Base64.getDecoder().wrap(
+                            new ByteArrayInputStream(encoded));
+                    ByteArrayOutputStream baos = new ByteArrayOutputStream();
+            ) {
+                byte[] buffer = new byte[bufferSize];
+                int read;
+                while ((read = in.read(buffer, 0, bufferSize)) >= 0) {
+                    baos.write(buffer, 0, read);
+                }
+                // compare result, output info if lengths do not match
+                byte[] decoded = baos.toByteArray();
+                checkEqual(decoded, orig, "Base64 stream decoding failed!");
+            }
+        }
+    }
}
<JDK-8222187.patch>

<http://oracle.com/us/design/oracle-email-sig-198324.gif>
<http://oracle.com/us/design/oracle-email-sig-198324.gif><http://oracle.com/us/design/oracle-email-sig-198324.gif>
<http://oracle.com/us/design/oracle-email-sig-198324.gif>Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
lance.ander...@oracle.com <mailto:lance.ander...@oracle.com>




Reply via email to