On Jul 15, 2020, at 5:44 PM, Raffaello Giulietti
<raffaello.giulie...@gmail.com
<mailto:raffaello.giulie...@gmail.com>> wrote:
Hi Roger,
On 2020-07-15 22:21, Roger Riggs wrote:
Hi Raffaello,
Base64.java:
2: Please update 2nd copyright year to 2020
I was unsure what to do here, thanks for guidance.
I also removed the seemingly useless import at former L33.
leftovers():
996: "&" the shortcutting && is more typical in a condition.
997: the -= is buried in an expression and a reader might miss it.
Corrected, as well as the analogous -= usage for wpos now at L1106-7
1053: "can be reallocated to other vars": not a useful comment,
reflecting on only one possible implementation that is out of the
control of the developer.
I'd almost rather see 'len' than 'limit - off'; it might be
informative to the reader if 'limit' was declared final. (1056:)
Done, as well as declaring i and v final in the loop.
TestBase54.java:
2: Update 2nd copyright year to 2020
27: Please add the bug number to the @bug line.
Style-wise, I would remove the blank lines at the beginning of
blocks. (1095, 1115)
Done.
Thanks, Roger
On 7/14/20 11:47 AM, Raffaello Giulietti wrote:
Hi Roger,
here's the latest version of the patch.
I did:
* Withdraw the simplification in encodedOutLength(), as it is
indeed out of scope for this bug fix.
* Restore field names in DecInputStream except for nextin (now
wpos) and nextout (now rpos) because they have slightly different
semantics. That's just for clarity. I would have nothing against
reusing the old names for the proposed purpose.
* Switch back to the original no-arg read() implementation.
* Adjust comment continuation lines.
* Preserve the proposed logic of read(byte[], int, int) and the
supporting methods.
Others needed changes are:
* Correct the copyright years: that's something better left to Oracle.
* Remove the apparently useless import at L33. I could build and
run without it.
Greetings
Raffaello
----
# HG changeset patch
# User lello
# Date 1594848775 -7200
# Wed Jul 15 23:32:55 2020 +0200
# Node ID f7407f35e83ab508f0e6deab4f776bb1a1c85e68
# Parent a5649ebf94193770ca763391dd63807059d333b4
8222187: java.util.Base64.Decoder stream adds unexpected null bytes
at the end
Reviewed-by: TBD
Contributed-by: Raffaello Giulietti <raffaello.giulie...@gmail.com
<mailto:raffaello.giulie...@gmail.com>>
diff --git a/src/java.base/share/classes/java/util/Base64.java
b/src/java.base/share/classes/java/util/Base64.java
--- a/src/java.base/share/classes/java/util/Base64.java
+++ b/src/java.base/share/classes/java/util/Base64.java
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2012, 2019, Oracle and/or its affiliates. All
rights reserved.
+ * Copyright (c) 2012, 2020, Oracle and/or its affiliates. All
rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -30,7 +30,6 @@
import java.io.IOException;
import java.io.OutputStream;
import java.nio.ByteBuffer;
-import java.nio.charset.StandardCharsets;
import sun.nio.cs.ISO_8859_1;
@@ -961,12 +960,15 @@
private final InputStream is;
private final boolean isMIME;
- private final int[] base64; // base64 -> byte mapping
- private int bits = 0; // 24-bit buffer for decoding
- private int nextin = 18; // next available "off" in
"bits" for input;
- // -> 18, 12, 6, 0
- private int nextout = -8; // next available "off" in
"bits" for output;
- // -> 8, 0, -8 (no byte for
output)
+ private final int[] base64; // base64 -> byte mapping
+ private int bits = 0; // 24-bit buffer for decoding
+
+ /* writing bit pos inside bits; one of 24 (left, msb), 18,
12, 6, 0 */
+ private int wpos = 0;
+
+ /* reading bit pos inside bits: one of 24 (left, msb), 16,
8, 0 */
+ private int rpos = 0;
+
private boolean eof = false;
private boolean closed = false;
@@ -983,107 +985,153 @@
return read(sbBuf, 0, 1) == -1 ? -1 : sbBuf[0] & 0xff;
}
- private int eof(byte[] b, int off, int len, int oldOff)
- throws IOException
- {
+ private int leftovers(byte[] b, int off, int pos, int limit) {
eof = true;
- if (nextin != 18) {
- if (nextin == 12)
- throw new IOException("Base64 stream has one
un-decoded dangling byte.");
- // treat ending xx/xxx without padding character legal.
- // same logic as v == '=' below
- b[off++] = (byte)(bits >> (16));
- if (nextin == 0) { // only one padding byte
- if (len == 1) { // no enough output space
- bits >>= 8; // shift to lowest byte
- nextout = 0;
- } else {
- b[off++] = (byte) (bits >> 8);
- }
- }
+
+ /*
+ * We use a loop here, as this method is executed only a
few times.
+ * Unrolling the loop would probably not contribute much
here.
+ */
+ while (rpos - 8 >= wpos && pos != limit) {
+ rpos -= 8;
+ b[pos++] = (byte) (bits >> rpos);
}
- return off == oldOff ? -1 : off - oldOff;
+ return pos - off != 0 || rpos - 8 >= wpos ? pos - off : -1;
}
- private int padding(byte[] b, int off, int len, int oldOff)
- throws IOException
- {
- // = shiftto==18 unnecessary padding
- // x= shiftto==12 dangling x, invalid unit
- // xx= shiftto==6 && missing last '='
- // xx=y or last is not '='
- if (nextin == 18 || nextin == 12 ||
- nextin == 6 && is.read() != '=') {
- throw new IOException("Illegal base64 ending
sequence:" + nextin);
+ private int eof(byte[] b, int off, int pos, int limit)
throws IOException {
+ /*
+ * pos != limit
+ *
+ * wpos == 18: x dangling single x, invalid unit
+ * accept ending xx or xxx without padding characters
+ */
+ if (wpos == 18) {
+ throw new IOException("Base64 stream has one
un-decoded dangling byte.");
}
- b[off++] = (byte)(bits >> (16));
- if (nextin == 0) { // only one padding byte
- if (len == 1) { // no enough output space
- bits >>= 8; // shift to lowest byte
- nextout = 0;
- } else {
- b[off++] = (byte) (bits >> 8);
- }
+ rpos = 24;
+ return leftovers(b, off, pos, limit);
+ }
+
+ private int padding(byte[] b, int off, int pos, int limit)
throws IOException {
+ /*
+ * pos != limit
+ *
+ * wpos == 24: = (unnecessary padding)
+ * wpos == 18: x= (dangling single x, invalid unit)
+ * wpos == 12 and missing last '=': xx= (invalid padding)
+ * wpos == 12 and last is not '=': xx=x (invalid padding)
+ */
+ if (wpos >= 18 || wpos == 12 && is.read() != '=') {
+ throw new IOException("Illegal base64 ending
sequence:" + wpos);
}
- eof = true;
- return off - oldOff;
+ rpos = 24;
+ return leftovers(b, off, pos, limit);
}
@Override
public int read(byte[] b, int off, int len) throws IOException {
- if (closed)
+ if (closed) {
throw new IOException("Stream is closed");
- if (eof && nextout < 0) // eof and no leftover
- return -1;
- if (off < 0 || len < 0 || len > b.length - off)
- throw new IndexOutOfBoundsException();
- int oldOff = off;
- while (nextout >= 0) { // leftover output byte(s)
in bits buf
- if (len == 0)
- return off - oldOff;
- b[off++] = (byte)(bits >> nextout);
- len--;
- nextout -= 8;
+ }
+ Objects.checkFromIndexSize(off, len, b.length);
+ if (len == 0) {
+ return 0;
}
- bits = 0;
- while (len > 0) {
- int v = is.read();
- if (v == -1) {
- return eof(b, off, len, oldOff);
+
+ /*
+ * Rather than keeping 2 running vars (e.g., off and len),
+ * we only keep one (pos), while definitely fixing the
boundaries
+ * of the range [off, limit).
+ * More specifically, each use of pos as an index in b meets
+ * pos - off >= 0 & limit - pos > 0
+ *
+ * Note that limit can overflow to Integer.MIN_VALUE.
However,
+ * as long as comparisons with pos are as coded, there's
no harm.
+ */
+ int pos = off;
+ final int limit = off + len;
+ if (eof) {
+ return leftovers(b, off, pos, limit);
+ }
+
+ /*
+ * Leftovers from previous invocation; here, wpos = 0.
+ * There can be at most 2 leftover bytes (rpos <= 16).
+ * Further, b has at least one free place.
+ *
+ * The logic could be coded as a loop, (as in method
leftovers())
+ * but the explicit "unrolling" makes it possible to
generate
+ * better byte extraction code.
+ */
+ if (rpos == 16) {
+ b[pos++] = (byte) (bits >> 8);
+ rpos = 8;
+ if (pos == limit) {
+ return len;
}
- if ((v = base64[v]) < 0) {
- if (v == -2) { // padding byte(s)
- return padding(b, off, len, oldOff);
- }
- if (v == -1) {
- if (!isMIME)
- throw new IOException("Illegal base64
character " +
- Integer.toString(v, 16));
- continue; // skip if for rfc2045
- }
- // neve be here
- }
- bits |= (v << nextin);
- if (nextin == 0) {
- nextin = 18; // clear for next in
- b[off++] = (byte)(bits >> 16);
- if (len == 1) {
- nextout = 8; // 2 bytes left in bits
- break;
- }
- b[off++] = (byte)(bits >> 8);
- if (len == 2) {
- nextout = 0; // 1 byte left in bits
- break;
- }
- b[off++] = (byte)bits;
- len -= 3;
- bits = 0;
- } else {
- nextin -= 6;
+ }
+ if (rpos == 8) {
+ b[pos++] = (byte) bits;
+ rpos = 0;
+ if (pos == limit) {
+ return len;
}
}
- return off - oldOff;
+
+ bits = 0;
+ wpos = 24;
+ for (;;) {
+ /* pos != limit & rpos == 0 */
+ final int i = is.read();
+ if (i < 0) {
+ return eof(b, off, pos, limit);
+ }
+ final int v = base64[i];
+ if (v < 0) {
+ /*
+ * i not in alphabet, thus
+ * v == -2: i is '=', the padding
+ * v == -1: i is something else, typically
CR or LF
+ */
+ if (v == -1) {
+ if (isMIME) {
+ continue;
+ }
+ throw new IOException("Illegal base64
character 0x" +
+ Integer.toHexString(i));
+ }
+ return padding(b, off, pos, limit);
+ }
+ wpos -= 6;
+ bits |= v << wpos;
+ if (wpos != 0) {
+ continue;
+ }
+ if (limit - pos >= 3) {
+ /* frequently taken fast path, no need to track
rpos */
+ b[pos++] = (byte) (bits >> 16);
+ b[pos++] = (byte) (bits >> 8);
+ b[pos++] = (byte) bits;
+ bits = 0;
+ wpos = 24;
+ if (pos == limit) {
+ return len;
+ }
+ continue;
+ }
+
+ /* b has either 1 or 2 free places */
+ b[pos++] = (byte) (bits >> 16);
+ if (pos == limit) {
+ rpos = 16;
+ return len;
+ }
+ b[pos++] = (byte) (bits >> 8);
+ /* pos == limit, no need for an if */
+ rpos = 8;
+ return len;
+ }
}
@Override
diff --git a/test/jdk/java/util/Base64/TestBase64.java
b/test/jdk/java/util/Base64/TestBase64.java
--- a/test/jdk/java/util/Base64/TestBase64.java
+++ b/test/jdk/java/util/Base64/TestBase64.java
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2012, 2017, Oracle and/or its affiliates. All
rights reserved.
+ * Copyright (c) 2012, 2020, Oracle and/or its affiliates. All
rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -24,7 +24,7 @@
/**
* @test
* @bug 4235519 8004212 8005394 8007298 8006295 8006315 8006530
8007379 8008925
- * 8014217 8025003 8026330 8028397 8129544 8165243 8176379
+ * 8014217 8025003 8026330 8028397 8129544 8165243 8176379 8222187
* @summary tests java.util.Base64
* @library /test/lib
* @build jdk.test.lib.RandomFactory
@@ -144,6 +144,10 @@
testDecoderKeepsAbstinence(Base64.getDecoder());
testDecoderKeepsAbstinence(Base64.getUrlDecoder());
testDecoderKeepsAbstinence(Base64.getMimeDecoder());
+
+ // tests patch addressing JDK-8222187
+ // https://bugs.openjdk.java.net/browse/JDK-8222187
+ testJDK_8222187();
}
private static void test(Base64.Encoder enc, Base64.Decoder dec,
@@ -607,4 +611,26 @@
}
}
}
+
+ private static void testJDK_8222187() throws Throwable {
+ byte[] orig = "12345678".getBytes("US-ASCII");
+ byte[] encoded = Base64.getEncoder().encode(orig);
+ // decode using different buffer sizes, up to a longer one
than needed
+ for (int bufferSize = 1; bufferSize <= encoded.length + 1;
bufferSize++) {
+ try (
+ InputStream in = Base64.getDecoder().wrap(
+ new ByteArrayInputStream(encoded));
+ ByteArrayOutputStream baos = new
ByteArrayOutputStream();
+ ) {
+ byte[] buffer = new byte[bufferSize];
+ int read;
+ while ((read = in.read(buffer, 0, bufferSize)) >= 0) {
+ baos.write(buffer, 0, read);
+ }
+ // compare result, output info if lengths do not match
+ byte[] decoded = baos.toByteArray();
+ checkEqual(decoded, orig, "Base64 stream decoding
failed!");
+ }
+ }
+ }
}
<JDK-8222187.patch>