Re: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API for Base64 decoding

Roger Riggs Fri, 04 Sep 2020 08:14:22 -0700

Hi Corey,

The idea I had in mind is refactoring the fast path into the method youcall decodeBlock.

Base64: lines 751-768.


It leaves all the unknown/illegal character handling to the Java code.

And yes, it does not need to handle MIME, except to return on illegalcharacters.


The patch is attached.

Regards, Roger



On 8/31/20 6:22 PM, Corey Ashford wrote:

On 8/29/20 1:19 PM, Corey Ashford wrote:
Hi Roger,

Thanks for your reply and thoughts!  Comments interspersed below:

On 8/28/20 10:54 AM, Roger Riggs wrote:
...
Comparing with the way that the Base64 encoder was intrinsified, the
method that is intrinsified should have a method body that does
the same function, so it is interchangable. That likely will justshift
the "fast path" code into the decodeBlock method.
Keeping the symmetry between encoder and decoder will
make it easier to maintain the code.
Good point. I'll investigate what this looks like in terms of theactual code, and will report back (perhaps in a new webrev).
Having looked at this again, I don't think it makes sense. One thingthat differs significantly from the encodeBlock intrinsic is that thedecodeBlock intrinsic only needs to process a prefix of the data, andso it can leave virtually any amount of data at the end of the srcbuffer unprocessed, where as with the encodeBlock intrinsic, if itexists, it must process the entire buffer.
In the (common) case where the decodeBlock intrinsic returns nothaving processed everything, it still needs to call the Java code, andif that Java code is "replaced" by the intrinsic, it's inaccessible.
Is there something I'm overlooking here? Basically I want the decodeAPI to behave differently than the encode API, mostly to make thearch-specific intrinsic easier to implement. If that's not acceptable,then I need to rethink the API, and also figure out how to deal withthe illegal character case. The latter could perhaps be done bythrowing an exception from the intrinsic, or maybe by returning anegative length that specifies the index of the illegal src byte, andthen have the Java code throw the exception).
Regards,

- Corey

diff --git a/src/java.base/share/classes/java/util/Base64.java 
b/src/java.base/share/classes/java/util/Base64.java
index 34b39b18a54..e2b3a686d70 100644
--- a/src/java.base/share/classes/java/util/Base64.java
+++ b/src/java.base/share/classes/java/util/Base64.java
@@ -741,6 +741,27 @@ public class Base64 {
             return 3 * (int) ((len + 3L) / 4) - paddings;
         }
 
+        // Fast path for full 4 byte -> 3 byte conversion w/o errors
+        private int decodeBlock(byte[] src, int sp, int sl, byte[] dst, 
boolean isURL) {
+            int[] base64 = isURL ? fromBase64URL : fromBase64;
+            int dp = 0;
+            int sl0 = sp + ((sl - sp) & ~0b11);
+            while (sp < sl0) {
+                int b1 = base64[src[sp++] & 0xff];
+                int b2 = base64[src[sp++] & 0xff];
+                int b3 = base64[src[sp++] & 0xff];
+                int b4 = base64[src[sp++] & 0xff];
+                if ((b1 | b2 | b3 | b4) < 0) {    // non base64 byte
+                    return dp;
+                }
+                int bits0 = b1 << 18 | b2 << 12 | b3 << 6 | b4;
+                dst[dp++] = (byte)(bits0 >> 16);
+                dst[dp++] = (byte)(bits0 >>  8);
+                dst[dp++] = (byte)(bits0);
+            }
+            return dp;
+        }
+
         private int decode0(byte[] src, int sp, int sl, byte[] dst) {
             int[] base64 = isURL ? fromBase64URL : fromBase64;
             int dp = 0;
@@ -749,23 +770,34 @@ public class Base64 {
 
             while (sp < sl) {
                 if (shiftto == 18 && sp + 4 < sl) {       // fast path
-                    int sl0 = sp + ((sl - sp) & ~0b11);
-                    while (sp < sl0) {
-                        int b1 = base64[src[sp++] & 0xff];
-                        int b2 = base64[src[sp++] & 0xff];
-                        int b3 = base64[src[sp++] & 0xff];
-                        int b4 = base64[src[sp++] & 0xff];
-                        if ((b1 | b2 | b3 | b4) < 0) {    // non base64 byte
-                            sp -= 4;
-                            break;
-                        }
-                        int bits0 = b1 << 18 | b2 << 12 | b3 << 6 | b4;
-                        dst[dp++] = (byte)(bits0 >> 16);
-                        dst[dp++] = (byte)(bits0 >>  8);
-                        dst[dp++] = (byte)(bits0);
-                    }
-                    if (sp >= sl)
-                        break;
+                    int dl = decodeBlock(src, sp, sl, dst, isURL);
+                    /*
+                     * Calculate how many characters were processed by how many
+                     * bytes of data were returned.
+                     */
+
+                    /*
+                     * Base64 characters always come in groups of four,
+                     * producing three bytes of binary data (except for on
+                     * the final four-character piece where it can produce
+                     * one to three data bytes depending on how many fill
+                     * characters there - zero, one, or two).  The only
+                     * case where there should be a non-multiple of three
+                     * returned is if the intrinsic has processed all of
+                     * the characters passed to it.  At this point in the
+                     * logic, however, we know the instrinsic hasn't
+                     * processed all of the chracters.
+                     *
+                     * Round dl down to the nearest three-byte boundary.
+                     */
+                    dl = (dl / 3) * 3;
+
+                    // Recalculate chars_decoded based on the rounded dl
+                    int chars_decoded = (dl / 3) * 4;
+
+                    sp += chars_decoded;
+                    dp += dl;
+                    continue;
                 }
                 int b = src[sp++] & 0xff;
                 if ((b = base64[b]) < 0) {

Re: RFR(M): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API for Base64 decoding

Reply via email to