Re: RFR JDK-6321472: Add CRC-32C API

Staffan Friberg Wed, 22 Oct 2014 17:07:30 -0700

Hi,

I was thinking about this earlier when I started writing the patch andthen I forgot about it again. I haven't been able to figure out when thecode will be executed. ByteBuffer is implemented in such a way thatonly the JDK can extend it and as far as I can tell you can only create3 types of ByteBuffers (Direct, Mapped and Heap), all of which will behandled by the more efficient calls above.

That said just to make the code a bit safer from OOM it is probably bestto update the default method and all current implementations which alluse the same pattern.


A reasonable solution should be the following code

            byte[] b = new byte[(buffer.remaining() < 4096)
                    ? buffer.remaining() : 4096];
            while (buffer.hasRemaining()) {
                int length = (buffer.remaining() < b.length)
                        ? buffer.remaining() : b.length;
                buffer.get(b, 0, length);
                update(b, 0, length);
            }

Xueming, do you have any further comment?

Regards,
Staffan

On 10/22/2014 03:04 PM, Stanimir Simeonoff wrote:

On Thu, Oct 23, 2014 at 12:10 AM, Bernd Eckenfels<e...@zusammenkunft.net <mailto:e...@zusammenkunft.net>> wrote:


    Hello,

    just a question in the default impl:

    +        } else {
    +            byte[] b = new byte[rem];
    +            buffer.get(b);
    +            update(b, 0, b.length);
    +        }

    would it be a good idea to actually put a ceiling on the size of the

array which is processed at once?

This is an excellent catch.
Should not be too large, probably 4k or so.

Stanimir


     Am Tue, 21 Oct 2014 10:28:50 -0700
    schrieb Staffan Friberg <staffan.frib...@oracle.com
    <mailto:staffan.frib...@oracle.com>>:

    > Hi Peter,
    >
    > Thanks for the comments..
    > >
    > >   217                 if (Unsafe.ADDRESS_SIZE == 4) {
    > >   218                     // On 32 bit platforms read two ints
    > > instead of a single 64bit long
    > >
    > > When you're reading from byte[] using Unsafe (updateBytes), you
    > > have the option of reading 64bit values on 64bit platforms. When
    > > you're reading from DirectByteBuffer memory
    > > (updateDirectByteBuffer), you're only using 32bit reads.
    > I will add a comment in the code for this decision. The reason is
    > that read a long results in slightly worse performance in this case,
    > in updateBytes it is faster. I was able to get it to run slightly
    > faster by working directly with the address instead of always adding
    > address + off, but this makes things worse in the 32bit case since
    > all calculation will now be using long variables. So using the
    getInt
    > as in the current code feels like the best solution as it
    strikes the
    > best balance between 32 and 64bit. Below is how updateByteBuffer
    > looked with the rewrite I mentioned.
    >
    >
    >   ong address = ((DirectBuffer) buffer).address();
    >   crc = updateDirectByteBuffer(crc, address + pos, address + limit);
    >
    >
    >       private static int updateDirectByteBuffer(int crc, long adr,
    > long end) {
    >
    >          // Do only byte reads for arrays so short they can't be
    > aligned if (end - adr >= 8) {
    >
    >              // align on 8 bytes
    >              int alignLength = (8 - (int) (adr & 0x7)) & 0x7;
    >              for (long alignEnd = adr + alignLength; adr < alignEnd;
    > adr++) { crc = (crc >>> 8)
    >                          ^ byteTable[(crc ^ UNSAFE.getByte(adr)) &
    > 0xFF]; }
    >
    >              if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
    >                  crc = Integer.reverseBytes(crc);
    >              }
    >
    >              // slicing-by-8
    >              for (; adr < (end - Long.BYTES); adr += Long.BYTES) {
    >                  int firstHalf;
    >                  int secondHalf;
    >                  if (Unsafe.ADDRESS_SIZE == 4) {
    >                      // On 32 bit platforms read two ints instead of
    > a single 64bit long firstHalf = UNSAFE.getInt(adr);
    >                      secondHalf = UNSAFE.getInt(adr +
    Integer.BYTES);
    >                  } else {
    >                      long value = UNSAFE.getLong(adr);
    >                      if (ByteOrder.nativeOrder() ==
    > ByteOrder.LITTLE_ENDIAN) { firstHalf = (int) value;
    >                          secondHalf = (int) (value >>> 32);
    >                      } else { // ByteOrder.BIG_ENDIAN
    >                          firstHalf = (int) (value >>> 32);
    >                          secondHalf = (int) value;
    >                      }
    >                  }
    >                  crc ^= firstHalf;
    >                  if (ByteOrder.nativeOrder() ==
    > ByteOrder.LITTLE_ENDIAN) { crc = byteTable7[crc & 0xFF]
    >                              ^ byteTable6[(crc >>> 8) & 0xFF]
    >                              ^ byteTable5[(crc >>> 16) & 0xFF]
    >                              ^ byteTable4[crc >>> 24]
    >                              ^ byteTable3[secondHalf & 0xFF]
    >                              ^ byteTable2[(secondHalf >>> 8) & 0xFF]
    >                              ^ byteTable1[(secondHalf >>> 16) &
    0xFF]
    >                              ^ byteTable0[secondHalf >>> 24];
    >                  } else { // ByteOrder.BIG_ENDIAN
    >                      crc = byteTable0[secondHalf & 0xFF]
    >                              ^ byteTable1[(secondHalf >>> 8) & 0xFF]
    >                              ^ byteTable2[(secondHalf >>> 16) &
    0xFF]
    >                              ^ byteTable3[secondHalf >>> 24]
    >                              ^ byteTable4[crc & 0xFF]
    >                              ^ byteTable5[(crc >>> 8) & 0xFF]
    >                              ^ byteTable6[(crc >>> 16) & 0xFF]
    >                              ^ byteTable7[crc >>> 24];
    >                  }
    >              }
    >
    >              if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
    >                  crc = Integer.reverseBytes(crc);
    >              }
    >          }
    >
    >          // Tail
    >          for (; adr < end; adr++) {
    >              crc = (crc >>> 8)
    >                      ^ byteTable[(crc ^ UNSAFE.getByte(adr)) &
    0xFF];
    >          }
    >
    >          return crc;
    >      }
    >
    >
    > >
    > > Also, in updateBytes, the usage of
    > > Unsafe.ARRAY_INT_INDEX_SCALE/ARRAY_LONG_INDEX_SCALE to index a
    byte
    > > array sounds a little scary. To be ultra portable you could check
    > > that ARRAY_BYTE_INDEX_SCALE == 1 first and refuse to use
    Unsafe for
    > > byte arrays if it is not 1. Then use Integer.BYTES/Long.BYTES to
    > > manipulate 'offsets' instead. In updateDirectByteBuffer it
    would be
    > > more appropriate to use Integer.BYTES/Long.BYTES too.
    > Good idea. Added a check in the initial if statement and it will get
    > automatically optimized away.
    >
    > >   225                         firstHalf = (int) (value &
    > > 0xFFFFFFFF); 226  secondHalf = (int) (value
    > > >>> 32); 227                     } else { // ByteOrder.BIG_ENDIAN
    > >   228                         firstHalf = (int) (value >>> 32);
    > >   229                         secondHalf = (int) (value &
    > > 0xFFFFFFFF);
    > >
    > > firstHalf = (int) value; // this is equivalent for line 225
    > > secondHalf = (int) value; // this is equivalent for line 229
    > Done.
    >
    > Here is the latest webrev,
    > http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.03
    <http://cr.openjdk.java.net/%7Esfriberg/JDK-6321472/webrev.03>
    >
    > Cheers,
    > Staffan

Re: RFR JDK-6321472: Add CRC-32C API

Reply via email to