Sorry Staffan, another nit...
212 for (; off < (end - Long.BYTES); off += Long.BYTES) {
and
286 for (; off < (end - Long.BYTES); off += Long.BYTES) {
I think you could change "off < (end - Long.BYTES)" to "off <= (end -
Long.BYTES)". Am I right?
Regards, Peter
On 10/21/2014 10:30 PM, Peter Levart wrote:
On 10/21/2014 08:49 PM, Staffan Friberg wrote:
Hi,
Got an offline comment that the package.html should be update as well
to cover CRC-32C.
Otherwise there are no code changes in this new webrev.
http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.04
Hi Staffan,
198 if (end - off >= 8 && Unsafe.ARRAY_BOOLEAN_INDEX_SCALE == 1) {
ARRAY_BOOLEAN_INDEX_SCALE -> ARRAY_BYTE_INDEX_SCALE
Otherwise looks good now.
Regards, Peter
P.S.
I think (by looking at DirectByteBuffer.asIntBuffer() implementation)
that when doing 32 bit (4 byte) reads using Unsafe, the address only
has to be aligned to 4 bytes (8 is necessary alignment for 64 bit
reads). So updateDirectByteBuffer could make this alignment on 4 bytes
as it's only using 32 bit reads (with additional check on
ADDRESS_SIZE, you could do that for updateBytes too).
You don't get much out of it, so you decide if it's worth complication.
//Staffan
On 10/21/2014 10:28 AM, Staffan Friberg wrote:
Hi Peter,
Thanks for the comments..
217 if (Unsafe.ADDRESS_SIZE == 4) {
218 // On 32 bit platforms read two ints
instead of a single 64bit long
When you're reading from byte[] using Unsafe (updateBytes), you
have the option of reading 64bit values on 64bit platforms. When
you're reading from DirectByteBuffer memory
(updateDirectByteBuffer), you're only using 32bit reads.
I will add a comment in the code for this decision. The reason is
that read a long results in slightly worse performance in this case,
in updateBytes it is faster. I was able to get it to run slightly
faster by working directly with the address instead of always adding
address + off, but this makes things worse in the 32bit case since
all calculation will now be using long variables. So using the
getInt as in the current code feels like the best solution as it
strikes the best balance between 32 and 64bit. Below is how
updateByteBuffer looked with the rewrite I mentioned.
ong address = ((DirectBuffer) buffer).address();
crc = updateDirectByteBuffer(crc, address + pos, address + limit);
private static int updateDirectByteBuffer(int crc, long adr,
long end) {
// Do only byte reads for arrays so short they can't be aligned
if (end - adr >= 8) {
// align on 8 bytes
int alignLength = (8 - (int) (adr & 0x7)) & 0x7;
for (long alignEnd = adr + alignLength; adr < alignEnd;
adr++) {
crc = (crc >>> 8)
^ byteTable[(crc ^ UNSAFE.getByte(adr)) &
0xFF];
}
if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
crc = Integer.reverseBytes(crc);
}
// slicing-by-8
for (; adr < (end - Long.BYTES); adr += Long.BYTES) {
int firstHalf;
int secondHalf;
if (Unsafe.ADDRESS_SIZE == 4) {
// On 32 bit platforms read two ints instead of
a single 64bit long
firstHalf = UNSAFE.getInt(adr);
secondHalf = UNSAFE.getInt(adr + Integer.BYTES);
} else {
long value = UNSAFE.getLong(adr);
if (ByteOrder.nativeOrder() ==
ByteOrder.LITTLE_ENDIAN) {
firstHalf = (int) value;
secondHalf = (int) (value >>> 32);
} else { // ByteOrder.BIG_ENDIAN
firstHalf = (int) (value >>> 32);
secondHalf = (int) value;
}
}
crc ^= firstHalf;
if (ByteOrder.nativeOrder() ==
ByteOrder.LITTLE_ENDIAN) {
crc = byteTable7[crc & 0xFF]
^ byteTable6[(crc >>> 8) & 0xFF]
^ byteTable5[(crc >>> 16) & 0xFF]
^ byteTable4[crc >>> 24]
^ byteTable3[secondHalf & 0xFF]
^ byteTable2[(secondHalf >>> 8) & 0xFF]
^ byteTable1[(secondHalf >>> 16) & 0xFF]
^ byteTable0[secondHalf >>> 24];
} else { // ByteOrder.BIG_ENDIAN
crc = byteTable0[secondHalf & 0xFF]
^ byteTable1[(secondHalf >>> 8) & 0xFF]
^ byteTable2[(secondHalf >>> 16) & 0xFF]
^ byteTable3[secondHalf >>> 24]
^ byteTable4[crc & 0xFF]
^ byteTable5[(crc >>> 8) & 0xFF]
^ byteTable6[(crc >>> 16) & 0xFF]
^ byteTable7[crc >>> 24];
}
}
if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) {
crc = Integer.reverseBytes(crc);
}
}
// Tail
for (; adr < end; adr++) {
crc = (crc >>> 8)
^ byteTable[(crc ^ UNSAFE.getByte(adr)) & 0xFF];
}
return crc;
}
Also, in updateBytes, the usage of
Unsafe.ARRAY_INT_INDEX_SCALE/ARRAY_LONG_INDEX_SCALE to index a byte
array sounds a little scary. To be ultra portable you could check
that ARRAY_BYTE_INDEX_SCALE == 1 first and refuse to use Unsafe for
byte arrays if it is not 1. Then use Integer.BYTES/Long.BYTES to
manipulate 'offsets' instead. In updateDirectByteBuffer it would be
more appropriate to use Integer.BYTES/Long.BYTES too.
Good idea. Added a check in the initial if statement and it will get
automatically optimized away.
225 firstHalf = (int) (value & 0xFFFFFFFF);
226 secondHalf = (int) (value >>> 32);
227 } else { // ByteOrder.BIG_ENDIAN
228 firstHalf = (int) (value >>> 32);
229 secondHalf = (int) (value & 0xFFFFFFFF);
firstHalf = (int) value; // this is equivalent for line 225
secondHalf = (int) value; // this is equivalent for line 229
Done.
Here is the latest webrev,
http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.03
Cheers,
Staffan