On 2021-02-02 Brett Okken wrote:
> Thus far I have only tested on jdk 11 64bit windows, but the fairly
> clear winner is:
> 
>     public void update(byte[] buf, int off, int len) {
>         final int end = off + len;
>         int i=off;
>         if (len > 3) {
>             switch (i & 3) {
>                 case 3:
>                     crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^
>                           (crc >>> 8);
>                 case 2:
>                     crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^
>                           (crc >>> 8);
>                 case 1:
>                     crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^
>                           (crc >>> 8);
>             }

To ensure (i & 3) == 0 when entering the main loop, the case-labels
should be 1-2-3, not 3-2-1. This may have messed up your tests. :-(

With a very quick test I didn't see much difference if I changed the
case-label order.

On 2021-02-02 Brett Okken wrote:
> I tested jdk 15 64bit and jdk 11 32bit, client and server and the
> above implementation is consistently quite good.
> The alternate in running does not do the leading alignment. This
> version is really close in 64 bit testing and slightly faster for 32
> bit. The differences are pretty small, and both are noticeably better
> than my original proposal (and all 3 are significantly faster than
> current). I think I would lead towards the simplicity of not doing the
> leading alignment, but I do not have a strong opinion.

Let's go with the simpler option.

>         switch (len & 3) {
>                 case 3:
>                     crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^
>                           (crc >>> 8);

I suppose this should use the same (faster) array indexing style as the
main loop:

    crc = TABLE[0][(buf[off++] & 0xFF) ^ ((int)crc & 0xFF)]
          ^ (crc >>> 8);

Also, does it really help to unroll the loop? With 8191-byte buffers I
see no significant difference (in a quick not-very-accurate test) if
the switch-statement is replaced with a while-loop.

With these two changes the code becomes functionally identical to the
version I posted with the name "Modified slicing-by-4". Is that an OK
version to commit?

Is the following fine to you as the file header? Your email address can
be omitted if you prefer that. I will mention in the commit message
that you adapted the code from XZ Utils and benchmarked it.

/*
 * CRC64
 *
 * Authors: Brett Okken <EMAIL>
 *          Lasse Collin <EMAIL>
 *
 * This file has been put into the public domain.
 * You can do whatever you want with this file.
 */

Thanks!

-- 
Lasse Collin  |  IRC: Larhzu @ IRCnet & Freenode

Reply via email to