So, as I was working on FreeBSD's implementation of gmac.c, I noticed
that I was able to get a significant speed up by using a mask instead
of an if branch in ghash_gfmul in gmac.c from OpenBSD...

Add a mask var and replace the code between the comments
"update Z" and "update V" w/:
                mask = !!(x[i >> 3] & (1 << (~i & 7)));
                mask = ~(mask - 1);

                z[0] ^= v[0] & mask;
                z[1] ^= v[1] & mask;
                z[2] ^= v[2] & mask;
                z[3] ^= v[3] & mask;

And you should see a nice performance increase...

I also have an implementation of ghash that does a 4 bit lookup table
version with the table split between cache lines in p4 at:
https://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/opencrypto/sys/opencrypto/gfmult.c&REV=4

This also has a version with does 4 blocks at a time getting a
further speed up...

-- 
  John-Mark Gurney                              Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."

Reply via email to