(I hope I'm doing this right) These are my Chacha implementations for reference, x86, SSE2, SSSE3, XOP, AVX, and AVX2, and Poly1305 implementations for reference, SSE2, AVX, and AVX2 for both 32 and 64 bits. (djb's floating point poly1305 is used for 32 bits). I do things a bit differently with the assembler code to make supporting multiple versions easier, I don't know if it is too non-standard or not. Everything is as fast as possible, so hopefully some or all of it can be used to fill things out, or give you ideas on how to surpass it. (I have 32 bit ARM and NEON versions as well, but no ARM to test on atm).
crypto/perlasm/x86gas.pl had to be modified to get AVX2 and floating point working for 32 bit code. (The floating point modifications are a bit of a hack, I apologize in advance). Also included is an EVP cipher compatible with Chrome's Chacha20-Poly1305 TLS implementation optimized for short messages that takes advantage of however optimized the underlying asm is. The cfrg draft isn't final so I didn't bother to add a generic AEAD construction for it, but the implementations are extremely flexible and can handle whatever is finalized. I didn't create a patch because I have no idea how you want to integrate everything in to the Makefiles/SSL, but it plugs in over this patch fairly easily. CHACHA_ASM_X86/POLY1305_ASM_X86, and CHACHA_ASM_X86_64/POLY1305_ASM_X86_64 are the defines it looks for, and everything else straightforward.
chacha_poly1305.tar.gz
Description: GNU Zip compressed data
_______________________________________________ openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
