On a SPARC-T4, with AES opcodes ...  enabled:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc     501882.74k   836726.87k   993102.76k  1020379.48k  1054083.75k
aes-192 cbc     435068.22k   707080.77k   837915.90k   864243.03k   889279.83k
aes-256 cbc     393746.28k   620463.13k   727483.31k   749580.97k   769029.46k

This system is a T4-2 so it's fun to show off some parallel benchmarks,
for example "openssl speed -multi 16 -evp aes-128-ecb" gives:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
evp            7429568.93k 17815630.93k 28436597.93k 32033047.55k 35120630.44k

35GB/sec AES encryption, not too bad.

Currently CBC, ECB, CTR, OFB, and CFB modes are explicitly optimized.
Other modes will be optimized in the future.

http://cvs.openssl.org/chngview?cn=22885 is heavily based on the submission, yet differs in several areas. There are pros and cons.

Pros:

- data alignment is handled in-line, i.e. without allocating temporary buffers and memcpy-ing in EVP (see even below);
- not dependent on deficient Solaris toolchain;
- single-block functions are implemented as loops (as opposite to switch between key-length specific paths, smaller code, no big deal in context);

Cons:

- no ECB, OFB, CFB yet;
- ~13% worse cumulative multi-process benchmark for CBC encrypt, less than 5% in other cases;

As for last point. "Cumulative multi-process benchmark" is speed result with -multi X, where X is number of virtual CPUs in system, N*8*8, where N is number of sockets in T4-based system, 128 in this case. The regression is caused by additional branches around in-line alignment code. It's effectively outweighed by benefits of the in-line alignment. I didn't examine all the cases, but I could measure >50% improvement in large-block CTR. Besides, crypto consumes only portion of time in real-life applications, so that even 13% is hardly a bigger problem. Once again, this is for cumulative multi-process benchmark. Single-thread results on the other hand are the virtually same.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to