On Thu, Oct 28, 2021 at 04:06:42PM -0600, Shawn Heisey wrote: > The file I transferred is 4GB in size, copied from /dev/urandom with dd. > Did the pull from another machine on the same gigabit LAN. I picked the > cipher by watching for TLS 1.2 ciphers shown by testssl.sh and choosing one > that mentioned AES. The server has plenty of memory to cache that entire > 4GB file, so disk speed should be irrelevant. > > Thank you for hanging onto enough patience to help me navigate this rabbit > hole.
By the way on this subject, based on the numbers you reported for openssl speed, the speed differences on as low bandwidth a network as 1 Gbps are not even relevant. Your machine can encrypt/decrypt at roughly 2 Gbps per core even when not using AES-NI, so in this case it's more important to watch the CPU utilization during the transfer than the transfer speed itself, which can be affected by many other factors. Also since you performed your transfer using aes-256-gcm, that's the one you should test. For me the differences are huge with and without AES-NI on this algo: without: $ OPENSSL_ia32cap="+0x200000200000000" openssl speed -elapsed -evp aes-256-gcm type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256-gcm 128654.97k 155005.99k 162485.42k 165428.22k 166909.27k 166914.73k with: $ openssl speed -elapsed -evp aes-256-gcm type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256-gcm 547722.32k 1457707.18k 2632156.25k 3890468.52k 4604226.22k 4597268.48k It's almost 30 times faster on large blocks. At 1 Gbps (~118 MB/s), this machine would spend roughly 70% of CPU in the AES code without AES-NI versus 2.5% with it. That's where you can see a really measurable difference. Of course, like Lukas said, "perf" is very useful here to see where time is spent. One trick I often use to measure the effects of micro-optimizations or things like this that only bring a benefit at higher data rate, is to chain many haproxy instances so that the traffic is processed many times. I have some config with 100 instances for example. When your traffic passes 100 times through decryption/encryption, you can hope to start to measure a big difference :-) Willy