Hi,

I've noticed that aes-128-gcm performance with scp(1) on amd64 based CPUs is much slower than expected on OpenBSD (i.e., I remember throughput being significantly better some time ago -- I think I saw much better throughput around the time when LRO and TSO were initially enabled for ix(4)). It looks to me like AES-NI isn't effectively used anymore.

I've now found the time to conduct some dedicated testing. In order to rule out any other bottlenecks, I did: - fresh installs of current (downloaded snapshot from ~10 days ago) with default options
- used X520 chipset based 10Gbe ix(4) NICs on both sides
- used sufficiently fast SSDs (>= PCIe Gen 3.0)

The actual tests were transfers of a large file. I also did a fresh reboot after each transfer to avoid any caching in RAM.

Transfers of 10GB file from AMD Ryzen 7 5800X to Xeon E5-2690:
via nc(1): ~413 MB/s
scp with chacha20-poly1305: ~83 MB/s
scp with aes-128-gcm: ~62 MB/s
--> AES-NI-accelerated aes-128-gcm should be much faster (also relative to chacha20-poly1305 -- see below).

Interestingly, the openssl speed command seems to correctly make use of hardware acceleration for aes-128-gcm:
Xeon E5-2690:
aes-128-gcm: 1021834.73k with 1024 bytes
chacha20-poly1305: 120589.22k with 1024 bytes
--> aes-128-gcm is >8X faster than chacha20-poly1305
AMD Ryzen 7 5800X:
aes-128 gcm: 3285108.49k with 1024 bytes
chacha20 poly1305: 450004.45k with 1024 bytes
--> aes-128-gcm is >7X faster than chacha20-poly1305

Is there a way to verify or control usage of AES-NI when used with OpenSSH / scp?

Best regards
Andreas

Reply via email to