Hey folks, [Noise-related, but CCing curves@, since this essentially amounts to a benchmark of 25519.]
I added multi-core handshake processing to WireGuard this afternoon. With that in place, I decided to run some tests on how many real life network packets could be handled. To do this, I simply replayed the same valid initiation packet over and over, from localhost, which means the processing of the packet went all the way through up to the timestamp/counter in the payload, when it then saw it was a replay and discarded. This means that pretty much all the Noise calculations were being executed. Measurements below are in kilo-packets per second; each packet requires 2 ECDH() calls and a bunch of hashing. Intel(R) Xeon(R) CPU E3-1505M v5 @ 2.80GHz AVX-accelerated ChaCha20Poly1305, Blake2s, Curve25519 (sandy2x): multi-core: 48k/second single-core: 10k/second AVX-accelerated ChaCha20Poly1305, Blake2s | Curve25519-donna 64-bit: multi-core: 42k/second single-core: 8.8k/second Reference C ChaCha20Poly1305, Blake2s | Curve25519-donna 64-bit: multi-core: 41k/second single-core: 8.6k/second Having accelerated hashing and encryption helps only a _little_, whereas having accelerated ECDH helps _a bit more than a little_ but still not _tons and tons_. I found that on this hardware, with an incoming packet queue length of 4096, and a "do not process unless a mac2 is present, thereby requiring a cookie reply message" max queue depth of 512, I was able to fend of a localhost-based (read: infinite bandwidth) DoS attack. Given that IK computes two ECDH() in the first message, are these measurements ± how you'd expect 25519 to perform? Is it "expected" that the difference between donna and sandy2x isn't that massive? Jason _______________________________________________ Curves mailing list Curves@moderncrypto.org https://moderncrypto.org/mailman/listinfo/curves