On 5/25/23 09:08, Willy Tarreau wrote:
The problem definitely is concurrency, so 1000 curl will show nothing
and will not even match production traffic. You'll need to use a load
generator that allows you to tweak the TLS resume support, like we do
with h1load's argument "--tls-reuse". Also I don't know how often the
recently modified locks are used per server connection and per client
connection, that's what the SSL guys want to know since they're not able
to test their changes.
I finally got a test program together. After trying and failing with
the Jetty HttpClient and Apache HttpClient version 5 (both options that
would have let me do HTTP/2) I got a program together with Apache
HttpClient version 4. I had one version that shelled out to curl, but
it ran about ten times slower.
I know lots of people are going to have bad things to say about writing
a test in Java. It's the only language where I already know how to
write multi-threaded code. I would have to spend a bunch of time
learning how to do that in another language.
It fires up X threads, each of which make 1000 consecutive requests to
the URL specified. It records the time in milliseconds for each
request, and when all the threads finish, prints out statistics. These
runs are with 24 threads. I ran it on a different system so that it
would not affect CPU usage on the server running haproxy. Here's the
results:
quictls branch: OpenSSL_1_1_1t+quic
23:01:19.067 [main] INFO o.e.t.h.MainSSLTest Count 24000 1228.69/s
23:01:19.069 [main] INFO o.e.t.h.MainSSLTest Median 7562839 ns
23:01:19.069 [main] INFO o.e.t.h.MainSSLTest 75th % 25138492 ns
23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 95th % 70603313 ns
23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 99th % 120502022 ns
23:01:19.070 [main] INFO o.e.t.h.MainSSLTest 99.9 % 355829439 ns
quictls branch: openssl-3.1.0+quic+locks
22:56:11.457 [main] INFO o.e.t.h.MainSSLTest Count 24000 1267.96/s
22:56:11.459 [main] INFO o.e.t.h.MainSSLTest Median 6827111 ns
22:56:11.459 [main] INFO o.e.t.h.MainSSLTest 75th % 23239248 ns
22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 95th % 70625628 ns
22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 99th % 129494323 ns
22:56:11.460 [main] INFO o.e.t.h.MainSSLTest 99.9 % 307070582 ns
quictls branch: openssl-3.0.8+quic
22:59:12.614 [main] INFO o.e.t.h.MainSSLTest Count 24000 1163.24/s
22:59:12.616 [main] INFO o.e.t.h.MainSSLTest Median 6930268 ns
22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 75th % 26238752 ns
22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 95th % 75464869 ns
22:59:12.616 [main] INFO o.e.t.h.MainSSLTest 99th % 132522508 ns
22:59:12.617 [main] INFO o.e.t.h.MainSSLTest 99.9 % 445411125 ns
The stats don't show any kind of smoking gun like I had hoped they
would. Not a lot of difference there.
Differences in the requests per second are also not huge, but more in
line with what I was expecting. If I can believe those numbers, and I
admit that this kind of micro-benchmark is not the most reliable way to
test performance, it looks like 3.1.0 with the lock fixes is slightly
faster than 1.1.1t. 24 threads might not be enough to really exercise
the concurrency though.
I will poke at it a little more tomorrow, trying more threads.