On 02/22/2017 02:16 PM, Niklas Edmundsson wrote:
Any joy with something simpler like gprof? (Caveat: haven't used it in
ages to I don't know if its even applicable nowadays).

Well, if I had thought about it a little more, I would have remembered that instrumenting profilers don't profile syscalls very well, and they especially mess with I/O timing. Valgrind was completely inconclusive on the read() vs. mmap() front. :(

(...except that it showed that a good 25% of my test server's CPU time was spent inside OpenSSL in a memcpy(). Interesting...)

So httpd isn't beat by the naive openssl s_server approach at least ;-)

I don't think s_server is particularly optimized for performance anyway.

Oh, and just to complete my local testing table:

- test server, writing from memory: 1.2 GiB/s
- test server, mmap() from disk: 1.1 GiB/s
- test server, 64K read()s from disk: 1.0 GiB/s
- httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s
- httpd trunk with 'EnableMMAP off': 580 MiB/s
- httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s

My test server's read() implementation is a really naive "block on read, then block on write, repeat" loop, so there's probably some improvement to be had there, but this is enough proof in my mind that there are major gains to be made regardless.

Going off on a tangent here:

For those of you who actually know how the ssl stuff really works, is it
possible to get multiple threads involved in doing the encryption, or do
you need the results from the previous block in order to do the next
one?

I'm not a cryptographer, but I think how parallelizable it is depends on the ciphersuite in use. Like you say, some ciphersuites require one block to be fed into the next as an input; others don't.

--Jacob

Reply via email to