Re: httpd 2.4.25, mpm_event, ssl: segfaults
All writes to Linux sockets means the kernel copies to 2kiB buffers used by SKBs. It's copied to somewhere in the middle of that 2kiB buffer, so that TCP/IP headers can be prepended by the kernel. Even with TCP Segmentation Offload, 2kiB buffers are still used; it just means that the TCP/IP headers just need to be calculated once for an array of buffers, and then the kernel puts an array of pointers in the network card's ring buffer. The kernel will only put on the wire as much data as the current TCP congestion window says, but it has to keep each packet in it's buffers until the remote side ACKs that packet. On Mon, Feb 27, 2017 at 2:25 PM, William A Rowe Jr wrote: > On Mon, Feb 27, 2017 at 12:16 PM, Jacob Champion > wrote: > > > > On 02/23/2017 04:48 PM, Yann Ylavic wrote: > >> On Wed, Feb 22, 2017 at 8:55 PM, Daniel Lescohier wrote: > >>> > >>> > >>> IOW: read():Three copies: copy from filesystem cache to httpd > >>> read() buffer to encrypted-data buffer to kernel socket buffer. > > > >> > >> Not really, "copy from filesystem cache to httpd read() buffer" is > >> likely mapping to userspace, so no copy (on read) here. > > > > Oh, cool. Which kernels do this? It seems like the VM tricks would have > to > > be incredibly intricate for this to work; reads typically don't happen in > > page-sized chunks, nor to aligned addresses. Linux in particular has > > comments in the source explaining that they *don't* do it for other > syscalls > > (e.g. vmsplice)... but I don't have much experience with non-Linux > systems. > > I don't understand this claim. > > If read() returned an API-provisioned buffer, it could point wherever it > liked, > including a 4k page. As things stand the void* (or char*) of the read() > buffer > is at an arbitrary offset, no common OS I'm familiar with maps a page to > a non-page-aligned address. > > The kernel socket send[v]() call might avoid copy in the direct-send case, > depending on the implementation. >
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Mon, Feb 27, 2017 at 12:16 PM, Jacob Champion wrote: > > On 02/23/2017 04:48 PM, Yann Ylavic wrote: >> On Wed, Feb 22, 2017 at 8:55 PM, Daniel Lescohier wrote: >>> >>> >>> IOW: read():Three copies: copy from filesystem cache to httpd >>> read() buffer to encrypted-data buffer to kernel socket buffer. > >> >> Not really, "copy from filesystem cache to httpd read() buffer" is >> likely mapping to userspace, so no copy (on read) here. > > Oh, cool. Which kernels do this? It seems like the VM tricks would have to > be incredibly intricate for this to work; reads typically don't happen in > page-sized chunks, nor to aligned addresses. Linux in particular has > comments in the source explaining that they *don't* do it for other syscalls > (e.g. vmsplice)... but I don't have much experience with non-Linux systems. I don't understand this claim. If read() returned an API-provisioned buffer, it could point wherever it liked, including a 4k page. As things stand the void* (or char*) of the read() buffer is at an arbitrary offset, no common OS I'm familiar with maps a page to a non-page-aligned address. The kernel socket send[v]() call might avoid copy in the direct-send case, depending on the implementation.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
[combining two replies] On 02/23/2017 04:47 PM, Yann Ylavic wrote: On Thu, Feb 23, 2017 at 7:16 PM, Jacob Champion wrote: Power users can break the system, and this is a power tool, right? It's not about power users, I don't think we can recommend anyone to use 4MB buffers even if they (seem to) have RAM for it. Even if this is the case (and I'm still skeptical), there's a middle ground between "don't let users configure it at all" and "let users configure it to dangerous extents". I'm not talking about hardcoding anything, nor reading or sendind hard limited sizes on filesystem/sockets. I'm proposing that the configuration relates to how much we "coalesce" data on output, and all buffers' reuses (though each of fixed size) should follow the needs. I consider the fixed block size to be a major tuning point. That's what I'm referring to with "hardcoding". Coalescing doesn't help performance if we're bottlenecking on the overhead of tiny blocks. (Even if *everything* else can be vectored -- and right now, OpenSSL can't be -- allocation itself would still still be a fixed overhead.) I've no idea how much it costs to have 8K vs 16K records, though. Maybe in the mod_ssl case we'd want 16K buffers, still reasonable? We can't/shouldn't hardcode this especially. People who want maximum throughput may want nice big records, but IIRC users who want progressive rendering need smaller records so that they don't have to wait as long for the first decrypted chunk. It needs to be tunable, possibly per-location. Well, the limit is 16K at the TLS level. Maximum limit, yes. We also need to not set a minimum limit, which is, IIUC, what we're currently doing. mod_ssl calls SSL_write() once for every chunk read from apr_bucket_read(). On 02/23/2017 04:48 PM, Yann Ylavic wrote: > On Wed, Feb 22, 2017 at 8:55 PM, Daniel Lescohier wrote: IOW: read():Three copies: copy from filesystem cache to httpd read() buffer to encrypted-data buffer to kernel socket buffer. > > Not really, "copy from filesystem cache to httpd read() buffer" is > likely mapping to userspace, so no copy (on read) here. Oh, cool. Which kernels do this? It seems like the VM tricks would have to be incredibly intricate for this to work; reads typically don't happen in page-sized chunks, nor to aligned addresses. Linux in particular has comments in the source explaining that they *don't* do it for other syscalls (e.g. vmsplice)... but I don't have much experience with non-Linux systems. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Fri, 24 Feb 2017, Yann Ylavic wrote: On Thu, Feb 23, 2017 at 8:50 PM, Jacob Champion wrote: Going off on a tangent here: For those of you who actually know how the ssl stuff really works, is it possible to get multiple threads involved in doing the encryption, or do you need the results from the previous block in order to do the next one? I'm not a cryptographer, but I think how parallelizable it is depends on the ciphersuite in use. Like you say, some ciphersuites require one block to be fed into the next as an input; others don't. Yes, and the cost of scheduling threads for non dedicated cypto device is not worth it I think. But mainly there is not only one stream involved in a typical HTTP server, so probably multiple simulteneous connections already saturate the AES-NI... Actually, the AES-NI capability can be seen as a dedicated crypto device of sorts... It's just a bit more versatile with a CPU core stuck in there as well ;-) I would much prefer if httpd could be able to push full bandwidth single-stream https using multiple cores instead of enticing users to use silly "parallel get" clients, download accelerators and whatnot. Granted, the use cases are not perhaps the standard serve-many-files-to-the-public ones, but they do exist. And depending on which way the computing trends blow we might start seeing more competing low-power cpus with more cores and less capability requiring more threads to perform on single/few-stream workloads. In any event I don't think the basic idea of multiple-thread-crypto should be dismissed lightly, especially if someone (definitely not me) figures out a neat way to do it :-) Personally, it's the angst! of having to wait more than 10 seconds for a DVD-sized Linux-distro.iso download when I KNOW that there are 7 cores idling and knowing that without the single-core bottleneck I would have 6 additionals seconds of time to spend on something useful 8-() /Nikke - thinking that the Subject is not that accurate anymore... -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- Sattinger's Law: It works better if you plug it in. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Fri, 24 Feb 2017, Yann Ylavic wrote: The issue is potentially the huge (order big-n) allocations which finally may hurt the system (fragmentation, OOM...). Is this a real or theoretical problem? Both. Fragmentation is a hard issue, but a constant is: the more you ask for big allocs, the more likely you won't be serviced one day or another (or another task will be killed for that, until yours). Long living (or closed to memory limits) systems suffer from this, no matter what allocator, small and large allocations fragment the memory (httpd is likely not the only program on the system), the only "remedy" is small order allocations (2^order pages, a "sane" order being lower than 2, hence order-1 on a system with 4K pages is 8K bytes). I've only seen this class of issues on Linux systems where vm.min_free_kbytes is left at default in combination with better-than-GigE networking. Since we started to tune vm.min_free_kbytes to be in the order of 0.5s bursts at maximum-network-speed (ie. 512M for 10GigE) we haven't seen it in production. I think our working theory was too little space to handle bursts resulted in the kernel unable to figure out which file cache pages to throw out in time, but I think we never got to figuring out the exact reason... However, for large file performance I really don't buy into the notion that it's a good idea to break everything into tiny puny blocks. The potential for wasting CPU cycles on this micro-management is rather big... I don't think that a readv/writev of 16 iovecs of 8KB is (noticeably) slower than read/write of contiguous 128K, both might well end up in a scaterlist for the kernel/hardware. Ah, true. Scatter/gather is magic... I do find iovecs useful, it the small blocks that gets me into skeptic mode... Small blocks is not for networking, it's for internal use only. And remember that TLS records are 16K max anyway, give 128KB to openssl's SSL_write it will output 8 chunks of 16KB. Oh, I had completely missed that limit on TLS record size... Kinda related: We also have the support for larger page sizes with modern CPUs. Has anyone investigated if it makes sense allocating memory pools in chunks that fit those large pages? I think PPC64 have 64K pages already. APR pools are already based on the page size IIRC. I was thinking of the huge/large pages available on x86:s, 2 MiB and maybe 1 GiB IIRC. My thought was that doing 2 MiB allocations for the large memory pools instead of 4k might make sense for configurations when you have a lot of threads resulting in allocating that much memory eventually, one page instead of lots. On Linux transparent huge page support, when enabled, can take advantage of this leading to less TLB entries/misses. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- * <- Tribble * <- Tribble having Safe Sex =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 23, 2017 at 7:48 PM, Yann Ylavic wrote: > On Wed, Feb 22, 2017 at 8:55 PM, Daniel Lescohier wrote: > > > > IOW: > > read():Three copies: copy from filesystem cache to httpd read() buffer to > > encrypted-data buffer to kernel socket buffer. > > Not really, "copy from filesystem cache to httpd read() buffer" is > likely mapping to userspace, so no copy (on read) here. > > > mmap(): Two copies: filesystem page already mapped into httpd, so just > copy > > from filesystem (cached) page to encrypted-data buffer to kernel socket > > buffer. > > So, as you said earlier the "write to socket" isn't a copy either, > hence both read() and mmap() implementations could work with a single > copy when mod_ssl is involved (this is more than a copy but you > counted it above so), and no copy at all without it. > When you do a write() system call to a socket, the kernel must copy the data from the userspace buffer to it's own buffers, because of data lifetime. When the write() system call returns, userspace is free to modify the buffer (which it owns). But, the data from the last write() call must live a long time in the kernel. The kernel needs to keep a copy of it until the remote system ACKs all of it. The data is referenced first in the kernel transmission control system, then in the network card's ring buffers. If the remote system's feedback indicates that a packet was dropped or corrupted, the kernel may send it multiple times. The data has a different lifetime than the userspace buffer, so the kernel must copy it to a buffer it owns. On the userspace high-order memory allocations. I still don't see what the problem is. Say you're using 64kiB buffers. When you free the buffers (e.g., at the end of the http request), they go into the memory allocator's 64kiB free-list, and they're available to be allocated again (e.g., by another http request). The memory allocator won't use the 64kiB free chunks for smaller allocations, unless the free-lists for the smaller-orders are emptied out. That'd mean there was a surge in demand for smaller-size allocations, so it'd make sense to start using the higher-order free chunk instead of calling brk(). Only if there are no more high-order free chunks left will the allocator have to call brk(). When the kernel gets the brk() request, if the system is short of high-order contiguous memory, it doesn't have to give contiguous-physical pages on that brk() calls. The Page Table Entries for that request can be composed of many individual 4kiB pages scattered all over physical memory. That's hidden from userspace, userspace sees a contiguous range of virtual memory.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 23, 2017 at 10:06 PM, Daniel Lescohier wrote: > Why would high-order memory allocations be a problem in userspace code, > which is using virtual memory? I thought high-order allocations is a big > problem in kernel space, which has to deal with physical pages. Well, both in kernel or user space, the difficulty is finding large contiguous memory. With virtual memory (admittedly virtually larger than physical memory), it needs more "active" regions to fail, but still it can fail if many heterogeneous chunks are to be mapped at a time, the OOM killer will do its job. It depends on how closed to the resident memory limit you are of course (it won't happen if some memory can be compressed or swapped), but large chunks are no better with lot of RAM either. > > But when you write to a socket, doesn't the kernel scatter the userspace > buffer into multiple SKBs? SKBs on order-0 pages allocated by the kernel? Right, in Linux network stack (or drivers) is mainly using (or is moving to) order 0 or 1 chunks (with scatterlists when needed). But this is where httpd's job end, we talk about before this here :) >From the other message: On Wed, Feb 22, 2017 at 8:55 PM, Daniel Lescohier wrote: > > IOW: > read():Three copies: copy from filesystem cache to httpd read() buffer to > encrypted-data buffer to kernel socket buffer. Not really, "copy from filesystem cache to httpd read() buffer" is likely mapping to userspace, so no copy (on read) here. > mmap(): Two copies: filesystem page already mapped into httpd, so just copy > from filesystem (cached) page to encrypted-data buffer to kernel socket > buffer. So, as you said earlier the "write to socket" isn't a copy either, hence both read() and mmap() implementations could work with a single copy when mod_ssl is involved (this is more than a copy but you counted it above so), and no copy at all without it. Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 23, 2017 at 8:50 PM, Jacob Champion wrote: > On 02/22/2017 02:16 PM, Niklas Edmundsson wrote: > > I don't think s_server is particularly optimized for performance anyway. > > Oh, and just to complete my local testing table: > > - test server, writing from memory: 1.2 GiB/s > - test server, mmap() from disk: 1.1 GiB/s > - test server, 64K read()s from disk: 1.0 GiB/s > - httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s > - httpd trunk with 'EnableMMAP off': 580 MiB/s > - httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s > > My test server's read() implementation is a really naive "block on read, > then block on write, repeat" loop, so there's probably some improvement to > be had there, but this is enough proof in my mind that there are major gains > to be made regardless. Does no-mmap-2MB-block beats MMap on? > >> Going off on a tangent here: >> >> For those of you who actually know how the ssl stuff really works, is it >> possible to get multiple threads involved in doing the encryption, or do >> you need the results from the previous block in order to do the next >> one? > > I'm not a cryptographer, but I think how parallelizable it is depends on the > ciphersuite in use. Like you say, some ciphersuites require one block to be > fed into the next as an input; others don't. Yes, and the cost of scheduling threads for non dedicated cypto device is not worth it I think. But mainly there is not only one stream involved in a typical HTTP server, so probably multiple simulteneous connections already saturate the AES-NI...
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 23, 2017 at 7:16 PM, Jacob Champion wrote: > On 02/23/2017 08:34 AM, Yann Ylavic wrote: >> Actually I'm not very pleased with this solution (or the final one >> that would make this size open / configurable). >> The issue is potentially the huge (order big-n) allocations which >> finally may hurt the system (fragmentation, OOM...). > > Power users can break the system, and this is a power tool, right? It's not about power users, I don't think we can recommend anyone to use 4MB buffers even if they (seem to) have RAM for it. > And we > have HugeTLB kernels and filesystems to play with, with 2MB and bigger > pages... Making all these assumptions for 90% of users is perfect for the > out-of-the-box experience, but hardcoding them so that no one can fix broken > assumptions seems Bad. I'm not talking about hardcoding anything, nor reading or sendind hard limited sizes on filesystem/sockets. I'm proposing that the configuration relates to how much we "coalesce" data on output, and all buffers' reuses (though each of fixed size) should follow the needs. > > (And don't get me wrong, I think applying vectored I/O to the brigade would > be a great thing to try out and benchmark. I just think it's a long-term and > heavily architectural fix, when a short-term change to get rid of some > #defined constants could have immediate benefits.) Of course, apr_bucket_file_set_read_size() is a quick patch (I dispute it for the general case, but it may be useful, or not, for the 16K SSL case, let's see), and so is another for configuring core_output_filter constants, but we don't need them for testing right? > >> I've no idea how much it costs to have 8K vs 16K records, though. >> Maybe in the mod_ssl case we'd want 16K buffers, still reasonable? > > > We can't/shouldn't hardcode this especially. People who want maximum > throughput may want nice big records, but IIRC users who want progressive > rendering need smaller records so that they don't have to wait as long for > the first decrypted chunk. It needs to be tunable, possibly per-location. Well, the limit is 16K at the TLS level. Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 23, 2017 at 7:15 PM, Niklas Edmundsson wrote: > On Thu, 23 Feb 2017, Yann Ylavic wrote: > >>> Technically, Yann's patch doesn't redefine APR_BUCKET_BUFF_SIZE, it just >>> defines a new buffer size for use with the file bucket. It's a little >>> less >>> than 64K, I assume to make room for an allocation header: >>> >>> #define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) >> >> >> Actually I'm not very pleased with this solution (or the final one >> that would make this size open / configurable). >> The issue is potentially the huge (order big-n) allocations which >> finally may hurt the system (fragmentation, OOM...). > > > Is this a real or theoretical problem? Both. Fragmentation is a hard issue, but a constant is: the more you ask for big allocs, the more likely you won't be serviced one day or another (or another task will be killed for that, until yours). Long living (or closed to memory limits) systems suffer from this, no matter what allocator, small and large allocations fragment the memory (httpd is likely not the only program on the system), the only "remedy" is small order allocations (2^order pages, a "sane" order being lower than 2, hence order-1 on a system with 4K pages is 8K bytes). > > Our large-file cache module does 128k allocs to get a sane block size when > copying files to the cache. The only potential drawback we noticed was httpd > processes becoming bloated due to the default MaxMemFree 2048, so we're > running with MaxMemFree 256 now. With MaxMemFree 256 (KB per allocator), each APR allocator reclaims but mainly returns a lot more memory chunks to the system's allocator, which does a better job to in recycling many differently sized chunks than APR's (which is pretty basic in this regard, it's role is more about quickly recycling common sizes). With MaxMemFree 2048 (2MB), there is more builtin handling of "exotic" chunks, which may be painful. That might be something else, though... > > Granted, doing alloc/free for all outgoing data means way more alloc/free:s, > so we might just miss the issues because cache fills aren't as common. That's why reuse of common and reasonably sized chunks in the APR allocator can help. > > However, for large file performance I really don't buy into the notion that > it's a good idea to break everything into tiny puny blocks. The potential > for wasting CPU cycles on this micro-management is rather big... I don't think that a readv/writev of 16 iovecs of 8KB is (noticeably) slower than read/write of contiguous 128K, both might well end up in a scaterlist for the kernel/hardware. > > I do find iovecs useful, it the small blocks that gets me into skeptic > mode... Small blocks is not for networking, it's for internal use only. And remember that TLS records are 16K max anyway, give 128KB to openssl's SSL_write it will output 8 chunks of 16KB. > > > Kinda related: We also have the support for larger page sizes with modern > CPUs. Has anyone investigated if it makes sense allocating memory pools in > chunks that fit those large pages? I think PPC64 have 64K pages already. APR pools are already based on the page size IIRC. Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Why would high-order memory allocations be a problem in userspace code, which is using virtual memory? I thought high-order allocations is a big problem in kernel space, which has to deal with physical pages. But when you write to a socket, doesn't the kernel scatter the userspace buffer into multiple SKBs? SKBs on order-0 pages allocated by the kernel? On Thu, Feb 23, 2017 at 1:16 PM, Jacob Champion wrote: > On 02/23/2017 08:34 AM, Yann Ylavic wrote: > > Actually I'm not very pleased with this solution (or the final one > > that would make this size open / configurable). > > The issue is potentially the huge (order big-n) allocations which > > finally may hurt the system (fragmentation, OOM...). > > Power users can break the system, and this is a power tool, right? And we > have HugeTLB kernels and filesystems to play with, with 2MB and bigger > pages... Making all these assumptions for 90% of users is perfect for the > out-of-the-box experience, but hardcoding them so that no one can fix > broken assumptions seems Bad. > > (And don't get me wrong, I think applying vectored I/O to the brigade > would be a great thing to try out and benchmark. I just think it's a > long-term and heavily architectural fix, when a short-term change to get > rid of some #defined constants could have immediate benefits.) > > I've no idea how much it costs to have 8K vs 16K records, though. >> Maybe in the mod_ssl case we'd want 16K buffers, still reasonable? >> > > We can't/shouldn't hardcode this especially. People who want maximum > throughput may want nice big records, but IIRC users who want progressive > rendering need smaller records so that they don't have to wait as long for > the first decrypted chunk. It needs to be tunable, possibly per-location. > > --Jacob >
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/22/2017 02:16 PM, Niklas Edmundsson wrote: Any joy with something simpler like gprof? (Caveat: haven't used it in ages to I don't know if its even applicable nowadays). Well, if I had thought about it a little more, I would have remembered that instrumenting profilers don't profile syscalls very well, and they especially mess with I/O timing. Valgrind was completely inconclusive on the read() vs. mmap() front. :( (...except that it showed that a good 25% of my test server's CPU time was spent inside OpenSSL in a memcpy(). Interesting...) So httpd isn't beat by the naive openssl s_server approach at least ;-) I don't think s_server is particularly optimized for performance anyway. Oh, and just to complete my local testing table: - test server, writing from memory: 1.2 GiB/s - test server, mmap() from disk: 1.1 GiB/s - test server, 64K read()s from disk: 1.0 GiB/s - httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s - httpd trunk with 'EnableMMAP off': 580 MiB/s - httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s My test server's read() implementation is a really naive "block on read, then block on write, repeat" loop, so there's probably some improvement to be had there, but this is enough proof in my mind that there are major gains to be made regardless. Going off on a tangent here: For those of you who actually know how the ssl stuff really works, is it possible to get multiple threads involved in doing the encryption, or do you need the results from the previous block in order to do the next one? I'm not a cryptographer, but I think how parallelizable it is depends on the ciphersuite in use. Like you say, some ciphersuites require one block to be fed into the next as an input; others don't. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/23/2017 08:34 AM, Yann Ylavic wrote: > Actually I'm not very pleased with this solution (or the final one > that would make this size open / configurable). > The issue is potentially the huge (order big-n) allocations which > finally may hurt the system (fragmentation, OOM...). Power users can break the system, and this is a power tool, right? And we have HugeTLB kernels and filesystems to play with, with 2MB and bigger pages... Making all these assumptions for 90% of users is perfect for the out-of-the-box experience, but hardcoding them so that no one can fix broken assumptions seems Bad. (And don't get me wrong, I think applying vectored I/O to the brigade would be a great thing to try out and benchmark. I just think it's a long-term and heavily architectural fix, when a short-term change to get rid of some #defined constants could have immediate benefits.) I've no idea how much it costs to have 8K vs 16K records, though. Maybe in the mod_ssl case we'd want 16K buffers, still reasonable? We can't/shouldn't hardcode this especially. People who want maximum throughput may want nice big records, but IIRC users who want progressive rendering need smaller records so that they don't have to wait as long for the first decrypted chunk. It needs to be tunable, possibly per-location. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 23 Feb 2017, Yann Ylavic wrote: Technically, Yann's patch doesn't redefine APR_BUCKET_BUFF_SIZE, it just defines a new buffer size for use with the file bucket. It's a little less than 64K, I assume to make room for an allocation header: #define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) Actually I'm not very pleased with this solution (or the final one that would make this size open / configurable). The issue is potentially the huge (order big-n) allocations which finally may hurt the system (fragmentation, OOM...). Is this a real or theoretical problem? Our large-file cache module does 128k allocs to get a sane block size when copying files to the cache. The only potential drawback we noticed was httpd processes becoming bloated due to the default MaxMemFree 2048, so we're running with MaxMemFree 256 now. I don't know if things got much better, but it isn't breaking anything either... Granted, doing alloc/free for all outgoing data means way more alloc/free:s, so we might just miss the issues because cache fills aren't as common. However, for large file performance I really don't buy into the notion that it's a good idea to break everything into tiny puny blocks. The potential for wasting CPU cycles on this micro-management is rather big... I can see it working for a small-file workload where files aren't much bigger than tens of kB anyway, but not so much for large-file delivery. A prudent way forward might be to investigate what impact different block sizes have wrt ssl/https first. As networking speeds go up it is kind of expected that block sizes needs to go up as well, especially as per-core clock frequency isn't increasing much (it's been at 2-ish GHz base frequency for server CPUs the last ten years now?) and we're relying more and more on various offload mechanisms in CPUs/NICs etc to get us from 1 Gbps to 10 Gbps to 100 Gbps ... I do find iovecs useful, it the small blocks that gets me into skeptic mode... Kinda related: We also have the support for larger page sizes with modern CPUs. Has anyone investigated if it makes sense allocating memory pools in chunks that fit those large pages? /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- You need not worry about your future =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 23, 2017 at 5:34 PM, Yann Ylavic wrote: > On Thu, Feb 23, 2017 at 4:58 PM, Stefan Eissing > wrote: >> >>> Am 23.02.2017 um 16:38 schrieb Yann Ylavic : >>> >>> On Wed, Feb 22, 2017 at 6:36 PM, Jacob Champion >>> wrote: On 02/22/2017 12:00 AM, Stefan Eissing wrote: > > Just so I do not misunderstand: > > you increased BUCKET_BUFF_SIZE in APR from 8000 to 64K? That is what you > are testing? Essentially, yes, *and* turn off mmap and sendfile. My hope is to disable the mmap-optimization by default while still improving overall performance for most users. Technically, Yann's patch doesn't redefine APR_BUCKET_BUFF_SIZE, it just defines a new buffer size for use with the file bucket. It's a little less than 64K, I assume to make room for an allocation header: #define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) >>> >>> Actually I'm not very pleased with this solution (or the final one >>> that would make this size open / configurable). >>> The issue is potentially the huge (order big-n) allocations which >>> finally may hurt the system (fragmentation, OOM...). >>> >>> So I'm thinking of another way to achieve the same with the current >>> APR_BUCKET_BUFF_SIZE (2 pages) per alloc. >>> >>> The idea is to have a new apr_allocator_allocv() function which would >>> fill an iovec with what's available in the allocator's freelist (i.e. >>> spare apr_memnodes) of at least the given min_size bytes (possibly a >>> max too but I don't see the need for now) and up to the size of the >>> given iovec. >> >> Interesting. Not only for pure files maybe. >> >> It would be great if there'd be a SSL_writev()... > > Indeed, openssl would fully fill the TLS records. > >> but until there is, the TLS case will either turn every iovec into >> its own TLS record > > Yes, but it's more a client issue to work with these records finally > (because from there all is networking only, and coalescing will happen > in the core output filter from a socket POV). > > Anyway mod_proxy(s) could be the client, so... > > I've no idea how much it costs to have 8K vs 16K records, though. > Maybe in the mod_ssl case we'd want 16K buffers, still reasonable? > >> or one needs another copy before that. This last strategy is used by >> mod_http2. Since there are 9 header bytes per frame, copying data >> into a right sized buffer gives better performance. So, it would be >> nice to read n bytes from a bucket brigade and get iovecs back. >> Which, as I understand it, you propose? > > I didn't think of apr_bucket_readv(), more focused on > file_bucket_read() to do this internally/transparently, but once file > buckets can do that I think we can generalize the concept, at worse > filling only iovec[0] when it's ENOTIMPL and/or makes no sense... > > That'd help the mod_ssl case with something like > apr_bucket_readv(min_size=16K), I'll try to think of it once/if I can > have a more simple to do file_bucket_read() only working ;) Hm no, this needs to happen on the producer side, not at final apr_bucket_read(). So for the mod_ssl case I think we could have a simple (new) apr_bucket_alloc_set_size(16K) in the first place if it helps more than hurts. As for your question about "iovecs back" (which I finally didn't answer), all happens at the bucket alloc level, when buckets are deleted.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 23, 2017 at 4:58 PM, Stefan Eissing wrote: > >> Am 23.02.2017 um 16:38 schrieb Yann Ylavic : >> >> On Wed, Feb 22, 2017 at 6:36 PM, Jacob Champion wrote: >>> On 02/22/2017 12:00 AM, Stefan Eissing wrote: Just so I do not misunderstand: you increased BUCKET_BUFF_SIZE in APR from 8000 to 64K? That is what you are testing? >>> >>> >>> Essentially, yes, *and* turn off mmap and sendfile. My hope is to disable >>> the mmap-optimization by default while still improving overall performance >>> for most users. >>> >>> Technically, Yann's patch doesn't redefine APR_BUCKET_BUFF_SIZE, it just >>> defines a new buffer size for use with the file bucket. It's a little less >>> than 64K, I assume to make room for an allocation header: >>> >>>#define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) >> >> Actually I'm not very pleased with this solution (or the final one >> that would make this size open / configurable). >> The issue is potentially the huge (order big-n) allocations which >> finally may hurt the system (fragmentation, OOM...). >> >> So I'm thinking of another way to achieve the same with the current >> APR_BUCKET_BUFF_SIZE (2 pages) per alloc. >> >> The idea is to have a new apr_allocator_allocv() function which would >> fill an iovec with what's available in the allocator's freelist (i.e. >> spare apr_memnodes) of at least the given min_size bytes (possibly a >> max too but I don't see the need for now) and up to the size of the >> given iovec. > > Interesting. Not only for pure files maybe. > > It would be great if there'd be a SSL_writev()... Indeed, openssl would fully fill the TLS records. > but until there is, the TLS case will either turn every iovec into > its own TLS record Yes, but it's more a client issue to work with these records finally (because from there all is networking only, and coalescing will happen in the core output filter from a socket POV). Anyway mod_proxy(s) could be the client, so... I've no idea how much it costs to have 8K vs 16K records, though. Maybe in the mod_ssl case we'd want 16K buffers, still reasonable? > or one needs another copy before that. This last strategy is used by > mod_http2. Since there are 9 header bytes per frame, copying data > into a right sized buffer gives better performance. So, it would be > nice to read n bytes from a bucket brigade and get iovecs back. > Which, as I understand it, you propose? I didn't think of apr_bucket_readv(), more focused on file_bucket_read() to do this internally/transparently, but once file buckets can do that I think we can generalize the concept, at worse filling only iovec[0] when it's ENOTIMPL and/or makes no sense... That'd help the mod_ssl case with something like apr_bucket_readv(min_size=16K), I'll try to think of it once/if I can have a more simple to do file_bucket_read() only working ;)
Re: httpd 2.4.25, mpm_event, ssl: segfaults
> Am 23.02.2017 um 16:38 schrieb Yann Ylavic : > > On Wed, Feb 22, 2017 at 6:36 PM, Jacob Champion wrote: >> On 02/22/2017 12:00 AM, Stefan Eissing wrote: >>> >>> Just so I do not misunderstand: >>> >>> you increased BUCKET_BUFF_SIZE in APR from 8000 to 64K? That is what you >>> are testing? >> >> >> Essentially, yes, *and* turn off mmap and sendfile. My hope is to disable >> the mmap-optimization by default while still improving overall performance >> for most users. >> >> Technically, Yann's patch doesn't redefine APR_BUCKET_BUFF_SIZE, it just >> defines a new buffer size for use with the file bucket. It's a little less >> than 64K, I assume to make room for an allocation header: >> >>#define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) > > Actually I'm not very pleased with this solution (or the final one > that would make this size open / configurable). > The issue is potentially the huge (order big-n) allocations which > finally may hurt the system (fragmentation, OOM...). > > So I'm thinking of another way to achieve the same with the current > APR_BUCKET_BUFF_SIZE (2 pages) per alloc. > > The idea is to have a new apr_allocator_allocv() function which would > fill an iovec with what's available in the allocator's freelist (i.e. > spare apr_memnodes) of at least the given min_size bytes (possibly a > max too but I don't see the need for now) and up to the size of the > given iovec. Interesting. Not only for pure files maybe. It would be great if there'd be a SSL_writev()...but until there is, the TLS case will either turn every iovec into its own TLS record or one needs another copy before that. This last strategy is used by mod_http2. Since there are 9 header bytes per frame, copying data into a right sized buffer gives better performance. So, it would be nice to read n bytes from a bucket brigade and get iovecs back. Which, as I understand it, you propose? > This function could be the base of a new apr_bucket_allocv() (and > possibly apr_p[c]allocv(), though out of scope here) which in turn > could be used by file_bucket_read() to get an iovec of available > buffers. > This iovec could then be passed to (new still) apr_file_readv() based > on the readv() syscall, which would allow to read much more data in > one go. > > With this the scheme we'd have iovec from end to end, well, sort of > since mod_ssl would be break the chain but still produce transient > buckets on output which anyway will end up in the core_output_filter's > brigade of aside heap buckets, for apr_socket_sendv() to finally > writev() them. > > We'd also have more recycled heap buckets (hence memnodes in the > allocator) as the core_output_filter retains buckets, all with > APR_BUCKET_BUFF_SIZE, up to THRESHOLD_MAX_BUFFER which, if > configurable and along with MaxMemFree, would be the real limiter of > recycling. > So it's also adaptative. > > Actually it looks like what we need, but I'd like to have feedbacks > before I go further the prototype I have so far (quite straightforward > apr_allocator changes...). > > Thoughts? > > > Regards, > Yann. Stefan Eissing bytes GmbH Hafenstrasse 16 48155 Münster www.greenbytes.de
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Wed, Feb 22, 2017 at 6:36 PM, Jacob Champion wrote: > On 02/22/2017 12:00 AM, Stefan Eissing wrote: >> >> Just so I do not misunderstand: >> >> you increased BUCKET_BUFF_SIZE in APR from 8000 to 64K? That is what you >> are testing? > > > Essentially, yes, *and* turn off mmap and sendfile. My hope is to disable > the mmap-optimization by default while still improving overall performance > for most users. > > Technically, Yann's patch doesn't redefine APR_BUCKET_BUFF_SIZE, it just > defines a new buffer size for use with the file bucket. It's a little less > than 64K, I assume to make room for an allocation header: > > #define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) Actually I'm not very pleased with this solution (or the final one that would make this size open / configurable). The issue is potentially the huge (order big-n) allocations which finally may hurt the system (fragmentation, OOM...). So I'm thinking of another way to achieve the same with the current APR_BUCKET_BUFF_SIZE (2 pages) per alloc. The idea is to have a new apr_allocator_allocv() function which would fill an iovec with what's available in the allocator's freelist (i.e. spare apr_memnodes) of at least the given min_size bytes (possibly a max too but I don't see the need for now) and up to the size of the given iovec. This function could be the base of a new apr_bucket_allocv() (and possibly apr_p[c]allocv(), though out of scope here) which in turn could be used by file_bucket_read() to get an iovec of available buffers. This iovec could then be passed to (new still) apr_file_readv() based on the readv() syscall, which would allow to read much more data in one go. With this the scheme we'd have iovec from end to end, well, sort of since mod_ssl would be break the chain but still produce transient buckets on output which anyway will end up in the core_output_filter's brigade of aside heap buckets, for apr_socket_sendv() to finally writev() them. We'd also have more recycled heap buckets (hence memnodes in the allocator) as the core_output_filter retains buckets, all with APR_BUCKET_BUFF_SIZE, up to THRESHOLD_MAX_BUFFER which, if configurable and along with MaxMemFree, would be the real limiter of recycling. So it's also adaptative. Actually it looks like what we need, but I'd like to have feedbacks before I go further the prototype I have so far (quite straightforward apr_allocator changes...). Thoughts? Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Wed, 22 Feb 2017, Jacob Champion wrote: To make results less confusing, any specific patches/branch I should test? My baseline is httpd-2.4.25 + httpd-2.4.25-deps --with-included-apr FWIW. 2.4.25 is just fine. We'll have to make sure there's nothing substantially different about it performance-wise before we backport patches anyway, so it'd be good to start testing it now. OK. - The OpenSSL test server, writing from memory: 1.2 GiB/s - httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s - httpd trunk with 'EnableMMAP off': 580 MiB/s - httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s At those speeds your results might be skewed by the latency of processing 10 MiB GET:s. Maybe, but keep in mind I care more about the difference between the numbers than the absolute throughput ceiling here. (In any case, I don't see significantly different numbers between 10 MiB and 1 GiB files. Remember, I'm testing via loopback.) Ah, right. Discard the results from the first warm-up access and your results delivering from memory or disk (cache) shouldn't differ. Ah, but they *do*, as Yann pointed out earlier. We can't just deliver the disk cache to OpenSSL for encryption; it has to be copied into some addressable buffer somewhere. That seems to be a major reason for the mmap() advantage, compared to a naive read() solution that just reads into a small buffer over and over again. (I am trying to set up Valgrind to confirm where the test server is spending most of its time, but it doesn't care for the large in-memory static buffer, or for OpenSSL's compressed debugging symbols, and crashes. :( ) Any joy with something simpler like gprof? (Caveat: haven't used it in ages to I don't know if its even applicable nowadays). Numbers on the "memcopy penalty" would indeed be interesting, especially any variation when the block size differs. As I said, our live server does 600 MB/s aes-128-gcm and can deliver 300 MB/s https without mmap. That's only a factor 2 difference between aes-128-gcm speed and delivered speed. Your results above are almost a factor 4 off, so something's fishy :-) Well, I can only report my methodology and numbers -- whether the numbers are actually meaningful has yet to be determined. ;D More testers are welcome! :-) I did some repeated tests and my initial results were actually a bit on the low side: Server CPU is an Intel E5606 (1st gen aes offload), openssl speed -evp says: The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes aes-128-gcm 208536.05k 452980.05k 567523.33k 607578.11k 619192.32k Single-stream https over a 10 Gbps link with 3ms RTT (useful routing SNAFU when talking to stuff in the neigboring building traffic takes the "shortcut" through a town 300 km away ;). Using wget -O /dev/null as a client, on a host with Intel E5-2630 CPU (960-ish MB/s aes-128-gcm on 8k blocks). http (sendfile): 1.07 GB/s (repeatedly) httpd (no mmap): 370-380 MB/s openssl s_server: 330-340 MB/s So httpd isn't beat by the naive openssl s_server approach at least ;-) Going off on a tangent here: For those of you who actually know how the ssl stuff really works, is it possible to get multiple threads involved in doing the encryption, or do you need the results from the previous block in order to do the next one? Yes, I know this wouldn't make sense for most real setups but for a student computer club with old hardware and good connectivity this is a real problem ;-) On the other hand, you would need it to do 100 Gbps single-stream https even on latest&greatest CPUs 8-) /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- There may be a correlation between humor and sex. - Data =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Wed, Feb 22, 2017 at 2:42 PM, Jacob Champion wrote: > Ah, but they *do*, as Yann pointed out earlier. We can't just deliver the > disk cache to OpenSSL for encryption; it has to be copied into some > addressable buffer somewhere. That seems to be a major reason for the > mmap() advantage, compared to a naive read() solution that just reads into > a small buffer over and over again. > IOW: read():Three copies: copy from filesystem cache to httpd read() buffer to encrypted-data buffer to kernel socket buffer. mmap(): Two copies: filesystem page already mapped into httpd, so just copy from filesystem (cached) page to encrypted-data buffer to kernel socket buffer.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/22/2017 10:34 AM, Niklas Edmundsson wrote: To make results less confusing, any specific patches/branch I should test? My baseline is httpd-2.4.25 + httpd-2.4.25-deps --with-included-apr FWIW. 2.4.25 is just fine. We'll have to make sure there's nothing substantially different about it performance-wise before we backport patches anyway, so it'd be good to start testing it now. - The OpenSSL test server, writing from memory: 1.2 GiB/s - httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s - httpd trunk with 'EnableMMAP off': 580 MiB/s - httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s At those speeds your results might be skewed by the latency of processing 10 MiB GET:s. Maybe, but keep in mind I care more about the difference between the numbers than the absolute throughput ceiling here. (In any case, I don't see significantly different numbers between 10 MiB and 1 GiB files. Remember, I'm testing via loopback.) Discard the results from the first warm-up access and your results delivering from memory or disk (cache) shouldn't differ. Ah, but they *do*, as Yann pointed out earlier. We can't just deliver the disk cache to OpenSSL for encryption; it has to be copied into some addressable buffer somewhere. That seems to be a major reason for the mmap() advantage, compared to a naive read() solution that just reads into a small buffer over and over again. (I am trying to set up Valgrind to confirm where the test server is spending most of its time, but it doesn't care for the large in-memory static buffer, or for OpenSSL's compressed debugging symbols, and crashes. :( ) As I said, our live server does 600 MB/s aes-128-gcm and can deliver 300 MB/s https without mmap. That's only a factor 2 difference between aes-128-gcm speed and delivered speed. Your results above are almost a factor 4 off, so something's fishy :-) Well, I can only report my methodology and numbers -- whether the numbers are actually meaningful has yet to be determined. ;D More testers are welcome! --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Tue, 21 Feb 2017, Jacob Champion wrote: Is there interest in more real-life numbers with increasing FILE_BUCKET_BUFF_SIZE or are you already on it? Yes please! My laptop probably isn't representative of most servers; it can do nearly 3 GB/s AES-128-GCM. The more machines we test, the better. To make results less confusing, any specific patches/branch I should test? My baseline is httpd-2.4.25 + httpd-2.4.25-deps --with-included-apr FWIW. I have an older server that can do 600 MB/s aes-128-gcm per core, but is only able to deliver 300 MB/s https single-stream via its 10 GBps interface. My guess is too small blocks causing CPU cycles being spent not delivering data... Right. To give you an idea of where I am in testing at the moment: I have a basic test server written with OpenSSL. It sends a 10 MiB response body from memory (*not* from disk) for every GET it receives. I also have a copy of httpd trunk that's serving an actual 10 MiB file from disk. My test call is just `h2load --h1 -n 100 https://localhost/`, which should send 100 requests over a single TLS connection. The ciphersuite selected for all test cases is ECDHE-RSA-AES256-GCM-SHA384. For reference, I can do in-memory AES-256-GCM at 2.1 GiB/s. - The OpenSSL test server, writing from memory: 1.2 GiB/s - httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s - httpd trunk with 'EnableMMAP off': 580 MiB/s - httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s At those speeds your results might be skewed by the latency of processing 10 MiB GET:s. I'd go for multiple GiB files (whatever you can cache in RAM) and deliver files from disk. Discard the results from the first warm-up access and your results delivering from memory or disk (cache) shouldn't differ. So just bumping the block size gets me almost to the speed of mmap, without the downside of a potential SIGBUS. Meanwhile, the OpenSSL test server seems to suggest a performance ceiling about 50% above where we are now. I'm guessing that if you redo the tests with a bigger file you should see even more potential. As I said, our live server does 600 MB/s aes-128-gcm and can deliver 300 MB/s https without mmap. That's only a factor 2 difference between aes-128-gcm speed and delivered speed. Your results above are almost a factor 4 off, so something's fishy :-) Even with the test server serving responses from memory, that seems like plenty of room to grow. I'm working on a version of the test server that serves files from disk so that I'm not comparing apples to oranges, but my prior testing leads me to believe that disk access is not the limiting factor on my machine. Hmm. Perhaps I should just do a quick test with openssl s_server, just to see what numbers I get... /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- BETA testing is hazardous to your health. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/22/2017 12:00 AM, Stefan Eissing wrote: Just so I do not misunderstand: you increased BUCKET_BUFF_SIZE in APR from 8000 to 64K? That is what you are testing? Essentially, yes, *and* turn off mmap and sendfile. My hope is to disable the mmap-optimization by default while still improving overall performance for most users. Technically, Yann's patch doesn't redefine APR_BUCKET_BUFF_SIZE, it just defines a new buffer size for use with the file bucket. It's a little less than 64K, I assume to make room for an allocation header: #define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
> Am 22.02.2017 um 00:14 schrieb Jacob Champion : > > On 02/19/2017 01:37 PM, Niklas Edmundsson wrote: >> On Thu, 16 Feb 2017, Jacob Champion wrote: >>> So, I had already hacked my O_DIRECT bucket case to just be a copy of >>> APR's file bucket, minus the mmap() logic. I tried making this change >>> on top of it... >>> >>> ...and holy crap, for regular HTTP it's *faster* than our current >>> mmap() implementation. HTTPS is still slower than with mmap, but >>> faster than it was without the change. (And the HTTPS performance has >>> been really variable.) >> >> I'm guessing that this is with a low-latency storage device, say a >> local SSD with low load? O_DIRECT on anything with latency would require >> way bigger blocks to hide the latency... You really want the OS >> readahead in the generic case, simply because it performs reasonably >> well in most cases. > > I described my setup really poorly. I've ditched O_DIRECT entirely. The > bucket type I created to use O_DIRECT has been repurposed to just be a copy > of the APR file bucket, with the mmap optimization removed entirely, and with > the new 64K bucket buffer limit. This new "no-mmap-plus-64K-block" file > bucket type performs better on my machine than the old "mmap-enabled" file > bucket type. > > (But yes, my testing is all local, with a nice SSD. Hopefully that gets a > little closer to isolating the CPU parts of this equation, which is the thing > we have the most influence over.) > >> I think the big win here is to use appropriate block sizes, you do more >> useful work and less housekeeping. I have no clue on when the block size >> choices were made, but it's likely that it was a while ago. Assuming >> that things will continue to evolve, I'd say making hard-coded numbers >> tunable is a Good Thing to do. > > Agreed. > >> Is there interest in more real-life numbers with increasing >> FILE_BUCKET_BUFF_SIZE or are you already on it? > > Yes please! My laptop probably isn't representative of most servers; it can > do nearly 3 GB/s AES-128-GCM. The more machines we test, the better. > >> I have an older server >> that can do 600 MB/s aes-128-gcm per core, but is only able to deliver >> 300 MB/s https single-stream via its 10 GBps interface. My guess is too >> small blocks causing CPU cycles being spent not delivering data... > > Right. To give you an idea of where I am in testing at the moment: I have a > basic test server written with OpenSSL. It sends a 10 MiB response body from > memory (*not* from disk) for every GET it receives. I also have a copy of > httpd trunk that's serving an actual 10 MiB file from disk. > > My test call is just `h2load --h1 -n 100 https://localhost/`, which should > send 100 requests over a single TLS connection. The ciphersuite selected for > all test cases is ECDHE-RSA-AES256-GCM-SHA384. For reference, I can do > in-memory AES-256-GCM at 2.1 GiB/s. > > - The OpenSSL test server, writing from memory: 1.2 GiB/s > - httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s > - httpd trunk with 'EnableMMAP off': 580 MiB/s > - httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s > > So just bumping the block size gets me almost to the speed of mmap, without > the downside of a potential SIGBUS. Meanwhile, the OpenSSL test server seems > to suggest a performance ceiling about 50% above where we are now. > > Even with the test server serving responses from memory, that seems like > plenty of room to grow. I'm working on a version of the test server that > serves files from disk so that I'm not comparing apples to oranges, but my > prior testing leads me to believe that disk access is not the limiting factor > on my machine. > > --Jacob Just so I do not misunderstand: you increased BUCKET_BUFF_SIZE in APR from 8000 to 64K? That is what you are testing? Stefan Eissing bytes GmbH Hafenstrasse 16 48155 Münster www.greenbytes.de
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/19/2017 01:37 PM, Niklas Edmundsson wrote: On Thu, 16 Feb 2017, Jacob Champion wrote: So, I had already hacked my O_DIRECT bucket case to just be a copy of APR's file bucket, minus the mmap() logic. I tried making this change on top of it... ...and holy crap, for regular HTTP it's *faster* than our current mmap() implementation. HTTPS is still slower than with mmap, but faster than it was without the change. (And the HTTPS performance has been really variable.) I'm guessing that this is with a low-latency storage device, say a local SSD with low load? O_DIRECT on anything with latency would require way bigger blocks to hide the latency... You really want the OS readahead in the generic case, simply because it performs reasonably well in most cases. I described my setup really poorly. I've ditched O_DIRECT entirely. The bucket type I created to use O_DIRECT has been repurposed to just be a copy of the APR file bucket, with the mmap optimization removed entirely, and with the new 64K bucket buffer limit. This new "no-mmap-plus-64K-block" file bucket type performs better on my machine than the old "mmap-enabled" file bucket type. (But yes, my testing is all local, with a nice SSD. Hopefully that gets a little closer to isolating the CPU parts of this equation, which is the thing we have the most influence over.) I think the big win here is to use appropriate block sizes, you do more useful work and less housekeeping. I have no clue on when the block size choices were made, but it's likely that it was a while ago. Assuming that things will continue to evolve, I'd say making hard-coded numbers tunable is a Good Thing to do. Agreed. Is there interest in more real-life numbers with increasing FILE_BUCKET_BUFF_SIZE or are you already on it? Yes please! My laptop probably isn't representative of most servers; it can do nearly 3 GB/s AES-128-GCM. The more machines we test, the better. I have an older server that can do 600 MB/s aes-128-gcm per core, but is only able to deliver 300 MB/s https single-stream via its 10 GBps interface. My guess is too small blocks causing CPU cycles being spent not delivering data... Right. To give you an idea of where I am in testing at the moment: I have a basic test server written with OpenSSL. It sends a 10 MiB response body from memory (*not* from disk) for every GET it receives. I also have a copy of httpd trunk that's serving an actual 10 MiB file from disk. My test call is just `h2load --h1 -n 100 https://localhost/`, which should send 100 requests over a single TLS connection. The ciphersuite selected for all test cases is ECDHE-RSA-AES256-GCM-SHA384. For reference, I can do in-memory AES-256-GCM at 2.1 GiB/s. - The OpenSSL test server, writing from memory: 1.2 GiB/s - httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s - httpd trunk with 'EnableMMAP off': 580 MiB/s - httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s So just bumping the block size gets me almost to the speed of mmap, without the downside of a potential SIGBUS. Meanwhile, the OpenSSL test server seems to suggest a performance ceiling about 50% above where we are now. Even with the test server serving responses from memory, that seems like plenty of room to grow. I'm working on a version of the test server that serves files from disk so that I'm not comparing apples to oranges, but my prior testing leads me to believe that disk access is not the limiting factor on my machine. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Mon, 20 Feb 2017, Yann Ylavic wrote: On Sun, Feb 19, 2017 at 10:11 PM, Niklas Edmundsson wrote: On Thu, 16 Feb 2017, Yann Ylavic wrote: Here I am, localhost still, 21GB file (client wget -qO- [url] &>/dev/null). Output attached. Looks good with nice big writes if I interpret it correctly. Is this without patching anything, meaning that AP_MAX_SENDFILE has no effect, or did you fix things? That's with unpatched 2.4.x. Actually AP_MAX_SENDFILE seems to be used nowhere in 2.4.x code (but in mod_file_cache, which I known nothing about, and is not widely used IMHO). Goodie. Just a confusing leftover define and a confused Nikke then ;) Maybe you'd want to switch from 2.2.x? ;) ftp.acc.umu.se is running 2.4.25, so we're pretty good on that front. It would however help if I learn not to mistype/tab-expand myself into old directory trees and grep around in the historical archives when trying to look into the current state of things ;-) So, to conclude: All is good on the http/sendfile front, nothing to see here, move along :-) /Nikke - who apparently had Monday every day last week... -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- Sculpture: mud pies that endure. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Sun, Feb 19, 2017 at 10:11 PM, Niklas Edmundsson wrote: > On Thu, 16 Feb 2017, Yann Ylavic wrote: >> >> Here I am, localhost still, 21GB file (client wget -qO- [url] >> &>/dev/null). >> Output attached. > > Looks good with nice big writes if I interpret it correctly. > > Is this without patching anything, meaning that AP_MAX_SENDFILE has no > effect, or did you fix things? That's with unpatched 2.4.x. Actually AP_MAX_SENDFILE seems to be used nowhere in 2.4.x code (but in mod_file_cache, which I known nothing about, and is not widely used IMHO). Maybe you'd want to switch from 2.2.x? ;) Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 16 Feb 2017, Jacob Champion wrote: On 02/16/2017 02:49 AM, Yann Ylavic wrote: +#define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) /* > APR_BUCKET_BUFF_SIZE */ So, I had already hacked my O_DIRECT bucket case to just be a copy of APR's file bucket, minus the mmap() logic. I tried making this change on top of it... ...and holy crap, for regular HTTP it's *faster* than our current mmap() implementation. HTTPS is still slower than with mmap, but faster than it was without the change. (And the HTTPS performance has been really variable.) I'm guessing that this is with a low-latency storage device, say a local SSD with low load? O_DIRECT on anything with latency would require way bigger blocks to hide the latency... You really want the OS readahead in the generic case, simply because it performs reasonably well in most cases. Yes, you can avoid a memcpy using O_DIRECT, but compared to the SSL stuff a memcpy is rather cheap... Can you confirm that you see a major performance improvement with the with the new 64K file buffer? I'm pretty skeptical of my own results at this point... but if you see it too, I think we need to make *all* these hard-coded numbers tunable in the config. I think the big win here is to use appropriate block sizes, you do more useful work and less housekeeping. I have no clue on when the block size choices were made, but it's likely that it was a while ago. Assuming that things will continue to evolve, I'd say making hard-coded numbers tunable is a Good Thing to do. Is there interest in more real-life numbers with increasing FILE_BUCKET_BUFF_SIZE or are you already on it? I have an older server that can do 600 MB/s aes-128-gcm per core, but is only able to deliver 300 MB/s https single-stream via its 10 GBps interface. My guess is too small blocks causing CPU cycles being spent not delivering data... /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- Fortunately... no one's in control. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 16 Feb 2017, Yann Ylavic wrote: Outputs (and the patch to produce them) attached. TL;DR: - http + EnableMMap=> single write - http + !EnableMMap + EnableSendfile => single write - http + !EnableMMap + !EnableSendfile => 125KB writes - https + EnableMMap=> 16KB writes - https + !EnableMMap=> 8KB writes If you try larger filesizes you should start seeing things being broken into chunks even for mmap/sendfile. For example we have #define AP_MAX_SENDFILE 16777216 /* 2^24 */ which is unneccessarily low IMHO. Here I am, localhost still, 21GB file (client wget -qO- [url] &>/dev/null). Output attached. Looks good with nice big writes if I interpret it correctly. Is this without patching anything, meaning that AP_MAX_SENDFILE has no effect, or did you fix things? /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- Fortunately... no one's in control. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Feb 17, 2017 2:52 PM, "William A Rowe Jr" wrote: On Feb 17, 2017 1:02 PM, "Jacob Champion" wrote: `EnableMMAP on` appears to boost performance for static files, yes, but is that because of mmap() itself, or because our bucket brigades configure themselves more optimally in the mmap() code path? Yann's research is starting to point towards the latter IMO. This may be as simple as the page manager caching and reusing the un-cleared page mapping on subsequent hits. You would need to overwhelmingly vary the page content served to test this theory. But the same caching wins for libld[l] ... Which doesn't segv during os updates. Probably due to copy-on-write mechanics. (With traditional and sendfile, you still have copy once on read - even if that file is sitting in FS cache.)
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Feb 17, 2017 1:02 PM, "Jacob Champion" wrote: `EnableMMAP on` appears to boost performance for static files, yes, but is that because of mmap() itself, or because our bucket brigades configure themselves more optimally in the mmap() code path? Yann's research is starting to point towards the latter IMO. This may be as simple as the page manager caching and reusing the un-cleared page mapping on subsequent hits. You would need to overwhelmingly vary the page content served to test this theory. But the same caching wins for libld[l] ... Which doesn't segv during os updates. Probably due to copy-on-write mechanics.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/17/2017 07:04 AM, Daniel Lescohier wrote: Is the high-level issue that: for serving static content over HTTP, you can use sendfile() from the OS filesystem cache, avoiding extra userspace copying; but if it's SSL, or any other dynamic filtering of content, you have to do extra work in userspace? Yes -- there are a bunch of potential high-level issues, but the one you've highlighted is the reason that I wouldn't expect our HTTPS implementation to ever get as fast as HTTP for static responses. At least not given the current architecture. (There are potential kernel-level encryption APIs that are popping up; I keep hoping someone will start playing around with AF_ALG sockets. But those aren't magic, either; we still have to layer TLS around the encrypted data.) That said, that's not what I'm trying to focus on with this thread. I have a feeling our performance is being artificially limited. `EnableMMAP on` appears to boost performance for static files, yes, but is that because of mmap() itself, or because our bucket brigades configure themselves more optimally in the mmap() code path? Yann's research is starting to point towards the latter IMO. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Is the high-level issue that: for serving static content over HTTP, you can use sendfile() from the OS filesystem cache, avoiding extra userspace copying; but if it's SSL, or any other dynamic filtering of content, you have to do extra work in userspace? On Thu, Feb 16, 2017 at 6:01 PM, Yann Ylavic wrote: > On Thu, Feb 16, 2017 at 10:51 PM, Jacob Champion > wrote: > > On 02/16/2017 02:49 AM, Yann Ylavic wrote: > >> > >> +#define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) /* > > APR_BUCKET_BUFF_SIZE > >> */ > > > > > > So, I had already hacked my O_DIRECT bucket case to just be a copy of > APR's > > file bucket, minus the mmap() logic. I tried making this change on top of > > it... > > > > ...and holy crap, for regular HTTP it's *faster* than our current mmap() > > implementation. HTTPS is still slower than with mmap, but faster than it > was > > without the change. (And the HTTPS performance has been really variable.) > > > > Can you confirm that you see a major performance improvement with the > with > > the new 64K file buffer? > > I can't test speed for now (stick with my laptop/localhost, which > won't be relevant enough I guess). > > > I'm pretty skeptical of my own results at this > > point... but if you see it too, I think we need to make *all* these > > hard-coded numbers tunable in the config. > > We could also improve the apr_bucket_alloc()ator to recycle more > order-n allocations possibilities (saving as much > {apr_allocator_,m}alloc() calls), along with configurable/higher > orders in httpd that'd be great I think. > > I can try this patch... >
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 10:51 PM, Jacob Champion wrote: > On 02/16/2017 02:49 AM, Yann Ylavic wrote: >> >> +#define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) /* > APR_BUCKET_BUFF_SIZE >> */ > > > So, I had already hacked my O_DIRECT bucket case to just be a copy of APR's > file bucket, minus the mmap() logic. I tried making this change on top of > it... > > ...and holy crap, for regular HTTP it's *faster* than our current mmap() > implementation. HTTPS is still slower than with mmap, but faster than it was > without the change. (And the HTTPS performance has been really variable.) > > Can you confirm that you see a major performance improvement with the with > the new 64K file buffer? I can't test speed for now (stick with my laptop/localhost, which won't be relevant enough I guess). > I'm pretty skeptical of my own results at this > point... but if you see it too, I think we need to make *all* these > hard-coded numbers tunable in the config. We could also improve the apr_bucket_alloc()ator to recycle more order-n allocations possibilities (saving as much {apr_allocator_,m}alloc() calls), along with configurable/higher orders in httpd that'd be great I think. I can try this patch...
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/16/2017 02:49 AM, Yann Ylavic wrote: +#define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) /* > APR_BUCKET_BUFF_SIZE */ So, I had already hacked my O_DIRECT bucket case to just be a copy of APR's file bucket, minus the mmap() logic. I tried making this change on top of it... ...and holy crap, for regular HTTP it's *faster* than our current mmap() implementation. HTTPS is still slower than with mmap, but faster than it was without the change. (And the HTTPS performance has been really variable.) Can you confirm that you see a major performance improvement with the with the new 64K file buffer? I'm pretty skeptical of my own results at this point... but if you see it too, I think we need to make *all* these hard-coded numbers tunable in the config. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/16/2017 02:48 AM, Niklas Edmundsson wrote: While I applaud the efforts to get https to behave performance-wise I would hate for http to be left out of being able to do top-notch on latest&greatest networking :-) My intent in focusing there was to discover why disabling mmap() seemed to be hitting our HTTPS implementation so badly compared to HTTP, in the hopes that we could move the mmap() crashy-optimization to a non-default setting. (Reducing HTTPS performance by 30+% out of the box seems like a non-starter to me.) But agreed that HTTP should not be left out. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/16/2017 03:41 AM, Yann Ylavic wrote: I can't reproduce it anymore, somehow I failed with my restarts between EnableMMap on=>off. Sorry for the noise... This is suspiciously similar to what I've been fighting the last three days. It's still entirely possible that you and I both messed up the restarts independently... but if anyone else thinks to themselves "huh, that's funny, didn't I restart?" please speak up. :) --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 11:41 AM, Plüm, Rüdiger, Vodafone Group wrote: > >> -Ursprüngliche Nachricht- >> Von: Yann Ylavic [mailto:ylavic@gmail.com] >> Gesendet: Donnerstag, 16. Februar 2017 11:35 >> An: httpd-dev >> Betreff: Re: httpd 2.4.25, mpm_event, ssl: segfaults >> >> On Thu, Feb 16, 2017 at 11:20 AM, Plüm, Rüdiger, Vodafone Group >> wrote: >> >> >> >> Please note that "EnableMMap on" avoids EnableSendfile (i.e. >> >> "EnableMMap on" => "EnableSendfile off") >> > >> > Just for clarification: If you placed EnableMMap on in your test >> > configuration you also put EnableSendfile off in your configuration, >> > correct? Just putting in EnableMMap on does not automatically cause >> > EnableSendfile set to off. I know that at least on trunk >> > EnableSendfile on is no longer the default. >> >> If I "EnableMMap on", core_output_filter() will never use sendfile() >> (whatever EnableSendfile is). >> >> I can try to figure out why (that's a really-all build/conf, so maybe >> some module's filter is apr_bucket_read()ing in the chain unless >> EnableMMap, unlikely though)... > > And this is what I don't understand and cannot read immediately from the > code. Weird. I can't reproduce it anymore, somehow I failed with my restarts between EnableMMap on=>off. Sorry for the noise...
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 11:48 AM, Niklas Edmundsson wrote: > On Thu, 16 Feb 2017, Yann Ylavic wrote: > >> Here are some SSL/core_write outputs (sizes) for me, with 2.4.x. >> This is with a GET for a 2MB file, on localhost... >> >> Please note that "EnableMMap on" avoids EnableSendfile (i.e. >> "EnableMMap on" => "EnableSendfile off"), which is relevant only in >> the http (non-ssl) case anyway. >> >> Outputs (and the patch to produce them) attached. >> >> TL;DR: >> - http + EnableMMap=> single write >> - http + !EnableMMap + EnableSendfile => single write >> - http + !EnableMMap + !EnableSendfile => 125KB writes >> - https + EnableMMap=> 16KB writes >> - https + !EnableMMap=> 8KB writes > > > If you try larger filesizes you should start seeing things being broken into > chunks even for mmap/sendfile. For example we have > #define AP_MAX_SENDFILE 16777216 /* 2^24 */ > which is unneccessarily low IMHO. Here I am, localhost still, 21GB file (client wget -qO- [url] &>/dev/null). Output attached. http + !EnableMMap + EnableSendfile (21GB): [Thu Feb 16 12:08:43.809832 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] writev_nonblocking(): 291/291 bytes [Thu Feb 16 12:08:43.812446 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 11219190/21478375424 bytes [Thu Feb 16 12:08:43.812843 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 2095456/21467156234 bytes [Thu Feb 16 12:08:43.812893 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 130966/21465060778 bytes [Thu Feb 16 12:08:43.812908 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/21464929812 bytes [Thu Feb 16 12:08:43.814166 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1964490/21464929812 bytes [Thu Feb 16 12:08:43.814205 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/21462965322 bytes [Thu Feb 16 12:08:43.815408 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1833524/21462965322 bytes [Thu Feb 16 12:08:43.815482 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 196449/21461131798 bytes [Thu Feb 16 12:08:43.815499 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/21460935349 bytes [Thu Feb 16 12:08:43.816843 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 2160939/21460935349 bytes [Thu Feb 16 12:08:43.816881 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/21458774410 bytes [Thu Feb 16 12:08:43.818319 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 2160939/21458774410 bytes [...] [Thu Feb 16 12:08:44.538137 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1768041/18805599699 bytes [Thu Feb 16 12:08:44.538145 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/18803831658 bytes [Thu Feb 16 12:08:44.538601 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1768041/18803831658 bytes [Thu Feb 16 12:08:44.538609 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/18802063617 bytes [Thu Feb 16 12:08:44.539060 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1833524/18802063617 bytes [Thu Feb 16 12:08:44.539069 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/18800230093 bytes [Thu Feb 16 12:08:44.539593 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1964490/18800230093 bytes [Thu Feb 16 12:08:44.539601 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/18798265603 bytes [Thu Feb 16 12:08:44.540136 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1964490/18798265603 bytes [Thu Feb 16 12:08:44.540156 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/18796301113 bytes [Thu Feb 16 12:08:44.540632 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 1964490/18796301113 bytes [Thu Feb 16 12:08:44.540639 2017] [core:notice] [pid 26960:tid 139674034870016] [client 127.0.0.1:56722] sendfile_nonblocking(): 0/18794336623 bytes [Thu Feb 16 12:08:44.541157 2017] [core:notice
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 11:48 AM, Niklas Edmundsson wrote: > On Thu, 16 Feb 2017, Yann Ylavic wrote: > >> Here are some SSL/core_write outputs (sizes) for me, with 2.4.x. >> This is with a GET for a 2MB file, on localhost... >> >> Please note that "EnableMMap on" avoids EnableSendfile (i.e. >> "EnableMMap on" => "EnableSendfile off"), which is relevant only in >> the http (non-ssl) case anyway. >> >> Outputs (and the patch to produce them) attached. >> >> TL;DR: >> - http + EnableMMap=> single write >> - http + !EnableMMap + EnableSendfile => single write >> - http + !EnableMMap + !EnableSendfile => 125KB writes >> - https + EnableMMap=> 16KB writes >> - https + !EnableMMap=> 8KB writes > > > If you try larger filesizes you should start seeing things being broken into > chunks even for mmap/sendfile. For example we have > #define AP_MAX_SENDFILE 16777216 /* 2^24 */ > which is unneccessarily low IMHO. Good point, I can try to make it configurable. > > Think being able to do 100 Gbps single-stream effectively, that's what we > need to optimize for now in order to have it in distros some years from now. > And you will waste a lot of CPU doing small writes at those speeds... > > While I applaud the efforts to get https to behave performance-wise I would > hate for http to be left out of being able to do top-notch on > latest&greatest networking :-) > > Granted, latest&greatest CPU:s "only" seem to be able to do approx 5 GB/s > AES128 per core so it will be hard to reach 100 Gbps single-stream https, > but I really think we should set the goal high while at it... Agreed, let's look at the whole scope. Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 11:01 AM, Yann Ylavic wrote: > On Thu, Feb 16, 2017 at 10:49 AM, Yann Ylavic wrote: >> >> - http + !EnableMMap + !EnableSendfile => 125KB writes > > This is due to MAX_IOVEC_TO_WRITE being 16 in > send_brigade_nonblocking(), 125KB is 16 * 8000B. > So playing with MAX_IOVEC_TO_WRITE might also be worth a try for > !EnableMMap+!EnableSendfile case... To minimize copying, this patch on APR(-util) could possibly help yet better: Index: srclib/apr-util/buckets/apr_buckets_file.c === --- srclib/apr-util/buckets/apr_buckets_file.c(revision 1732829) +++ srclib/apr-util/buckets/apr_buckets_file.c(working copy) @@ -72,6 +72,8 @@ static int file_make_mmap(apr_bucket *e, apr_size_ } #endif +#define FILE_BUCKET_BUFF_SIZE (64 * 1024 - 64) /* > APR_BUCKET_BUFF_SIZE */ + static apr_status_t file_bucket_read(apr_bucket *e, const char **str, apr_size_t *len, apr_read_type_e block) { @@ -108,8 +110,8 @@ static apr_status_t file_bucket_read(apr_bucket *e } #endif -*len = (filelength > APR_BUCKET_BUFF_SIZE) - ? APR_BUCKET_BUFF_SIZE +*len = (filelength > FILE_BUCKET_BUFF_SIZE) + ? FILE_BUCKET_BUFF_SIZE : filelength; *str = NULL; /* in case we die prematurely */ buf = apr_bucket_alloc(*len, e->list); _ If that's the case, we could have an apr_bucket_file_bufsize_set() to make this configurable, or even better one more apr_bucket_alloc's freelist for such larger/fixed buffers...
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 16 Feb 2017, Yann Ylavic wrote: Here are some SSL/core_write outputs (sizes) for me, with 2.4.x. This is with a GET for a 2MB file, on localhost... Please note that "EnableMMap on" avoids EnableSendfile (i.e. "EnableMMap on" => "EnableSendfile off"), which is relevant only in the http (non-ssl) case anyway. Outputs (and the patch to produce them) attached. TL;DR: - http + EnableMMap=> single write - http + !EnableMMap + EnableSendfile => single write - http + !EnableMMap + !EnableSendfile => 125KB writes - https + EnableMMap=> 16KB writes - https + !EnableMMap=> 8KB writes If you try larger filesizes you should start seeing things being broken into chunks even for mmap/sendfile. For example we have #define AP_MAX_SENDFILE 16777216 /* 2^24 */ which is unneccessarily low IMHO. Think being able to do 100 Gbps single-stream effectively, that's what we need to optimize for now in order to have it in distros some years from now. And you will waste a lot of CPU doing small writes at those speeds... While I applaud the efforts to get https to behave performance-wise I would hate for http to be left out of being able to do top-notch on latest&greatest networking :-) Granted, latest&greatest CPU:s "only" seem to be able to do approx 5 GB/s AES128 per core so it will be hard to reach 100 Gbps single-stream https, but I really think we should set the goal high while at it... /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- Memory allocation error: Reboot System! =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 11:20 AM, Plüm, Rüdiger, Vodafone Group wrote: >> >> Please note that "EnableMMap on" avoids EnableSendfile (i.e. >> "EnableMMap on" => "EnableSendfile off") > > Just for clarification: If you placed EnableMMap on in your test > configuration you also put EnableSendfile off in your configuration, > correct? Just putting in EnableMMap on does not automatically cause > EnableSendfile set to off. I know that at least on trunk > EnableSendfile on is no longer the default. If I "EnableMMap on", core_output_filter() will never use sendfile() (whatever EnableSendfile is). I can try to figure out why (that's a really-all build/conf, so maybe some module's filter is apr_bucket_read()ing in the chain unless EnableMMap, unlikely though)... Regards, Yann
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 10:49 AM, Yann Ylavic wrote: > > - http + !EnableMMap + !EnableSendfile => 125KB writes This is due to MAX_IOVEC_TO_WRITE being 16 in send_brigade_nonblocking(), 125KB is 16 * 8000B. So playing with MAX_IOVEC_TO_WRITE might also be worth a try for !EnableMMap+!EnableSendfile case...
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Here are some SSL/core_write outputs (sizes) for me, with 2.4.x. This is with a GET for a 2MB file, on localhost... Please note that "EnableMMap on" avoids EnableSendfile (i.e. "EnableMMap on" => "EnableSendfile off"), which is relevant only in the http (non-ssl) case anyway. Outputs (and the patch to produce them) attached. TL;DR: - http + EnableMMap=> single write - http + !EnableMMap + EnableSendfile => single write - http + !EnableMMap + !EnableSendfile => 125KB writes - https + EnableMMap=> 16KB writes - https + !EnableMMap=> 8KB writes http + EnableMMap: [Thu Feb 16 09:42:33.725297 2017] [core:notice] [pid 25096:tid 139655069091584] [client 127.0.0.1:56466] writev_nonblocking(): 2053620/2053620 bytes http + !EnableMMap + EnableSendfile: [Thu Feb 16 10:09:35.504327 2017] [core:notice] [pid 25598:tid 139674043262720] [client 127.0.0.1:56494] writev_nonblocking(): 284/284 bytes [Thu Feb 16 10:09:35.504733 2017] [core:notice] [pid 25598:tid 139674043262720] [client 127.0.0.1:56494] sendfile_nonblocking(): 2053336/2053336 bytes http + !EnableMMap + !EnableSendfile: [Thu Feb 16 09:40:41.377781 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 120284/120284 bytes [Thu Feb 16 09:40:41.377895 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.377973 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378069 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378145 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378215 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378291 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378366 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378446 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378525 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378592 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378642 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378708 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378778 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.378931 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.379006 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 128000/128000 bytes [Thu Feb 16 09:40:41.379040 2017] [core:notice] [pid 25057:tid 139655060698880] [client 127.0.0.1:56464] writev_nonblocking(): 13336/13336 bytes https + EnableMMap: [Thu Feb 16 09:38:33.699387 2017] [ssl:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] bio_filter_out_write(): 2088 bytes [Thu Feb 16 09:38:33.699429 2017] [ssl:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] bio_filter_out_write(): flush [Thu Feb 16 09:38:33.699454 2017] [core:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] writev_nonblocking(): 2088/2088 bytes [Thu Feb 16 09:38:33.700821 2017] [ssl:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] bio_filter_out_write(): 274 bytes [Thu Feb 16 09:38:33.700835 2017] [ssl:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] bio_filter_out_write(): flush [Thu Feb 16 09:38:33.700853 2017] [core:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] writev_nonblocking(): 274/274 bytes [Thu Feb 16 09:38:33.701481 2017] [ssl:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] ssl_filter_write(): 284 bytes [Thu Feb 16 09:38:33.701497 2017] [ssl:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] bio_filter_out_write(): 313 bytes [Thu Feb 16 09:38:33.701502 2017] [ssl:notice] [pid 24930:tid 139655069091584] [client 127.0.0.1:58074] ssl_filter_write(): 2053336 bytes [Thu Feb 16 09:38:33.701524 2017] [ssl:notice] [pid 24
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Not at my comp, but the mod_http2 output has special handling for file buckts. Because apr_buckt_read returns a max of 8k and splits itself. It instead grabs the file and reads the size it needs, if memory serves me well. I assume when it's mmapped it does not make much of a difference. > Am 16.02.2017 um 00:40 schrieb Yann Ylavic : > >> On Thu, Feb 16, 2017 at 12:31 AM, Yann Ylavic wrote: >> >> Actually this is 16K (the maximum size of a TLS record) > > ... these are the outputs (records) splitted/produced by SSL_write() > when given inputs (plain text) greater than 16K (at once).
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 12:31 AM, Yann Ylavic wrote: > > Actually this is 16K (the maximum size of a TLS record) ... these are the outputs (records) splitted/produced by SSL_write() when given inputs (plain text) greater than 16K (at once).
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 16, 2017 at 12:06 AM, Jacob Champion wrote: > On 02/15/2017 02:03 PM, Yann Ylavic wrote: > >> Assuming so :) there is also the fact that mod_ssl will encrypt/pass >> 8K buckets at a time, while the core output filter tries to send the >> whole mmap()ed file, keeping what remains after EAGAIN for the next >> call (if any). > > Oh, right... we split on APR_MMAP_LIMIT in the mmap() case. Which is a nice > big 4MiB instead of 8KiB. Could it really be as easy as tuning the default > file bucket size up? Actually this is 16K (the maximum size of a TLS record), so 16K buckets are passed to the core output filter (which itself bufferizes buckets lower than THRESHOLD_MIN_WRITE=4K, up to THRESHOLD_MAX_BUFFER=64K before sending. Without SSL, sending is done up to the size of the mmapp()ed file still, which may make a difference. Maybe you could try to play with THRESHOLD_MIN_WRITE/THRESHOLD_MAX_BUFFER in server/core_filters.c for the SSL case (e.g. MIN=4M and MAX=16M), but that'd still cost some transient to heap buckets (copies) which don't happen in the non-SSL case... > > --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/15/2017 02:03 PM, Yann Ylavic wrote: On Wed, Feb 15, 2017 at 9:50 PM, Jacob Champion wrote: For the next step, I want to find out why TLS connections see such a big performance hit when I switch off mmap(), but unencrypted connections don't... it's such a huge difference that I feel like I must be missing something obvious. First, you did "EnableSendfile off", right? Yep :) But thanks for the reminder anyway; I would have kicked myself... Also, I have to retract an earlier claim I made: I am now seeing a difference between the performance of mmap'd and non-mmap'd responses for regular HTTP connections. I don't know if I actually changed something, or if the original lack of difference was tester error on my part. Assuming so :) there is also the fact that mod_ssl will encrypt/pass 8K buckets at a time, while the core output filter tries to send the whole mmap()ed file, keeping what remains after EAGAIN for the next call (if any). Oh, right... we split on APR_MMAP_LIMIT in the mmap() case. Which is a nice big 4MiB instead of 8KiB. Could it really be as easy as tuning the default file bucket size up? --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Wed, Feb 15, 2017 at 9:50 PM, Jacob Champion wrote: > > For the next step, I want to find out why TLS connections see such a big > performance hit when I switch off mmap(), but unencrypted connections > don't... it's such a huge difference that I feel like I must be missing > something obvious. First, you did "EnableSendfile off", right? Assuming so :) there is also the fact that mod_ssl will encrypt/pass 8K buckets at a time, while the core output filter tries to send the whole mmap()ed file, keeping what remains after EAGAIN for the next call (if any). That's I think a big difference too, especially on localhost or a fast/large bandwidth network. Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/07/2017 02:32 AM, Niklas Edmundsson wrote: O_DIRECT also bypasses any read-ahead logic, so you'll have to do nice and big IO etc to get good performance. Yep, confirmed... my naive approach to O_DIRECT, which reads from the file in the 8K chunks we're used to from the file bucket brigade, absolutely mutilates our performance (80% slowdown) *and* rails the disk during the load test. Not good. (I was hoping that combining the O_DIRECT approach with in-memory caching would give us the best of both worlds. Nope. A plain read() with no explicit caching at all is much, much faster on my machine.) We've played around with O_DIRECT to optimize the caching process in our large-file caching module (our backing store is nfs). However, since all our hosts are running Linux we had much better results with doing plain reads and utilizing posix_fadvise with POSIX_FADV_WILLNEED to trigger read-ahead and POSIX_FADV_DONTNEED to drop the original file from cache when read (as future requests will be served from local disk cache). We're doing 8MB fadvise chunks to get full streaming performance when caching large files. Hmm, I will keep the file advisory API in the back of my head, thanks for that. For the next step, I want to find out why TLS connections see such a big performance hit when I switch off mmap(), but unencrypted connections don't... it's such a huge difference that I feel like I must be missing something obvious. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Here is how cache page replacement is done in Linux: https://linux-mm.org/PageReplacementDesign On Tue, Feb 7, 2017 at 5:32 AM, Niklas Edmundsson wrote: > On Mon, 6 Feb 2017, Jacob Champion wrote: > > > > Considering the massive amount of caching that's built into the entire >> HTTP ecosystem already, O_DIRECT *might* be an effective way to do that (in >> which we give up filesystem optimizations and caching in return for a DMA >> into userspace). I have a PoC about halfway done, but I need to split my >> time this week between this and the FCGI stuff I've been neglecting. >> > > As O_DIRECT bypasses the cache, all your IO will hit your storage device. > I think this makes it useful only in exotic usecases. > > For our workload that's using mod_cache with a local disk cache we still > want to use the remaining RAM as a disk read cache, it doesn't make sense > to force disk I/O for files that simply could have been served from RAM. We > typically see 90-95% of the requests served by mod_cache actually being > served from the disk cache in RAM rather than hitting the local disk cache > devices. I'm suspecting the picture is similar for most occurrances of > static file serving, regardless of using mod_cache etc or not. > > O_DIRECT also bypasses any read-ahead logic, so you'll have to do nice and > big IO etc to get good performance. > > We've played around with O_DIRECT to optimize the caching process in our > large-file caching module (our backing store is nfs). However, since all > our hosts are running Linux we had much better results with doing plain > reads and utilizing posix_fadvise with POSIX_FADV_WILLNEED to trigger > read-ahead and POSIX_FADV_DONTNEED to drop the original file from cache > when read (as future requests will be served from local disk cache). We're > doing 8MB fadvise chunks to get full streaming performance when caching > large files. > > But that's way out of scope for this discussion, I think ;-) > > In conclusion, I wouldn't expect any benefits of using O_DIRECT in the > common httpd usecases. That said, I would gladly be proven wrong :) > > /Nikke > -- > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > -=-=-=-=-=-=-=- > Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se > > --- > "The point is I am now a perfectly safe penguin!" -- Ford Prefect > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > =-=-=-=-=-=-=-= >
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Mon, 6 Feb 2017, Jacob Champion wrote: Considering the massive amount of caching that's built into the entire HTTP ecosystem already, O_DIRECT *might* be an effective way to do that (in which we give up filesystem optimizations and caching in return for a DMA into userspace). I have a PoC about halfway done, but I need to split my time this week between this and the FCGI stuff I've been neglecting. As O_DIRECT bypasses the cache, all your IO will hit your storage device. I think this makes it useful only in exotic usecases. For our workload that's using mod_cache with a local disk cache we still want to use the remaining RAM as a disk read cache, it doesn't make sense to force disk I/O for files that simply could have been served from RAM. We typically see 90-95% of the requests served by mod_cache actually being served from the disk cache in RAM rather than hitting the local disk cache devices. I'm suspecting the picture is similar for most occurrances of static file serving, regardless of using mod_cache etc or not. O_DIRECT also bypasses any read-ahead logic, so you'll have to do nice and big IO etc to get good performance. We've played around with O_DIRECT to optimize the caching process in our large-file caching module (our backing store is nfs). However, since all our hosts are running Linux we had much better results with doing plain reads and utilizing posix_fadvise with POSIX_FADV_WILLNEED to trigger read-ahead and POSIX_FADV_DONTNEED to drop the original file from cache when read (as future requests will be served from local disk cache). We're doing 8MB fadvise chunks to get full streaming performance when caching large files. But that's way out of scope for this discussion, I think ;-) In conclusion, I wouldn't expect any benefits of using O_DIRECT in the common httpd usecases. That said, I would gladly be proven wrong :) /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- "The point is I am now a perfectly safe penguin!" -- Ford Prefect =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/03/2017 12:30 AM, Niklas Edmundsson wrote: Methinks this makes mmap+ssl a VERY bad combination if the thing SIGBUS:es due to a simple IO error, I'll proceed with disabling mmap and see if that is a viable way to go for our workload... (Pulling from a parallel conversation, with permission) The question has been raised: is our mmap() optimization really giving us the utility we want for the additional instability we pay? Stefan had this to say: On 02/03/2017 08:32 AM, Stefan Eissing wrote: Experimented on my Ubuntu 14.04 image on Parallels on MacOS 10.12, MacBook Pro mid 2012. Loading a 10 MB file 1000 times over 8 connections: h2load -c 8 -t 8 -n 1000 -m1 http://xxx/10mb.file using HTTP/1.1 and HTTP/2 (limit of 1 stream at a time per connection). Plain and with TLS1.2, transfer speeds in GByte/sec from localhost: H1Plain H1SSL H2Plain H2SSL MMAP on4.3 1.53.8 1.3 off 3.5 1.13.8 1.3 HTTP/2 seems rather unaffected, while HTTP/1.1 experiences significant differences. Hmm... and I replied: Am 03.02.2017 um 21:47 schrieb Jacob Champion : Weird. I can't see any difference for plain HTTP/1.1 when just toggling EnableMMAP, even with EnableSendfile off. I *do* see a significant difference for TLS+HTTP/1.1. That doesn't really make sense to me; is there some other optimization kicking in? sendfile blows the mmap optimization out of the water, but naturally it can't kick in for TLS. I would be interested to see if an O_DIRECT-aware file bucket could speed up the TLS side of things without exposing people to mmap instability. I was also interested to see if there was some mmap() flag we were missing that could fix the problem for us. Turns out a few systems (used to?) have one called MAP_COPY. Linus had a few choice words about it: http://yarchive.net/comp/linux/map_copy.html Linus-insult-rant aside, his point applies here too, I think. We're using mmap() as an optimized read(). We should be focusing on how to use read() in an optimized way. And surely read() for modern systems has come a long way since that thread in 2001? Considering the massive amount of caching that's built into the entire HTTP ecosystem already, O_DIRECT *might* be an effective way to do that (in which we give up filesystem optimizations and caching in return for a DMA into userspace). I have a PoC about halfway done, but I need to split my time this week between this and the FCGI stuff I've been neglecting. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Mon, Feb 6, 2017 at 12:10 PM, Ruediger Pluem wrote: >> >> What might happen in ssl_io_filter_output() is that buffered >> output data (already deleted but not cleared) end up being reused >> on shutdown. >> >> Could you please try the attached patch? > > Why would we need to handle filter_ctx->pssl == NULL the same way we > handle META_BUCKETS? filter_ctx->pssl == NULL already causes > ssl_filter_write to fail. Do I miss any code before that could crash > in the data case with filter_ctx->pssl == NULL? No, this hunk was not intended to be proposed/tested (the case should not happen though, so harmless imo), and anyway was not committed in r1781582 ([1]). However I opened the thread "ssl_io_filter_output vs EOC" ([2]), maybe we could discuss this there? It seems that we can either error/fail or fall through the filter stack after the EOC, depending on whether subsequents buckets are in the same brigade or not. We probably should clarify (and being consistent on) what to do after the EOC when TLS is in place (i.e. send whatever follows, besides metadata, in clear or bail out?). Regards, Yann. [1] http://svn.apache.org/r1781582 [2] https://lists.apache.org/thread.html/714ca91c918e7520b75fae664b2bdee28d7b2a9f9ef78e51d8765c96@%3Cdev.httpd.apache.org%3E
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 11:04 AM, Yann Ylavic wrote: > Hi Niklas, > > On Wed, Feb 1, 2017 at 7:02 PM, Niklas Edmundsson wrote: >> >> We've started to see spurious segfaults with httpd 2.4.25, mpm_event, ssl on >> Ubuntu 14.04LTS. Not frequent, but none the less happening. >> >> #4 ssl_io_filter_output (f=0x7f507013cfe0, bb=0x7f4f840be168) at >> ssl_engine_io.c:1746 >> data = 0x7f5075518000 > 0x7f5075518000> >> len = 4194304 >> bucket = 0x7f4f840b1ba8 >> status = >> filter_ctx = 0x7f507013cf88 >> inctx = >> outctx = 0x7f507013d008 >> rblock = APR_NONBLOCK_READ > > I suspect some cleanup ordering issue happening in > ssl_io_filter_output(), when the EOC bucket is found. > >> >> Are we hitting a corner case of process cleanup that plays merry hell with >> https/ssl, or are we just having bad luck? Ideas? Suggestions? > > 2.4.25 is eager to terminate/shutdown keepalive connections more > quickly (than previous versions) on graceful shutdown (e.g. > MaxConnectionsPerChild reached). > > What might happen in ssl_io_filter_output() is that buffered output > data (already deleted but not cleared) end up being reused on > shutdown. > > Could you please try the attached patch? Why would we need to handle filter_ctx->pssl == NULL the same way we handle META_BUCKETS? filter_ctx->pssl == NULL already causes ssl_filter_write to fail. Do I miss any code before that could crash in the data case with filter_ctx->pssl == NULL? Regards Rüdiger
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Hmm, Linux raises SIGBUS if an mmap is used after the underlying file has been truncated (see [1]). See also https://bz.apache.org/bugzilla/show_bug.cgi?id=46688 . Niklas, just to clarify: you're not willfully truncating large files as they're being served, right? I *can* reproduce a SIGBUS if I start truncating files during my stress testing. But EnableMMAP's documentation calls that case out explicitly. We're using our own large-file tuned cache module (I know, we should really publish it properly), and clean on oldest atime with a script that does rm. There's nothing cache-related that I know of that ever truncates files. Bah. This particular server seems to have sporadic IO errors on the cache filesystem. I'm guessing that this could show up as a truncated file in the mmap sense of things, with SIGBUS and all. I should have caught on when I only saw the issue on one host :-/ Methinks this makes mmap+ssl a VERY bad combination if the thing SIGBUS:es due to a simple IO error, I'll proceed with disabling mmap and see if that is a viable way to go for our workload... We do sendfile for vanilla http, I'm guessing that's why we've never hit this before. So, sorry for the noise. You learn something new every day... Just to rule it out, I changed our httpd init script to leave the stacksize ulimit at Linux default (8MB) without changing anything else. I'll probably leave this in though, as I'm guessing this usecase is what gets the bulk of testing... /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- This tagline SHAREWARE. Send $5. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 2 Feb 2017, Jacob Champion wrote: We've started to see spurious segfaults with httpd 2.4.25, mpm_event, ssl on Ubuntu 14.04LTS. Not frequent, but none the less happening. #4 ssl_io_filter_output (f=0x7f507013cfe0, bb=0x7f4f840be168) at ssl_engine_io.c:1746 data = 0x7f5075518000 len = 4194304 bucket = 0x7f4f840b1ba8 status = filter_ctx = 0x7f507013cf88 inctx = outctx = 0x7f507013d008 rblock = APR_NONBLOCK_READ Idle thoughts: "Cannot access memory" in this case could be a red herring, if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems reasonable that the data in question could be mmap'd, given the nice round address and 4 MiB length (equal to APR_MMAP_LIMIT). Oooh, hadn't even though of that... That doesn't mean we're looking in the wrong place, though, since SIGBUS can also be generated by an out-of-bounds access to an mmap'd region. Niklas, what version of APR are you using? Are you serving large (> 4 MiB) static files? I have not been able to reproduce so far (Ubuntu 16.04, httpd 2.4.25 + mod_ssl + mpm_event). Yes, this is a file archive offload target ONLY serving large files. APR from httpd-2.4.25-deps.tar.bz2 and ./configure --prefix=/lap/apache/2.4.25.sslcleanuppatch --sysconfdir=/var/conf/apache2 --with-included-apr --enable-nonportable-atomics=yes --enable-layout=GNU --enable-mpms-shared=all --with-gdbm --without-berkeley-db --enable-mods-shared=all --enable-cache=shared --enable-cache-disk=shared --enable-ssl=shared --enable-cgi=shared --enable-suexec --with-suexec-caller=www-srv --with-suexec-uidmin=1000 --with-suexec-gidmin=1000 CFLAGS=-O2 - (yes, the cgi/suexec stuff isn't really used but we haven't gotten around to change our builds). It's indeed using that APR according to /proc/pid/maps, libapr-1.so.0.5.2 in the httpd install tree. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- This tagline SHAREWARE. Send $5. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 2 Feb 2017, Jacob Champion wrote: On 02/02/2017 03:05 PM, Yann Ylavic wrote: Hmm, Linux raises SIGBUS if an mmap is used after the underlying file has been truncated (see [1]). See also https://bz.apache.org/bugzilla/show_bug.cgi?id=46688 . Niklas, just to clarify: you're not willfully truncating large files as they're being served, right? I *can* reproduce a SIGBUS if I start truncating files during my stress testing. But EnableMMAP's documentation calls that case out explicitly. We're using our own large-file tuned cache module (I know, we should really publish it properly), and clean on oldest atime with a script that does rm. There's nothing cache-related that I know of that ever truncates files. Just to rule it out, I changed our httpd init script to leave the stacksize ulimit at Linux default (8MB) without changing anything else. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- This tagline SHAREWARE. Send $5. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 03:05 PM, Yann Ylavic wrote: Couldn't htcacheclean or alike do something like this? "EnableMMAP off" could definitely help here. (Didn't mean to ignore this part of your email, but I don't have much experience with htcacheclean yet so I can't really comment...) --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 03:05 PM, Yann Ylavic wrote: Hmm, Linux raises SIGBUS if an mmap is used after the underlying file has been truncated (see [1]). See also https://bz.apache.org/bugzilla/show_bug.cgi?id=46688 . Niklas, just to clarify: you're not willfully truncating large files as they're being served, right? I *can* reproduce a SIGBUS if I start truncating files during my stress testing. But EnableMMAP's documentation calls that case out explicitly. --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 2, 2017 at 11:36 PM, Jacob Champion wrote: > On 02/02/2017 02:32 PM, Yann Ylavic wrote: >> >> On Thu, Feb 2, 2017 at 11:19 PM, Jacob Champion >> wrote: >>> >>> Idle thoughts: "Cannot access memory" in this case could be a red >>> herring, >>> if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems >>> reasonable >>> that the data in question could be mmap'd, given the nice round address >>> and >>> 4 MiB length (equal to APR_MMAP_LIMIT). >>> >>> That doesn't mean we're looking in the wrong place, though, since SIGBUS >>> can >>> also be generated by an out-of-bounds access to an mmap'd region. >> >> >> Right, looks like the memory has been unmapped though (SIGBUS) before >> being (re)used. > > Oh, I thought an access after an unmap would SIGSEGV instead of SIGBUS. I > haven't ever tested that out; I should try it... Hmm, Linux raises SIGBUS if an mmap is used after the underlying file has been truncated (see [1]). Couldn't htcacheclean or alike do something like this? "EnableMMAP off" could definitely help here. [1] http://man7.org/linux/man-pages/man2/mmap.2.html
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 02:32 PM, Yann Ylavic wrote: On Thu, Feb 2, 2017 at 11:19 PM, Jacob Champion wrote: Idle thoughts: "Cannot access memory" in this case could be a red herring, if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems reasonable that the data in question could be mmap'd, given the nice round address and 4 MiB length (equal to APR_MMAP_LIMIT). That doesn't mean we're looking in the wrong place, though, since SIGBUS can also be generated by an out-of-bounds access to an mmap'd region. Right, looks like the memory has been unmapped though (SIGBUS) before being (re)used. Oh, I thought an access after an unmap would SIGSEGV instead of SIGBUS. I haven't ever tested that out; I should try it... --Jacob
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, Feb 2, 2017 at 11:19 PM, Jacob Champion wrote: > > Idle thoughts: "Cannot access memory" in this case could be a red herring, > if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems reasonable > that the data in question could be mmap'd, given the nice round address and > 4 MiB length (equal to APR_MMAP_LIMIT). > > That doesn't mean we're looking in the wrong place, though, since SIGBUS can > also be generated by an out-of-bounds access to an mmap'd region. Right, looks like the memory has been unmapped though (SIGBUS) before being (re)used. Does "EnableMMAP off" help or produce another backtrace? > > Niklas, what version of APR are you using? Are you serving large (> 4 MiB) > static files? I have not been able to reproduce so far (Ubuntu 16.04, httpd > 2.4.25 + mod_ssl + mpm_event). The original file bucket comes from mod_cache, and indeed looks larger than 4MB. If it were (htcache)cleaned while being served, SIGBUS shouldn't happen still since we hold an fd (and reference) on it... Regards, Yann.
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On 02/02/2017 02:04 AM, Yann Ylavic wrote: Hi Niklas, On Wed, Feb 1, 2017 at 7:02 PM, Niklas Edmundsson wrote: We've started to see spurious segfaults with httpd 2.4.25, mpm_event, ssl on Ubuntu 14.04LTS. Not frequent, but none the less happening. #4 ssl_io_filter_output (f=0x7f507013cfe0, bb=0x7f4f840be168) at ssl_engine_io.c:1746 data = 0x7f5075518000 len = 4194304 bucket = 0x7f4f840b1ba8 status = filter_ctx = 0x7f507013cf88 inctx = outctx = 0x7f507013d008 rblock = APR_NONBLOCK_READ Idle thoughts: "Cannot access memory" in this case could be a red herring, if Niklas' gdb can't peer into mmap'd memory spaces [1]. It seems reasonable that the data in question could be mmap'd, given the nice round address and 4 MiB length (equal to APR_MMAP_LIMIT). That doesn't mean we're looking in the wrong place, though, since SIGBUS can also be generated by an out-of-bounds access to an mmap'd region. Niklas, what version of APR are you using? Are you serving large (> 4 MiB) static files? I have not been able to reproduce so far (Ubuntu 16.04, httpd 2.4.25 + mod_ssl + mpm_event). --Jacob [1] https://stackoverflow.com/questions/654393/examining-mmaped-addresses-using-gdb
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 2 Feb 2017, Niklas Edmundsson wrote: On Thu, 2 Feb 2017, Yann Ylavic wrote: Are we hitting a corner case of process cleanup that plays merry hell with https/ssl, or are we just having bad luck? Ideas? Suggestions? 2.4.25 is eager to terminate/shutdown keepalive connections more quickly (than previous versions) on graceful shutdown (e.g. MaxConnectionsPerChild reached). What might happen in ssl_io_filter_output() is that buffered output data (already deleted but not cleared) end up being reused on shutdown. Could you please try the attached patch? Built and deployed, waiting for the most affected host to drain in order to restart it. I'll also lower MaxConnectionsPerChild a bit more in order to stress this. Still seems to SIGBUS. backtraces attached. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- New Borg Software Package : Locutus 1-2-3 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= bt.2201.bz2 Description: Binary data bt.6429.bz2 Description: Binary data bt.8071.bz2 Description: Binary data
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Thu, 2 Feb 2017, Yann Ylavic wrote: Are we hitting a corner case of process cleanup that plays merry hell with https/ssl, or are we just having bad luck? Ideas? Suggestions? 2.4.25 is eager to terminate/shutdown keepalive connections more quickly (than previous versions) on graceful shutdown (e.g. MaxConnectionsPerChild reached). What might happen in ssl_io_filter_output() is that buffered output data (already deleted but not cleared) end up being reused on shutdown. Could you please try the attached patch? Built and deployed, waiting for the most affected host to drain in order to restart it. I'll also lower MaxConnectionsPerChild a bit more in order to stress this. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- At last I'm organized", he sighed, and died. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
Hi Niklas, On Wed, Feb 1, 2017 at 7:02 PM, Niklas Edmundsson wrote: > > We've started to see spurious segfaults with httpd 2.4.25, mpm_event, ssl on > Ubuntu 14.04LTS. Not frequent, but none the less happening. > > #4 ssl_io_filter_output (f=0x7f507013cfe0, bb=0x7f4f840be168) at > ssl_engine_io.c:1746 > data = 0x7f5075518000 0x7f5075518000> > len = 4194304 > bucket = 0x7f4f840b1ba8 > status = > filter_ctx = 0x7f507013cf88 > inctx = > outctx = 0x7f507013d008 > rblock = APR_NONBLOCK_READ I suspect some cleanup ordering issue happening in ssl_io_filter_output(), when the EOC bucket is found. > > Are we hitting a corner case of process cleanup that plays merry hell with > https/ssl, or are we just having bad luck? Ideas? Suggestions? 2.4.25 is eager to terminate/shutdown keepalive connections more quickly (than previous versions) on graceful shutdown (e.g. MaxConnectionsPerChild reached). What might happen in ssl_io_filter_output() is that buffered output data (already deleted but not cleared) end up being reused on shutdown. Could you please try the attached patch? Regards, Yann. Index: modules/ssl/ssl_engine_io.c === --- modules/ssl/ssl_engine_io.c (revision 1781324) +++ modules/ssl/ssl_engine_io.c (working copy) @@ -138,6 +138,7 @@ static int bio_filter_out_pass(bio_filter_out_ctx_ outctx->rc = ap_pass_brigade(outctx->filter_ctx->pOutputFilter->next, outctx->bb); +apr_brigade_cleanup(outctx->bb); /* Fail if the connection was reset: */ if (outctx->rc == APR_SUCCESS && outctx->c->aborted) { outctx->rc = APR_ECONNRESET; @@ -1699,13 +1700,12 @@ static apr_status_t ssl_io_filter_output(ap_filter while (!APR_BRIGADE_EMPTY(bb) && status == APR_SUCCESS) { apr_bucket *bucket = APR_BRIGADE_FIRST(bb); -if (APR_BUCKET_IS_METADATA(bucket)) { +if (APR_BUCKET_IS_METADATA(bucket) || !filter_ctx->pssl) { /* Pass through metadata buckets untouched. EOC is * special; terminate the SSL layer first. */ if (AP_BUCKET_IS_EOC(bucket)) { ssl_filter_io_shutdown(filter_ctx, f->c, 0); } -AP_DEBUG_ASSERT(APR_BRIGADE_EMPTY(outctx->bb)); /* Metadata buckets are passed one per brigade; it might * be more efficient (but also more complex) to use @@ -1712,11 +1712,10 @@ static apr_status_t ssl_io_filter_output(ap_filter * outctx->bb as a true buffer and interleave these with * data buckets. */ APR_BUCKET_REMOVE(bucket); -APR_BRIGADE_INSERT_HEAD(outctx->bb, bucket); -status = ap_pass_brigade(f->next, outctx->bb); -if (status == APR_SUCCESS && f->c->aborted) -status = APR_ECONNRESET; -apr_brigade_cleanup(outctx->bb); +APR_BRIGADE_INSERT_TAIL(outctx->bb, bucket); +if (bio_filter_out_pass(outctx) < 0) { +status = outctx->rc; +} } else { /* Filter a data bucket. */
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Wed, 1 Feb 2017, Eric Covener wrote: On Wed, Feb 1, 2017 at 1:02 PM, Niklas Edmundsson wrote: This might be due to processes being cleaned up due to hitting MaxSpareThreads or MaxConnectionsPerChild, these are tuned to not happen frequently. It's just a wild guess, but the reason for me suspecting this is the weird looking stacktraces that points towards use-after-free issues... The backtraces of the other threads in the process might give a hint if graceful proc shutdown is occurring -- e.g. one thread might have join_workers / apr_thread_join in the stack. Nothing obvious to me (grep -i join finds nothing in the backtraces). I have 9 coredumps with the thread apply all bt full output summing to 1.3 MB which feels a bit much to post here, although I guess they would be small if I bzip:ed and attached them... Before we look too hard in that direction though, I remembered that our httpd init script sets the stacksize to 512k down from the Linux default of 8MB (historical reasons). Might that be the easy explanation, ie threads overflowing the stack? We only started serving https this autumn, and recently saw a bump in usage due to the LineageOS mirror. It is entirely possible that this is triggered by a usage pattern change exposing some of our arcane habits ;) Another observation is that these dumps seems to happen in groups, ie: [Tue Jan 24 19:32:52.277623 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 425377 exit signal Bus error (7), possible coredump in /tmp [Tue Jan 24 19:32:55.281211 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 29545 exit signal Bus error (7), possible coredump in /tmp [Tue Jan 24 19:32:56.282240 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 20749 exit signal Bus error (7), possible coredump in /tmp [Tue Jan 24 19:32:58.285476 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 679374 exit signal Bus error (7), possible coredump in /tmp [Wed Jan 25 00:14:29.743371 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 679441 exit signal Bus error (7), possible coredump in /tmp [Wed Jan 25 00:14:38.753792 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 679547 exit signal Bus error (7), possible coredump in /tmp [Tue Jan 31 15:29:04.767732 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 954024 exit signal Bus error (7), possible coredump in /tmp [Tue Jan 31 15:29:09.773329 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 693010 exit signal Bus error (7), possible coredump in /tmp [Tue Jan 31 15:29:18.782301 2017] [core:notice] [pid 2520:tid 139983763429184] AH00051: child pid 693899 exit signal Bus error (7), possible coredump in /tmp Don't know what, if any, conclusions can be drawn from that though... /Nikke - rambling a bit before falling asleep... -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- It's a port of call, home away from home... =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: httpd 2.4.25, mpm_event, ssl: segfaults
On Wed, Feb 1, 2017 at 1:02 PM, Niklas Edmundsson wrote: > This might be due to processes being cleaned up due to hitting > MaxSpareThreads or MaxConnectionsPerChild, these are tuned to not happen > frequently. It's just a wild guess, but the reason for me suspecting this is > the weird looking stacktraces that points towards use-after-free issues... The backtraces of the other threads in the process might give a hint if graceful proc shutdown is occurring -- e.g. one thread might have join_workers / apr_thread_join in the stack. -- Eric Covener cove...@gmail.com