On 03/05/2018 07:24 μμ, Willy Tarreau wrote: > On Thu, May 03, 2018 at 02:51:12PM +0200, Pavlos Parissis wrote: >> On 03/05/2018 02:45 uu, Olivier Houchard wrote: >>> Hi Pavlos, >>> >>> On Thu, May 03, 2018 at 12:45:42PM +0200, Pavlos Parissis wrote: >>>> Hi, >>>> >>>> Linux kernel version 4.14 adds support for zero-copy from user memory to >>>> TCP sockets by setting >>>> MSG_ZEROCOPY flag. This is for the sending side of the socket, for the >>>> receiving side of the socket >>>> we need to wait for kernel version 4.18. >>>> >>>> Will you consider enabling this on HAProxy? >>>> >>>> More info can be found here, >>>> https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html >>> >>> After some discussion with Willy, we're not sure it is worth it. >>> It would force us to release buffer much later than we do actually, it can't >>> be used with SSL, and we already achieve zero-copy by using splicing. >>> >>> Is there any specific case where you think it'd be a huge win ? >>> >> >> The only use case that I can think of is HTTP streaming. But, without >> testing it we can't say a lot. > > In fact, for HTTP streaming, splicing already does it all and even > better since it only manipulates a few pointers in the kernel between > the source and destination socket buffers. Userspace is not even > involved. > > Also it's important to remember that while copies are best avoided > whenever possible, they aren't that dramatic at the common traffic > rates. I've already reached 60 Gbps of forwarded traffic with and > without splicing on a 4-core machine. > > One aspect to keep in mind is the following. A typical Xeon system will > achieve around 20 GB/s of in-L3 memcpy() bandwidth. For a typical 16kB > buffer, that's only 760 ns to copy the whole buffer, which is roughly the > cost of the extra syscall needed to check that the transfer completed. > At 10 Gbps, this represents only 6.25% of the total processing time. > And there's something much more important : with the copy operation, > the buffer is released after these 760 ns and immediately recycled for > other connections. This ensures that the memory usage remains low and > that most transfer operations are made in L3 instead of RAM. If you > use zero-copy here, instead your memory will be pinned for the time > it takes to cycle on many other connections and get back to processing > this FD. It can very easily become 10-100 microseconds, or 15-150 times > more, resulting in much more RAM usage for temporary buffers, and thus > a much higher cache footprint. > > In my opinion MSG_ZEROCOPY was designed for servers, those which stream > video and so on, and which produce their own data, and which don't need > to recycle their buffers. We're definitely not in this case at all here, > we're just forwarding ephemeral data so we can recycle buffers very quickly > and through splicing we can even avoid to see these data at all. > > Hoping this helps, > Willy >
Thanks for this very detailed response, once again I learned a lot. Cheers, Pavlos
signature.asc
Description: OpenPGP digital signature

