Re: Considering adding support for TCP Zero Copy

Willy Tarreau Thu, 03 May 2018 10:25:04 -0700

On Thu, May 03, 2018 at 02:51:12PM +0200, Pavlos Parissis wrote:
> On 03/05/2018 02:45 uu, Olivier Houchard wrote:
> > Hi Pavlos,
> > 
> > On Thu, May 03, 2018 at 12:45:42PM +0200, Pavlos Parissis wrote:
> >> Hi,
> >>
> >> Linux kernel version 4.14 adds support for zero-copy from user memory to 
> >> TCP sockets by setting
> >> MSG_ZEROCOPY flag. This is for the sending side of the socket, for the 
> >> receiving side of the socket
> >> we need to wait for kernel version 4.18.
> >>
> >> Will you consider enabling this on HAProxy?
> >>
> >> More info can be found here, 
> >> https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html
> > 
> > After some discussion with Willy, we're not sure it is worth it.
> > It would force us to release buffer much later than we do actually, it can't
> > be used with SSL, and we already achieve zero-copy by using splicing.
> > 
> > Is there any specific case where you think it'd be a huge win ?
> > 
> 
> The only use case that I can think of is HTTP streaming. But, without testing 
> it we can't say a lot.


In fact, for HTTP streaming, splicing already does it all and even
better since it only manipulates a few pointers in the kernel between
the source and destination socket buffers. Userspace is not even
involved.

Also it's important to remember that while copies are best avoided
whenever possible, they aren't that dramatic at the common traffic
rates. I've already reached 60 Gbps of forwarded traffic with and
without splicing on a 4-core machine.

One aspect to keep in mind is the following. A typical Xeon system will
achieve around 20 GB/s of in-L3 memcpy() bandwidth. For a typical 16kB
buffer, that's only 760 ns to copy the whole buffer, which is roughly the
cost of the extra syscall needed to check that the transfer completed.
At 10 Gbps, this represents only 6.25% of the total processing time.
And there's something much more important : with the copy operation,
the buffer is released after these 760 ns and immediately recycled for
other connections. This ensures that the memory usage remains low and
that most transfer operations are made in L3 instead of RAM. If you
use zero-copy here, instead your memory will be pinned for the time
it takes to cycle on many other connections and get back to processing
this FD. It can very easily become 10-100 microseconds, or 15-150 times
more, resulting in much more RAM usage for temporary buffers, and thus
a much higher cache footprint.

In my opinion MSG_ZEROCOPY was designed for servers, those which stream
video and so on, and which produce their own data, and which don't need
to recycle their buffers. We're definitely not in this case at all here,
we're just forwarding ephemeral data so we can recycle buffers very quickly
and through splicing we can even avoid to see these data at all.

Hoping this helps,
Willy

Re: Considering adding support for TCP Zero Copy

Reply via email to