On 03/05/2018 07:24 μμ, Willy Tarreau wrote:
> On Thu, May 03, 2018 at 02:51:12PM +0200, Pavlos Parissis wrote:
>> On 03/05/2018 02:45 uu, Olivier Houchard wrote:
>>> Hi Pavlos,
>>>
>>> On Thu, May 03, 2018 at 12:45:42PM +0200, Pavlos Parissis wrote:
>>>> Hi,
>>>>
>>>> Linux kernel version 4.14 adds support for zero-copy from user memory to 
>>>> TCP sockets by setting
>>>> MSG_ZEROCOPY flag. This is for the sending side of the socket, for the 
>>>> receiving side of the socket
>>>> we need to wait for kernel version 4.18.
>>>>
>>>> Will you consider enabling this on HAProxy?
>>>>
>>>> More info can be found here, 
>>>> https://www.kernel.org/doc/html/latest/networking/msg_zerocopy.html
>>>
>>> After some discussion with Willy, we're not sure it is worth it.
>>> It would force us to release buffer much later than we do actually, it can't
>>> be used with SSL, and we already achieve zero-copy by using splicing.
>>>
>>> Is there any specific case where you think it'd be a huge win ?
>>>
>>
>> The only use case that I can think of is HTTP streaming. But, without 
>> testing it we can't say a lot.
> 
> In fact, for HTTP streaming, splicing already does it all and even
> better since it only manipulates a few pointers in the kernel between
> the source and destination socket buffers. Userspace is not even
> involved.
> 
> Also it's important to remember that while copies are best avoided
> whenever possible, they aren't that dramatic at the common traffic
> rates. I've already reached 60 Gbps of forwarded traffic with and
> without splicing on a 4-core machine.
> 
> One aspect to keep in mind is the following. A typical Xeon system will
> achieve around 20 GB/s of in-L3 memcpy() bandwidth. For a typical 16kB
> buffer, that's only 760 ns to copy the whole buffer, which is roughly the
> cost of the extra syscall needed to check that the transfer completed.
> At 10 Gbps, this represents only 6.25% of the total processing time.
> And there's something much more important : with the copy operation,
> the buffer is released after these 760 ns and immediately recycled for
> other connections. This ensures that the memory usage remains low and
> that most transfer operations are made in L3 instead of RAM. If you
> use zero-copy here, instead your memory will be pinned for the time
> it takes to cycle on many other connections and get back to processing
> this FD. It can very easily become 10-100 microseconds, or 15-150 times
> more, resulting in much more RAM usage for temporary buffers, and thus
> a much higher cache footprint.
> 
> In my opinion MSG_ZEROCOPY was designed for servers, those which stream
> video and so on, and which produce their own data, and which don't need
> to recycle their buffers. We're definitely not in this case at all here,
> we're just forwarding ephemeral data so we can recycle buffers very quickly
> and through splicing we can even avoid to see these data at all.
> 
> Hoping this helps,
> Willy
> 

Thanks for this very detailed response, once again I learned a lot.

Cheers,
Pavlos

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to