On Wed, Jan 9, 2019 at 7:50 AM Marek Majkowski <ma...@cloudflare.com> wrote: > > I got it slightly wrong, and it's even worse than this. As far as I > understand it, the current semantics of MSG_ZEROCOPY on TCP make it > close to unusable. The problem is that the remote party can move your > MSG_ZEROCOPY socket from ESTABLISHED to CLOSE_WAIT without your > involvement. This will mean that even though the program can still > send() data to the socket, MSG_ZEROCOPY operations will fail with > EINVAL. > > In other words: because the socket needs to be ESTABLISHED for > MSG_ZEROCOPY to work, and because remote party can send FIN and move > the socket to CLOSE_WAIT, a sending party must implement a fallback > from EINVAL return code on the transmission code. An adversarial > client who does shutdown(SHUT_WR), will trigger EINVAL in the sender..
An adversarial client only affects its own stream, so the impact is limited. > > Marek > > On Wed, Jan 9, 2019 at 1:01 PM Marek Majkowski <ma...@cloudflare.com> wrote: > > > > Hi, > > > > Current implementation of MSG_ZEROCOPY for TCP requires the socket to > > be ESTABLISHED: > > https://elixir.bootlin.com/linux/v5.0-rc1/source/net/ipv4/tcp.c#L1188 > > > > if (sk->sk_state != TCP_ESTABLISHED) { > > err = -EINVAL; > > goto out_err; > > } > > > > In TCP it's totally fine to have half-open sockets, for example: > > > > shutdown(5, SHUT_RD) > > > > Moves the socket from ESTABLISHED to CLOSE_WAIT. In such TCP state > > it's possible to continue sending data. This is not supported by > > MSG_ZEROCOPY, which will fail with EINVAL in such case. I think it's a > > bug. Thanks for the report. At first blush it seems like extending the check to include state CLOSE_WAIT would resolve the issue if (flags & MSG_ZEROCOPY && size && sock_flag(sk, SOCK_ZEROCOPY)) { - if (sk->sk_state != TCP_ESTABLISHED) { + if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { err = -EINVAL; goto out_err; }