On Fri, Mar 24, 2023 at 11:47:20AM +0100, Florian Westphal wrote: > qemu-nbd doesn't set TCP_NODELAY on the tcp socket. > > Kernel waits for more data and avoids transmission of small packets. > Without TLS this is barely noticeable, but with TLS this really shows. > > Booting a VM via qemu-nbd on localhost (with tls) takes more than > 2 minutes on my system. tcpdump shows frequent wait periods, where no > packets get sent for a 40ms period.
Thank you for this analysis. > > Add explicit (un)corking when processing (and responding to) requests. > "TCP_CORK, &zero" after earlier "CORK, &one" will flush pending data. > > VM Boot time: > main: no tls: 23s, with tls: 2m45s > patched: no tls: 14s, with tls: 15s > > VM Boot time, qemu-nbd via network (same lan): > main: no tls: 18s, with tls: 1m50s > patched: no tls: 17s, with tls: 18s And the timings bear proof that it matters. > > Future optimization: if we could detect if there is another pending > request we could defer the uncork operation because more data would be > appended. nbdkit and libnbd do this with the MSG_MORE flag (plaintext) and TLS corking (tls); when building up a message to the other side, a flag is set any time we know we are likely to send more data very shortly. nbdkit wraps it under a flag SEND_MORE, which applies to both plaintext: https://gitlab.com/nbdkit/nbdkit/-/blob/master/server/connections.c#L415 and to TLS connections: https://gitlab.com/nbdkit/nbdkit/-/blob/master/server/crypto.c#L396 while libnbd uses MSG_MORE a bit more directly for the same purpose for plaintext, but isn't (yet) doing TLS corking: https://gitlab.com/nbdkit/libnbd/-/blob/master/generator/states-issue-command.c#L53 https://gitlab.com/nbdkit/libnbd/-/blob/master/lib/internal.h#L57 I would love to see a future patch to qio_channel code to support MSG_MORE in the same way as nbdkit is using its SEND_MORE flag, as the caller often has more info on whether it is sending a short prefix or is done with a conceptual message and ready to uncork, and where the use of a flag can be more efficient than separate passes through cork/uncork calls. But even your initial work at properly corking is a good step in the right direction. And surprisingly, qemu IS using corking on the client side: https://gitlab.com/qemu-project/qemu/-/blob/master/block/nbd.c#L525 just not on the server side, before your patch. > > Signed-off-by: Florian Westphal <f...@strlen.de> > --- > nbd/server.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/nbd/server.c b/nbd/server.c > index a4750e41880a..848836d41405 100644 > --- a/nbd/server.c > +++ b/nbd/server.c > @@ -2667,6 +2667,8 @@ static coroutine_fn void nbd_trip(void *opaque) > goto disconnect; > } > > + qio_channel_set_cork(client->ioc, true); > + > if (ret < 0) { > /* It wasn't -EIO, so, according to nbd_co_receive_request() > * semantics, we should return the error to the client. */ > @@ -2692,6 +2694,7 @@ static coroutine_fn void nbd_trip(void *opaque) > goto disconnect; > } > > + qio_channel_set_cork(client->ioc, false); Reviewed-by: Eric Blake <ebl...@redhat.com> > done: > nbd_request_put(req); > nbd_client_put(client); > -- > 2.39.2 > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org