[adding the NBD list into cc] On Mon, Aug 23, 2021 at 09:26:34PM +0530, Abhay Raj Singh wrote: > I had an idea for optimizing my current approach, it's good in some > ways but can be faster with some breaking changes to the protocol. > > Currently, we read (from socket connected to source) one request at a time > the simple flow looks like `read_header(io_uring) ---- success ---> > recv(data) --- success ---> send(data) & queue another read header` > but it's not as efficient as it could be at best it's a hack. > > Another approach I am thinking about is a large buffer > where we can read all of the socket's data and process packets from > that buffer as all the I/O is handled. > this minimizes the number of read requests to the kernel as we do 1 > read for multiple NBD packets. > > Further optimization requires changing the NBD protocol a bit > Current protocol > 1. Memory representation of a response (20-byte header + data) > 2. Memory representation of a request (28-byte header + data) > > HHHHH_DDDDDDDDD... > HHHHHHH_DDDDDDDDD... > > H and D represent 4 bytes, _ represents 0 bytes
You are correct that requests are currently 28 bytes header plus any payload (where payload is currently only in NBD_CMD_WRITE). But responses are two different lengths: simple responses are 16 bytes + payload (payload only for NBD_CMD_READ, and only if structured replies not negotiated), while structured responses are 20 bytes + payload (but while NBD_CMD_READ and NBD_CMD_BLOCK_STATUS require structured replies, a compliant server can still send simple replies to other commands). So it's even trickier than you represent here, as reading 20-byte headers of a reply is not going to always do the right thing. > > With the large buffer approach, we read data into a large buffer, then > copy the NBD packet's data to a new buffer, strap a new header to it > and send it. > This copying is what we wanted to avoid in the first place. > > If the response header was 28 bytes or the first 8-bytes of data were > useless we could have just overwritten the header part and sent data > directly from the large buffer, therefore avoiding the copy. > > What are your thoughts? There's already discussions about what it would take to extend the NBD protocol to support 64-bit requests (not that we'd want to go beyond current server restrictions of 32M or 64M maximum NBD_CMD_READ and NBD_CMD_WRITE, but more so we can permit quick image zeroing via a 64-bit NBD_CMD_WRITE_ZEROES). Your observation that having the request and response headers be equally sized for more efficient handling is worthwhile to consider in making such a protocol extension - of necessity, it would have to be via an NBD_OPT_* option requested by the client during negotiation and responded to affirmatively by the server, before both sides then use the new-size packets in both directions after NBD_OPT_GO (and a client would still have to be prepared to fall back to the unequal-sized headers if the server doesn't understand the option). For that matter, is there a benefit to having cache-line-optimized sizing, where all headers are exactly 32 bytes (both requests and responses, and both simple and structured replies)? I'm thinking maybe NBD_OPT_FIXED_SIZE_HEADER might be a sane name for such an option. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org _______________________________________________ Libguestfs mailing list [email protected] https://listman.redhat.com/mailman/listinfo/libguestfs
