On 11/16/2017 03:51 AM, Wouter Verhelst wrote:

>> I also remember from talking with Vladimir during KVM Forum last month
>> that one of the shortfalls of the NBD protocol is that you can only ever
>> send a length of up to 32 bits on the command side (unless we introduce
>> structured commands in addition to our current work to add structured
>> replies);
> 
> Yes, and I'm thinking we should do so. This will obviously require more
> negotiation.
> 
> Can be done fairly easily though:
> - Client negotiates structured replies (don't think it makes sense to do
>   structured requests without structured replies)
> - Server sets an extra transmission flag to say "I am capable of
>   receiving extended requests"
> - Extended requests have a different magic number, and should have a
>   "request length" field as well. I'm thinking we make it:
> 
> magic          (32b)
> request length (16b)
> type           (16b)
> flags          (64b)
> handle         (64b)
> from           (64b)
> data length    (64b)
> (extra data)
> 
> Request length in this proposal should always be at least 320.

320 bits, but we pass length in bytes, so 40.

> 
> I made flags 64 bits rather than 16 as per the current format, because
> that way everything is aligned on a 4-byte boundary, which makes things
> a bit easier on some architectures (e.g., I know that sparc doesn't like
> unaligned 64-bit access). 64 bits for flags looks like a bit of a waste,
> but then if we're going to waste some bits somewhere, I guess it's best
> to assign them to flags.
> 
> The idea is that "request length" is the length of the data that the
> client is sending, and "data length" is the length of the range that
> we're trying to deal with.

Yep, sounds reasonable.

> 
> A write request would thus have to have request length be (data length +
> 320); a read request would have request length be 320, and expect data
> to be returned of data length bytes.

Umm, if data length is 64 bits, but request length is 16 bits, we can't
properly support write requests with that much payload.  I think it goes
back to the idea of block sizes: the maximum request that a server is
willing to accept (advertised via INFO during NBD_OPT_GO) still applies,
so you can't NBD_CMD_WRITE with more than that maximum size (and
therefore your maximum request size can't be more than that size plus
the overhead of the 40 byte header), but OTHER commands that CAN operate
on larger chunks of the export (WRITE_ZEROES, BLOCK_STATUS, maybe a new
extension LOCK for allowing fcntl()-like locking semantics) can now be
accommodated, because they don't have to send large amounts of payload
along with the request.

Keeping request length at 16 bits may therefore be too small (if we want
to allow 32M write payloads), and may require us laying out the
structured write header slightly differently (maybe fewer flag bits
after all).  Or we can say that NBD_CMD_WRITE still has to use old-style
requests.

> 
> A metadata request could then tack on extra data, where request length
> of 320 implies "all negotiated metadata contexts", but anything more
> than that would imply there are some metadata IDs passed along.

Again, 40 bytes (not 320 bits), but yeah, that would be the idea.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to