On Thu, Nov 16, 2017 at 09:30:41AM -0600, Eric Blake wrote: > On 11/16/2017 03:51 AM, Wouter Verhelst wrote: > > >> I also remember from talking with Vladimir during KVM Forum last month > >> that one of the shortfalls of the NBD protocol is that you can only ever > >> send a length of up to 32 bits on the command side (unless we introduce > >> structured commands in addition to our current work to add structured > >> replies); > > > > Yes, and I'm thinking we should do so. This will obviously require more > > negotiation. > > > > Can be done fairly easily though: > > - Client negotiates structured replies (don't think it makes sense to do > > structured requests without structured replies) > > - Server sets an extra transmission flag to say "I am capable of > > receiving extended requests" > > - Extended requests have a different magic number, and should have a > > "request length" field as well. I'm thinking we make it: > > > > magic (32b) > > request length (16b) > > type (16b) > > flags (64b) > > handle (64b) > > from (64b) > > data length (64b) > > (extra data) > > > > Request length in this proposal should always be at least 320. > > 320 bits, but we pass length in bytes, so 40.
Umm. Yes. I didn't sleep very well last night, sorry. > > I made flags 64 bits rather than 16 as per the current format, because > > that way everything is aligned on a 4-byte boundary, which makes things > > a bit easier on some architectures (e.g., I know that sparc doesn't like > > unaligned 64-bit access). 64 bits for flags looks like a bit of a waste, > > but then if we're going to waste some bits somewhere, I guess it's best > > to assign them to flags. > > > > The idea is that "request length" is the length of the data that the > > client is sending, and "data length" is the length of the range that > > we're trying to deal with. > > Yep, sounds reasonable. > > > A write request would thus have to have request length be (data length + > > 320); a read request would have request length be 320, and expect data > > to be returned of data length bytes. > > Umm, if data length is 64 bits, but request length is 16 bits, we can't > properly support write requests with that much payload. mumble mumble. Yes, of course. See above ;-) > I think it goes > back to the idea of block sizes: the maximum request that a server is > willing to accept (advertised via INFO during NBD_OPT_GO) still applies, > so you can't NBD_CMD_WRITE with more than that maximum size (and > therefore your maximum request size can't be more than that size plus > the overhead of the 40 byte header), but OTHER commands that CAN operate > on larger chunks of the export (WRITE_ZEROES, BLOCK_STATUS, maybe a new > extension LOCK for allowing fcntl()-like locking semantics) can now be > accommodated, because they don't have to send large amounts of payload > along with the request. > > Keeping request length at 16 bits may therefore be too small (if we want > to allow 32M write payloads), and may require us laying out the > structured write header slightly differently (maybe fewer flag bits > after all). Right, so let's make it: magic (32b) flags (16b) type (16b) handle (64b) from (64b) data length (64b) request length (64b) (extra data) That actually copies the same structure as the current request header, but: - data length is extended from 32 bits to 64 bits, so that long requests can be made where they make sense. We should make it clear though that block size limitations still apply, and that a request to read 2^63 bytes of data is likely to be refused by a server. - request lenght is added, also as a 64-bit value. I could be persuaded to make that a 32-bit value rather than a 64-bit one, though. This way, common code can be used to handle all data up to and including the "from" field; what's beyond that would need to be decided based upon the value of the magic field (32-byte data length in case of old requests, 2 64-byte length values in the other case). Maybe, by the time we get here, it makes sense to also have "request length" be 0 for a command that doesn't have any extra data beyond that header. That would also make one particular case easier to read: if(req.type == NBD_CMD_WRITE && req.data_length != req.request_length) return EINVAL; However, the downside of that is that we include header length in all other cases (negotiation, structured replies), and this one would then be the odd one out; I could imagine that might cause too much confusion. Not sure whether the minor advantage above is worth that. > Or we can say that NBD_CMD_WRITE still has to use old-style requests. Let's not :-) > > A metadata request could then tack on extra data, where request length > > of 320 implies "all negotiated metadata contexts", but anything more > > than that would imply there are some metadata IDs passed along. > > Again, 40 bytes (not 320 bits), but yeah, that would be the idea. Right. -- Could you people please use IRC like normal people?!? -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008 Hacklab