Ping. Should I write this up as a proper proposal? On Thu, Nov 16, 2017 at 05:20:29PM +0100, Wouter Verhelst wrote: > On Thu, Nov 16, 2017 at 09:30:41AM -0600, Eric Blake wrote: > > On 11/16/2017 03:51 AM, Wouter Verhelst wrote: > > > > >> I also remember from talking with Vladimir during KVM Forum last month > > >> that one of the shortfalls of the NBD protocol is that you can only ever > > >> send a length of up to 32 bits on the command side (unless we introduce > > >> structured commands in addition to our current work to add structured > > >> replies); > > > > > > Yes, and I'm thinking we should do so. This will obviously require more > > > negotiation. > > > > > > Can be done fairly easily though: > > > - Client negotiates structured replies (don't think it makes sense to do > > > structured requests without structured replies) > > > - Server sets an extra transmission flag to say "I am capable of > > > receiving extended requests" > > > - Extended requests have a different magic number, and should have a > > > "request length" field as well. I'm thinking we make it: > > > > > > magic (32b) > > > request length (16b) > > > type (16b) > > > flags (64b) > > > handle (64b) > > > from (64b) > > > data length (64b) > > > (extra data) > > > > > > Request length in this proposal should always be at least 320. > > > > 320 bits, but we pass length in bytes, so 40. > > Umm. Yes. I didn't sleep very well last night, sorry. > > > > I made flags 64 bits rather than 16 as per the current format, because > > > that way everything is aligned on a 4-byte boundary, which makes things > > > a bit easier on some architectures (e.g., I know that sparc doesn't like > > > unaligned 64-bit access). 64 bits for flags looks like a bit of a waste, > > > but then if we're going to waste some bits somewhere, I guess it's best > > > to assign them to flags. > > > > > > The idea is that "request length" is the length of the data that the > > > client is sending, and "data length" is the length of the range that > > > we're trying to deal with. > > > > Yep, sounds reasonable. > > > > > A write request would thus have to have request length be (data length + > > > 320); a read request would have request length be 320, and expect data > > > to be returned of data length bytes. > > > > Umm, if data length is 64 bits, but request length is 16 bits, we can't > > properly support write requests with that much payload. > > mumble mumble. Yes, of course. See above ;-) > > > I think it goes > > back to the idea of block sizes: the maximum request that a server is > > willing to accept (advertised via INFO during NBD_OPT_GO) still applies, > > so you can't NBD_CMD_WRITE with more than that maximum size (and > > therefore your maximum request size can't be more than that size plus > > the overhead of the 40 byte header), but OTHER commands that CAN operate > > on larger chunks of the export (WRITE_ZEROES, BLOCK_STATUS, maybe a new > > extension LOCK for allowing fcntl()-like locking semantics) can now be > > accommodated, because they don't have to send large amounts of payload > > along with the request. > > > > Keeping request length at 16 bits may therefore be too small (if we want > > to allow 32M write payloads), and may require us laying out the > > structured write header slightly differently (maybe fewer flag bits > > after all). > > Right, so let's make it: > > magic (32b) > flags (16b) > type (16b) > handle (64b) > from (64b) > data length (64b) > request length (64b) > (extra data) > > That actually copies the same structure as the current request header, but: > - data length is extended from 32 bits to 64 bits, so that long requests can > be > made where they make sense. We should make it clear though that block > size limitations still apply, and that a request to read 2^63 bytes of > data is likely to be refused by a server. > - request lenght is added, also as a 64-bit value. I could be persuaded > to make that a 32-bit value rather than a 64-bit one, though. > > This way, common code can be used to handle all data up to and including > the "from" field; what's beyond that would need to be decided based upon > the value of the magic field (32-byte data length in case of old > requests, 2 64-byte length values in the other case). > > Maybe, by the time we get here, it makes sense to also have "request > length" be 0 for a command that doesn't have any extra data beyond that > header. That would also make one particular case easier to read: > > if(req.type == NBD_CMD_WRITE && req.data_length != req.request_length) > return EINVAL; > > However, the downside of that is that we include header length in all > other cases (negotiation, structured replies), and this one would then > be the odd one out; I could imagine that might cause too much confusion. > Not sure whether the minor advantage above is worth that. > > > Or we can say that NBD_CMD_WRITE still has to use old-style requests. > > Let's not :-) > > > > A metadata request could then tack on extra data, where request length > > > of 320 implies "all negotiated metadata contexts", but anything more > > > than that would imply there are some metadata IDs passed along. > > > > Again, 40 bytes (not 320 bits), but yeah, that would be the idea. > > Right. > > -- > Could you people please use IRC like normal people?!? > > -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008 > Hacklab > >
-- Could you people please use IRC like normal people?!? -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008 Hacklab