Wouter, >> However, in this instance the block sizing info comes through (and >> *only* comes through) NBD_OPT_INFO. So in order to support block sizes >> you need to support NBD_OPT_INFO because there is no other way to >> obtain the block sizes anyway (*). The converse is really what we >> are arguing about. Can you legitimately support NBD_OPT_INFO (as a >> client) then ignore block sizes? > > Right, that's the argument I think matters most. I think that yes, you > can. We provide more information about exports than *just* the block > sizes they support (the export size, whether the export requires TLS, > possibly whether future extensions are supported as well), and there is > no reason to require that this one arbitrary bit of information is > assumed to be abided by just because you asked for general information.
I think that's a fair enough point. >> I'm arguing that we can legitimately make block size support part of the info >> spec, and say if you ask for info, you must be prepared to deal with it. The >> alternate view is that info and blocksizes are different things, in which >> case the server needs some way of knowing whether the client will respect >> block sizes if given. > > Right. I think they are different things. They look similar today, > because we mostly add INFO because of the block sizes -- but we also add > it for STARTTLS, and a client which doesn't care about block sizes but > which does care about STARTTLS should not have to suddenly worry about > block-aligning requests, etc. OK. I suppose my counterargument would be 'but blocksize constraints exist today, just like STARTTLS did before it was a formal option'. However, I agree the asymmetry of treatment of BLOCKSIZE and new bits is not good. >> An alternative here would be to take a leaf out of DHCP's (dubious) >> book. When asking for NBD_OPT_INFO you'd supply a list of extensions >> you understanderstand (i.e. a list of the NBD_REP_INFO codes), and >> you only get sent back the ones you ask for. The fact you've explicitly >> asked for it CAN be taken to mean you've understood it. > > Yes; that makes a lot more more sense. I had in fact been thinking about > something along those lines, just hadn't suggested it yet. OK. I will knock something up. >> I guess up to this point in the email, we are talking about how best >> for a client to signal to the server that it can respect the >> server's block size constraints. I proposed but then decided I >> didn't much like NBD_OPT_BLOCK_SIZE. There's another idea above. > > I haven't really understood why you dislike it, but perhaps that's just > me. A few reasons: * Every further info type necessarily needs a specific new option with its own custom documentation * There is an ordering issue (less so now the option haggling phase is synchronous). EG I might want to tell a client that supports blocksize constraints that my minimum block size is 512 bytes, but tell one that is 'only asking for info' that the minimum block size is 1. What happens if NBD_OPT_BLOCK_SIZE comes after the NBD_OPT_INFO request? I suspect the answer is 'well then, things have changed so I can reply differently with NBD_OPT_GO'. * Something in me feels the NBD_OPT stuff is asking the server about what it does, rather than the client telling the server about what it does. You almost want the server to be able to ask the client things. This might be just in my head. >> Yet another idea would be a single 'I understand block sizes' >> bit in the flags the client sends right at the start of the negotiation >> (perhaps mirrored by a server bit - I have something useful to >> say about block sizes); I think I'd prefer that. > > I don't. There's no need to burn flags for this. Yeah, true. >> But we have more contentiousness in the following. > > Not really. I meant I disagreed more about what you wrote below this point. >>> In addition, while I understand the reasoning behind announcing block >>> sizes, I do not want to make "abiding by announced block sizes" a >>> prerequisite of "using the more modern way of choosing an export". The >>> protocol currently has no requirement for a server to use a particular >>> block size, and I think this is a feature. Sure, most clients *do* >>> currently request data in block-sized units, but that doesn't mean we >>> need to make that a requirement. A simple user-space client which wants >>> to inspect (and possibly modify) some parts of an export might not care >>> about block sizes; we should make writing such clients not harder than >>> needs be. >> >> Well, actually every server I know of ALREADY has block size constraints. >> nbd-server.c is weakest, in that it won't support operations >= 1>>31 >> bytes. Qemu will fail stuff >32Mb. gonbdserver.c fails on stuff >128Mb >> (and I won't be removing that - the DoS opportunity is too great). >> Many servers perform substantially better if reads/writes are aligned. > > Sure, but that's not my point. > > A server is already allowed to reject 'too large' requests, even if > block sizes aren't negotiated. That's fine, and we should keep that. They aren't allowed to per the standard, but they do. > However, servers *aren't* currently allowed to reject requests that are > too small, or that are not block-aligned. They aren't allowed to per the standard, but they do. I don't see the difference. > I think that's the only case > in which NBD_REP_ERR_BLOCK_SIZE_REQD should be sent; it signals that the > server can't support a certain type of request, and that the client > should be prepared for that. Right. So if you take the case of a server that can't (because it is impossible / impractical to do so, it carries other disadvantages, or the author is lazy, or whatever) support block writes < 512 bytes (say), that's no different from one that can't (because it is impossible / impractical to do so, or because the author is lazy or whatever) support >32Mb (cough, qemu) or >128Mb (cough, gonbdserver) or >allocatable memory (cough, nbdserver.c) requests. The way the block size stuff is currently written, there are a default set of block constraints - for these purposes the minimum size of 1 byte and the maximum size of 0xffffffff are the only thing that matters. If you support those, you shouldn't send NBD_REP_ERR_BLOCK_SIZE_REQD (currently the text is slightly wonky as it says you shouldn't send it if you can agree them externally, whereas obviously if they are the default you don't need to agree them externally - I will fix that). IE you should only be sending NBD_REP_ERR_BLOCK_SIZE_REQD if you can't tell the client that you don't support the block sizes it is going to expect (in the absence of asking you). > Block-aligning requests on the client side may be a bit much work > though, and some clients may not have the ability to abide by that > request. Therefore, if yoyu're going to require block-aligned requests, > you're effectively passing work to clients. Well that's one way of looking at it. Another is you are otherwise requiring the server to do it. In the general case I suspect it's easier on the client, because most client stacks (kernel, qemu being my 2 examples) are already capable of dealing with backends that have fixed block sizes (normal block devices, hard disks etc.) so will already do this. >> If you write a server which has arcane block size constraints it >> will fail if some dumb clients talk to it. I don't think that's an >> issue. It's better they fail early than try to connect and fail on >> the 37th write. >> >>> In that same light, I think we should discourage (but allow) the use of >>> NBD_REP_ERR_BLOCK_SIZE_REQD (although we might want to make that error's >>> name a bit shorter -- boy that's a lot of typing) in the same way that >>> we currently allow but discourage ENOMEM. >> >> Don't agree. >> >> If don't support block sizes under 4096 bytes (say) it's pretty >> non-negotiable, or you risk something that fails obscurely later. > > Sure. My point isn't that you shouldn't announce that; it's that you > should add code that makes it possible for you to talk to older clients, > even if that requires a performance cost. So this is my area of disagreement. I have no problem adding code. However, I don't believe there are (really) older clients that rely on this and if it made my life 1% harder to support them, I wouldn't bother and don't see why I should (in gonbdserver's case it turns out to be easy - so far - to support unaligned requests and minimum write size, but difficult for complicated reasons to support requests greater than a maximum size - difficult in the sense of 'to support arbitrary sizes, I would need to change the threading and memory management model significantly'). Actually nbd-server.c is in exactly the same position (and would be more so if you put in some DoS protection!) But despite the fact it's easy for gonbdserver, it wouldn't be easy for a server that uses mmap, O_DIRECT, or various other things. These are reasonable things for a server author to want to do. I think all (or nearly all) reasonable clients end up with the ability to support devices with fixed block size anyway. So it's reasonable for a server author to say 'you know what, if a client can't deal with my blocksize constraints, I want to cleanly refuse a connection' (which is what NBD_REP_ERR_BLOCK_SIZE_REQD is for). I think you agree with this, but ... >> Much better to fail at negotiation stage if you really can't >> support (or can't risk pretending to support) the client. >> >> If what you are saying is 'it is unreasonable to use the nbd >> protocol unless you can always somehow support byte-aligned >> writes' I don't really agree. > > I'm not saying that. > > A small server could certainly use NBD_REP_ERR_BLOCK_SIZE_REQD and > refuse to talk to clients who don't (announce they) know how to > block-align requests. There's nothing wrong with that. > > However, a server written for full interoperability and maximum > usefulness should not do so. It may issue a warning that it will be > slower, but it should be able to operate at a basic level. ... you are characterising such a server as a server with less than full interoperability. I don't agree. It's a perfectly reasonable server choice. In fact I'd say 'not supporting block sizes is an issue with client interoperability, not a server issue'. NBD is fundamentally to me a block device, not a file seeking device (the clue being in the name). If a hard disk vendor said "well, I'm sorry our disk only supports reads and writes by whole sectors", the response "well that's not very interoperable" would not be seen as sensible. Nor would a 'iSCSI vendors SHOULD support byte-wise writes to block devices'. However, if we found in the iSCSI protocol 'clients MUST respect block sizes between X and Y as returned by the iSCSI target' we'd be pretty unsurprised. That's the bit I think you may have backwards. > For that reason, I think we should add some language to discourage the > use of that option, with the understanding that "discourage" does not > mean "forbid". If it was me, I would make it the other way around, that clients SHOULD ask for blocksize info and respect it. And were it not for the fact we haven't had blocksize in there from day 1, I'd make it a 'MUST' (obviously that's impractical now). >> Just as you are arguing it should be possible to write a simple >> client, I'd argue it should be possible to write a simple server. > > And I agree with that :-) > > Perhaps we should add a section to the spec outlining requirements for > "minimum" implementations that are still interoperable with other > implementations...? Mmmm.... well I think RFC2119 wording mostly does that. I guess one general problem is that we are now adding a pile of options and it's not obvious what the consequences of not supporting them are. Let's take an uncontroversial one like STARTTLS. When we put it in, we made sure we did not make NBD_OPT_STARTTLS mandatory to support either side (we said the client MAY send it and the server can operate in one of 3 modes, in one of which it can refuse to process the option, which is what nbd-server.c does today); so that in a sense is easy, but only came after a lot of word wrangling. What about a simpler extension like NBD_CMD_WRITEZEROES? Assuming it's been promoted from an extension, what status has it got? Of course you can not send NBD_FLAG_SEND_WRITEZEROES which is what older servers will do, but is is a server that doesn't support it (assuming it is promoted) compliant with the *current* spec? IE is NBD_FLAG_SEND_WRITEZEROES there for back compatibility, or is there a 'new NBD protocol version' which is back compatible, but to be compatible with the *new* version, you need to implement certain changes? Certainly I feel that should be the case with (e.g.) fixed-Newstyle. Perhaps we need an 'NBD protocol version' that says what features people are expected to implement (one side or the other). Incidentally, I'd love to move everything bar fixed-Newstyle out to a section marked 'deprecated - historical interest only - if you want to interoperate with these servers/clients here's what you might find'. -- Alex Bligh ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z _______________________________________________ Nbd-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nbd-general
