On Thu, Apr 14, 2016 at 09:09:20PM -0600, Eric Blake wrote:
> Existing NBD servers often have limitations, such as requiring
> actions to be aligned to block sizes or limiting maximum
> transactions to avoid denial of service attacks; for example,
> qemu's NBD server refuses any transaction larger than 32M.  But
> to date, clients have to learn these limitations via out-of-band
> means, and nothing in the spec allowed for alignment limitations.
> 
> Add a section to the document describing overall block size
> constraints, and rules for what defaults to use if there is no
> communication (whether out of band, or by the new options added
> here).
> 
> Also, add a new client option NBD_OPT_BLOCK_SIZE (a promise that
> the client will obey any advertised block sizes, to let a server
> optimize to use O_DIRECT without worrying about how it would have
> to report errors), and extend NBD_REP_INFO (to allow the server
> to advertise block sizes in band, for a new enough client that
> uses NBD_OPT_GO).
> 
> Design decision: a client that wants to learn block sizes MUST
> use NBD_OPT_GO, rather than the old NBD_OPT_EXPORT_NAME, even
> though we could have repurposed some of the reserved zeroes when
> NBD_FLAG_C_NO_ZEROES is not in effect, because we don't want to
> encourage any further abuse of NBD_OPT_EXPORT_NAME.
> 
> Signed-off-by: Eric Blake <ebl...@redhat.com>
> ---
>  doc/proto.md | 184 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 162 insertions(+), 22 deletions(-)
> 
> diff --git a/doc/proto.md b/doc/proto.md
> index 09066e2..30d170e 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -590,6 +590,84 @@ This functionality has not yet been implemented by the 
> reference
>  implementation, but was implemented by qemu and subsequently
>  by other users, so has been moved out of the "experimental" section.
> 
> +## Block sizes
> +
> +During transmission phase, several operations are constrained by the
> +export size sent by the final `NBD_OPT_EXPORT_NAME` or `NBD_OPT_GO`,
> +as well as by three block sizes defined here (minimum, preferred, and
> +maximum).  During the handshake phase, a client MAY announce its
> +intention to honor server block sizes via the experimental
> +`BLOCK_SIZE` extension; see below.  A client that is worried about
> +block size constraints SHOULD use `NBD_OPT_GO` rather than
> +`NBD_OPT_EXPORT_NAME`.  A server SHOULD advertise the block size
> +contraints during handshake phase via the experimental `INFO`
> +extension; see below.  A server and client MAY agree on block sizes
> +via out of band means.
> +
> +If block sizes have not been advertised or agreed on externally, then
> +a client SHOULD assume a default minimum block size of 1, a preferred
> +block size of 4,096, and no inherent maximum block size.

the size of the device as the maximum block size?

> A server
> +that wants to enforce block sizes other than the defaults specified
> +here MUST support the experimental `INFO` extension, and MAY refuse to
> +go into transmission phase for a client that uses
> +`NBD_OPT_EXPORT_NAME` or failed to use `NBD_OPT_BLOCK_SIZE`, although
> +a server SHOULD permit such clients if block sizes can be agreed on
> +externally.  When allowing such clients, the server MUST cleanly error
> +commands that fall outside block size parameters without corrupting
> +data.

Maybe also make a note that doing so will limit interoperability?

> +A client MAY choose to operate as if tighter block sizes had been
> +specified (for example, even when the server advertises the default
> +minimum block size of 1, a client may safely use a minimum block size
> +of 512, a preferred block size of 65,536, and a maximum block size of
> +3,355,442 (32M)).  Notwithstanding any maximum block size advertised,

32M is 33554432 (you missed the 3 at the 10^1 position). Maybe use powers of
two here? 32M is 2^25.

> +either the server or the client MAY initiate a hard disconnect if the
> +size of a request or a reply is large enough to be deemed a denial of
> +service attack.
> +
> +The minimum block size represents the smallest addressable length and
> +alignment within the export, although writing to an area that small
> +may require the server to use a less-efficient read-modify-write
> +action.  If advertised, this value MUST be a power of 2, MUST NOT be
> +larger than 65,536, and MAY be as small as 1 for an export backed by a
> +regular file, although the values of 512 or 4,096 are more typical for
> +an export backed by a block device.  If a server advertises a minimum
> +block size, the advertised export size MUST be an integer multiple of
> +that block size.

I think this can be a SHOULD without problem?

> +The preferred block size represents the minimum size at which aligned
> +requests will have efficient I/O, avoiding behaviour such as
> +read-modify-write.  If advertised, this MUST be a power of 2 at least
> +as large as the smaller of the minimum block size and 4,096, although
> +larger values (such as the minimum granularity of a hole) are also
> +appropriate.  The preferred block size MAY be larger than the export
> +size, in which case the client is unable to utilize the preferred
> +block size for that export.
> +
> +The maximum block size represents the maximum length that the server
> +is willing to handle in one request.  If advertised, it MUST be either
> +an integer multiple of the minimum block size or the value
> +4,294,967,295 (that is, (uint32_t)-1) for no inherent limit, MUST be

Casting a negative value into an unsigned integer is actually not
defined in the C language (there are systems that don't use two's
complement for negative values), although arithmetic on them is
(strangely enough). Maybe express it in hex instead? 0xFFFFFFFF. That
ought to make clear what's meant, too.

> +at least as large as the preferred block size, and SHOULD be at least
> +3,355,442 (32M), but MAY be something other than a power of 2.  For

Incorrect 32M value again.

[...]

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Nbd-general mailing list
Nbd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nbd-general

Reply via email to