On 11/10/22 00:26, Eric Blake wrote: > Modern qemu tends to advertise a strict 32M payload limit. Yet we > default to allowing the user to attempt up to 64M in pwrite. For > pread, qemu will reply with EINVAL but keep the connection up; but for > pwrite, an overlarge buffer is fatal. It's time to teach libnbd to > honor qemu's max buffer by default, while still allowing the user to > disable the strict mode setting if they want to try 64M anyways. > > This also involves advertising a new LIBNBD_SIZE_PAYLOAD via > nbd_get_block_size, which is more reliable than asking the user to > manually obey LIBNBD_SIZE_MAXIMUM which may not be set or may be > larger than 64M. > --- > lib/internal.h | 4 +++ > generator/API.ml | 39 ++++++++++++++++----- > generator/states-newstyle-opt-export-name.c | 1 + > generator/states-newstyle-opt-go.c | 1 + > generator/states-oldstyle.c | 1 + > lib/flags.c | 25 +++++++++++++ > lib/rw.c | 8 ++++- > 7 files changed, 70 insertions(+), 9 deletions(-) > > diff --git a/lib/internal.h b/lib/internal.h > index 22f500b4..bbbd2639 100644 > --- a/lib/internal.h > +++ b/lib/internal.h > @@ -146,6 +146,8 @@ struct nbd_handle { > uint32_t block_minimum; > uint32_t block_preferred; > uint32_t block_maximum; > + /* Payload cap, always non-zero, based on block_maximum when feasible */ > + uint32_t payload_maximum; > > /* Server canonical name and description, or NULL if not advertised. */ > char *canonical_name; > @@ -448,6 +450,8 @@ extern int nbd_internal_set_size_and_flags (struct > nbd_handle *h, > extern int nbd_internal_set_block_size (struct nbd_handle *h, uint32_t min, > uint32_t pref, uint32_t max) > LIBNBD_ATTRIBUTE_NONNULL(1); > +extern void nbd_internal_set_payload (struct nbd_handle *h) > + LIBNBD_ATTRIBUTE_NONNULL(1); > > /* is-state.c */ > extern bool nbd_internal_is_state_created (enum state state); > diff --git a/generator/API.ml b/generator/API.ml > index 8d0d9d15..0ebc9298 100644 > --- a/generator/API.ml > +++ b/generator/API.ml > @@ -184,6 +184,7 @@ let block_size_enum = > "MINIMUM", 0; > "PREFERRED", 1; > "MAXIMUM", 2; > + "PAYLOAD", 3; > ] > } > let all_enums = [ tls_enum; block_size_enum ] > @@ -221,6 +222,7 @@ let strict_flags = > "BOUNDS", 1 lsl 2; > "ZERO_SIZE", 1 lsl 3; > "ALIGN", 1 lsl 4; > + "PAYLOAD", 1 lsl 5; > ] > } > let allow_transport_flags = { > @@ -1060,10 +1062,21 @@ "set_strict_mode", { > =item C<LIBNBD_STRICT_ALIGN> = 5 > > If set, and the server provided minimum block sizes (see > -L<nbd_get_block_size(3)>, this flag rejects client requests that > -do not have length and offset aligned to the server's minimum > -requirements. If clear, unaligned requests are sent to the server, > -where it is up to the server whether to honor or reject the request. > +C<LIBNBD_SIZE_MINIMUM> for L<nbd_get_block_size(3)>), this > +flag rejects client requests that do not have length and offset > +aligned to the server's minimum requirements. If clear, > +unaligned requests are sent to the server, where it is up to > +the server whether to honor or reject the request. > + > +=item C<LIBNBD_STRICT_PAYLOAD> = 6 > + > +If set, the client refuses to send a command to the server > +with more than libnbd's outgoing payload maximum (see > +C<LIBNBD_SIZE_PAYLOAD> for L<nbd_get_block_size(3)>), whether > +or not the server advertised a block size maximum. If clear, > +oversize requests up to 64MiB may be attempted, although > +requests larger than 32MiB are liable to cause some servers to > +disconnect. > > =back > > @@ -2286,6 +2299,15 @@ "get_block_size", { > itself imposes a maximum of 64M. If zero, some NBD servers will > abruptly disconnect if a transaction involves more than 32M. > > +=item C<LIBNBD_SIZE_PAYLOAD> = 3 > + > +This value is not advertised by the server, but rather represents > +the maximum outgoing payload size for a given connection that > +libnbd will enforce unless C<LIBNBD_STRICT_PAYLOAD> is cleared > +in C<nbd_set_strict_mode(3)>. It is always non-zero: never > +smaller than 1M, never larger than 64M, and matches > +C<LIBNBD_SIZE_MAXIMUM> when possible. > + > =back > > Future NBD extensions may result in additional C<size_type> values. > @@ -2449,10 +2471,11 @@ "pwrite", { > acknowledged by the server, or there is an error. Note this will > generally return an error if L<nbd_is_read_only(3)> is true. > > -Note that libnbd currently enforces a maximum write buffer of 64MiB, > -even if the server would permit a larger buffer in a single transaction; > -attempts to exceed this will result in an C<ERANGE> error. The server > -may enforce a smaller limit, which can be learned with > +Note that libnbd defaults to enforcing a maximum write buffer > +of the lesser of 64MiB or any maximum payload size advertised > +by the server; attempts to exceed this will generally result in > +a client-side C<ERANGE> error, rather than a server-side > +disconnection. The actual limit can be learned with > L<nbd_get_block_size(3)>. > > The C<flags> parameter may be C<0> for no flags, or may contain > diff --git a/generator/states-newstyle-opt-export-name.c > b/generator/states-newstyle-opt-export-name.c > index 398aa680..88296297 100644 > --- a/generator/states-newstyle-opt-export-name.c > +++ b/generator/states-newstyle-opt-export-name.c > @@ -70,6 +70,7 @@ NEWSTYLE.OPT_EXPORT_NAME.CHECK_REPLY: > SET_NEXT_STATE (%.DEAD); > return 0; > } > + nbd_internal_set_payload (h); > SET_NEXT_STATE (%^FINISHED); > CALL_CALLBACK (h->opt_cb.completion, &err); > nbd_internal_free_option (h); > diff --git a/generator/states-newstyle-opt-go.c > b/generator/states-newstyle-opt-go.c > index f2d9f846..ac9613d4 100644 > --- a/generator/states-newstyle-opt-go.c > +++ b/generator/states-newstyle-opt-go.c > @@ -276,6 +276,7 @@ NEWSTYLE.OPT_GO.CHECK_REPLY: > err = nbd_get_errno () ? : ENOTSUP; > break; > case NBD_REP_ACK: > + nbd_internal_set_payload (h); > err = 0; > break; > } > diff --git a/generator/states-oldstyle.c b/generator/states-oldstyle.c > index 7f668936..1eb5e940 100644 > --- a/generator/states-oldstyle.c > +++ b/generator/states-oldstyle.c > @@ -68,6 +68,7 @@ OLDSTYLE.CHECK: > SET_NEXT_STATE (%.DEAD); > return 0; > } > + nbd_internal_set_payload (h); > > h->protocol = "oldstyle"; > > diff --git a/lib/flags.c b/lib/flags.c > index 5def16e8..eaf4ff85 100644 > --- a/lib/flags.c > +++ b/lib/flags.c > @@ -26,6 +26,7 @@ > #include <errno.h> > > #include "internal.h" > +#include "minmax.h" > > /* Reset connection data. Called after swapping export name, after > * failed OPT_GO/OPT_INFO, or after successful OPT_STARTTLS. > @@ -38,6 +39,7 @@ nbd_internal_reset_size_and_flags (struct nbd_handle *h) > h->block_minimum = 0; > h->block_preferred = 0; > h->block_maximum = 0; > + h->payload_maximum = 0; > free (h->canonical_name); > h->canonical_name = NULL; > free (h->description); > @@ -118,6 +120,27 @@ nbd_internal_set_block_size (struct nbd_handle *h, > uint32_t min, > return 0; /* Use return -1 if we want to reject such servers */ > } > > +/* Set the payload size constraint on the handle. > + * This is called from the state machine after all other information > + * about an export is available, regardless of oldstyle or newstyle. > + */ > +void > +nbd_internal_set_payload (struct nbd_handle *h) > +{ > + /* The NBD protocol says we should not assume the server will allow > + * more than 32M payload if it did not advertise anything. If it > + * advertised more than 64M, we still cap at the maximum buffer size > + * we are willing to allocate as a client; if it advertised smaller > + * than 1M (presumably because the export itself was small), the NBD > + * protocol says we can still send payloads that large. > + */ > + if (h->block_maximum) > + h->payload_maximum = MIN (MAX_REQUEST_SIZE, > + MAX (1024 * 1024, h->block_maximum)); > + else > + h->payload_maximum = 32 * 1024 * 1024; > +} > + > static int > get_flag (struct nbd_handle *h, uint16_t flag) > { > @@ -237,6 +260,8 @@ nbd_unlocked_get_block_size (struct nbd_handle *h, int > type) > return h->block_preferred; > case LIBNBD_SIZE_MAXIMUM: > return h->block_maximum; > + case LIBNBD_SIZE_PAYLOAD: > + return h->payload_maximum; > } > return 0; > } > diff --git a/lib/rw.c b/lib/rw.c > index 2ab85f3d..6120edd0 100644 > --- a/lib/rw.c > +++ b/lib/rw.c > @@ -207,8 +207,14 @@ nbd_internal_command_common (struct nbd_handle *h, > > switch (type) { > /* Commands which send or receive data are limited to MAX_REQUEST_SIZE. > */ > - case NBD_CMD_READ: > case NBD_CMD_WRITE: > + if (h->strict & LIBNBD_STRICT_PAYLOAD && count > h->payload_maximum) { > + set_error (ERANGE, "request too large: maximum payload size is %d",
For pedantry's sake, I think we shouldn't print a uint32_t value with a %d format specifier (I understand the value is limited to MAX_REQUEST_SIZE, which would be fine to print with %d *if* passed in as an "unsigned int"; but for pedantry we may not want to assume that uint32_t is actually "unsigned int".) PRIu32 would fit better. > + h->payload_maximum); > + goto err; > + } > + /* fallthrough */ > + case NBD_CMD_READ: > if (count > MAX_REQUEST_SIZE) { > set_error (ERANGE, "request too large: maximum request size is %d", > MAX_REQUEST_SIZE); > Otherwise the patch looks OK to me, but I don't understand where it really matters. * If a client program using libnbd connects to an NBD server that advertises LIBNBD_SIZE_MAXIMUM, then it is up to the client application (not libnbd) to adhere to that maximum, or face the consequences. Is that right? * If a client program using libnbd connects to an NBD server that does *not* advertise LIBNBD_SIZE_MAXIMUM, then the client can (per libnbd's own limit) attempt writes up to 64MB. If the NBD server then abruptly disconnects, *without having advertised* a lower limit (and IIUC this applies to *old* qemu-nbd), then -- per NBD spec / protocol -- the fault is with either the client program or libnbd, because the request size should not have exceeded 32MB. The question is who is responsible for enforcing this protocol-default-limit: the client application, or libnbd? Before the patch, the server might abruptly disconnect, after the patch, the client app will get a local error -- the request will not succeed either way, and the client application will have to be modified (= reprogrammed by a human) anyways, to consider such a 32MB default limit explicitly. So what use case / failure scenario is improved by this patch? Is it that the client application's programmer gets a friendlier / more understandable error message, rather than just the client app being silently kicked out by (old) qemu-nbd? Anyway, the patch looks OK; feel free to extend (or not) the commit message and/or to change (or not) the %d format specifier, from my end: Reviewed-by: Laszlo Ersek <ler...@redhat.com> Laszlo _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs