Re: [Libguestfs] [PATCH v2 3/6] spec: Add NBD_OPT_EXTENDED_HEADERS

2023-03-05 Thread Wouter Verhelst
On Fri, Mar 03, 2023 at 04:36:41PM -0600, Eric Blake wrote:
> On Wed, Feb 22, 2023 at 11:49:18AM +0200, Wouter Verhelst wrote:
> > On Mon, Nov 14, 2022 at 04:46:52PM -0600, Eric Blake wrote:
[...]
> > > +  Note that even when extended headers are in use, the client MUST be
> > > +  prepared for the server to use either the compact or extended chunk
> > > +  type, regardless of whether the client's hinted effect length was
> > > +  more or less than 32 bits; but the server MUST use exactly one of
> > > +  the two chunk types per negotiated metacontext ID.
> > 
> > Is this last paragraph really a good idea? I would think it makes more
> > sense to require the new format if we're already required to support it
> > on both sides anyway.
> 
> My proof of implementation was easier to code when I didn't have to
> resize the block status reply sizing in the same patch as adding the
> 64-bit headers.  But if you think requiring the 64-bit reply type
> always (and forbidding the 32-bit reply) when extended headers are in
> force, that's also possible.

Intuitively, this sounds off. It would seem to me that it's easier to do
if you don't have to have a conditional on each received data packet.
But maybe I'm missing something -- I haven't done an implementation yet,
anyway.

-- 
 w@uter.{be,co.za}
wouter@{grep.be,fosdem.org,debian.org}

I will have a Tin-Actinium-Potassium mixture, thanks.

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [PATCH v2 3/6] spec: Add NBD_OPT_EXTENDED_HEADERS

2023-03-03 Thread Eric Blake
On Wed, Feb 22, 2023 at 11:49:18AM +0200, Wouter Verhelst wrote:
> On Mon, Nov 14, 2022 at 04:46:52PM -0600, Eric Blake wrote:
> [...]
> > @@ -1370,9 +1475,10 @@ of the newstyle negotiation.
> >  Return a list of `NBD_REP_META_CONTEXT` replies, one per context,
> >  followed by an `NBD_REP_ACK` or an error.
> > 
> > -This option SHOULD NOT be requested unless structured replies have
> > -been negotiated first. If a client attempts to do so, a server
> > -MAY send `NBD_REP_ERR_INVALID`.
> > +This option SHOULD NOT be requested unless structured replies or
> > +extended headers have been negotiated first. If a client attempts
> > +to do so, a server MAY send `NBD_REP_ERR_INVALID` or
> > +`NBD_REP_ERR_EXT_HEADER_REQD`.
> 
> Is it the intent that NBD_REP_ERR_EXT_HEADER_REQD means structured
> replies are not supported by this server? I think that could be
> clarified here.

My intent here was that a newer server that ONLY wants to talk to
clients that understand extended headers can use
NBD_REP_ERR_EXT_HEADER_REQD as its error message to clients that have
not requested extended headers yet.  Older clients may not know what
that error code means, but our spec already has reasonable constraints
that the client should at least recognize it as an error code.

That way, a server only has to implement extended headers, rather than
maintaining the structured header code in parallel.

> 
> (this occurs twice)
> 
> [...]
> > +* `NBD_OPT_EXTENDED_HEADERS` (11)
> > +
> > +The client wishes to use extended headers during the transmission
> > +phase.  The client MUST NOT send any additional data with the
> > +option, and the server SHOULD reject a request that includes data
> > +with `NBD_REP_ERR_INVALID`.
> > +
> > +When successful, this option takes precedence over structured
> > +replies.  A client MAY request structured replies first, although
> > +a server SHOULD support this option even if structured replies are
> > +not negotiated.
> > +
> > +It is envisioned that future extensions will add other new
> > +requests that support a data payload in the request or reply.  A
> > +server that supports such extensions SHOULD NOT advertise those
> > +extensions until the client has negotiated extended headers; and a
> > +client MUST NOT make use of those extensions without first
> > +enabling support for reply payloads.
> > +
> > +The server replies with the following, or with an error permitted
> > +elsewhere in this document:
> > +
> > +- `NBD_REP_ACK`: Extended headers have been negotiated; the client
> > +  MUST use the 32-byte extended request header, with proper use of
> > +  `NBD_CMD_FLAG_PAYLOAD_LEN` for all commands sending a payload;
> > +  and the server MUST use the 32-byte extended reply header.
> > +- For backwards compatibility, clients SHOULD be prepared to also
> > +  handle `NBD_REP_ERR_UNSUP`; in this case, only the compact
> > +  transmission headers will be used.
> > +
> > +Note that a response of `NBD_REP_ERR_BLOCK_SIZE_REQD` does not
> > +make sense in response to this command, but a server MAY fail with
> > +that error for a later `NBD_OPT_GO` without a client request for
> > +`NBD_INFO_BLOCK_SIZE`, since the use of extended headers provides
> > +more incentive for a client to promise to obey block size
> > +constraints.
> > +
> > +If the client requests `NBD_OPT_STARTTLS` after this option, it
> > +MUST renegotiate extended headers.
> > +
> 
> Does it make sense here to also forbid use of NBD_OPT_EXPORT_NAME? I
> think the sooner we get rid of that, the better ;-)

I hadn't thought of that, but it does indeed sound desirable.  I can
touch up the spec to state that NBD_OPT_EXPORT_NAME MUST NOT be used
by a client that requested extended headers.

> 
> [...]
> > @@ -1746,13 +1914,15 @@ unrecognized flags.
> > 
> >   Structured reply types
> > 
> > -These values are used in the "type" field of a structured reply.
> > -Some chunk types can additionally be categorized by role, such as
> > -*error chunks* or *content chunks*.  Each type determines how to
> > -interpret the "length" bytes of payload.  If the client receives
> > -an unknown or unexpected type, other than an *error chunk*, it
> > -MUST initiate a hard disconnect.  A server MUST NOT send a chunk
> > -larger than any advertised maximum block payload size.
> > +These values are used in the "type" field of a structured reply.  Some
> > +chunk types can additionally be categorized by role, such as *error
> > +chunks*, *content chunks*, or *status chunks*.  Each type determines
> > +how to interpret the "length" bytes of payload.  If the client
> > +receives an unknown or unexpected type, other than an *error chunk*,
> > +it MAY initiate a hard disconnect on the grounds that the client is
> > +uncertain whether the server handled the request as desired.  A server
> > +MUST NOT send a chu

Re: [Libguestfs] [PATCH v2 3/6] spec: Add NBD_OPT_EXTENDED_HEADERS

2023-02-22 Thread Wouter Verhelst
On Mon, Nov 14, 2022 at 04:46:52PM -0600, Eric Blake wrote:
[...]
> @@ -1370,9 +1475,10 @@ of the newstyle negotiation.
>  Return a list of `NBD_REP_META_CONTEXT` replies, one per context,
>  followed by an `NBD_REP_ACK` or an error.
> 
> -This option SHOULD NOT be requested unless structured replies have
> -been negotiated first. If a client attempts to do so, a server
> -MAY send `NBD_REP_ERR_INVALID`.
> +This option SHOULD NOT be requested unless structured replies or
> +extended headers have been negotiated first. If a client attempts
> +to do so, a server MAY send `NBD_REP_ERR_INVALID` or
> +`NBD_REP_ERR_EXT_HEADER_REQD`.

Is it the intent that NBD_REP_ERR_EXT_HEADER_REQD means structured
replies are not supported by this server? I think that could be
clarified here.

(this occurs twice)

[...]
> +* `NBD_OPT_EXTENDED_HEADERS` (11)
> +
> +The client wishes to use extended headers during the transmission
> +phase.  The client MUST NOT send any additional data with the
> +option, and the server SHOULD reject a request that includes data
> +with `NBD_REP_ERR_INVALID`.
> +
> +When successful, this option takes precedence over structured
> +replies.  A client MAY request structured replies first, although
> +a server SHOULD support this option even if structured replies are
> +not negotiated.
> +
> +It is envisioned that future extensions will add other new
> +requests that support a data payload in the request or reply.  A
> +server that supports such extensions SHOULD NOT advertise those
> +extensions until the client has negotiated extended headers; and a
> +client MUST NOT make use of those extensions without first
> +enabling support for reply payloads.
> +
> +The server replies with the following, or with an error permitted
> +elsewhere in this document:
> +
> +- `NBD_REP_ACK`: Extended headers have been negotiated; the client
> +  MUST use the 32-byte extended request header, with proper use of
> +  `NBD_CMD_FLAG_PAYLOAD_LEN` for all commands sending a payload;
> +  and the server MUST use the 32-byte extended reply header.
> +- For backwards compatibility, clients SHOULD be prepared to also
> +  handle `NBD_REP_ERR_UNSUP`; in this case, only the compact
> +  transmission headers will be used.
> +
> +Note that a response of `NBD_REP_ERR_BLOCK_SIZE_REQD` does not
> +make sense in response to this command, but a server MAY fail with
> +that error for a later `NBD_OPT_GO` without a client request for
> +`NBD_INFO_BLOCK_SIZE`, since the use of extended headers provides
> +more incentive for a client to promise to obey block size
> +constraints.
> +
> +If the client requests `NBD_OPT_STARTTLS` after this option, it
> +MUST renegotiate extended headers.
> +

Does it make sense here to also forbid use of NBD_OPT_EXPORT_NAME? I
think the sooner we get rid of that, the better ;-)

[...]
> @@ -1746,13 +1914,15 @@ unrecognized flags.
> 
>   Structured reply types
> 
> -These values are used in the "type" field of a structured reply.
> -Some chunk types can additionally be categorized by role, such as
> -*error chunks* or *content chunks*.  Each type determines how to
> -interpret the "length" bytes of payload.  If the client receives
> -an unknown or unexpected type, other than an *error chunk*, it
> -MUST initiate a hard disconnect.  A server MUST NOT send a chunk
> -larger than any advertised maximum block payload size.
> +These values are used in the "type" field of a structured reply.  Some
> +chunk types can additionally be categorized by role, such as *error
> +chunks*, *content chunks*, or *status chunks*.  Each type determines
> +how to interpret the "length" bytes of payload.  If the client
> +receives an unknown or unexpected type, other than an *error chunk*,
> +it MAY initiate a hard disconnect on the grounds that the client is
> +uncertain whether the server handled the request as desired.  A server
> +MUST NOT send a chunk larger than any advertised maximum block payload
> +size.

Why do we make this a MAY rather than a MUST?

Also, should this section say "structured or extended reply"? We use the
same types for both.

[...]
> +* `NBD_REPLY_TYPE_BLOCK_STATUS_EXT` (6)
> +
> +  This chunk type is in the status chunk category.  *length* MUST be
> +  8 + (a positive multiple of 16).  The semantics of this chunk mirror
> +  those of `NBD_REPLY_TYPE_BLOCK_STATUS`, other than the use of a
> +  larger *extent length* field, added padding in each descriptor to
> +  ease alignment, and the addition of a *descriptor count* field that
> +  can be used for easier client processing.  This chunk type MUST NOT
> +  be used unless extended headers were negotiated with
> +  `NBD_OPT_EXTENDED_HEADERS`.
> +
> +  If the *descriptor count* field contains 0, the number of subsequent
> +  descriptors is determined solely by the *length* field of the reply
> +

Re: [Libguestfs] [PATCH v2 3/6] spec: Add NBD_OPT_EXTENDED_HEADERS

2022-12-19 Thread Vladimir Sementsov-Ogievskiy

On 11/15/22 01:46, Eric Blake wrote:

Add a new negotiation feature where the client and server agree to use
larger packet headers on every packet sent during transmission phase.
This has two purposes: first, it makes it possible to perform
operations like trim, write zeroes, and block status on more than 2^32
bytes in a single command.  For NBD_CMD_READ, replies are still
implicitly capped by the maximum block payload limits (generally 32M);
if you want to know if a hole larger than 32 bits can represent,
you'll use BLOCK_STATUS instead of hoping that a large READ will
either return a hole or report overflow.  But for
NBD_CMD_BLOCK_STATUS, it is very useful to be able to report a status
extent with a size larger than 32-bits, in some cases even if the
client's request was for smaller than 32-bits (such as when it is
known that the entire image is not sparse).  Thus, the wording chosen
here is careful to permit a server to use either flavor status chunk
type in its reply, and clients requesting extended headers must be
prepared for both reply types.

Second, when structured replies are active, clients have to deal with
the difference between 16- and 20-byte headers of simple
vs. structured replies, which impacts performance if the client must
perform multiple syscalls to first read the magic before knowing if
there are even additional bytes to read to learn a payload length.  In
extended header mode, all headers are the same width and there are no
simple replies permitted.  The layout of the reply header is more like
the request header; and including the client's offset in the reply
makes it easier to convert between absolute and relative offsets for
replies to NBD_CMD_READ.


Hmm.. Can't imagine, probably will see it in code patches.


 Similarly, by having extended mode use a
power-of-2 sizing, it becomes easier to manipulate arrays of headers
without worrying about an individual header crossing a cache line.


Do we really manipulate with such arrays in Qemu or Libnbd or Nbdkit?


However, note that this change only affects the headers; data payloads
can still be unaligned (for example, a client performing 1-byte reads
or writes).  We would need to negotiate yet another extension if we
wanted to ensure that all NBD transmission packets started on an
8-byte boundary after option haggling has completed.

This spec addition was done in parallel with proof of concept
implementations in qemu (server and client), libnbd (client), and
nbdkit (server).

Signed-off-by: Eric Blake 


In general looks good to me.

I'm not sure in extra bytes to get 8bytes-alignment.. Maybe that's a right way.


---
  doc/proto.md | 481 ++-
  1 file changed, 358 insertions(+), 123 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 53c334a..fde1e70 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -280,34 +280,53 @@ a soft disconnect.

  ### Transmission

-There are three message types in the transmission phase: the request,
-the simple reply, and the structured reply chunk.  The
+There are two general message types in the transmission phase: the
+request (simple or extended), and the reply (simple, structured, or
+extended).  The determination of which message headers to use is
+determined during handshaking phase, based on whether


"determination is determined" sounds strange

should we also define "compact" in this paragraph? [*]


+`NBD_OPT_STRUCTURED_REPLY` or `NBD_OPT_EXTENDED_HEADERS` was requested
+by the client and given a successful response by the server.  The
  transmission phase consists of a series of transactions, where the
  client submits requests and the server sends corresponding replies
  with either a single simple reply or a series of one or more
-structured reply chunks per request.  The phase continues until
-either side terminates transmission; this can be performed cleanly
-only by the client.
+structured or extended reply chunks per request.  The phase continues
+until either side terminates transmission; this can be performed
+cleanly only by the client.

  Note that without client negotiation, the server MUST use only simple
  replies, and that it is impossible to tell by reading the server
  traffic in isolation whether a data field will be present; the simple
  reply is also problematic for error handling of the `NBD_CMD_READ`
-request.  Therefore, structured replies can be used to create a
-a context-free server stream; see below.
+request.  Therefore, structured or extended replies can be used to
+create a a context-free server stream; see below.


s/a a/a/


+
+The results of client negotiation also determine whether the client
+and server will utilize only compact requests and replies, or whether


first place, where "compact" term appears and it is not defined. [*]


+both sides will use only extended packets.  Compact messages are the
+default, but inherently limit single transactions to a 32-bit window
+starting at a 64-bit offset.  Exten

[Libguestfs] [PATCH v2 3/6] spec: Add NBD_OPT_EXTENDED_HEADERS

2022-11-14 Thread Eric Blake
Add a new negotiation feature where the client and server agree to use
larger packet headers on every packet sent during transmission phase.
This has two purposes: first, it makes it possible to perform
operations like trim, write zeroes, and block status on more than 2^32
bytes in a single command.  For NBD_CMD_READ, replies are still
implicitly capped by the maximum block payload limits (generally 32M);
if you want to know if a hole larger than 32 bits can represent,
you'll use BLOCK_STATUS instead of hoping that a large READ will
either return a hole or report overflow.  But for
NBD_CMD_BLOCK_STATUS, it is very useful to be able to report a status
extent with a size larger than 32-bits, in some cases even if the
client's request was for smaller than 32-bits (such as when it is
known that the entire image is not sparse).  Thus, the wording chosen
here is careful to permit a server to use either flavor status chunk
type in its reply, and clients requesting extended headers must be
prepared for both reply types.

Second, when structured replies are active, clients have to deal with
the difference between 16- and 20-byte headers of simple
vs. structured replies, which impacts performance if the client must
perform multiple syscalls to first read the magic before knowing if
there are even additional bytes to read to learn a payload length.  In
extended header mode, all headers are the same width and there are no
simple replies permitted.  The layout of the reply header is more like
the request header; and including the client's offset in the reply
makes it easier to convert between absolute and relative offsets for
replies to NBD_CMD_READ.  Similarly, by having extended mode use a
power-of-2 sizing, it becomes easier to manipulate arrays of headers
without worrying about an individual header crossing a cache line.
However, note that this change only affects the headers; data payloads
can still be unaligned (for example, a client performing 1-byte reads
or writes).  We would need to negotiate yet another extension if we
wanted to ensure that all NBD transmission packets started on an
8-byte boundary after option haggling has completed.

This spec addition was done in parallel with proof of concept
implementations in qemu (server and client), libnbd (client), and
nbdkit (server).

Signed-off-by: Eric Blake 
---
 doc/proto.md | 481 ++-
 1 file changed, 358 insertions(+), 123 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 53c334a..fde1e70 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -280,34 +280,53 @@ a soft disconnect.

 ### Transmission

-There are three message types in the transmission phase: the request,
-the simple reply, and the structured reply chunk.  The
+There are two general message types in the transmission phase: the
+request (simple or extended), and the reply (simple, structured, or
+extended).  The determination of which message headers to use is
+determined during handshaking phase, based on whether
+`NBD_OPT_STRUCTURED_REPLY` or `NBD_OPT_EXTENDED_HEADERS` was requested
+by the client and given a successful response by the server.  The
 transmission phase consists of a series of transactions, where the
 client submits requests and the server sends corresponding replies
 with either a single simple reply or a series of one or more
-structured reply chunks per request.  The phase continues until
-either side terminates transmission; this can be performed cleanly
-only by the client.
+structured or extended reply chunks per request.  The phase continues
+until either side terminates transmission; this can be performed
+cleanly only by the client.

 Note that without client negotiation, the server MUST use only simple
 replies, and that it is impossible to tell by reading the server
 traffic in isolation whether a data field will be present; the simple
 reply is also problematic for error handling of the `NBD_CMD_READ`
-request.  Therefore, structured replies can be used to create a
-a context-free server stream; see below.
+request.  Therefore, structured or extended replies can be used to
+create a a context-free server stream; see below.
+
+The results of client negotiation also determine whether the client
+and server will utilize only compact requests and replies, or whether
+both sides will use only extended packets.  Compact messages are the
+default, but inherently limit single transactions to a 32-bit window
+starting at a 64-bit offset.  Extended messages make it possible to
+perform 64-bit transactions (although typically only for commands that
+do not include a data payload).  Furthermore, when only structured
+replies have been negotiated, compact messages require the client to
+perform partial reads to determine which reply packet style (16-byte
+simple or 20-byte structured) is on the wire before knowing the length
+of the rest of the reply, which can reduce client performance.  With
+extended messages, all packet heade