Re: [Libguestfs] [libnbd PATCH v4 2/4] generator: Finish parsing structured header in states-reply.c

Laszlo Ersek Fri, 09 Jun 2023 08:22:51 -0700

On 6/9/23 04:17, Eric Blake wrote:
> Splitting the parse of a 20-byte structured reply header across two
> source files (16 bytes overlapping with simple replies in
> states-reply.c, the remaining 4 bytes in states-reply-structured.c) is
> confusing.  The upcoming addition of extended headers will reuse the
> payload parsing portion of structured replies, but will parse its
> 32-byte header completely in states-reply.c.  So it is better to have
> a single file in charge of parsing all three header sizes (once the
> third type is introduced), where we then farm out to the payload
> parsers.


"Farm out" is a transitive phrasal verb, but we don't have an object for
it. "Fan out" perhaps? (Or add an object.)

> 
> To do that, rename REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY to
> REPLY.CHECK_REPLY_MAGIC and have it pick up the setup from
> REPLY.STRUCTURED_REPLY.START, rename REPLY

The last REPLY is superfluous

> REPLY.STRUCTURED_REPLY.RECV_REMAINING to
> REPLY.RECV_STRUCTURED_REMAINING across files, and merge
> REPLY.STRUCTURED_REPLY.CHECK into what would otherwise be the
> now-empty REPLY.STRUCTURED_REPLY.START.

A different way to describe the same restructuring (I needed this
wording for explaining the code movement to myself):

- We have (prepare, receive, parse) triplets.

- Before the patch, our starter triplet is (REPLY.START,
REPLY.RECV_REPLY, REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY). These are all
in "states-reply.c". In the last component of that triplet (the "parse"
component), we branch either to SIMPLE_REPLY.START
(states-reply-simple.c) for actual parsing, or we start another triplet,
for the remaining bytes of a structured reply:
(REPLY.STRUCTURED_REPLY.START, REPLY.STRUCTURED_REPLY.RECV_REMAINING,
REPLY.STRUCTURED_REPLY.CHECK); these are all in

- The patch effectively redistributes the second triplet. The prepare
stage REPLY.STRUCTURED_REPLY.START is merged into the last (parse) stage
of the first triplet, i.e., into REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY
(and it is renamed to REPLY.CHECK_REPLY_MAGIC). The receive stage of the
second triplet is moved to "states-reply.c", and renamed to
REPLY.RECV_STRUCTURED_REMAINING. The last (parse) stage of the second
triplet is kept in "states-reply-structured.c", but renamed to
REPLY.STRUCTURED_REPLY.START.

- Thus "states-reply-structured.c" loses two states (prepare and
receive); the first is absorbed by "states-reply.c" into an existing
state, and the second is gained by "states-reply.c" as a new state.

I agree this is an improvement, because now we finish reading all reply
headers in a single C file.

> 
> Signed-off-by: Eric Blake <ebl...@redhat.com>
> ---
>  generator/state_machine.ml          | 29 ++++++++-----------
>  generator/states-reply.c            | 43 ++++++++++++++++++++++-------
>  generator/states-reply-structured.c | 26 -----------------
>  3 files changed, 44 insertions(+), 54 deletions(-)
> 
> diff --git a/generator/state_machine.ml b/generator/state_machine.ml
> index 126159b9..b5485aec 100644
> --- a/generator/state_machine.ml
> +++ b/generator/state_machine.ml
> @@ -769,8 +769,15 @@ and
> 
>    State {
>      default_state with
> -    name = "CHECK_SIMPLE_OR_STRUCTURED_REPLY";
> -    comment = "Check if the reply is a simple or structured reply";
> +    name = "CHECK_REPLY_MAGIC";
> +    comment = "Check if the reply has expected magic";
> +    external_events = [];
> +  };

So this remains in "states-reply.c" and absorbs the original prep logic
of REPLY.STRUCTURED_REPLY.START, from "states-reply-structured.c".

> +
> +  State {
> +    default_state with
> +    name = "RECV_STRUCTURED_REMAINING";
> +    comment = "Receiving the remaining part of a structured reply header";
>      external_events = [];
>    };
> 

This is effectively moved from "states-reply-structured.c" (originally
REPLY.STRUCTURED_REPLY.RECV_REMAINING) to "states-reply.c".

> @@ -804,28 +811,14 @@ and
>    };
>  ]
> 
> -(* Receiving a structured reply from the server.
> +(* Receiving a structured reply payload from the server.
>   * Implementation: generator/states-reply-structured.c
>   *)
>  and structured_reply_state_machine = [
>    State {
>      default_state with
>      name = "START";
> -    comment = "Prepare to receive the remaining part of a structured reply";
> -    external_events = [];
> -  };
> -

Yes, moved & merged into CHECK_REPLY_MAGIC.

> -  State {
> -    default_state with
> -    name = "RECV_REMAINING";
> -    comment = "Receiving the remaining part of a structured reply";
> -    external_events = [];
> -  };

Yes, moved and renamed to RECV_STRUCTURED_REMAINING.

> -
> -  State {
> -    default_state with
> -    name = "CHECK";
> -    comment = "Parse a structured reply from the server";
> +    comment = "Start parsing a structured reply payload from the server";
>      external_events = [];
>    };
> 

Stays in "states-reply-structured.c", renamed from CHECK to START.

> diff --git a/generator/states-reply.c b/generator/states-reply.c
> index 2c77658b..87f17bae 100644
> --- a/generator/states-reply.c
> +++ b/generator/states-reply.c
> @@ -64,11 +64,11 @@  REPLY.START:
>    ssize_t r;
> 
>    /* We read all replies initially as if they are simple replies, but
> -   * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below.  This
> -   * works because the structured_reply header is larger, and because
> -   * the last member of a simple reply, cookie, is coincident between
> -   * the two structs (an intentional design decision in the NBD spec
> -   * when structured replies were added).
> +   * check the magic in CHECK_REPLY_MAGIC below.  This works because
> +   * the structured_reply header is larger, and because the last
> +   * member of a simple reply, cookie, is coincident between the two
> +   * structs (an intentional design decision in the NBD spec when
> +   * structured replies were added).
>     */
>    STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) ==
>                   offsetof (struct nbd_handle, 
> sbuf.sr.structured_reply.cookie),

OK, only the state name is updated, and the comment is reflowed.

> @@ -113,11 +113,11 @@  REPLY.RECV_REPLY:
>    switch (recv_into_rbuf (h)) {
>    case -1: SET_NEXT_STATE (%.DEAD); return 0;
>    case 1: SET_NEXT_STATE (%.READY); return 0;
> -  case 0: SET_NEXT_STATE (%CHECK_SIMPLE_OR_STRUCTURED_REPLY);
> +  case 0: SET_NEXT_STATE (%CHECK_REPLY_MAGIC);
>    }
>    return 0;

OK, state rename.

> 
> - REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
> + REPLY.CHECK_REPLY_MAGIC:
>    struct command *cmd;
>    uint32_t magic;
>    uint64_t cookie;

Ditto.

> @@ -127,7 +127,18 @@  REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
>      SET_NEXT_STATE (%SIMPLE_REPLY.START);
>    }
>    else if (magic == NBD_STRUCTURED_REPLY_MAGIC) {
> -    SET_NEXT_STATE (%STRUCTURED_REPLY.START);
> +    /* We've only read the bytes that fill simple_reply.  The
> +     * structured_reply is longer, so prepare to read the remaining
> +     * bytes.  We depend on the memory aliasing in union sbuf to
> +     * overlay the two reply types.
> +     */
> +    STATIC_ASSERT (sizeof h->sbuf.simple_reply ==
> +                   offsetof (struct nbd_structured_reply, length),
> +                   simple_structured_overlap);
> +    assert (h->rbuf == (char *)&h->sbuf + sizeof h->sbuf.simple_reply);
> +    h->rlen = sizeof h->sbuf.sr.structured_reply;
> +    h->rlen -= sizeof h->sbuf.simple_reply;

This is being absorbed from REPLY.STRUCTURED_REPLY.START. The comment's
wording is slightly updated. OK.

The code is identical.

> +    SET_NEXT_STATE (%RECV_STRUCTURED_REMAINING);

Having dealt with the "prepare" stage right here, we can now read the
remaining bytes; that gets renamed from
REPLY.STRUCTURED_REPLY.RECV_REMAINING, and moved to this file.

>    }
>    else {
>      /* We've probably lost synchronization. */
> @@ -141,8 +152,9 @@  REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
>      return 0;
>    }
> 
> -  /* NB: This works for both simple and structured replies because the
> -   * handle (our cookie) is stored at the same offset.  See the
> +  /* NB: This works for both simple and structured replies, even
> +   * though we haven't finished reading the structured header yet,
> +   * because the cookie is stored at the same offset.  See the
>     * STATIC_ASSERT above in state REPLY.START that confirmed this.
>     */
>    h->chunks_received++;

Good!

> @@ -157,6 +169,17 @@  REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY:
>    h->reply_cmd = cmd;
>    return 0;
> 
> + REPLY.RECV_STRUCTURED_REMAINING:
> +  switch (recv_into_rbuf (h)) {
> +  case -1: SET_NEXT_STATE (%.DEAD); return 0;
> +  case 1:
> +    save_reply_state (h);
> +    SET_NEXT_STATE (%.READY);
> +    return 0;
> +  case 0: SET_NEXT_STATE (%STRUCTURED_REPLY.START);
> +  }
> +  return 0;
> +
>   REPLY.FINISH_COMMAND:
>    struct command *prev_cmd, *cmd;
>    uint64_t cookie;

Moved nearly identically from REPLY.STRUCTURED_REPLY.RECV_REMAINING. The
state itself is renamed, and the state we jump to (for parsing) is also
renamed. Good.

"Nearly" identically: the code movement sneakily fixes a whitespace
error in the original! :) Double space after "case 0:".


> diff --git a/generator/states-reply-structured.c 
> b/generator/states-reply-structured.c
> index 205a236d..3a7a03fd 100644
> --- a/generator/states-reply-structured.c
> +++ b/generator/states-reply-structured.c
> @@ -46,32 +46,6 @@ structured_reply_in_bounds (uint64_t offset, uint32_t 
> length,
> 
>  STATE_MACHINE {
>   REPLY.STRUCTURED_REPLY.START:
> -  /* We've only read the bytes that fill simple_reply.  The
> -   * structured_reply is longer, so read the remaining part.  We
> -   * depend on the memory aliasing in union sbuf to overlay the two
> -   * reply types.
> -   */
> -  STATIC_ASSERT (sizeof h->sbuf.simple_reply ==
> -                 offsetof (struct nbd_structured_reply, length),
> -                 simple_structured_overlap);
> -  assert (h->rbuf == (char *)&h->sbuf + sizeof h->sbuf.simple_reply);
> -  h->rlen = sizeof h->sbuf.sr.structured_reply;
> -  h->rlen -= sizeof h->sbuf.simple_reply;
> -  SET_NEXT_STATE (%RECV_REMAINING);
> -  return 0;
> -
> - REPLY.STRUCTURED_REPLY.RECV_REMAINING:
> -  switch (recv_into_rbuf (h)) {
> -  case -1: SET_NEXT_STATE (%.DEAD); return 0;
> -  case 1:
> -    save_reply_state (h);
> -    SET_NEXT_STATE (%.READY);
> -    return 0;
> -  case 0:  SET_NEXT_STATE (%CHECK);
> -  }
> -  return 0;
> -
> - REPLY.STRUCTURED_REPLY.CHECK:
>    struct command *cmd = h->reply_cmd;
>    uint16_t flags, type;
>    uint32_t length;

Yes.

Hellishly difficult to review, but correct, as far as I can tell, and
certainly an improvement for the codebase.

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

I'll continue with the rest of the series sometime later.

Thanks,
Laszlo
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [libnbd PATCH v4 2/4] generator: Finish parsing structured header in states-reply.c

Reply via email to