Hi,

Thanks Numan for the review. See the replies below.

On Thu, May 19, 2022 at 12:36 AM Numan Siddique <num...@ovn.org> wrote:
>
> On Wed, May 11, 2022 at 11:34 AM <lmart...@redhat.com> wrote:
> >
> > From: Lucas Alvares Gomes <lucasago...@gmail.com>
> >
> > In order to PXE boot a baremetal server using the OVN DHCP server we
> > need to allow users to set the "next-server" (siaddr) [0] field in the
> > DHCP header.
> >
> > While investigating this issue by comparing the DHCPOFFER and DHCPACK
> > packets sent my dnsmasq and OVN we saw that the "next-server" field
> > was the problem for OVN, without it PXE booting was timing out while
> > fetching the iPXE image from the TFTP server (see the bugzilla ticket
> > below for reference).
> >
> > To confirm this problem we created a bogus patch hardcoding the TFTP
> > address in the siaddr of the DHCP header (see the discussion in the
> > maillist below) and with this in place we were able to deploy a
> > baremetal node using the OVN DHCP end-to-end.
> >
> > This patch is a proper implementation that creates a new DHCP
> > configuration option called "next_server" to allow users to set this
> > field dynamically. This patch uses the DHCP code 253 which is a unsed
> > code for DHCP specification as this is not a normal DHCP option but a
> > special use case in OVN.
> >
> > [0]
> > https://github.com/openvswitch/ovs/blob/9dd3031d2e0e9597449e95428320ccaaff7d8b3d/lib/dhcp.h#L42
> >
> > Reported-by: https://bugzilla.redhat.com/show_bug.cgi?id=2083629
> > Reported-by:
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051821.html
> > Signed-off-by: Lucas Alvares Gomes <lucasago...@gmail.com>
> > ---
> >  controller/pinctrl.c | 69 ++++++++++++++++++++++++++++++--------------
> >  lib/actions.c        | 13 ++++++++-
> >  lib/ovn-l7.h         |  8 +++++
> >  northd/ovn-northd.c  |  1 +
> >  ovn-nb.xml           |  7 +++++
> >  tests/ovn.at         |  6 ++--
> >  tests/test-ovn.c     |  1 +
> >  7 files changed, 80 insertions(+), 25 deletions(-)
> >
> > diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> > index ae3da332c..428863293 100644
> > --- a/controller/pinctrl.c
> > +++ b/controller/pinctrl.c
> > @@ -2203,30 +2203,56 @@ pinctrl_handle_put_dhcp_opts(
> >       *| 4 Bytes padding | 1 Byte (option end 0xFF ) | 4 Bytes padding|
> >       * --------------------------------------------------------------
> >       */
> > -    struct dhcp_opt_header *in_dhcp_opt =
> > -        (struct dhcp_opt_header *)reply_dhcp_opts_ptr->data;
> > -    if (in_dhcp_opt->code == DHCP_OPT_BOOTFILE_CODE) {
> > -        unsigned char *ptr = (unsigned char *)in_dhcp_opt;
> > -        int len = sizeof *in_dhcp_opt + in_dhcp_opt->len;
> > -        struct dhcp_opt_header *next_dhcp_opt =
> > -            (struct dhcp_opt_header *)(ptr + len);
> > -
> > -        if (next_dhcp_opt->code == DHCP_OPT_BOOTFILE_ALT_CODE) {
> > -            if (!ipxe_req) {
> > -                ofpbuf_pull(reply_dhcp_opts_ptr, len);
> > -                next_dhcp_opt->code = DHCP_OPT_BOOTFILE_CODE;
> > -            } else {
> > -                char *buf = xmalloc(len);
> > +    ovs_be32 next_server = in_dhcp_data->siaddr;
> > +    bool bootfile_name_set = false;
> > +    in_dhcp_ptr = reply_dhcp_opts_ptr->data;
> > +    end = (const char *)reply_dhcp_opts_ptr->data + 
> > reply_dhcp_opts_ptr->size;
> > +
>
> Hi Lucas,
>
> Thanks for adding this support.
>
> It seems to me this while loop can be avoided since lib/actions.c
> always encodes offer_ip, DHCP_OPT_BOOTFILE_CODE,
> DHCP_OPT_BOOTFILE_ALT_CODE
> and now DHCP_OPT_NEXT_SERVER_CODE  in the userdata buffer before
> encoding other dhcp options.
>
> Since it is deterministic,  can't we do something like below instead
> of the while loop ?
>
> struct dhcp_opt_header *in_dhcp_opt =
> (struct dhcp_opt_header *)reply_dhcp_opts_ptr->data;
> if (in_dhcp_opt->code == DHCP_OPT_BOOTFILE_CODE) {
>    ....
>    advance in_dhcp_opt to the next option
> } else if (in_dhcp_opt->code == DHCP_OPT_BOOTFILE_ALT_CODE) {
>   ..
>  advance in_dhcp_opt to the next option
> }
>
> if (in_dhcp_opt->code == DHCP_OPT_NEXT_SERVER_CODE) {
>     next_server = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> }
>
> Let me know if this gets complicated because of my above suggestion.
> In that case,I'm fine to run the below while loop.
>

That's how I coded it the first time when I was testing the patch, but
I found a problem where if more than one option was set, for example:
bootfile_name and next_server only the first encoded option would be
processed. In order to process all options I needed to offset to the
next one like:

unsigned char *ptr = (unsigned char *)in_dhcp_opt;
int len = sizeof *in_dhcp_opt + in_dhcp_opt->len;
struct dhcp_opt_header *next_dhcp_opt = (struct dhcp_opt_header *)(ptr + len);

But by then the code wasn't really looking great anymore. That's why I
thought that the looping + switch case just looks better. Plus, it's more
future proof as if we need to add another option to this (and we might
with iPXE for IPv6) all we need to do is add another "case <opt>".

>
> Also it would be great if you can enhance the test "dhcpv4 : 1 HV, 2
> LS, 2 LSPs/LS" in ovn.at with the newly added option.
>

Will take a look at it.

> From what I understand, the dhcp response  will also include this new
> option - 253 along with setting the dhcp_header->siaddr if
> CMS has defined next_server in the dhcp_options right ?  If so, the
> dhcp clients would ignore this option gracefully right ?
>

That's right, the option is simply ignored. It won't cause any harm.


> Thanks
> Numan
>
>
> > +    while (in_dhcp_ptr < end) {
> > +        struct dhcp_opt_header *in_dhcp_opt =
> > +            (struct dhcp_opt_header *)in_dhcp_ptr;
> > +
> > +        switch (in_dhcp_opt->code) {
> > +        case DHCP_OPT_NEXT_SERVER_CODE:
> > +            next_server = 
> > get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> > +            break;
> > +        case DHCP_OPT_BOOTFILE_CODE: ;
> > +            unsigned char *ptr = (unsigned char *)in_dhcp_opt;
> > +            int len = sizeof *in_dhcp_opt + in_dhcp_opt->len;
> > +            struct dhcp_opt_header *next_dhcp_opt =
> > +                (struct dhcp_opt_header *)(ptr + len);
> > +
> > +            if (next_dhcp_opt->code == DHCP_OPT_BOOTFILE_ALT_CODE) {
> > +                if (!ipxe_req) {
> > +                    ofpbuf_pull(reply_dhcp_opts_ptr, len);
> > +                    next_dhcp_opt->code = DHCP_OPT_BOOTFILE_CODE;
> > +                } else {
> > +                    char *buf = xmalloc(len);
> >
> > -                memcpy(buf, in_dhcp_opt, len);
> > -                ofpbuf_pull(reply_dhcp_opts_ptr,
> > -                            sizeof *in_dhcp_opt + next_dhcp_opt->len);
> > -                memcpy(reply_dhcp_opts_ptr->data, buf, len);
> > -                free(buf);
> > +                    memcpy(buf, in_dhcp_opt, len);
> > +                    ofpbuf_pull(reply_dhcp_opts_ptr,
> > +                                sizeof *in_dhcp_opt + next_dhcp_opt->len);
> > +                    memcpy(reply_dhcp_opts_ptr->data, buf, len);
> > +                    free(buf);
> > +                }
> > +            }
> > +            bootfile_name_set = true;
> > +            break;
> > +        case DHCP_OPT_BOOTFILE_ALT_CODE:
> > +            if (!bootfile_name_set) {
> > +                in_dhcp_opt->code = DHCP_OPT_BOOTFILE_CODE;
> >              }
> > +            break;
> > +        }
> > +
> > +        in_dhcp_ptr += sizeof *in_dhcp_opt;
> > +        if (in_dhcp_ptr > end) {
> > +            break;
> > +        }
> > +        in_dhcp_ptr += in_dhcp_opt->len;
> > +        if (in_dhcp_ptr > end) {
> > +            break;
> >          }
> > -    } else if (in_dhcp_opt->code == DHCP_OPT_BOOTFILE_ALT_CODE) {
> > -        in_dhcp_opt->code = DHCP_OPT_BOOTFILE_CODE;
> >      }
> >
> >      uint16_t new_l4_size = UDP_HEADER_LEN + DHCP_HEADER_LEN + 16;
> > @@ -2259,6 +2285,7 @@ pinctrl_handle_put_dhcp_opts(
> >
> >      if (*in_dhcp_msg_type != OVN_DHCP_MSG_INFORM) {
> >          dhcp_data->yiaddr = (msg_type == DHCP_MSG_NAK) ? 0 : *offer_ip;
> > +        dhcp_data->siaddr = (msg_type == DHCP_MSG_NAK) ? 0 : next_server;
> >      } else {
> >          dhcp_data->yiaddr = 0;
> >      }
> > diff --git a/lib/actions.c b/lib/actions.c
> > index a3cf747db..11b349368 100644
> > --- a/lib/actions.c
> > +++ b/lib/actions.c
> > @@ -2862,10 +2862,21 @@ encode_PUT_DHCPV4_OPTS(const struct ovnact_put_opts 
> > *pdo,
> >          opt_header[1] = strlen(c->string);
> >          ofpbuf_put(ofpacts, c->string, opt_header[1]);
> >      }
> > +    /* Encode next_server opt (253) */
> > +    const struct ovnact_gen_option *next_server_opt = find_opt(
> > +        pdo->options, pdo->n_options, DHCP_OPT_NEXT_SERVER_CODE);
> > +    if (next_server_opt) {
> > +        uint8_t *opt_header = ofpbuf_put_zeros(ofpacts, 2);
> > +        const union expr_constant *c = next_server_opt->value.values;
> > +        opt_header[0] = next_server_opt->option->code;
> > +        opt_header[1] = sizeof(ovs_be32);
> > +        ofpbuf_put(ofpacts, &c->value.ipv4, sizeof(ovs_be32));
> > +    }
> >
> >      for (size_t i = 0; i < pdo->n_options; i++) {
> >          const struct ovnact_gen_option *o = &pdo->options[i];
> > -        if (o != offerip_opt && o != boot_opt && o != boot_alt_opt) {
> > +        if (o != offerip_opt && o != boot_opt && o != boot_alt_opt && \
> > +            o != next_server_opt) {
> >              encode_put_dhcpv4_option(o, ofpacts);
> >          }
> >      }
> > diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
> > index 49ecea81f..0b2da9f7b 100644
> > --- a/lib/ovn-l7.h
> > +++ b/lib/ovn-l7.h
> > @@ -145,6 +145,14 @@ struct gen_opts_map {
> >  #define DHCP_OPT_TCP_KEEPALIVE_INTERVAL \
> >      DHCP_OPTION("tcp_keepalive_interval", 38, "uint32")
> >
> > +/* Use unused 253 option for DHCP next-server DHCP option.
> > + * This option is used for setting the "Next server IP address"
> > + * field in the DHCP header.
> > + */
> > +#define DHCP_OPT_NEXT_SERVER_CODE 253
> > +#define DHCP_OPT_NEXT_SERVER \
> > +    DHCP_OPTION("next_server", DHCP_OPT_NEXT_SERVER_CODE, "ipv4")
> > +
> >  static inline uint32_t
> >  gen_opt_hash(char *opt_name)
> >  {
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 0a0f85010..6e01679e1 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -246,6 +246,7 @@ static struct gen_opts_map supported_dhcp_opts[] = {
> >      DHCP_OPT_BROADCAST_ADDRESS,
> >      DHCP_OPT_NETBIOS_NAME_SERVER,
> >      DHCP_OPT_NETBIOS_NODE_TYPE,
> > +    DHCP_OPT_NEXT_SERVER,
> >  };
> >
> >  static struct gen_opts_map supported_dhcpv6_opts[] = {
> > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > index 9010240a8..a9c784865 100644
> > --- a/ovn-nb.xml
> > +++ b/ovn-nb.xml
> > @@ -3579,6 +3579,13 @@
> >            </p>
> >          </column>
> >
> > +        <column name="options" key="next_server">
> > +          <p>
> > +            The DHCPv4 option code for setting the "Next server IP
> > +            address" field in the DHCP header.
> > +          </p>
> > +        </column>
> > +
> >        </group>
> >
> >        <group title="Boolean DHCP Options">
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 799a6aabd..17a8b7d31 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -1438,9 +1438,9 @@ reg0[0] = lookup_arp_ip(inport, eth.dst);
> >  # put_dhcp_opts
> >  reg1[0] = put_dhcp_opts(offerip = 1.2.3.4, router = 10.0.0.1);
> >      encodes as 
> > controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.40.01.02.03.04.03.04.0a.00.00.01,pause)
> > -reg2[5] = 
> > put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.254.0,mtu=1400,domain_name="ovn.org",wpad="https://example.org",bootfile_name="https://127.0.0.1/boot.ipxe",path_prefix="/tftpboot";);
> > -    formats as reg2[5] = put_dhcp_opts(offerip = 10.0.0.4, router = 
> > 10.0.0.1, netmask = 255.255.254.0, mtu = 1400, domain_name = "ovn.org", 
> > wpad = "https://example.org";, bootfile_name = 
> > "https://127.0.0.1/boot.ipxe";, path_prefix = "/tftpboot");
> > -    encodes as 
> > controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.25.0a.00.00.04.43.1b.68.74.74.70.73.3a.2f.2f.31.32.37.2e.30.2e.30.2e.31.2f.62.6f.6f.74.2e.69.70.78.65.03.04.0a.00.00.01.01.04.ff.ff.fe.00.1a.02.05.78.0f.07.6f.76.6e.2e.6f.72.67.fc.13.68.74.74.70.73.3a.2f.2f.65.78.61.6d.70.6c.65.2e.6f.72.67.d2.09.2f.74.66.74.70.62.6f.6f.74,pause)
> > +reg2[5] = 
> > put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.254.0,mtu=1400,domain_name="ovn.org",wpad="https://example.org",bootfile_name="https://127.0.0.1/boot.ipxe",path_prefix="/tftpboot";,
> >  next_server=10.0.0.9);
> > +    formats as reg2[5] = put_dhcp_opts(offerip = 10.0.0.4, router = 
> > 10.0.0.1, netmask = 255.255.254.0, mtu = 1400, domain_name = "ovn.org", 
> > wpad = "https://example.org";, bootfile_name = 
> > "https://127.0.0.1/boot.ipxe";, path_prefix = "/tftpboot", next_server = 
> > 10.0.0.9);
> > +    encodes as 
> > controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.25.0a.00.00.04.43.1b.68.74.74.70.73.3a.2f.2f.31.32.37.2e.30.2e.30.2e.31.2f.62.6f.6f.74.2e.69.70.78.65.fd.04.0a.00.00.09.03.04.0a.00.00.01.01.04.ff.ff.fe.00.1a.02.05.78.0f.07.6f.76.6e.2e.6f.72.67.fc.13.68.74.74.70.73.3a.2f.2f.65.78.61.6d.70.6c.65.2e.6f.72.67.d2.09.2f.74.66.74.70.62.6f.6f.74,pause)
> >  reg0[15] = 
> > put_dhcp_opts(offerip=10.0.0.4,router=10.0.0.1,netmask=255.255.255.0,mtu=1400,ip_forward_enable=1,default_ttl=121,dns_server={8.8.8.8,7.7.7.7},classless_static_route={30.0.0.0/24,10.0.0.4,40.0.0.0/16,10.0.0.6,0.0.0.0/0,10.0.0.1},ethernet_encap=1,router_discovery=0,tftp_server_address={10.0.0.4,10.0.0.5},arp_cache_timeout=10,tcp_keepalive_interval=10);
> >      formats as reg0[15] = put_dhcp_opts(offerip = 10.0.0.4, router = 
> > 10.0.0.1, netmask = 255.255.255.0, mtu = 1400, ip_forward_enable = 1, 
> > default_ttl = 121, dns_server = {8.8.8.8, 7.7.7.7}, classless_static_route 
> > = {30.0.0.0/24, 10.0.0.4, 40.0.0.0/16, 10.0.0.6, 0.0.0.0/0, 10.0.0.1}, 
> > ethernet_encap = 1, router_discovery = 0, tftp_server_address = {10.0.0.4, 
> > 10.0.0.5}, arp_cache_timeout = 10, tcp_keepalive_interval = 10);
> >      encodes as 
> > controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.6f.0a.00.00.04.03.04.0a.00.00.01.01.04.ff.ff.ff.00.1a.02.05.78.13.01.01.17.01.79.06.08.08.08.08.08.07.07.07.07.79.14.18.1e.00.00.0a.00.00.04.10.28.00.0a.00.00.06.00.0a.00.00.01.24.01.01.1f.01.00.96.08.0a.00.00.04.0a.00.00.05.23.04.00.00.00.0a.26.04.00.00.00.0a,pause)
> > diff --git a/tests/test-ovn.c b/tests/test-ovn.c
> > index d79c6a5bc..844905797 100644
> > --- a/tests/test-ovn.c
> > +++ b/tests/test-ovn.c
> > @@ -196,6 +196,7 @@ create_gen_opts(struct hmap *dhcp_opts, struct hmap 
> > *dhcpv6_opts,
> >      dhcp_opt_add(dhcp_opts, "broadcast_address", 28, "ipv4");
> >      dhcp_opt_add(dhcp_opts, "netbios_name_server", 44, "ipv4");
> >      dhcp_opt_add(dhcp_opts, "netbios_node_type", 46, "uint8");
> > +    dhcp_opt_add(dhcp_opts, "next_server", 253, "ipv4");
> >
> >      /* DHCPv6 options. */
> >      hmap_init(dhcpv6_opts);
> > --
> > 2.36.1
> >
> > _______________________________________________
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to