On Sun, May 14, 2023 8:27 pm, Dave Voutila wrote:
> In short:
>
>
> 1) vmop_create_params contains settings specific to a vm and the local
> prefix settings are currently global. I don't want to mix semantics if we
> don't have to. Plus, as you call out below, a configuration reload via
> `vmctl reload` will not be handled by your approach.
>
>
> 2) you're adding a new usage of the current_vm extern, which i'm trying
> to phase out as much as possible (there's too much implicit global state
> throughout vmd). Adding another extern to dhcp.c, a source of many
> headaches in the past, is not something I want to do.
>
> Some notes below in the diff, but in general I'd much rather prefer a
> message-based approach that allows us to remove even more dependence on the
> crufty global state (env, current_vm).

This is great feedback - I really appreciate you taking the time to write
this up. I think I understand the direction vmd is moving in a bit better
now, so I took another stab at patching this. Diff inlined below.

> I recommend looking at how the host mac is set asynchronously with all
> virtio network device processes.

I took inspiration from this and did something pretty similar for the
local prefix setting, which is now stored in the vionet_dev struct. When
a VM (IPv4) interface is configured by the privileged process after the
parent calls vm_priv_ifconfig, this process sends a new message
IMSG_VMDOP_PRIV_IFADDR_RESPONSE back up through the parent to the vmm
process.

The vmm process then forwards on the 'struct address' in its'
global config down to each vm process, who update a relevant field in
their virtio_dev structures per your suggestion. This gets propagated
down to the device processes with IMSG_DEVOP_LOCALPREFIX for use in
dhcp.c.

The function signatures for vm_priv_addr + vm_priv_addr6 are slimmed
down to accomodate this change as well, as you mentioned in your
comments. Rather than passing in the entire vmd_cfg, I'm now passing
just the 'struct address' - which helps on the journey to getting rid
of 'env' in dhcp.c.

Additional quality of life changes:
- vm_priv_ifconfig is called upon reloading/loading configuration,
similar to how vm_priv_brconfig is rerun in these situations. This way
we can support changes to local prefix [6] config lines at runtime
with vmctl load.
- config_init_localprefix is now invoked by parse_config. This way, if
we have a 'local prefix' line in our config, comment it out, and then
reload vmd, the local prefix jumps back to the default of
100.64.0.0/10 rather than 'sticking' to its' original value. Perhaps
there's a more elegant way to accomplish this in your framework..


Thanks again for the constructive criticism and the patience. I'm
excited to be able to help the project in whatever way I can.

Index: config.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/config.c,v
retrieving revision 1.71
diff -u -p -u -p -r1.71 config.c
--- config.c    28 Apr 2023 19:46:42 -0000      1.71
+++ config.c    15 May 2023 19:05:28 -0000
@@ -42,9 +42,7 @@
 /* Supported bridge types */
 const char *vmd_descsw[] = { "bridge", "veb", NULL };

-static int      config_init_localprefix(struct vmd_config *);
-
-static int
+int
 config_init_localprefix(struct vmd_config *cfg)
 {
        struct sockaddr_in6     *sin6;
Index: dhcp.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/dhcp.c,v
retrieving revision 1.12
diff -u -p -u -p -r1.12 dhcp.c
--- dhcp.c      27 Apr 2023 22:47:27 -0000      1.12
+++ dhcp.c      15 May 2023 19:05:28 -0000
@@ -147,7 +147,7 @@ dhcp_request(struct virtio_dev *dev, cha
        }

        if ((client_addr.s_addr =
-           vm_priv_addr(&env->vmd_cfg,
+           vm_priv_addr(&dev->vionet.localprefix,
            dev->vm_vmid, vionet->idx, 1)) == 0)
                return (-1);
        memcpy(&resp.yiaddr, &client_addr,
@@ -156,8 +156,8 @@ dhcp_request(struct virtio_dev *dev, cha
            sizeof(client_addr));
        ss2sin(&pc.pc_dst)->sin_port = htons(CLIENT_PORT);

-       if ((server_addr.s_addr = vm_priv_addr(&env->vmd_cfg, dev->vm_vmid,
-           vionet->idx, 0)) == 0)
+       if ((server_addr.s_addr = vm_priv_addr(&dev->vionet.localprefix,
+           dev->vm_vmid, vionet->idx, 0)) == 0)
                return (-1);
        memcpy(&resp.siaddr, &server_addr, sizeof(server_addr));
        memcpy(&ss2sin(&pc.pc_src)->sin_addr, &server_addr,
Index: parse.y
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/parse.y,v
retrieving revision 1.67
diff -u -p -u -p -r1.67 parse.y
--- parse.y     28 Apr 2023 21:22:20 -0000      1.67
+++ parse.y     15 May 2023 19:05:28 -0000
@@ -1188,6 +1188,10 @@ parse_config(const char *filename)
        extern const char        default_conffile[];
        struct sym              *sym, *next;

+       /* Reset local prefix */
+       if (config_init_localprefix(&env->vmd_cfg) == 01)
+               return (-1);
+
        if ((file = pushfile(filename, 0)) == NULL) {
                /* no default config file is fine */
                if (errno == ENOENT && filename == default_conffile) {
Index: priv.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/priv.c,v
retrieving revision 1.22
diff -u -p -u -p -r1.22 priv.c
--- priv.c      28 Jan 2023 14:40:53 -0000      1.22
+++ priv.c      15 May 2023 19:05:28 -0000
@@ -146,7 +146,14 @@ priv_dispatch_parent(int fd, struct priv
                    sizeof(ifbr.ifbr_name));
                strlcpy(ifbr.ifbr_ifsname, vfr.vfr_value,
                    sizeof(ifbr.ifbr_ifsname));
-               if (ioctl(env->vmd_fd, SIOCBRDGADD, &ifbr) == -1 &&
+
+               /* vmd's configuration does not allow us to change
+                * the bridge an interface belongs to at runtime.
+                * If this request is invoked multiple times over a
+                * VM's lifetime, we will get EEXIST and can safely
+                * proceed
+                */
+               if (ioctl(env->vmd_fd, SIOCBRDGADD, &ifbr) == -1 &&
                    errno != EEXIST)
                        log_warn("SIOCBRDGADD");
                break;
@@ -182,11 +189,34 @@ priv_dispatch_parent(int fd, struct priv
                    sizeof(ifgr.ifgr_group)) >= sizeof(ifgr.ifgr_group))
                        fatalx("%s: group name too long", __func__);

+               /*
+                * vmd's configuration doesn't allow us to change the
+                * group an interface lives in while the interface is
+                * up and running. If this request is invoked multiple
+                * times over a VM's lifetime, then, we will get EEXIST
+                */
                if (ioctl(env->vmd_fd, SIOCAIFGROUP, &ifgr) == -1 &&
                    errno != EEXIST)
                        log_warn("SIOCAIFGROUP");
                break;
        case IMSG_VMDOP_PRIV_IFADDR:
+               /*
+                * Get the current address of this interface,
+                * and delete it if it is already assigned
+                * Not strictly needed, but helps ifconfig tapN
+                * to show up-to-date address information
+                */
+               strlcpy(ifr.ifr_name, vfr.vfr_name, sizeof(ifr.ifr_name));
+               if (ioctl(env->vmd_fd, SIOCGIFADDR, &ifr) == 0) {
+                       if (ioctl(env->vmd_fd, SIOCDIFADDR, &ifr) == -1) {
+                               log_warn("SIOCDIFADDR");
+                               break;
+                       }
+               } else if (errno != EADDRNOTAVAIL) {
+                       log_warn("SIOCGIFADDR");
+                       break;
+               }
+
                memset(&ifra, 0, sizeof(ifra));

                if (vfr.vfr_addr.ss_family != AF_INET ||
@@ -207,8 +237,33 @@ priv_dispatch_parent(int fd, struct priv

                if (ioctl(env->vmd_fd, SIOCAIFADDR, &ifra) == -1)
                        log_warn("SIOCAIFADDR");
+
+               /*
+                * Reply to the parent that the address has
+                * been set, prompting it to forward the new
+                * address information to this vm's device process
+                */
+               proc_compose(ps, PROC_PARENT, IMSG_VMDOP_PRIV_IFADDR_RESPONSE,
+                   NULL, 0);
+
                break;
        case IMSG_VMDOP_PRIV_IFADDR6:
+               /*
+                * Get the current address of this interface,
+                * and delete it if it is already assigned
+                */
+               strlcpy(ifr.ifr_name, vfr.vfr_name, sizeof(ifr.ifr_name));
+
+               if (ioctl(env->vmd_fd, SIOCGIFADDR, &ifr) == 0) {
+                       if (ioctl(env->vmd_fd, SIOCDIFADDR, &ifr) == -1) {
+                               log_warn("SIOCDIFADDR");
+                               break;
+                       }
+               } else if (errno != EADDRNOTAVAIL) {
+                       log_warn("SIOCGIFADDR");
+                       break;
+               }
+
                memset(&ifar, 0, sizeof(ifar));
                memset(&in6_ifra, 0, sizeof(in6_ifra));

@@ -466,7 +521,7 @@ vm_priv_ifconfig(struct privsep *ps, str
                        sin4->sin_family = AF_INET;
                        sin4->sin_len = sizeof(*sin4);
                        if ((sin4->sin_addr.s_addr =
-                           vm_priv_addr(&env->vmd_cfg,
+                           vm_priv_addr(&env->vmd_cfg.cfg_localprefix,
                            vm->vm_vmid, i, 0)) == 0)
                                return (-1);

@@ -493,7 +548,7 @@ vm_priv_ifconfig(struct privsep *ps, str
                        sin6 = ss2sin6(&vfr.vfr_addr);
                        sin6->sin6_family = AF_INET6;
                        sin6->sin6_len = sizeof(*sin6);
-                       if (vm_priv_addr6(&env->vmd_cfg,
+                       if (vm_priv_addr6(&env->vmd_cfg.cfg_localprefix6,
                            vm->vm_vmid, i, 0, &sin6->sin6_addr) == -1)
                                return (-1);

@@ -565,9 +620,8 @@ vm_priv_brconfig(struct privsep *ps, str
 }

 uint32_t
-vm_priv_addr(struct vmd_config *cfg, uint32_t vmid, int idx, int isvm)
+vm_priv_addr(struct address *h, uint32_t vmid, int idx, int isvm)
 {
-       struct address          *h = &cfg->cfg_localprefix;
        in_addr_t                prefix, mask, addr;

        /*
@@ -611,10 +665,9 @@ vm_priv_addr(struct vmd_config *cfg, uin
 }

 int
-vm_priv_addr6(struct vmd_config *cfg, uint32_t vmid,
+vm_priv_addr6(struct address *h, uint32_t vmid,
     int idx, int isvm, struct in6_addr *in6_addr)
 {
-       struct address          *h = &cfg->cfg_localprefix6;
        struct in6_addr          addr, mask;
        uint32_t                 addr4;

@@ -626,7 +679,7 @@ vm_priv_addr6(struct vmd_config *cfg, ui
        prefixlen2mask6(h->prefixlen, &mask);

        /* 2. Encode the VM IPv4 address as subnet, fd00::NN:NN:0:0/96. */
-       if ((addr4 = vm_priv_addr(cfg, vmid, idx, 1)) == 0)
+       if ((addr4 = vm_priv_addr(h, vmid, idx, 1)) == 0)
                return (0);
        memcpy(&addr.s6_addr[8], &addr4, sizeof(addr4));

Index: vionet.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/vionet.c,v
retrieving revision 1.3
diff -u -p -u -p -r1.3 vionet.c
--- vionet.c    13 May 2023 23:15:28 -0000      1.3
+++ vionet.c    15 May 2023 19:05:28 -0000
@@ -728,6 +728,12 @@ dev_dispatch_vm(int fd, short event, voi
                            sizeof(vionet->hostmac));
                        log_debug("%s: set hostmac", __func__);
                        break;
+               case IMSG_DEVOP_LOCALPREFIX:
+                       IMSG_SIZE_CHECK(&imsg, &vionet->localprefix);
+                       memcpy(&vionet->localprefix, imsg.data,
+                           sizeof(vionet->localprefix));
+                       log_debug("%s: set localprefix", __func__);
+                       break;
                case IMSG_VMDOP_PAUSE_VM:
                        log_debug("%s: pausing", __func__);
                        event_del(&ev_tap);
Index: virtio.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/virtio.c,v
retrieving revision 1.103
diff -u -p -u -p -r1.103 virtio.c
--- virtio.c    13 May 2023 23:15:28 -0000      1.103
+++ virtio.c    15 May 2023 19:05:28 -0000
@@ -797,6 +797,45 @@ vionet_set_hostmac(struct vmd_vm *vm, un
        }
 }

+/*
+ * vionet_set_dhcp_localprefix
+ *
+ * Sets the IPv4 address prefix used to conduct DHCP
+ * negotiation on all vionet_dev's for the current VM
+ *
+ * This should only be called from the event-loop thread
+ *
+ * vm: pointer to the current vmd_vm instance
+ * localprefix: address structure to use
+ */
+void
+vionet_set_dhcp_localprefix(struct vmd_vm *vm, struct address localprefix)
+{
+       struct virtio_dev       *dev;
+       struct vionet_dev       *vionet;
+       int                      ret;
+
+       SLIST_FOREACH(dev, &virtio_devs, dev_next) {
+               if (dev->dev_type == VMD_DEVTYPE_NET) {
+                       vionet = &dev->vionet;
+
+                       /* Set local vm process copy */
+                       memcpy(&vionet->localprefix, &localprefix,
+                           sizeof(localprefix));
+
+                       /* Forward on to device process */
+                       ret = imsg_compose_event(&dev->async_iev,
+                           IMSG_DEVOP_LOCALPREFIX, 0, 0, -1,
+                           &vionet->localprefix, sizeof(vionet->localprefix));
+
+                       if (ret == -1)
+                               log_warnx("%s: failed to queue localprefix "
+                                   "to vionet dev %u", __func__,
+                                   dev->vionet.idx);
+               }
+       }
+}
+
 void
 virtio_shutdown(struct vmd_vm *vm)
 {
Index: virtio.h
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/virtio.h,v
retrieving revision 1.45
diff -u -p -u -p -r1.45 virtio.h
--- virtio.h    27 Apr 2023 22:47:27 -0000      1.45
+++ virtio.h    15 May 2023 19:05:28 -0000
@@ -255,9 +255,11 @@ struct vionet_dev {
        uint8_t mac[6];
        uint8_t hostmac[6];
        int lockedmac;
-       int local;
        int pxeboot;

+       int local;
+       struct address localprefix;
+
        unsigned int idx;
 };

@@ -369,6 +371,7 @@ int vionet_notify_tx(struct virtio_dev *
 void vionet_process_rx(uint32_t);
 int vionet_enq_rx(struct vionet_dev *, char *, size_t, int *);
 void vionet_set_hostmac(struct vmd_vm *, unsigned int, uint8_t *);
+void vionet_set_dhcp_localprefix(struct vmd_vm *, struct address);

 int vmmci_io(int, uint16_t, uint32_t *, uint8_t *, void *, uint8_t);
 int vmmci_dump(int);
Index: vm.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/vm.c,v
retrieving revision 1.89
diff -u -p -u -p -r1.89 vm.c
--- vm.c        13 May 2023 23:15:28 -0000      1.89
+++ vm.c        15 May 2023 19:05:28 -0000
@@ -520,6 +520,7 @@ vm_dispatch_vmm(int fd, short event, voi
        struct vmd_vm           *vm = arg;
        struct vmop_result       vmr;
        struct vmop_addr_result  var;
+       struct address           localprefix;
        struct imsgev           *iev = &vm->vm_iev;
        struct imsgbuf          *ibuf = &iev->ibuf;
        struct imsg              imsg;
@@ -605,6 +606,12 @@ vm_dispatch_vmm(int fd, short event, voi
                            ether_ntoa((void *)var.var_addr), var.var_nic_idx);

                        vionet_set_hostmac(vm, var.var_nic_idx, var.var_addr);
+                       break;
+               case IMSG_VMDOP_PRIV_IFADDR_RESPONSE:
+                       IMSG_SIZE_CHECK(&imsg, &localprefix);
+                       memcpy(&localprefix, imsg.data, sizeof(localprefix));
+
+                       vionet_set_dhcp_localprefix(vm, localprefix);
                        break;
                default:
                        fatalx("%s: got invalid imsg %d from %s",
Index: vmd.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/vmd.c,v
retrieving revision 1.149
diff -u -p -u -p -r1.149 vmd.c
--- vmd.c       13 May 2023 23:15:28 -0000      1.149
+++ vmd.c       15 May 2023 19:05:28 -0000
@@ -597,6 +597,7 @@ vmd_dispatch_agentx(int fd, struct privs
 int
 vmd_dispatch_priv(int fd, struct privsep_proc *p, struct imsg *imsg)
 {
+       struct privsep          *ps = p->p_ps;
        struct vmop_addr_result  var;

        switch (imsg->hdr.type) {
@@ -605,6 +606,9 @@ vmd_dispatch_priv(int fd, struct privsep
                memcpy(&var, imsg->data, sizeof(var));
                proc_forward_imsg(p->p_ps, imsg, PROC_VMM, -1);
                break;
+       case IMSG_VMDOP_PRIV_IFADDR_RESPONSE:
+               proc_forward_imsg(ps, imsg, PROC_VMM, -1);
+               break;
        default:
                return (-1);
        }
@@ -1005,6 +1009,7 @@ vmd_configure(void)
 {
        int                     ncpus;
        struct vmd_switch       *vsw;
+       struct vmd_vm           *vm;
        int ncpu_mib[] = {CTL_HW, HW_NCPUONLINE};
        size_t ncpus_sz = sizeof(ncpus);

@@ -1058,6 +1063,15 @@ vmd_configure(void)
                }
        }

+       TAILQ_FOREACH(vm, env->vmd_vms, vm_entry) {
+               if (vm_priv_ifconfig(&env->vmd_ps, vm) == -1) {
+                       log_warn("%s: failed to update if for vm %u",
+                           __func__, vm->vm_vmid);
+                       vm_terminate(vm, __func__);
+                       return (-1);
+               }
+       }
+
        if (!(env->vmd_cfg.cfg_flags & VMD_CFG_STAGGERED_START)) {
                env->vmd_cfg.delay.tv_sec = VMD_DEFAULT_STAGGERED_START_DELAY;
                if (sysctl(ncpu_mib, nitems(ncpu_mib), &ncpus, &ncpus_sz, NULL, 
0) == -1)
@@ -1134,6 +1148,17 @@ vmd_reload(unsigned int reset, const cha
                                log_warn("%s: failed to create switch %s",
                                    __func__, vsw->sw_name);
                                switch_remove(vsw);
+                               return (-1);
+                       }
+               }
+
+               TAILQ_FOREACH(vm, env->vmd_vms, vm_entry) {
+                       log_debug("%s: updating config for vm %u",
+                           __func__, vm->vm_vmid);
+                       if (vm_priv_ifconfig(&env->vmd_ps, vm) == -1) {
+                               log_warn("%s: failed to update if for vm %u",
+                                   __func__, vm->vm_vmid);
+                               vm_terminate(vm, __func__);
                                return (-1);
                        }
                }
Index: vmd.h
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/vmd.h,v
retrieving revision 1.122
diff -u -p -u -p -r1.122 vmd.h
--- vmd.h       13 May 2023 23:15:28 -0000      1.122
+++ vmd.h       15 May 2023 19:05:28 -0000
@@ -134,6 +134,7 @@ enum imsg_type {
        IMSG_VMDOP_PRIV_IFDOWN,
        IMSG_VMDOP_PRIV_IFGROUP,
        IMSG_VMDOP_PRIV_IFADDR,
+       IMSG_VMDOP_PRIV_IFADDR_RESPONSE,
        IMSG_VMDOP_PRIV_IFADDR6,
        IMSG_VMDOP_PRIV_IFRDOMAIN,
        IMSG_VMDOP_PRIV_GET_ADDR,
@@ -144,6 +145,7 @@ enum imsg_type {
        IMSG_VMDOP_DONE,
        /* Device Operation Messages */
        IMSG_DEVOP_HOSTMAC,
+       IMSG_DEVOP_LOCALPREFIX,
        IMSG_DEVOP_MSG,
 };

@@ -473,8 +475,8 @@ int  priv_findname(const char *, const c
 int     priv_validgroup(const char *);
 int     vm_priv_ifconfig(struct privsep *, struct vmd_vm *);
 int     vm_priv_brconfig(struct privsep *, struct vmd_switch *);
-uint32_t vm_priv_addr(struct vmd_config *, uint32_t, int, int);
-int     vm_priv_addr6(struct vmd_config *, uint32_t, int, int,
+uint32_t vm_priv_addr(struct address *, uint32_t, int, int);
+int     vm_priv_addr6(struct address *, uint32_t, int, int,
            struct in6_addr *);

 /* vmm.c */
@@ -500,6 +502,7 @@ int  remap_guest_mem(struct vmd_vm *, in

 /* config.c */
 int     config_init(struct vmd *);
+int     config_init_localprefix(struct vmd_config *);
 void    config_purge(struct vmd *, unsigned int);
 int     config_setconfig(struct vmd *);
 int     config_getconfig(struct vmd *, struct imsg *);
Index: vmm.c
===================================================================
RCS file: /cvs/src/usr.sbin/vmd/vmm.c,v
retrieving revision 1.112
diff -u -p -u -p -r1.112 vmm.c
--- vmm.c       13 May 2023 23:15:28 -0000      1.112
+++ vmm.c       15 May 2023 19:05:28 -0000
@@ -115,6 +115,7 @@ vmm_dispatch_parent(int fd, struct privs
        struct vmop_result       vmr;
        struct vmop_create_params vmc;
        struct vmop_addr_result  var;
+       struct address           localprefix;
        uint32_t                 id = 0, peerid = imsg->hdr.peerid;
        pid_t                    pid = 0;
        unsigned int             mode, flags;
@@ -327,6 +328,16 @@ vmm_dispatch_parent(int fd, struct privs

                /* Get and terminate all running VMs */
                get_info_vm(ps, NULL, 1);
+               break;
+       case IMSG_VMDOP_PRIV_IFADDR_RESPONSE:
+               localprefix = env->vmd_cfg.cfg_localprefix;
+
+               /* Forward interface address details to _every_ guest vm */
+               TAILQ_FOREACH(vm, env->vmd_vms, vm_entry) {
+                       imsg_compose_event(&vm->vm_iev,
+                           imsg->hdr.type, imsg->hdr.peerid, imsg->hdr.pid,
+                           imsg->fd, &localprefix, sizeof(localprefix));
+               }
                break;
        default:
                return (-1);


Reply via email to