On Sun, May 14, 2023 8:27 pm, Dave Voutila wrote: > In short: > > > 1) vmop_create_params contains settings specific to a vm and the local > prefix settings are currently global. I don't want to mix semantics if we > don't have to. Plus, as you call out below, a configuration reload via > `vmctl reload` will not be handled by your approach. > > > 2) you're adding a new usage of the current_vm extern, which i'm trying > to phase out as much as possible (there's too much implicit global state > throughout vmd). Adding another extern to dhcp.c, a source of many > headaches in the past, is not something I want to do. > > Some notes below in the diff, but in general I'd much rather prefer a > message-based approach that allows us to remove even more dependence on the > crufty global state (env, current_vm).
This is great feedback - I really appreciate you taking the time to write this up. I think I understand the direction vmd is moving in a bit better now, so I took another stab at patching this. Diff inlined below. > I recommend looking at how the host mac is set asynchronously with all > virtio network device processes. I took inspiration from this and did something pretty similar for the local prefix setting, which is now stored in the vionet_dev struct. When a VM (IPv4) interface is configured by the privileged process after the parent calls vm_priv_ifconfig, this process sends a new message IMSG_VMDOP_PRIV_IFADDR_RESPONSE back up through the parent to the vmm process. The vmm process then forwards on the 'struct address' in its' global config down to each vm process, who update a relevant field in their virtio_dev structures per your suggestion. This gets propagated down to the device processes with IMSG_DEVOP_LOCALPREFIX for use in dhcp.c. The function signatures for vm_priv_addr + vm_priv_addr6 are slimmed down to accomodate this change as well, as you mentioned in your comments. Rather than passing in the entire vmd_cfg, I'm now passing just the 'struct address' - which helps on the journey to getting rid of 'env' in dhcp.c. Additional quality of life changes: - vm_priv_ifconfig is called upon reloading/loading configuration, similar to how vm_priv_brconfig is rerun in these situations. This way we can support changes to local prefix [6] config lines at runtime with vmctl load. - config_init_localprefix is now invoked by parse_config. This way, if we have a 'local prefix' line in our config, comment it out, and then reload vmd, the local prefix jumps back to the default of 100.64.0.0/10 rather than 'sticking' to its' original value. Perhaps there's a more elegant way to accomplish this in your framework.. Thanks again for the constructive criticism and the patience. I'm excited to be able to help the project in whatever way I can. Index: config.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/config.c,v retrieving revision 1.71 diff -u -p -u -p -r1.71 config.c --- config.c 28 Apr 2023 19:46:42 -0000 1.71 +++ config.c 15 May 2023 19:05:28 -0000 @@ -42,9 +42,7 @@ /* Supported bridge types */ const char *vmd_descsw[] = { "bridge", "veb", NULL }; -static int config_init_localprefix(struct vmd_config *); - -static int +int config_init_localprefix(struct vmd_config *cfg) { struct sockaddr_in6 *sin6; Index: dhcp.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/dhcp.c,v retrieving revision 1.12 diff -u -p -u -p -r1.12 dhcp.c --- dhcp.c 27 Apr 2023 22:47:27 -0000 1.12 +++ dhcp.c 15 May 2023 19:05:28 -0000 @@ -147,7 +147,7 @@ dhcp_request(struct virtio_dev *dev, cha } if ((client_addr.s_addr = - vm_priv_addr(&env->vmd_cfg, + vm_priv_addr(&dev->vionet.localprefix, dev->vm_vmid, vionet->idx, 1)) == 0) return (-1); memcpy(&resp.yiaddr, &client_addr, @@ -156,8 +156,8 @@ dhcp_request(struct virtio_dev *dev, cha sizeof(client_addr)); ss2sin(&pc.pc_dst)->sin_port = htons(CLIENT_PORT); - if ((server_addr.s_addr = vm_priv_addr(&env->vmd_cfg, dev->vm_vmid, - vionet->idx, 0)) == 0) + if ((server_addr.s_addr = vm_priv_addr(&dev->vionet.localprefix, + dev->vm_vmid, vionet->idx, 0)) == 0) return (-1); memcpy(&resp.siaddr, &server_addr, sizeof(server_addr)); memcpy(&ss2sin(&pc.pc_src)->sin_addr, &server_addr, Index: parse.y =================================================================== RCS file: /cvs/src/usr.sbin/vmd/parse.y,v retrieving revision 1.67 diff -u -p -u -p -r1.67 parse.y --- parse.y 28 Apr 2023 21:22:20 -0000 1.67 +++ parse.y 15 May 2023 19:05:28 -0000 @@ -1188,6 +1188,10 @@ parse_config(const char *filename) extern const char default_conffile[]; struct sym *sym, *next; + /* Reset local prefix */ + if (config_init_localprefix(&env->vmd_cfg) == 01) + return (-1); + if ((file = pushfile(filename, 0)) == NULL) { /* no default config file is fine */ if (errno == ENOENT && filename == default_conffile) { Index: priv.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/priv.c,v retrieving revision 1.22 diff -u -p -u -p -r1.22 priv.c --- priv.c 28 Jan 2023 14:40:53 -0000 1.22 +++ priv.c 15 May 2023 19:05:28 -0000 @@ -146,7 +146,14 @@ priv_dispatch_parent(int fd, struct priv sizeof(ifbr.ifbr_name)); strlcpy(ifbr.ifbr_ifsname, vfr.vfr_value, sizeof(ifbr.ifbr_ifsname)); - if (ioctl(env->vmd_fd, SIOCBRDGADD, &ifbr) == -1 && + + /* vmd's configuration does not allow us to change + * the bridge an interface belongs to at runtime. + * If this request is invoked multiple times over a + * VM's lifetime, we will get EEXIST and can safely + * proceed + */ + if (ioctl(env->vmd_fd, SIOCBRDGADD, &ifbr) == -1 && errno != EEXIST) log_warn("SIOCBRDGADD"); break; @@ -182,11 +189,34 @@ priv_dispatch_parent(int fd, struct priv sizeof(ifgr.ifgr_group)) >= sizeof(ifgr.ifgr_group)) fatalx("%s: group name too long", __func__); + /* + * vmd's configuration doesn't allow us to change the + * group an interface lives in while the interface is + * up and running. If this request is invoked multiple + * times over a VM's lifetime, then, we will get EEXIST + */ if (ioctl(env->vmd_fd, SIOCAIFGROUP, &ifgr) == -1 && errno != EEXIST) log_warn("SIOCAIFGROUP"); break; case IMSG_VMDOP_PRIV_IFADDR: + /* + * Get the current address of this interface, + * and delete it if it is already assigned + * Not strictly needed, but helps ifconfig tapN + * to show up-to-date address information + */ + strlcpy(ifr.ifr_name, vfr.vfr_name, sizeof(ifr.ifr_name)); + if (ioctl(env->vmd_fd, SIOCGIFADDR, &ifr) == 0) { + if (ioctl(env->vmd_fd, SIOCDIFADDR, &ifr) == -1) { + log_warn("SIOCDIFADDR"); + break; + } + } else if (errno != EADDRNOTAVAIL) { + log_warn("SIOCGIFADDR"); + break; + } + memset(&ifra, 0, sizeof(ifra)); if (vfr.vfr_addr.ss_family != AF_INET || @@ -207,8 +237,33 @@ priv_dispatch_parent(int fd, struct priv if (ioctl(env->vmd_fd, SIOCAIFADDR, &ifra) == -1) log_warn("SIOCAIFADDR"); + + /* + * Reply to the parent that the address has + * been set, prompting it to forward the new + * address information to this vm's device process + */ + proc_compose(ps, PROC_PARENT, IMSG_VMDOP_PRIV_IFADDR_RESPONSE, + NULL, 0); + break; case IMSG_VMDOP_PRIV_IFADDR6: + /* + * Get the current address of this interface, + * and delete it if it is already assigned + */ + strlcpy(ifr.ifr_name, vfr.vfr_name, sizeof(ifr.ifr_name)); + + if (ioctl(env->vmd_fd, SIOCGIFADDR, &ifr) == 0) { + if (ioctl(env->vmd_fd, SIOCDIFADDR, &ifr) == -1) { + log_warn("SIOCDIFADDR"); + break; + } + } else if (errno != EADDRNOTAVAIL) { + log_warn("SIOCGIFADDR"); + break; + } + memset(&ifar, 0, sizeof(ifar)); memset(&in6_ifra, 0, sizeof(in6_ifra)); @@ -466,7 +521,7 @@ vm_priv_ifconfig(struct privsep *ps, str sin4->sin_family = AF_INET; sin4->sin_len = sizeof(*sin4); if ((sin4->sin_addr.s_addr = - vm_priv_addr(&env->vmd_cfg, + vm_priv_addr(&env->vmd_cfg.cfg_localprefix, vm->vm_vmid, i, 0)) == 0) return (-1); @@ -493,7 +548,7 @@ vm_priv_ifconfig(struct privsep *ps, str sin6 = ss2sin6(&vfr.vfr_addr); sin6->sin6_family = AF_INET6; sin6->sin6_len = sizeof(*sin6); - if (vm_priv_addr6(&env->vmd_cfg, + if (vm_priv_addr6(&env->vmd_cfg.cfg_localprefix6, vm->vm_vmid, i, 0, &sin6->sin6_addr) == -1) return (-1); @@ -565,9 +620,8 @@ vm_priv_brconfig(struct privsep *ps, str } uint32_t -vm_priv_addr(struct vmd_config *cfg, uint32_t vmid, int idx, int isvm) +vm_priv_addr(struct address *h, uint32_t vmid, int idx, int isvm) { - struct address *h = &cfg->cfg_localprefix; in_addr_t prefix, mask, addr; /* @@ -611,10 +665,9 @@ vm_priv_addr(struct vmd_config *cfg, uin } int -vm_priv_addr6(struct vmd_config *cfg, uint32_t vmid, +vm_priv_addr6(struct address *h, uint32_t vmid, int idx, int isvm, struct in6_addr *in6_addr) { - struct address *h = &cfg->cfg_localprefix6; struct in6_addr addr, mask; uint32_t addr4; @@ -626,7 +679,7 @@ vm_priv_addr6(struct vmd_config *cfg, ui prefixlen2mask6(h->prefixlen, &mask); /* 2. Encode the VM IPv4 address as subnet, fd00::NN:NN:0:0/96. */ - if ((addr4 = vm_priv_addr(cfg, vmid, idx, 1)) == 0) + if ((addr4 = vm_priv_addr(h, vmid, idx, 1)) == 0) return (0); memcpy(&addr.s6_addr[8], &addr4, sizeof(addr4)); Index: vionet.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/vionet.c,v retrieving revision 1.3 diff -u -p -u -p -r1.3 vionet.c --- vionet.c 13 May 2023 23:15:28 -0000 1.3 +++ vionet.c 15 May 2023 19:05:28 -0000 @@ -728,6 +728,12 @@ dev_dispatch_vm(int fd, short event, voi sizeof(vionet->hostmac)); log_debug("%s: set hostmac", __func__); break; + case IMSG_DEVOP_LOCALPREFIX: + IMSG_SIZE_CHECK(&imsg, &vionet->localprefix); + memcpy(&vionet->localprefix, imsg.data, + sizeof(vionet->localprefix)); + log_debug("%s: set localprefix", __func__); + break; case IMSG_VMDOP_PAUSE_VM: log_debug("%s: pausing", __func__); event_del(&ev_tap); Index: virtio.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/virtio.c,v retrieving revision 1.103 diff -u -p -u -p -r1.103 virtio.c --- virtio.c 13 May 2023 23:15:28 -0000 1.103 +++ virtio.c 15 May 2023 19:05:28 -0000 @@ -797,6 +797,45 @@ vionet_set_hostmac(struct vmd_vm *vm, un } } +/* + * vionet_set_dhcp_localprefix + * + * Sets the IPv4 address prefix used to conduct DHCP + * negotiation on all vionet_dev's for the current VM + * + * This should only be called from the event-loop thread + * + * vm: pointer to the current vmd_vm instance + * localprefix: address structure to use + */ +void +vionet_set_dhcp_localprefix(struct vmd_vm *vm, struct address localprefix) +{ + struct virtio_dev *dev; + struct vionet_dev *vionet; + int ret; + + SLIST_FOREACH(dev, &virtio_devs, dev_next) { + if (dev->dev_type == VMD_DEVTYPE_NET) { + vionet = &dev->vionet; + + /* Set local vm process copy */ + memcpy(&vionet->localprefix, &localprefix, + sizeof(localprefix)); + + /* Forward on to device process */ + ret = imsg_compose_event(&dev->async_iev, + IMSG_DEVOP_LOCALPREFIX, 0, 0, -1, + &vionet->localprefix, sizeof(vionet->localprefix)); + + if (ret == -1) + log_warnx("%s: failed to queue localprefix " + "to vionet dev %u", __func__, + dev->vionet.idx); + } + } +} + void virtio_shutdown(struct vmd_vm *vm) { Index: virtio.h =================================================================== RCS file: /cvs/src/usr.sbin/vmd/virtio.h,v retrieving revision 1.45 diff -u -p -u -p -r1.45 virtio.h --- virtio.h 27 Apr 2023 22:47:27 -0000 1.45 +++ virtio.h 15 May 2023 19:05:28 -0000 @@ -255,9 +255,11 @@ struct vionet_dev { uint8_t mac[6]; uint8_t hostmac[6]; int lockedmac; - int local; int pxeboot; + int local; + struct address localprefix; + unsigned int idx; }; @@ -369,6 +371,7 @@ int vionet_notify_tx(struct virtio_dev * void vionet_process_rx(uint32_t); int vionet_enq_rx(struct vionet_dev *, char *, size_t, int *); void vionet_set_hostmac(struct vmd_vm *, unsigned int, uint8_t *); +void vionet_set_dhcp_localprefix(struct vmd_vm *, struct address); int vmmci_io(int, uint16_t, uint32_t *, uint8_t *, void *, uint8_t); int vmmci_dump(int); Index: vm.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/vm.c,v retrieving revision 1.89 diff -u -p -u -p -r1.89 vm.c --- vm.c 13 May 2023 23:15:28 -0000 1.89 +++ vm.c 15 May 2023 19:05:28 -0000 @@ -520,6 +520,7 @@ vm_dispatch_vmm(int fd, short event, voi struct vmd_vm *vm = arg; struct vmop_result vmr; struct vmop_addr_result var; + struct address localprefix; struct imsgev *iev = &vm->vm_iev; struct imsgbuf *ibuf = &iev->ibuf; struct imsg imsg; @@ -605,6 +606,12 @@ vm_dispatch_vmm(int fd, short event, voi ether_ntoa((void *)var.var_addr), var.var_nic_idx); vionet_set_hostmac(vm, var.var_nic_idx, var.var_addr); + break; + case IMSG_VMDOP_PRIV_IFADDR_RESPONSE: + IMSG_SIZE_CHECK(&imsg, &localprefix); + memcpy(&localprefix, imsg.data, sizeof(localprefix)); + + vionet_set_dhcp_localprefix(vm, localprefix); break; default: fatalx("%s: got invalid imsg %d from %s", Index: vmd.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/vmd.c,v retrieving revision 1.149 diff -u -p -u -p -r1.149 vmd.c --- vmd.c 13 May 2023 23:15:28 -0000 1.149 +++ vmd.c 15 May 2023 19:05:28 -0000 @@ -597,6 +597,7 @@ vmd_dispatch_agentx(int fd, struct privs int vmd_dispatch_priv(int fd, struct privsep_proc *p, struct imsg *imsg) { + struct privsep *ps = p->p_ps; struct vmop_addr_result var; switch (imsg->hdr.type) { @@ -605,6 +606,9 @@ vmd_dispatch_priv(int fd, struct privsep memcpy(&var, imsg->data, sizeof(var)); proc_forward_imsg(p->p_ps, imsg, PROC_VMM, -1); break; + case IMSG_VMDOP_PRIV_IFADDR_RESPONSE: + proc_forward_imsg(ps, imsg, PROC_VMM, -1); + break; default: return (-1); } @@ -1005,6 +1009,7 @@ vmd_configure(void) { int ncpus; struct vmd_switch *vsw; + struct vmd_vm *vm; int ncpu_mib[] = {CTL_HW, HW_NCPUONLINE}; size_t ncpus_sz = sizeof(ncpus); @@ -1058,6 +1063,15 @@ vmd_configure(void) } } + TAILQ_FOREACH(vm, env->vmd_vms, vm_entry) { + if (vm_priv_ifconfig(&env->vmd_ps, vm) == -1) { + log_warn("%s: failed to update if for vm %u", + __func__, vm->vm_vmid); + vm_terminate(vm, __func__); + return (-1); + } + } + if (!(env->vmd_cfg.cfg_flags & VMD_CFG_STAGGERED_START)) { env->vmd_cfg.delay.tv_sec = VMD_DEFAULT_STAGGERED_START_DELAY; if (sysctl(ncpu_mib, nitems(ncpu_mib), &ncpus, &ncpus_sz, NULL, 0) == -1) @@ -1134,6 +1148,17 @@ vmd_reload(unsigned int reset, const cha log_warn("%s: failed to create switch %s", __func__, vsw->sw_name); switch_remove(vsw); + return (-1); + } + } + + TAILQ_FOREACH(vm, env->vmd_vms, vm_entry) { + log_debug("%s: updating config for vm %u", + __func__, vm->vm_vmid); + if (vm_priv_ifconfig(&env->vmd_ps, vm) == -1) { + log_warn("%s: failed to update if for vm %u", + __func__, vm->vm_vmid); + vm_terminate(vm, __func__); return (-1); } } Index: vmd.h =================================================================== RCS file: /cvs/src/usr.sbin/vmd/vmd.h,v retrieving revision 1.122 diff -u -p -u -p -r1.122 vmd.h --- vmd.h 13 May 2023 23:15:28 -0000 1.122 +++ vmd.h 15 May 2023 19:05:28 -0000 @@ -134,6 +134,7 @@ enum imsg_type { IMSG_VMDOP_PRIV_IFDOWN, IMSG_VMDOP_PRIV_IFGROUP, IMSG_VMDOP_PRIV_IFADDR, + IMSG_VMDOP_PRIV_IFADDR_RESPONSE, IMSG_VMDOP_PRIV_IFADDR6, IMSG_VMDOP_PRIV_IFRDOMAIN, IMSG_VMDOP_PRIV_GET_ADDR, @@ -144,6 +145,7 @@ enum imsg_type { IMSG_VMDOP_DONE, /* Device Operation Messages */ IMSG_DEVOP_HOSTMAC, + IMSG_DEVOP_LOCALPREFIX, IMSG_DEVOP_MSG, }; @@ -473,8 +475,8 @@ int priv_findname(const char *, const c int priv_validgroup(const char *); int vm_priv_ifconfig(struct privsep *, struct vmd_vm *); int vm_priv_brconfig(struct privsep *, struct vmd_switch *); -uint32_t vm_priv_addr(struct vmd_config *, uint32_t, int, int); -int vm_priv_addr6(struct vmd_config *, uint32_t, int, int, +uint32_t vm_priv_addr(struct address *, uint32_t, int, int); +int vm_priv_addr6(struct address *, uint32_t, int, int, struct in6_addr *); /* vmm.c */ @@ -500,6 +502,7 @@ int remap_guest_mem(struct vmd_vm *, in /* config.c */ int config_init(struct vmd *); +int config_init_localprefix(struct vmd_config *); void config_purge(struct vmd *, unsigned int); int config_setconfig(struct vmd *); int config_getconfig(struct vmd *, struct imsg *); Index: vmm.c =================================================================== RCS file: /cvs/src/usr.sbin/vmd/vmm.c,v retrieving revision 1.112 diff -u -p -u -p -r1.112 vmm.c --- vmm.c 13 May 2023 23:15:28 -0000 1.112 +++ vmm.c 15 May 2023 19:05:28 -0000 @@ -115,6 +115,7 @@ vmm_dispatch_parent(int fd, struct privs struct vmop_result vmr; struct vmop_create_params vmc; struct vmop_addr_result var; + struct address localprefix; uint32_t id = 0, peerid = imsg->hdr.peerid; pid_t pid = 0; unsigned int mode, flags; @@ -327,6 +328,16 @@ vmm_dispatch_parent(int fd, struct privs /* Get and terminate all running VMs */ get_info_vm(ps, NULL, 1); + break; + case IMSG_VMDOP_PRIV_IFADDR_RESPONSE: + localprefix = env->vmd_cfg.cfg_localprefix; + + /* Forward interface address details to _every_ guest vm */ + TAILQ_FOREACH(vm, env->vmd_vms, vm_entry) { + imsg_compose_event(&vm->vm_iev, + imsg->hdr.type, imsg->hdr.peerid, imsg->hdr.pid, + imsg->fd, &localprefix, sizeof(localprefix)); + } break; default: return (-1);