hshoexer <[email protected]> writes:

> Hi,
>
> On Tue, Apr 21, 2026 at 01:25:31AM +0900, Yoshihiro Kawamata wrote:
>> I encountered an issue where vmd terminates unexpectedly when multiple
>> VMs are rebooted at the same time.
>
> I'm seeing a similar issue with rebooting multiple VMs at once.
> Can you try this diff?
>
> Take care,
> HJ.
>
> --------------------------------------------------------------------------
> commit a797a331a8d5986ca7b5d81a3df6afadfe6c4f62
> Author: hshoexer <[email protected]>
> Date:   Thu Apr 23 18:39:41 2026 +0200
>
>     vmd(8): Avoid reuse of dead filedescriptor
>
>     When the vmd process sends a kernfd to the vmm process, that
>     descriptor will be closed in msgbuf_write() after a successful
>     sendmsg().  However, that descriptor number is still stored in
>     vm->vm_kernel.
>
>     When termination of one VM is interleaved with lauch of another VM,
>     that number might be reassigned to a _new_ kernfd of the launching
>     VM.  Now we have a race:
>
>      - the vmd process queues an imsg with that descriptor in config_setvm()
>        (for the launching VM)
>      - the vmd process calls in vm_stop() close() on that descriptor
>        (for the terminating VM)
>      - when the vmd process calls proc_dispatch() imsgbuf_send() for
>        imsg queued in config_setvm(), sendmsg() will return EBADF (the
>        descriptor in the control message is invalid)
>
>     By dupping kernfd we can avoid this race.

This makes sense. Do we need to do this to any others?

In general this is ok dv@

>
> diff --git a/usr.sbin/vmd/config.c b/usr.sbin/vmd/config.c
> index 5f23f70f465..55ae589640d 100644
> --- a/usr.sbin/vmd/config.c
> +++ b/usr.sbin/vmd/config.c
> @@ -297,8 +297,11 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, 
> uint32_t peerid, uid_t uid)
>                       goto fail;
>               }
>
> -             vm->vm_kernel = kernfd;
> -             vmc->vmc_kernel = kernfd;
> +             if ((vm->vm_kernel = dup(kernfd)) == -1) {
> +                     ret = errno;
> +                     goto fail;
> +             }
> +             vmc->vmc_kernel = vm->vm_kernel;
>       }
>
>       /* Open CDROM image for child */
> @@ -482,7 +485,7 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, 
> uint32_t peerid, uid_t uid)
>       /* Send VM information */
>       /* XXX check proc_compose_imsg return values */
>       proc_compose_imsg(ps, PROC_VMM, IMSG_VMDOP_START_VM_REQUEST,
> -         vm->vm_vmid, vm->vm_kernel, vmc, sizeof(*vmc));
> +         vm->vm_vmid, kernfd, vmc, sizeof(*vmc));
>
>       if (strlen(vmc->vmc_cdrom))
>               proc_compose_imsg(ps, PROC_VMM, IMSG_VMDOP_START_VM_CDROM,
> @@ -528,6 +531,8 @@ config_setvm(struct privsep *ps, struct vmd_vm *vm, 
> uint32_t peerid, uid_t uid)
>       log_warnx("failed to start vm %s", vmc->vmc_name);
>
>       if (vm->vm_kernel != -1)
> +             close(vm->vm_kernel);
> +     if (kernfd != -1)
>               close(kernfd);
>       if (cdromfd != -1)
>               close(cdromfd);

Reply via email to