Re: vmd(8): fix vmctl wait state corruption
> On 24 Apr 2021, at 20:56, Dave Voutila wrote: > > > Dave Voutila writes: > >> Dave Voutila writes: >> >>> vmd(8) users of tech@, >>> >>> NOTE: I have no intention to try to commit this prior to 6.9's release >>> due to its complexity, but I didn't want to "wait" to solicit testers or >>> potential feedback. >> >> Freeze is over, so bumping this thread with an updated diff below. >> > > Now that there's been some testing and snaps are building once again, > anyone willing to review & OK? Wanted to confirm here is as well. The patch works well. Ran this patch against -current with ~30 VMs owned by a user account. Issued vmctl stop -aw, ctrl-c-ed every 3-4 VM, and every time the last VM stopped shutdown properly. Even in rapid succession of vmctl stop -aw + ctrl-c, resulting in multiple VMs in "stopping” stage, all worked well. All VMs also started properly without any fsck needed. Mischa >>> I noticed recently that I could not have two vmctl(8) clients "wait" for >>> the same vm to shutdown as one would cancel the other. Worse yet, if you >>> cancel a wait (^C) you can effectively corrupt the state being used for >>> tracking the waiting client preventing future clients from waiting on >>> the vm. >>> >>> It turns out the socket fd of the vmctl(8) client is being sent by the >>> control process as the peerid in the imsg. This fd is being stored on >>> the vmd_vm structure in the vm_peerid member, but this fd only has >>> meaning in the scope of the control process. Consequently: >>> >>> - only 1 value can be stored at a time, meaning only 1 waiting client >>> can exist at a time >>> - since vm_peerid is used for storing if another vmd(8) process is >>> waiting on the vm, vm_peerid can be corrupted by vmctl(8) >>> - the control process cannot update waiting state on vmctl disconnects >>> and since fd's are reused it's possible the message could be sent to a >>> vmctl(8) client performing an operation other than a "wait" >>> >>> The below diff: >>> >>> 1. enables support for multiple vmctl(8) clients to wait on the same vm >>> to terminate >>> 2. keeps the wait state in the control process and out of the parent's >>> global vm state, tracking waiting parties in a TAILQ >>> 3. removes the waiting client state on client disconnect/cancellation >>> 4. simplifies vmd(8) by removing IMSG_VMDOP_WAIT_VM_REQUEST handling >>> from the vmm process, which isn't needed (and was partially >>> responsible for the corruption) >>> >> >> Above design still stands, but I've fixed some messaging issues related >> to the fact the parent process was forwarding >> IMSG_VMDOP_TERMINATE_VM_RESPONSE messages directly to the control >> process resulting in duplicate messages. This broke doing a `vmctl stop` >> for all vms (-a) and waiting (-w). It now only forwards errors. >> >>> There are some subsequent tweaks that may follow this diff, specifically >>> one related to the fact I've switched the logic to send >>> IMSG_VMDOP_TERMINATE_VM_EVENT messages to the control process (which >>> makes sense to me) but control relays a IMSG_VMDOP_TERMINATE_VM_RESPONSE >>> message to the waiting vmctl(8) client. I'd need to update vmctl(8) to >>> look for the other event and don't want to complicate the diff further. >>> >>> If any testers out there can try to break this for me it would be much >>> appreciated. :-) >>> >> >> Testers? I'd like to give people a few days to kick the tires before >> asking for OK to commit. >> >> -dv >> >> >> Index: control.c >> === >> RCS file: /cvs/src/usr.sbin/vmd/control.c,v >> retrieving revision 1.34 >> diff -u -p -r1.34 control.c >> --- control.c20 Apr 2021 21:11:56 - 1.34 >> +++ control.c21 Apr 2021 17:17:04 - >> @@ -41,6 +41,13 @@ >> >> struct ctl_connlist ctl_conns = TAILQ_HEAD_INITIALIZER(ctl_conns); >> >> +struct ctl_notify { >> +int ctl_fd; >> +uint32_tctl_vmid; >> +TAILQ_ENTRY(ctl_notify) entry; >> +}; >> +TAILQ_HEAD(ctl_notify_q, ctl_notify) ctl_notify_q = >> +TAILQ_HEAD_INITIALIZER(ctl_notify_q); >> void >> control_accept(int, short, void *); >> struct ctl_conn >> @@ -78,7 +85,10 @@ int >> control_dispatch_vmd(int fd, struct privsep_proc *p, struct imsg *imsg) >> { >> struct ctl_conn *c; >> +struct ctl_notify *notify = NULL, *notify_next; >> struct privsep *ps = p->p_ps; >> +struct vmop_result vmr; >> +int waiting = 0; >> >> switch (imsg->hdr.type) { >> case IMSG_VMDOP_START_VM_RESPONSE: >> @@ -86,11 +96,11 @@ control_dispatch_vmd(int fd, struct priv >> case IMSG_VMDOP_SEND_VM_RESPONSE: >> case IMSG_VMDOP_RECEIVE_VM_RESPONSE: >> case IMSG_VMDOP_UNPAUSE_VM_RESPONSE: >> -case IMSG_VMDOP_TERMINATE_VM_RESPONSE: >> case IMSG_VMDOP_GET_INFO_VM_DATA: >> case IMSG_VMDOP_GET_INFO_VM_END_DATA: >> case
Re: vmd(8): fix vmctl wait state corruption
Dave Voutila writes: > Dave Voutila writes: > >> vmd(8) users of tech@, >> >> NOTE: I have no intention to try to commit this prior to 6.9's release >> due to its complexity, but I didn't want to "wait" to solicit testers or >> potential feedback. > > Freeze is over, so bumping this thread with an updated diff below. > Now that there's been some testing and snaps are building once again, anyone willing to review & OK? >> >> I noticed recently that I could not have two vmctl(8) clients "wait" for >> the same vm to shutdown as one would cancel the other. Worse yet, if you >> cancel a wait (^C) you can effectively corrupt the state being used for >> tracking the waiting client preventing future clients from waiting on >> the vm. >> >> It turns out the socket fd of the vmctl(8) client is being sent by the >> control process as the peerid in the imsg. This fd is being stored on >> the vmd_vm structure in the vm_peerid member, but this fd only has >> meaning in the scope of the control process. Consequently: >> >> - only 1 value can be stored at a time, meaning only 1 waiting client >> can exist at a time >> - since vm_peerid is used for storing if another vmd(8) process is >> waiting on the vm, vm_peerid can be corrupted by vmctl(8) >> - the control process cannot update waiting state on vmctl disconnects >> and since fd's are reused it's possible the message could be sent to a >> vmctl(8) client performing an operation other than a "wait" >> >> The below diff: >> >> 1. enables support for multiple vmctl(8) clients to wait on the same vm >>to terminate >> 2. keeps the wait state in the control process and out of the parent's >>global vm state, tracking waiting parties in a TAILQ >> 3. removes the waiting client state on client disconnect/cancellation >> 4. simplifies vmd(8) by removing IMSG_VMDOP_WAIT_VM_REQUEST handling >>from the vmm process, which isn't needed (and was partially >>responsible for the corruption) >> > > Above design still stands, but I've fixed some messaging issues related > to the fact the parent process was forwarding > IMSG_VMDOP_TERMINATE_VM_RESPONSE messages directly to the control > process resulting in duplicate messages. This broke doing a `vmctl stop` > for all vms (-a) and waiting (-w). It now only forwards errors. > >> There are some subsequent tweaks that may follow this diff, specifically >> one related to the fact I've switched the logic to send >> IMSG_VMDOP_TERMINATE_VM_EVENT messages to the control process (which >> makes sense to me) but control relays a IMSG_VMDOP_TERMINATE_VM_RESPONSE >> message to the waiting vmctl(8) client. I'd need to update vmctl(8) to >> look for the other event and don't want to complicate the diff further. >> >> If any testers out there can try to break this for me it would be much >> appreciated. :-) >> > > Testers? I'd like to give people a few days to kick the tires before > asking for OK to commit. > > -dv > > > Index: control.c > === > RCS file: /cvs/src/usr.sbin/vmd/control.c,v > retrieving revision 1.34 > diff -u -p -r1.34 control.c > --- control.c 20 Apr 2021 21:11:56 - 1.34 > +++ control.c 21 Apr 2021 17:17:04 - > @@ -41,6 +41,13 @@ > > struct ctl_connlist ctl_conns = TAILQ_HEAD_INITIALIZER(ctl_conns); > > +struct ctl_notify { > + int ctl_fd; > + uint32_tctl_vmid; > + TAILQ_ENTRY(ctl_notify) entry; > +}; > +TAILQ_HEAD(ctl_notify_q, ctl_notify) ctl_notify_q = > + TAILQ_HEAD_INITIALIZER(ctl_notify_q); > void >control_accept(int, short, void *); > struct ctl_conn > @@ -78,7 +85,10 @@ int > control_dispatch_vmd(int fd, struct privsep_proc *p, struct imsg *imsg) > { > struct ctl_conn *c; > + struct ctl_notify *notify = NULL, *notify_next; > struct privsep *ps = p->p_ps; > + struct vmop_result vmr; > + int waiting = 0; > > switch (imsg->hdr.type) { > case IMSG_VMDOP_START_VM_RESPONSE: > @@ -86,11 +96,11 @@ control_dispatch_vmd(int fd, struct priv > case IMSG_VMDOP_SEND_VM_RESPONSE: > case IMSG_VMDOP_RECEIVE_VM_RESPONSE: > case IMSG_VMDOP_UNPAUSE_VM_RESPONSE: > - case IMSG_VMDOP_TERMINATE_VM_RESPONSE: > case IMSG_VMDOP_GET_INFO_VM_DATA: > case IMSG_VMDOP_GET_INFO_VM_END_DATA: > case IMSG_CTL_FAIL: > case IMSG_CTL_OK: > + /* Provide basic response back to a specific control client */ > if ((c = control_connbyfd(imsg->hdr.peerid)) == NULL) { > log_warnx("%s: lost control connection: fd %d", > __func__, imsg->hdr.peerid); > @@ -99,6 +109,61 @@ control_dispatch_vmd(int fd, struct priv > imsg_compose_event(>iev, imsg->hdr.type, > 0, 0, imsg->fd, imsg->data, IMSG_DATA_SIZE(imsg)); > break; > + case IMSG_VMDOP_TERMINATE_VM_RESPONSE:
Re: vmd(8): fix vmctl wait state corruption
Dave Voutila writes: > vmd(8) users of tech@, > > NOTE: I have no intention to try to commit this prior to 6.9's release > due to its complexity, but I didn't want to "wait" to solicit testers or > potential feedback. Freeze is over, so bumping this thread with an updated diff below. > > I noticed recently that I could not have two vmctl(8) clients "wait" for > the same vm to shutdown as one would cancel the other. Worse yet, if you > cancel a wait (^C) you can effectively corrupt the state being used for > tracking the waiting client preventing future clients from waiting on > the vm. > > It turns out the socket fd of the vmctl(8) client is being sent by the > control process as the peerid in the imsg. This fd is being stored on > the vmd_vm structure in the vm_peerid member, but this fd only has > meaning in the scope of the control process. Consequently: > > - only 1 value can be stored at a time, meaning only 1 waiting client > can exist at a time > - since vm_peerid is used for storing if another vmd(8) process is > waiting on the vm, vm_peerid can be corrupted by vmctl(8) > - the control process cannot update waiting state on vmctl disconnects > and since fd's are reused it's possible the message could be sent to a > vmctl(8) client performing an operation other than a "wait" > > The below diff: > > 1. enables support for multiple vmctl(8) clients to wait on the same vm >to terminate > 2. keeps the wait state in the control process and out of the parent's >global vm state, tracking waiting parties in a TAILQ > 3. removes the waiting client state on client disconnect/cancellation > 4. simplifies vmd(8) by removing IMSG_VMDOP_WAIT_VM_REQUEST handling >from the vmm process, which isn't needed (and was partially >responsible for the corruption) > Above design still stands, but I've fixed some messaging issues related to the fact the parent process was forwarding IMSG_VMDOP_TERMINATE_VM_RESPONSE messages directly to the control process resulting in duplicate messages. This broke doing a `vmctl stop` for all vms (-a) and waiting (-w). It now only forwards errors. > There are some subsequent tweaks that may follow this diff, specifically > one related to the fact I've switched the logic to send > IMSG_VMDOP_TERMINATE_VM_EVENT messages to the control process (which > makes sense to me) but control relays a IMSG_VMDOP_TERMINATE_VM_RESPONSE > message to the waiting vmctl(8) client. I'd need to update vmctl(8) to > look for the other event and don't want to complicate the diff further. > > If any testers out there can try to break this for me it would be much > appreciated. :-) > Testers? I'd like to give people a few days to kick the tires before asking for OK to commit. -dv Index: control.c === RCS file: /cvs/src/usr.sbin/vmd/control.c,v retrieving revision 1.34 diff -u -p -r1.34 control.c --- control.c 20 Apr 2021 21:11:56 - 1.34 +++ control.c 21 Apr 2021 17:17:04 - @@ -41,6 +41,13 @@ struct ctl_connlist ctl_conns = TAILQ_HEAD_INITIALIZER(ctl_conns); +struct ctl_notify { + int ctl_fd; + uint32_tctl_vmid; + TAILQ_ENTRY(ctl_notify) entry; +}; +TAILQ_HEAD(ctl_notify_q, ctl_notify) ctl_notify_q = + TAILQ_HEAD_INITIALIZER(ctl_notify_q); void control_accept(int, short, void *); struct ctl_conn @@ -78,7 +85,10 @@ int control_dispatch_vmd(int fd, struct privsep_proc *p, struct imsg *imsg) { struct ctl_conn *c; + struct ctl_notify *notify = NULL, *notify_next; struct privsep *ps = p->p_ps; + struct vmop_result vmr; + int waiting = 0; switch (imsg->hdr.type) { case IMSG_VMDOP_START_VM_RESPONSE: @@ -86,11 +96,11 @@ control_dispatch_vmd(int fd, struct priv case IMSG_VMDOP_SEND_VM_RESPONSE: case IMSG_VMDOP_RECEIVE_VM_RESPONSE: case IMSG_VMDOP_UNPAUSE_VM_RESPONSE: - case IMSG_VMDOP_TERMINATE_VM_RESPONSE: case IMSG_VMDOP_GET_INFO_VM_DATA: case IMSG_VMDOP_GET_INFO_VM_END_DATA: case IMSG_CTL_FAIL: case IMSG_CTL_OK: + /* Provide basic response back to a specific control client */ if ((c = control_connbyfd(imsg->hdr.peerid)) == NULL) { log_warnx("%s: lost control connection: fd %d", __func__, imsg->hdr.peerid); @@ -99,6 +109,61 @@ control_dispatch_vmd(int fd, struct priv imsg_compose_event(>iev, imsg->hdr.type, 0, 0, imsg->fd, imsg->data, IMSG_DATA_SIZE(imsg)); break; + case IMSG_VMDOP_TERMINATE_VM_RESPONSE: + IMSG_SIZE_CHECK(imsg, ); + memcpy(, imsg->data, sizeof(vmr)); + + if ((c = control_connbyfd(imsg->hdr.peerid)) == NULL) { + log_warnx("%s: lost control connection: fd
vmd(8): fix vmctl wait state corruption
vmd(8) users of tech@, NOTE: I have no intention to try to commit this prior to 6.9's release due to its complexity, but I didn't want to "wait" to solicit testers or potential feedback. I noticed recently that I could not have two vmctl(8) clients "wait" for the same vm to shutdown as one would cancel the other. Worse yet, if you cancel a wait (^C) you can effectively corrupt the state being used for tracking the waiting client preventing future clients from waiting on the vm. It turns out the socket fd of the vmctl(8) client is being sent by the control process as the peerid in the imsg. This fd is being stored on the vmd_vm structure in the vm_peerid member, but this fd only has meaning in the scope of the control process. Consequently: - only 1 value can be stored at a time, meaning only 1 waiting client can exist at a time - since vm_peerid is used for storing if another vmd(8) process is waiting on the vm, vm_peerid can be corrupted by vmctl(8) - the control process cannot update waiting state on vmctl disconnects and since fd's are reused it's possible the message could be sent to a vmctl(8) client performing an operation other than a "wait" The below diff: 1. enables support for multiple vmctl(8) clients to wait on the same vm to terminate 2. keeps the wait state in the control process and out of the parent's global vm state, tracking waiting parties in a TAILQ 3. removes the waiting client state on client disconnect/cancellation 4. simplifies vmd(8) by removing IMSG_VMDOP_WAIT_VM_REQUEST handling from the vmm process, which isn't needed (and was partially responsible for the corruption) There are some subsequent tweaks that may follow this diff, specifically one related to the fact I've switched the logic to send IMSG_VMDOP_TERMINATE_VM_EVENT messages to the control process (which makes sense to me) but control relays a IMSG_VMDOP_TERMINATE_VM_RESPONSE message to the waiting vmctl(8) client. I'd need to update vmctl(8) to look for the other event and don't want to complicate the diff further. If any testers out there can try to break this for me it would be much appreciated. :-) -Dave Index: control.c === RCS file: /cvs/src/usr.sbin/vmd/control.c,v retrieving revision 1.33 diff -u -p -r1.33 control.c --- control.c 11 Apr 2021 18:53:23 - 1.33 +++ control.c 12 Apr 2021 11:20:08 - @@ -39,7 +39,15 @@ #defineCONTROL_BACKLOG 5 -struct ctl_connlist ctl_conns; +struct ctl_notify { + int ctl_fd; + uint32_tctl_vmid; + TAILQ_ENTRY(ctl_notify) entry; +}; +TAILQ_HEAD(ctl_notifylist, ctl_notify); + +struct ctl_notifylist notifylist; +struct ctl_connlistctl_conns; void control_accept(int, short, void *); @@ -78,7 +86,10 @@ int control_dispatch_vmd(int fd, struct privsep_proc *p, struct imsg *imsg) { struct ctl_conn *c; + struct ctl_notify *notify, *notify_next; struct privsep *ps = p->p_ps; + struct vmop_result vmr; + int waiting = 0; switch (imsg->hdr.type) { case IMSG_VMDOP_START_VM_RESPONSE: @@ -86,11 +97,11 @@ control_dispatch_vmd(int fd, struct priv case IMSG_VMDOP_SEND_VM_RESPONSE: case IMSG_VMDOP_RECEIVE_VM_RESPONSE: case IMSG_VMDOP_UNPAUSE_VM_RESPONSE: - case IMSG_VMDOP_TERMINATE_VM_RESPONSE: case IMSG_VMDOP_GET_INFO_VM_DATA: case IMSG_VMDOP_GET_INFO_VM_END_DATA: case IMSG_CTL_FAIL: case IMSG_CTL_OK: + /* Provide basic response back to a specific control client */ if ((c = control_connbyfd(imsg->hdr.peerid)) == NULL) { log_warnx("%s: lost control connection: fd %d", __func__, imsg->hdr.peerid); @@ -99,6 +110,50 @@ control_dispatch_vmd(int fd, struct priv imsg_compose_event(>iev, imsg->hdr.type, 0, 0, imsg->fd, imsg->data, IMSG_DATA_SIZE(imsg)); break; + case IMSG_VMDOP_TERMINATE_VM_RESPONSE: + IMSG_SIZE_CHECK(imsg, ); + memcpy(, imsg->data, sizeof(vmr)); + + if ((c = control_connbyfd(imsg->hdr.peerid)) != NULL) { + TAILQ_FOREACH(notify, , entry) { + if (notify->ctl_fd == (int) imsg->hdr.peerid) { + /* Update if waiting by vm name */ + if (notify->ctl_vmid < 1) + notify->ctl_vmid = vmr.vmr_id; + waiting = 1; + break; + } + } + } else { + log_warnx("%s: lost control connection: fd %d", + __func__,
Re: vmctl wait
On Mon, Dec 03, 2018 at 06:22:14PM +0100, Claudio Jeker wrote: > This adds a feature to vmctl/vmd to wait for a VM to stop. > It is a feature usable in many situation where you wait for a VM to halt > after work is done. This is more or less vmctl stop -w without > sending the termination to the VM. > > There is only one vmctl that can wait so if a second one comes in the > previous one will terminate with an error. This is why I modified the > error path a bit in vmctl. > > -- > :wq Claudio Ok ccardenas@ +--+ Carlos
vmctl wait
This adds a feature to vmctl/vmd to wait for a VM to stop. It is a feature usable in many situation where you wait for a VM to halt after work is done. This is more or less vmctl stop -w without sending the termination to the VM. There is only one vmctl that can wait so if a second one comes in the previous one will terminate with an error. This is why I modified the error path a bit in vmctl. -- :wq Claudio Index: vmctl/main.c === RCS file: /cvs/src/usr.sbin/vmctl/main.c,v retrieving revision 1.48 diff -u -p -r1.48 main.c --- vmctl/main.c26 Nov 2018 10:39:30 - 1.48 +++ vmctl/main.c3 Dec 2018 17:11:08 - @@ -63,6 +63,7 @@ intctl_reset(struct parse_result *, i int ctl_start(struct parse_result *, int, char *[]); int ctl_status(struct parse_result *, int, char *[]); int ctl_stop(struct parse_result *, int, char *[]); +int ctl_waitfor(struct parse_result *, int, char *[]); int ctl_pause(struct parse_result *, int, char *[]); int ctl_unpause(struct parse_result *, int, char *[]); int ctl_send(struct parse_result *, int, char *[]); @@ -82,6 +83,7 @@ struct ctl_command ctl_commands[] = { "\t\t[-n switch] [-i count] [-d disk]* [-t name]" }, { "status", CMD_STATUS, ctl_status, "[id]" }, { "stop", CMD_STOP, ctl_stop, "[id|-a] [-fw]" }, + { "wait", CMD_WAITFOR,ctl_waitfor,"id" }, { "pause", CMD_PAUSE, ctl_pause, "id" }, { "unpause",CMD_UNPAUSE,ctl_unpause,"id" }, { "send", CMD_SEND, ctl_send, "id", 1}, @@ -178,7 +180,7 @@ parse(int argc, char *argv[]) err(1, "pledge"); } if (ctl->main(, argc, argv) != 0) - err(1, "failed"); + exit(1); if (ctl_sock != -1) { close(ibuf->fd); @@ -251,6 +253,9 @@ vmmaction(struct parse_result *res) imsg_compose(ibuf, IMSG_CTL_RESET, 0, 0, -1, >mode, sizeof(res->mode)); break; + case CMD_WAITFOR: + waitfor_vm(res->id, res->name); + break; case CMD_PAUSE: pause_vm(res->id, res->name); break; @@ -310,6 +315,9 @@ vmmaction(struct parse_result *res) done = vm_start_complete(, , tty_autoconnect); break; + case CMD_WAITFOR: + flags = VMOP_WAIT; + /* FALLTHROUGH */ case CMD_STOP: done = terminate_vm_complete(, , flags); @@ -337,7 +345,10 @@ vmmaction(struct parse_result *res) } } - return (0); + if (ret) + return (1); + else + return (0); } void @@ -941,6 +952,18 @@ ctl_stop(struct parse_result *res, int a int ctl_console(struct parse_result *res, int argc, char *argv[]) +{ + if (argc == 2) { + if (parse_vmid(res, argv[1], 0) == -1) + errx(1, "invalid id: %s", argv[1]); + } else if (argc != 2) + ctl_usage(res->ctl); + + return (vmmaction(res)); +} + +int +ctl_waitfor(struct parse_result *res, int argc, char *argv[]) { if (argc == 2) { if (parse_vmid(res, argv[1], 0) == -1) Index: vmctl/vmctl.8 === RCS file: /cvs/src/usr.sbin/vmctl/vmctl.8,v retrieving revision 1.54 diff -u -p -r1.54 vmctl.8 --- vmctl/vmctl.8 20 Nov 2018 12:48:16 - 1.54 +++ vmctl/vmctl.8 3 Dec 2018 17:11:08 - @@ -234,6 +234,8 @@ Stop all running VMs. .It Cm unpause Ar id Unpause (resume from a paused state) a VM with the specified .Ar id . +.It Cm wait Ar id +Wait until the specified VM has stopped. .El .Pp If the Index: vmctl/vmctl.c === RCS file: /cvs/src/usr.sbin/vmctl/vmctl.c,v retrieving revision 1.63 diff -u -p -r1.63 vmctl.c --- vmctl/vmctl.c 26 Nov 2018 10:39:30 - 1.63 +++ vmctl/vmctl.c 3 Dec 2018 17:11:08 - @@ -496,6 +496,10 @@ terminate_vm_complete(struct imsg *imsg, fprintf(stderr, "vm not found\n"); *ret = EIO; break; + case EINTR: + fprintf(stderr, "interrupted call\n"); + *ret = EIO; + break; default: errno = res; fprintf(stderr, "failed: %s\n",