On Mon, Mar 04, 2019 at 01:25:16PM +0200, Wictor Lund wrote: > Hi misc@! > > I have figured out that it is possible to get vmd(8) into a state where > 1) com1_dev.rcv_pending != 0 > 2) there is data pending on com1_dev.fd > 3) the guest doesn't seem to care > > This results in a locked up situation where com_rcv_event() is called on > indefinitely. It seems to me that an interrupt is lost somewhere, leading > to a situation where the guest OS is happily ignorant of the available data, > while the vmm is waiting for the guest to eat it up. > > This has made it impossible to install Linux via the serial console on > vmm(4). It seems that people previously have reported "freezing" problems > in vmm(4) form time to time, but when reported no one else have been able to > reproduce it. > > I have solved the problem for myself by changing com_rcv_event() to the > following: > > static void > com_rcv_event(int fd, short kind, void *arg) > { > mutex_lock(&com1_dev.mutex); > > /* > * We already have other data pending to be received. The data that > * has become available now will be moved to the com port later. > */ > if (com1_dev.rcv_pending) { > /* If pending interrupt, inject */ > if ((com1_dev.regs.iir & IIR_NOPEND) == 0) { > utrace("comrcv injintr", &com1_dev.regs.lsr, > sizeof(com1_dev.regs.lsr)); > /* XXX: vcpu_id */ > vcpu_assert_pic_irq((uintptr_t) arg, 0, com1_dev.irq); > vcpu_deassert_pic_irq((uintptr_t) arg, 0, > com1_dev.irq); > } > mutex_unlock(&com1_dev.mutex); > return; > } > if (com1_dev.regs.lsr & LSR_RXRDY) > com1_dev.rcv_pending = 1; > else { > com_rcv(&com1_dev, (uintptr_t) arg, 0); > > /* If pending interrupt, inject */ > if ((com1_dev.regs.iir & IIR_NOPEND) == 0) { > /* XXX: vcpu_id */ > vcpu_assert_pic_irq((uintptr_t) arg, 0, com1_dev.irq); > vcpu_deassert_pic_irq((uintptr_t) arg, 0, > com1_dev.irq); > } > } > > mutex_unlock(&com1_dev.mutex); > } > > However, I have little experience in the interrupt behaviour on x86. I'm > also aware of that there has been an attempt to fix this behaviour [1]. > > I think the problem is that when com_rcv() is called from > vcpu_process_com_data(), the interrupt is triggered using vcpu_exit_inout(), > which was not touched in the previous attempt [1] to fix the "freezing" > problem. vcpu_exit_inout() still uses a simple vcpu_assert_pic_irq() call > to trigger the interrupt while for example com_rcv_event() uses the > vcpu_assert_pic_irq(); vcpu_deassert_pic_irq() sequence to trigger it. > > With my modifications to com_rcv_event() I was able to install not only > alpine linux, but even debian using the serial console. Without the > modification I can't even install alpine linux via the serial console. > > Any thoughts on this? If people think my change is a sound one, I can make > a proper patch for it. If people think the change is unsound, I would have > to look into changing vcpu_exit_inout() and probably extend the interface to > it to decide how the interrupt should be triggered. > > 1. https://marc.info/?l=openbsd-cvs&m=153115270302514&w=2 > > -- > Wictor Lund >
Thanks Wictor! Can you make a proper diff and resend please? -ml