On Mon, Mar 04, 2019 at 01:25:16PM +0200, Wictor Lund wrote:
> Hi misc@!
> 
> I have figured out that it is possible to get vmd(8) into a state where
> 1) com1_dev.rcv_pending != 0 
> 2) there is data pending on com1_dev.fd
> 3) the guest doesn't seem to care
> 
> This results in a locked up situation where com_rcv_event() is called on
> indefinitely.  It seems to me that an interrupt is lost somewhere, leading
> to a situation where the guest OS is happily ignorant of the available data,
> while the vmm is waiting for the guest to eat it up. 
> 
> This has made it impossible to install Linux via the serial console on
> vmm(4).  It seems that people previously have reported "freezing" problems
> in vmm(4) form time to time, but when reported no one else have been able to
> reproduce it.
> 
> I have solved the problem for myself by changing com_rcv_event() to the
> following:
> 
> static void
> com_rcv_event(int fd, short kind, void *arg)
> {
>         mutex_lock(&com1_dev.mutex);
> 
>         /*
>          * We already have other data pending to be received. The data that
>          * has become available now will be moved to the com port later.
>          */
>         if (com1_dev.rcv_pending) {
>                 /* If pending interrupt, inject */
>                 if ((com1_dev.regs.iir & IIR_NOPEND) == 0) {
>                         utrace("comrcv injintr", &com1_dev.regs.lsr, 
> sizeof(com1_dev.regs.lsr));
>                         /* XXX: vcpu_id */
>                         vcpu_assert_pic_irq((uintptr_t) arg, 0, com1_dev.irq);
>                         vcpu_deassert_pic_irq((uintptr_t) arg, 0, 
> com1_dev.irq);
>                 }
>                 mutex_unlock(&com1_dev.mutex);
>                 return;
>         }
>         if (com1_dev.regs.lsr & LSR_RXRDY)
>                 com1_dev.rcv_pending = 1;
>         else {
>                 com_rcv(&com1_dev, (uintptr_t) arg, 0);
> 
>                 /* If pending interrupt, inject */
>                 if ((com1_dev.regs.iir & IIR_NOPEND) == 0) {
>                         /* XXX: vcpu_id */
>                         vcpu_assert_pic_irq((uintptr_t) arg, 0, com1_dev.irq);
>                         vcpu_deassert_pic_irq((uintptr_t) arg, 0, 
> com1_dev.irq);
>                 }
>         }
> 
>         mutex_unlock(&com1_dev.mutex);
> }
> 
> However, I have little experience in the interrupt behaviour on x86.  I'm
> also aware of that there has been an attempt to fix this behaviour [1].
> 
> I think the problem is that when com_rcv() is called from
> vcpu_process_com_data(), the interrupt is triggered using vcpu_exit_inout(),
> which was not touched in the previous attempt [1] to fix the "freezing"
> problem.  vcpu_exit_inout() still uses a simple vcpu_assert_pic_irq() call
> to trigger the interrupt while for example com_rcv_event() uses the
> vcpu_assert_pic_irq(); vcpu_deassert_pic_irq() sequence to trigger it.
> 
> With my modifications to com_rcv_event() I was able to install not only
> alpine linux, but even debian using the serial console.  Without the
> modification I can't even install alpine linux via the serial console.
> 
> Any thoughts on this?  If people think my change is a sound one, I can make
> a proper patch for it.  If people think the change is unsound, I would have
> to look into changing vcpu_exit_inout() and probably extend the interface to
> it to decide how the interrupt should be triggered.
> 
> 1. https://marc.info/?l=openbsd-cvs&m=153115270302514&w=2
> 
> --
> Wictor Lund
> 

Thanks Wictor!

Can you make a proper diff and resend please?

-ml

Reply via email to