On Fri, Oct 15, 2021 at 11:54 AM Alistair Francis <[email protected]> wrote: > > On Thu, Sep 16, 2021 at 11:42 PM Anup Patel <[email protected]> wrote: > > > > On Wed, Sep 15, 2021 at 6:19 AM Alistair Francis <[email protected]> > > wrote: > > > > > > On Tue, Sep 14, 2021 at 2:33 AM Anup Patel <[email protected]> wrote: > > > > > > > > On Thu, Sep 9, 2021 at 12:14 PM Alistair Francis <[email protected]> > > > > wrote: > > > > > > > > > > On Thu, Sep 2, 2021 at 9:26 PM Anup Patel <[email protected]> wrote: > > > > > > > > > > > > The guest external interrupts for external interrupt controller are > > > > > > not delivered to the guest running under hypervisor on time. This > > > > > > results in a guest having sluggish response to serial console input > > > > > > and other I/O events. To improve timely delivery of guest external > > > > > > interrupts, we check and inject interrupt upon every sret > > > > > > instruction. > > > > > > > > > > > > Signed-off-by: Anup Patel <[email protected]> > > > > > > --- > > > > > > target/riscv/op_helper.c | 9 +++++++++ > > > > > > 1 file changed, 9 insertions(+) > > > > > > > > > > > > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c > > > > > > index ee7c24efe7..4c995c239e 100644 > > > > > > --- a/target/riscv/op_helper.c > > > > > > +++ b/target/riscv/op_helper.c > > > > > > @@ -129,6 +129,15 @@ target_ulong helper_sret(CPURISCVState *env, > > > > > > target_ulong cpu_pc_deb) > > > > > > > > > > > > riscv_cpu_set_mode(env, prev_priv); > > > > > > > > > > > > + /* > > > > > > + * QEMU does not promptly deliver guest external interrupts > > > > > > + * to a guest running on a hypervisor which in-turn is running > > > > > > + * on QEMU. We make dummy call to riscv_cpu_update_mip() upon > > > > > > + * every sret instruction so that QEMU pickup guest external > > > > > > + * interrupts sooner. > > > > > > + */ > > > > > > + riscv_cpu_update_mip(env_archcpu(env), 0, 0); > > > > > > > > > > This doesn't seem right. I don't understand why we need this? > > > > > > > > > > riscv_cpu_update_mip() is called when an interrupt is delivered to the > > > > > CPU, if we are missing interrupts then that is a bug somewhere else. > > > > > > > > I have finally figured out the cause of guest external interrupts being > > > > missed by Guest/VM. > > > > > > > > The riscv_cpu_set_irq() which handles guest external interrupt lines > > > > of each CPU is called asynchronously. This function in-turn calls > > > > riscv_cpu_update_mip() but the CPU might be in host mode (V=0) > > > > or in Guest/VM mode (V=1). If the CPU is in host mode (V=0) when > > > > > > The IRQ being raised should just directly call riscv_cpu_update_mip() > > > so the IRQ should happen straight away. > > > > That's not true for guest external interrupts. The target Guest/VM might > > not be running on the receiving HART. > > > > Let say IMSIC injected guest external IRQ1 to HARTx which is meant > > for a Guest/VM, so the riscv_cpu_set_irq() will call riscv_cpu_update_mip(). > > If HARTx might be in HS-mode or HARTx might be running some > > other Guest/VM then cpu_interrupt() request queued by riscv_cpu_update_mip() > > will not result in any interrupt being injected. This further means that > > QEMU has to check and inject guest external interrupts to target > > Guest/VM when HARTx makes a switch from HS-mode to VS-mode. By > > calling riscv_cpu_update_mip() upon SRET instruction we are ensuring > > that if any guest external interrupt was missed then it is injected ot > > VS-mode. > > Ah ok. > > So the problem is that an interrupt can occur for a guest, while that > guest isn't executing.
Yes, that's right. > > So for example a CPU is executing with V=0. `riscv_cpu_update_mip()` > is called, which triggers a hard interrupt. That in turn calls > `riscv_cpu_exec_interrupt()` and `riscv_cpu_local_irq_pending()`. In this case, the hard interrupt is actually a guest external interrupt which is tracked via hgeip CSR. The hgeip CSR is updated immediately but `riscv_cpu_local_irq_pending()` might be called while V=0 hence no interrupt. > > This results in the guest Hypervisor receiving the interrupt, which it > then doesn't act on? Or is MIP set but `riscv_cpu_local_irq_pending()` > returns false due to the enable checks? Here, hgeip CSR will be set and it will be reflected in mip.VSEIP bit only when hypervisor schedules/runs the Guest (i.e. V=1 and hstatus.VGEIN pointing to the appropriate bit of hgeip csr). > > That either seems like a guest bug or that we need some functionality > in `riscv_cpu_swap_hypervisor_regs()` to trigger an interrupt on > context swap. This certainly is not a bug with Guest or Hypervisor but rather an issue of invoking `riscv_cpu_exec_interrupt()` and `riscv_cpu_local_irq_pending()` at the right time. Your suggestion of checking and triggering guest external interrupt in `riscv_cpu_swap_hypervisor_regs()` is a better approach. If you are okay then I will update this patch in the next revision. Regards, Anup > > Alistair > > > > > > > > > Even from MTTCG I see this: > > > > > > """ > > > Currently thanks to KVM work any access to IO memory is automatically > > > protected by the global iothread mutex, also known as the BQL (Big > > > Qemu Lock). Any IO region that doesn't use global mutex is expected to > > > do its own locking. > > > > > > However IO memory isn't the only way emulated hardware state can be > > > modified. Some architectures have model specific registers that > > > trigger hardware emulation features. Generally any translation helper > > > that needs to update more than a single vCPUs of state should take the > > > BQL. > > > """ > > > > > > So we should be fine here as well. > > > > > > Can you supply a test case to reproduce the bug? > > > > Just boot Linux Guest using my QEMU, OpenSBI, Linux, and KVMTOOL > > having AIA support patches. If this patch is not there then lot of Guest > > external interrupts are missed and Guest gets stuck at random places. > > > > Regards, > > Anup > > > > > > > > > the riscv_cpu_set_irq() is called, then the CPU interrupt requested by > > > > riscv_cpu_update_mip() has no effect because the CPU can't take > > > > the interrupt until it enters Guest/VM mode. > > > > > > > > This patch does the right thing by doing a dummy call to > > > > riscv_cpu_update_mip() upon SRET instruction so that if the CPU > > > > had missed a guest interrupt previously then the CPU can take it now. > > > > > > This still doesn't look like the right fix. > > > > > > Alistair > > > > > > > > > > > Regards, > > > > Anup
