On 10/24/18 10:38 AM, Andrii Anisov wrote:
Hello Julien,
Hi Andrii,
Thank you for the review.
On 23.10.18 21:17, Julien Grall wrote:
Devices that expose their interrupt status registers via system
registers (e.g. Statistical profiling, CPU PMU, DynamIQ PMU, arch timer,
vgic (although unused by Linux), ...)
I guess under vgic you mean GICH registers, is it correct?
I would speculate that GIC v2 registers are memory mapped, not exposed
via system registers.
vGIC is not only about GICv2. It also covers GICv3 that is accessible
via system registers.
rely on a context synchronising
operation on the CPU to ensure that the updated status register is
visible to the CPU when handling the interrupt. This usually happens as
a result of taking the IRQ exception in the first place, but there are
two race scenarios where this isn't the case.
For example, let's say we have two peripherals (X and Y), where Y uses a
system register for its interrupt status.
Case 1:
1. CPU takes an IRQ exception as a result of X raising an interrupt
2. Y then raises its interrupt line, but the update to its system
register is not yet visible to the CPU
3. The GIC decides to expose Y's interrupt number first in the Ack
register
4. The CPU runs the IRQ handler for Y, but the status register is stale
But this scenario somehow explains a strange thing I saw during IRQ
latency investigation (optimization attempts) during last weeks.
The strange sequence is as following:
1. CPU takes an IRQ exception from the guest context
2. In `enter_hypervisor_head()` function (i.e. in `gic_clear_lrs()`
as for mainline) from some LR registers interrupt statuses are read as
PENDING
3. Performing further code, without returning to the guest (i.e.
inside vgic_vcpu_inject_irq()), it happens that we read status INVALID
from the LR we read PENDING before, in step 2. >
Please note that we are tailoring xen based on RELEASE-4.10.0.
As you tailor Xen, I need more details on your modification to be able
to provide feedback on the oddness you encounter.
But you are using GICv2, right? If so, you are not using system
registers for the vGIC. This is not covered by this patch.
Case 2:
1. CPU takes an IRQ exception as a result of X raising an interrupt
2. CPU reads the interrupt number for X from the Ack register and runs
its IRQ handler
3. Y raises its interrupt line and the Ack register is updated, but
again, the update to its system register is not yet visible to the
CPU.
4. Since the GIC drivers poll the Ack register, we read Y's interrupt
number and run its handler without a context synchronisation
operation, therefore seeing the stale register value.
In either case, we run the risk of missing an IRQ. This patch solves the
problem by ensuring that we execute an ISB in the GIC drivers prior
to invoking the interrupt handler.
Based on Linux commit 39a06b67c2c1256bcf2361a1f67d2529f70ab206
"irqchip/gic: Ensure we have an ISB between ack and ->handle_irq".
Signed-off-by: Julien Grall <julien.gr...@arm.com>
---
This patch is a candidate for backporting up to Xen 4.9.
---
xen/arch/arm/gic.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 8d7e491060..305fbd66dd 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -388,12 +388,14 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
if ( likely(irq >= 16 && irq < 1020) )
{
local_irq_enable();
+ isb();
Taking in account that the first GICH accesses are from
`gic_clear_lrs()`, called from `enter_hypervisor_head`, I would suggest
moving isb() there.
do_IRQ(regs, irq, is_fiq);
local_irq_disable();
}
else if ( is_lpi(irq) )
{
local_irq_enable();
+ isb();
gic_hw_ops->do_LPI(irq);
local_irq_disable();
}
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel