Last month, we experienced several guests crash(6cores-8cores), qemu logs display the following messages:
qemu-system-x86_64: /build/qemu-2.1.2/kvm-all.c:976: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. After analysis and verification, we can confirm it's irq-balance daemon(in guest) leads to the assertion failure. Start a 8 core guest with two disks, execute the following scripts will reproduce the BUG quickly: irq_affinity.sh ======================================================================== #!/bin/sh vda_irq_num=25 vdb_irq_num=27 while [ 1 ] do for irq in {1,2,4,8,10,20,40,80} do echo $irq > /proc/irq/$vda_irq_num/smp_affinity echo $irq > /proc/irq/$vdb_irq_num/smp_affinity dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct done done ======================================================================== QEMU setup static irq route entries in kvm_pc_setup_irq_routing(), PIC and IOAPIC share the first 15 GSI numbers, take up 23 GSI numbers, but take up 38 irq route entries. When change irq smp_affinity in guest, a dynamic route entry may be setup, the current logic is: if allocate GSI number succeeds, a new route entry can be added. The available dynamic GSI numbers is 1021(KVM_MAX_IRQ_ROUTES-23), but available irq route entries is only 986(KVM_MAX_IRQ_ROUTES-38), GSI numbers greater than route entries. irq-balance's behavior will eventually leads to total irq route entries exceed KVM_MAX_IRQ_ROUTES, ioctl(KVM_SET_GSI_ROUTING) fail and kvm_irqchip_commit_routes() trigger assertion failure. This patch fix the BUG. Signed-off-by: Wenshuang Ma <kevin...@tencent.com> --- Changes v1 -> v2: Take into account Paolo's comments: * extra call to kvm_flush_dynamic_msi_routes make retry code useless, delete it. kvm-all.c | 18 +++++++++++------- 1 files changed, 11 insertions(+), 7 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 18cc6b4..8c73bda 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1121,9 +1121,18 @@ static int kvm_irqchip_get_virq(KVMState *s) uint32_t *word = s->used_gsi_bitmap; int max_words = ALIGN(s->gsi_count, 32) / 32; int i, bit; - bool retry = true; -again: + /* + * PIC and IOAPIC share the first 15 GSI numbers,available GSI + * numbers greater than IRQ route entries. If allocate GSI number + * succeeds, a new route entry can be added, so total IRQ route + * enties can exceed gsi_count, flush dynamic MSI entries when + * IRQ route entries arrive gsi_count. + */ + if (!s->direct_msi && s->irq_routes->nr == s->gsi_count) { + kvm_flush_dynamic_msi_routes(s); + } + /* Return the lowest unused GSI in the bitmap */ for (i = 0; i < max_words; i++) { bit = ffs(~word[i]); @@ -1133,11 +1142,6 @@ again: return bit - 1 + i * 32; } - if (!s->direct_msi && retry) { - retry = false; - kvm_flush_dynamic_msi_routes(s); - goto again; - } return -ENOSPC; } -- 1.7.1