Re: [PATCH 2/2] KVM: arm/arm64: vgic: Fix irq refcount leak in kvm_vgic_set_owner()
On Thu, Jun 06, 2019 at 01:06:33PM +0100, Marc Zyngier wrote: > On 06/06/2019 11:58, Dave Martin wrote: > > kvm_vgic_set_owner() leaks a reference on the vgic_irq descriptor, > > which does not seem to match up with any vgic_put_irq() that I can > > find. > > > > Since the irq pointer is not passed out and the caller must anyway > > subsequently use vgic_get_irq() when is wants a pointer, it is not > > clear why we should have a dangling refcount here. > > > > The refcount is still needed inside kvm_vgic_set_owner() to prevent > > the vgic_irq struct from disappearing while while it is > > manipulated. > > > > So, keep it vgic_get_irq() here, but add the matching > > vgic_put_irq() before returning. > > > > unreferenced object 0x800b6365ab80 (size 128): > > comm "qemu-system-aar", pid 14414, jiffies 4300822606 (age 84.436s) > > hex dump (first 32 bytes): > > 00 00 00 00 00 00 00 00 b0 e1 e0 38 00 00 ff ff ...8 > > b0 e1 e0 38 00 00 ff ff 78 e6 ad dd 0a 80 ff ff ...8x... > > backtrace: > > [] kmem_cache_alloc+0x178/0x208 > > [<114591cb>] vgic_add_lpi.part.5+0x34/0x190 > > [ ] vgic_its_cmd_handle_mapi+0x320/0x348 > > [<935c5c32>] vgic_its_process_commands.part.14+0x350/0x8b8 > > [ ] vgic_mmio_write_its_cwriter+0x78/0x98 > > [<8659acd2>] dispatch_mmio_write+0xd4/0x120 > > > > [...] > > > > Cc: Christoffer Dall > > Fixes: c6ccd30e0de3 ("KVM: arm/arm64: Introduce an allocator for in-kernel > > irq lines") > > Signed-off-by: Dave Martin > > > > --- > > > > Based on the limited testing I've done so far, the patch _appears_ to > > fix the bug. > > > > However, I still don't understand which the bug is intermittent, or why > > the arch_timer or pmu (the only apparent users of kvm_vgic_set_owner()) > > are claiming an LPI in the first place. > > > > So there may be other bugs in the mix, or I may have misunderstood > > something... > > Yeah, this doesn't make much sense. Both timer and PMU are using PPIs, > which are not refcounted, so this vgic_put_irq() is effectively a NOP. > It doesn't invalidate the patch itself, it is just that I seriously > doubt it fixes anything. > > LPIs do not use the owner field so far, so we must have another get/put > mismatch somewhere. No argument from me. As I say, this change _appeared_ to make this leak go away, but I couldn't understand why, and didn't kick it very thoroughly. So it may well be a red herring. Cheers ---Dave ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 2/2] KVM: arm/arm64: vgic: Fix irq refcount leak in kvm_vgic_set_owner()
On 06/06/2019 11:58, Dave Martin wrote: > kvm_vgic_set_owner() leaks a reference on the vgic_irq descriptor, > which does not seem to match up with any vgic_put_irq() that I can > find. > > Since the irq pointer is not passed out and the caller must anyway > subsequently use vgic_get_irq() when is wants a pointer, it is not > clear why we should have a dangling refcount here. > > The refcount is still needed inside kvm_vgic_set_owner() to prevent > the vgic_irq struct from disappearing while while it is > manipulated. > > So, keep it vgic_get_irq() here, but add the matching > vgic_put_irq() before returning. > > unreferenced object 0x800b6365ab80 (size 128): > comm "qemu-system-aar", pid 14414, jiffies 4300822606 (age 84.436s) > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 b0 e1 e0 38 00 00 ff ff ...8 > b0 e1 e0 38 00 00 ff ff 78 e6 ad dd 0a 80 ff ff ...8x... > backtrace: > [] kmem_cache_alloc+0x178/0x208 > [<114591cb>] vgic_add_lpi.part.5+0x34/0x190 > [ ] vgic_its_cmd_handle_mapi+0x320/0x348 > [<935c5c32>] vgic_its_process_commands.part.14+0x350/0x8b8 > [ ] vgic_mmio_write_its_cwriter+0x78/0x98 > [<8659acd2>] dispatch_mmio_write+0xd4/0x120 > > [...] > > Cc: Christoffer Dall > Fixes: c6ccd30e0de3 ("KVM: arm/arm64: Introduce an allocator for in-kernel > irq lines") > Signed-off-by: Dave Martin > > --- > > Based on the limited testing I've done so far, the patch _appears_ to > fix the bug. > > However, I still don't understand which the bug is intermittent, or why > the arch_timer or pmu (the only apparent users of kvm_vgic_set_owner()) > are claiming an LPI in the first place. > > So there may be other bugs in the mix, or I may have misunderstood > something... Yeah, this doesn't make much sense. Both timer and PMU are using PPIs, which are not refcounted, so this vgic_put_irq() is effectively a NOP. It doesn't invalidate the patch itself, it is just that I seriously doubt it fixes anything. LPIs do not use the owner field so far, so we must have another get/put mismatch somewhere. Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 2/2] KVM: arm/arm64: vgic: Fix irq refcount leak in kvm_vgic_set_owner()
kvm_vgic_set_owner() leaks a reference on the vgic_irq descriptor, which does not seem to match up with any vgic_put_irq() that I can find. Since the irq pointer is not passed out and the caller must anyway subsequently use vgic_get_irq() when is wants a pointer, it is not clear why we should have a dangling refcount here. The refcount is still needed inside kvm_vgic_set_owner() to prevent the vgic_irq struct from disappearing while while it is manipulated. So, keep it vgic_get_irq() here, but add the matching vgic_put_irq() before returning. unreferenced object 0x800b6365ab80 (size 128): comm "qemu-system-aar", pid 14414, jiffies 4300822606 (age 84.436s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 b0 e1 e0 38 00 00 ff ff ...8 b0 e1 e0 38 00 00 ff ff 78 e6 ad dd 0a 80 ff ff ...8x... backtrace: [] kmem_cache_alloc+0x178/0x208 [<114591cb>] vgic_add_lpi.part.5+0x34/0x190 [ ] vgic_its_cmd_handle_mapi+0x320/0x348 [<935c5c32>] vgic_its_process_commands.part.14+0x350/0x8b8 [ ] vgic_mmio_write_its_cwriter+0x78/0x98 [<8659acd2>] dispatch_mmio_write+0xd4/0x120 [...] Cc: Christoffer Dall Fixes: c6ccd30e0de3 ("KVM: arm/arm64: Introduce an allocator for in-kernel irq lines") Signed-off-by: Dave Martin --- Based on the limited testing I've done so far, the patch _appears_ to fix the bug. However, I still don't understand which the bug is intermittent, or why the arch_timer or pmu (the only apparent users of kvm_vgic_set_owner()) are claiming an LPI in the first place. So there may be other bugs in the mix, or I may have misunderstood something... The bug (and fix) were observed with native qemu on ThunderX2, on a merge of v5.1 with kvmarm/next commit 9eecfc22e0bf ("KVM: arm64: Fix ptrauth ID register masking logic"). My qemu invocation was: $ qemu-system-aarch64 -machine virt,accel=kvm,gic_version=3 -cpu host \ -smp 4 -nographic \ -drive id=vblock,file=block.qcow2,format=qcow2,if=none \ -device virtio-blk-device,drive=vblock \ -kernel Image -append 'root=/dev/vda1 ro' --- virt/kvm/arm/vgic/vgic.c | 1 + 1 file changed, 1 insertion(+) diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c index 191decc..930319c 100644 --- a/virt/kvm/arm/vgic/vgic.c +++ b/virt/kvm/arm/vgic/vgic.c @@ -599,6 +599,7 @@ int kvm_vgic_set_owner(struct kvm_vcpu *vcpu, unsigned int intid, void *owner) else irq->owner = owner; raw_spin_unlock_irqrestore(>irq_lock, flags); + vgic_put_irq(vcpu->kvm, irq); return ret; } -- 2.1.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm