Re: KVM Arm Device passthrough and linux-rt
On 2019-06-04 13:58:51 [+0100], Julien Grall wrote: > Hi, Hi, > This is happening because vgic_v2_fold_lr_state() is expected > to be called with interrupt disabled. However, some of the path > (e.g eventfd) will take a spinlock. > > The spinlock is from the waitqueue, so using a raw_spin_lock cannot > even be considered. > > Do you have any input on how this could be solved? There is swair (init_swait_queue_head() and friends) in case that works for you. > Cheers, Sebastian ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: KVM Arm Device passthrough and linux-rt
On Tue, 4 Jun 2019 14:53:26 +0100 Marc Zyngier wrote: > That's to prevent the injection of an interrupt firing on the same CPU > while we're saving the corresponding vcpu interrupt context, among other > things (the whole guest exit path runs with interrupt disabled in order > to avoid this kind of thing). Can't we use a per_cpu local lock for this? -- Steve ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: KVM Arm Device passthrough and linux-rt
On 04/06/2019 14:53, Marc Zyngier wrote: > On 04/06/2019 14:16, Steven Rostedt wrote: >> On Tue, 4 Jun 2019 13:58:51 +0100 >> Julien Grall wrote: >> >>> This is happening because vgic_v2_fold_lr_state() is expected >>> to be called with interrupt disabled. However, some of the path >>> (e.g eventfd) will take a spinlock. >>> >>> The spinlock is from the waitqueue, so using a raw_spin_lock cannot >>> even be considered. >>> >>> Do you have any input on how this could be solved? >> >> What's the reason that vgic_v2_fold_lr_state() expects interrupts to be >> disabled? > > That's to prevent the injection of an interrupt firing on the same CPU > while we're saving the corresponding vcpu interrupt context, among other > things (the whole guest exit path runs with interrupt disabled in order > to avoid this kind of thing). > > One possibility would be to accumulate the set of interrupt that require > resampling (which is bound to be small, the number of LRs at most) and > call kvm_notify_acked_irq on that set once interrupts are re-enabled. > > I'll have a look... Julien, Here's what I have in mind, compile-tested only. Please let me know how it fares on your setup... Thanks, M. diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index c36c86f1ec9a..9e587f5d90fa 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -307,6 +307,10 @@ struct vgic_cpu { unsigned int used_lrs; struct vgic_irq private_irqs[VGIC_NR_PRIVATE_IRQS]; + /* SPIs to be resampled by SW at EOI time */ + u16 spi_notify[VGIC_V2_MAX_LRS]; + int spi_notify_count; + raw_spinlock_t ap_list_lock;/* Protects the ap_list */ /* @@ -371,6 +375,7 @@ void kvm_vgic_put(struct kvm_vcpu *vcpu); bool kvm_vcpu_has_pending_irqs(struct kvm_vcpu *vcpu); void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu); void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu); +void kvm_vgic_process_resample(struct kvm_vcpu *vcpu); void kvm_vgic_reset_mapped_irq(struct kvm_vcpu *vcpu, u32 vintid); void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg, bool allow_group1); diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 7eeebe5e9da2..a95c46b2d723 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -827,6 +827,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) guest_exit(); trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); + /* +* Now that interrupts are enabled, signal interrupts +* that require SW resampling. +*/ + kvm_vgic_process_resample(vcpu); + /* Exit types that need handling before we can be preempted */ handle_exit_early(vcpu, run, ret); diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c index d91a8938aa7c..9953e1742d65 100644 --- a/virt/kvm/arm/vgic/vgic-v2.c +++ b/virt/kvm/arm/vgic/vgic-v2.c @@ -79,8 +79,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu) /* Notify fds when the guest EOI'ed a level-triggered SPI */ if (lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid)) - kvm_notify_acked_irq(vcpu->kvm, 0, -intid - VGIC_NR_PRIVATE_IRQS); + vgic_cpu->spi_notify[vgic_cpu->spi_notify_count++] = intid; irq = vgic_get_irq(vcpu->kvm, vcpu, intid); diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c index 9f87e58dbd4a..54b1e4ea5cfa 100644 --- a/virt/kvm/arm/vgic/vgic-v3.c +++ b/virt/kvm/arm/vgic/vgic-v3.c @@ -69,8 +69,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu) /* Notify fds when the guest EOI'ed a level-triggered IRQ */ if (lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid)) - kvm_notify_acked_irq(vcpu->kvm, 0, -intid - VGIC_NR_PRIVATE_IRQS); + vgic_cpu->spi_notify[vgic_cpu->spi_notify_count++] = intid; irq = vgic_get_irq(vcpu->kvm, vcpu, intid); if (!irq) /* An LPI could have been unmapped. */ diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c index 191deccf60bf..baa6aa494e86 100644 --- a/virt/kvm/arm/vgic/vgic.c +++ b/virt/kvm/arm/vgic/vgic.c @@ -880,6 +880,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu) { WARN_ON(vgic_v4_flush_hwstate(vcpu)); + vcpu->arch.vgic_cpu.spi_notify_count = 0; + /* * If there are no virtual interrupts active or pending for this * VCPU, then there is no work to do and we can bail out without @@ -908,6 +910,16 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu) vgic_restore_state(vcpu); } +void kvm_vgic_process_resample(struct kvm_vcpu *vcpu) +{ + struct vgic_cpu
Re: KVM Arm Device passthrough and linux-rt
On 04/06/2019 14:16, Steven Rostedt wrote: > On Tue, 4 Jun 2019 13:58:51 +0100 > Julien Grall wrote: > >> This is happening because vgic_v2_fold_lr_state() is expected >> to be called with interrupt disabled. However, some of the path >> (e.g eventfd) will take a spinlock. >> >> The spinlock is from the waitqueue, so using a raw_spin_lock cannot >> even be considered. >> >> Do you have any input on how this could be solved? > > What's the reason that vgic_v2_fold_lr_state() expects interrupts to be > disabled? That's to prevent the injection of an interrupt firing on the same CPU while we're saving the corresponding vcpu interrupt context, among other things (the whole guest exit path runs with interrupt disabled in order to avoid this kind of thing). One possibility would be to accumulate the set of interrupt that require resampling (which is bound to be small, the number of LRs at most) and call kvm_notify_acked_irq on that set once interrupts are re-enabled. I'll have a look... M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: KVM Arm Device passthrough and linux-rt
On Tue, 4 Jun 2019 13:58:51 +0100 Julien Grall wrote: > This is happening because vgic_v2_fold_lr_state() is expected > to be called with interrupt disabled. However, some of the path > (e.g eventfd) will take a spinlock. > > The spinlock is from the waitqueue, so using a raw_spin_lock cannot > even be considered. > > Do you have any input on how this could be solved? What's the reason that vgic_v2_fold_lr_state() expects interrupts to be disabled? -- Steve ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
KVM Arm Device passthrough and linux-rt
Hi, While trying device passthrough on Linux-rt with KVM Arm, I had the following splat. [ 363.410141] 000: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974 [ 363.410150] 000: in_atomic(): 0, irqs_disabled(): 128, pid: 2916, name: qemu-system-aar [ 363.410153] 000: 4 locks held by qemu-system-aar/2916: [ 363.410157] 000: #0: 8007bd248100 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x70/0xae0 [ 363.410171] 000: #1: 8007bd1e2b20 (&kvm->irq_srcu){}, at: kvm_notify_acked_irq+0x7c/0x300 [ 363.410179] 000: #2: 8007bd1e2b20 (&kvm->irq_srcu){}, at: irqfd_resampler_ack+0x0/0xd8 [ 363.410187] 000: #3: 8007c2b27d28 (&ctx->wqh#2){+.+.}, at: eventfd_signal+0x24/0x78 [ 363.410196] 000: irq event stamp: 4033894 [ 363.410197] 000: hardirqs last enabled at (4033893): [] _raw_spin_unlock_irqrestore+0x88/0x90 [ 363.410203] 000: hardirqs last disabled at (4033894): [] kvm_arch_vcpu_ioctl_run+0x2a8/0xc08 [ 363.410207] 000: softirqs last enabled at (0): [] copy_process.isra.1.part.2+0x8d8/0x1958 [ 363.410212] 000: softirqs last disabled at (0): [<>] (null) [ 363.410216] 000: CPU: 0 PID: 2916 Comm: qemu-system-aar Tainted: GW 5.0.14-rt9-00013-g4b2a13c8a804 #84 [ 363.410219] 000: Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT) [ 363.410221] 000: Call trace: [ 363.410222] 000: dump_backtrace+0x0/0x158 [ 363.410225] 000: show_stack+0x14/0x20 [ 363.410227] 000: dump_stack+0xa0/0xd4 [ 363.410230] 000: ___might_sleep+0x16c/0x1f8 [ 363.410234] 000: rt_spin_lock+0x5c/0x70 [ 363.410237] 000: eventfd_signal+0x24/0x78 [ 363.410238] 000: irqfd_resampler_ack+0x94/0xd8 [ 363.410241] 000: kvm_notify_acked_irq+0xf8/0x300 [ 363.410243] 000: vgic_v2_fold_lr_state+0x174/0x1e0 [ 363.410246] 000: kvm_vgic_sync_hwstate+0x5c/0x2b8 [ 363.410249] 000: kvm_arch_vcpu_ioctl_run+0x624/0xc08 [ 363.410250] 000: kvm_vcpu_ioctl+0x3a0/0xae0 [ 363.410252] 000: do_vfs_ioctl+0xbc/0x910 [ 363.410255] 000: ksys_ioctl+0x78/0xa8 [ 363.410257] 000: __arm64_sys_ioctl+0x1c/0x28 [ 363.410260] 000: el0_svc_common+0x90/0x118 [ 363.410263] 000: el0_svc_handler+0x2c/0x80 [ 363.410265] 000: el0_svc+0x8/0xc This is happening because vgic_v2_fold_lr_state() is expected to be called with interrupt disabled. However, some of the path (e.g eventfd) will take a spinlock. The spinlock is from the waitqueue, so using a raw_spin_lock cannot even be considered. Do you have any input on how this could be solved? Cheers, -- Julien Grall ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm