Re: [PATCH 3/6] irqchip: GICv3: Skip LPI deactivation
On 08/12/2015 03:34 PM, Marc Zyngier wrote: On 11/08/15 10:42, Eric Auger wrote: On 07/09/2015 03:19 PM, Marc Zyngier wrote: Contrary to other GICv3 interrupts, LPIs do not have an active state by virtue of being edge-triggered only (they only have a pending state). Given this, there is no point trying to deactivate them, and we can skip the ICC_DIR_EL1 entierely. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic-v3.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 49768fc..e02592b 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -295,10 +295,14 @@ static int gic_irq_get_irqchip_state(struct irq_data *d, static void gic_eoi_irq(struct irq_data *d) { - if (static_key_true(supports_deactivate)) + if (static_key_true(supports_deactivate)) { + /* No need to deactivate an LPI */ + if (gic_irq(d) = 8192) In case of EOIMode == 0, we do not call EOI. I can't understand whether it is an issue. What do you mean? We definitely perform an EOI in both EOImodes... In 4.8.3 Properties of LPI, in 2d note it is written: SW must issue a write to EOI to clear the active priorities register, hence the CPU interface still requires an active state for LPIs, even through this is not necessary within the redistributor Eric + return; gic_write_dir(gic_irq(d)); - else + } else { gic_write_eoir(gic_irq(d)); ... right here. Of am I missing something completely obvious? yes sorry please forget this. I think I meant EOImode == 1 instead and anyway the EOI is done in gic_handle_irq. Apologies Eric Thanks, M. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] irqchip: GICv3: Don't deactivate interrupts forwarded to a guest
On 08/12/2015 04:20 PM, Marc Zyngier wrote: On 11/08/15 11:03, Eric Auger wrote: On 07/09/2015 03:19 PM, Marc Zyngier wrote: Commit 0a4377de3056 (genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU) added just what we needed at the lowest level to allow an interrupt to be deactivated by a guest. When such a request reaches the GIC, it knows it doesn't need to perform the deactivation anymore, and can safely leave the guest do its magic. This of course requires additional support in both VFIO and KVM. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic-v3.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index e02592b..a1ca9e6 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -70,6 +70,11 @@ static inline int gic_irq_in_rdist(struct irq_data *d) return gic_irq(d) 32; } +static inline bool forwarded_irq(struct irq_data *d) +{ + return d-handler_data != NULL; +} + static inline void __iomem *gic_dist_base(struct irq_data *d) { if (gic_irq_in_rdist(d))/* SGI+PPI - SGI_base for this CPU */ @@ -231,6 +236,12 @@ static void gic_poke_irq(struct irq_data *d, u32 offset) static void gic_mask_irq(struct irq_data *d) { gic_poke_irq(d, GICD_ICENABLER); + /* +* When masking a forwarded interrupt, make sure it is +* deactivated as well. To me it is not straightforward to understand why a forwarded IRQ would need to be DIR'ed when masked. This is needed because of the disable_irq optimisation, I would add a related comment. The lazy disable_irq is just an optimization. yes that's true but it causes a real problem here since although you disabled the IRQ, a new one can show up, we do not call the actual handler (that was supposed to forward it to the guest) but still you expect the guest to complete it. Practically that's why the host must take in charge the deactivation in that case, and this happens during the masking with this implementation. The real reason is that if we mask an interrupt on the host, it is because we don't want the guest to process it at all. There is three cases: 1) The interrupt was inactive: no problem 2) The interrupt was active, but not presented to the guest yet: no problem either. The interrupt will be taken again on unmask. 3) The interrupt was active and presented to the guest: we might get a double deactivate, which shouldn't be a big deal (but mostly should not occur at all). Would something like this make sense? yes makes sense. The only thing that scares me a bit is 3: when masking/DIR an edge irq (#n) we can have the same new physical IRQ showing up when unmasking (#n+1); when guest deactivates the #nth virtual IRQ it is going to DIR #n+1 physical IRQ. Also with VGIC state machine, we must be attention not to inject the second forwarded edge irq while there is one programmed in the LR. We said it comes from the HW so it must be true? Not safe anymore here ... On a related note, I wonder if we need to mark the interrupt pending if it is configured as edge. Otherwise, we could loose an interrupt in case 2 (mask, deactivate, unmask, oh look nothing triggers). Thoughts? Yes I think this makes sense indeed. Definitively this one will be lost. Eric Thanks, M. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] irqchip: GICv3: Don't deactivate interrupts forwarded to a guest
On 12/08/15 16:09, Eric Auger wrote: On 08/12/2015 04:20 PM, Marc Zyngier wrote: On 11/08/15 11:03, Eric Auger wrote: On 07/09/2015 03:19 PM, Marc Zyngier wrote: Commit 0a4377de3056 (genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU) added just what we needed at the lowest level to allow an interrupt to be deactivated by a guest. When such a request reaches the GIC, it knows it doesn't need to perform the deactivation anymore, and can safely leave the guest do its magic. This of course requires additional support in both VFIO and KVM. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic-v3.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index e02592b..a1ca9e6 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -70,6 +70,11 @@ static inline int gic_irq_in_rdist(struct irq_data *d) return gic_irq(d) 32; } +static inline bool forwarded_irq(struct irq_data *d) +{ + return d-handler_data != NULL; +} + static inline void __iomem *gic_dist_base(struct irq_data *d) { if (gic_irq_in_rdist(d))/* SGI+PPI - SGI_base for this CPU */ @@ -231,6 +236,12 @@ static void gic_poke_irq(struct irq_data *d, u32 offset) static void gic_mask_irq(struct irq_data *d) { gic_poke_irq(d, GICD_ICENABLER); + /* + * When masking a forwarded interrupt, make sure it is + * deactivated as well. To me it is not straightforward to understand why a forwarded IRQ would need to be DIR'ed when masked. This is needed because of the disable_irq optimisation, I would add a related comment. The lazy disable_irq is just an optimization. yes that's true but it causes a real problem here since although you disabled the IRQ, a new one can show up, we do not call the actual handler (that was supposed to forward it to the guest) but still you expect the guest to complete it. Practically that's why the host must take in charge the deactivation in that case, and this happens during the masking with this implementation. Yeah, I see what you mean. If we end-up here with an active interrupt, that's because the lazy interrupt masking has been used, and we need to fix up things. The real reason is that if we mask an interrupt on the host, it is because we don't want the guest to process it at all. There is three cases: 1) The interrupt was inactive: no problem 2) The interrupt was active, but not presented to the guest yet: no problem either. The interrupt will be taken again on unmask. 3) The interrupt was active and presented to the guest: we might get a double deactivate, which shouldn't be a big deal (but mostly should not occur at all). Would something like this make sense? yes makes sense. The only thing that scares me a bit is 3: when masking/DIR an edge irq (#n) we can have the same new physical IRQ showing up when unmasking (#n+1); when guest deactivates the #nth virtual IRQ it is going to DIR #n+1 physical IRQ. That bit is not worrying me too much for a few reasons reasons: - You normally don't mask a forwarded interrupt. You only do that on specific events like guest termination. At that point, it doesn't matter anymore. - Edge interrupts can always be coalesced. So getting one event instead of two is not a problem. - Deactivation (specially on EOI from a guest) is not refcounted. It just clears the active bit. Also with VGIC state machine, we must be attention not to inject the second forwarded edge irq while there is one programmed in the LR. We said it comes from the HW so it must be true? Not safe anymore here ... I don't believe this is a problem. You should never end-up masking the interrupt if you don't intend to tear it down (this is why we have the active bit - to avoid masking thing in normal operations). On a related note, I wonder if we need to mark the interrupt pending if it is configured as edge. Otherwise, we could loose an interrupt in case 2 (mask, deactivate, unmask, oh look nothing triggers). Thoughts? Yes I think this makes sense indeed. Definitively this one will be lost. Depends. If we are to restore a working interrupt flow, then we need it. If we just mask to allow an interrupt to be unforwarded, then we do not have to care. Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] irqchip: GICv3: Don't deactivate interrupts forwarded to a guest
On 08/12/2015 05:40 PM, Marc Zyngier wrote: On 12/08/15 16:09, Eric Auger wrote: On 08/12/2015 04:20 PM, Marc Zyngier wrote: On 11/08/15 11:03, Eric Auger wrote: On 07/09/2015 03:19 PM, Marc Zyngier wrote: Commit 0a4377de3056 (genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU) added just what we needed at the lowest level to allow an interrupt to be deactivated by a guest. When such a request reaches the GIC, it knows it doesn't need to perform the deactivation anymore, and can safely leave the guest do its magic. This of course requires additional support in both VFIO and KVM. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic-v3.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index e02592b..a1ca9e6 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -70,6 +70,11 @@ static inline int gic_irq_in_rdist(struct irq_data *d) return gic_irq(d) 32; } +static inline bool forwarded_irq(struct irq_data *d) +{ + return d-handler_data != NULL; +} + static inline void __iomem *gic_dist_base(struct irq_data *d) { if (gic_irq_in_rdist(d))/* SGI+PPI - SGI_base for this CPU */ @@ -231,6 +236,12 @@ static void gic_poke_irq(struct irq_data *d, u32 offset) static void gic_mask_irq(struct irq_data *d) { gic_poke_irq(d, GICD_ICENABLER); + /* + * When masking a forwarded interrupt, make sure it is + * deactivated as well. To me it is not straightforward to understand why a forwarded IRQ would need to be DIR'ed when masked. This is needed because of the disable_irq optimisation, I would add a related comment. The lazy disable_irq is just an optimization. yes that's true but it causes a real problem here since although you disabled the IRQ, a new one can show up, we do not call the actual handler (that was supposed to forward it to the guest) but still you expect the guest to complete it. Practically that's why the host must take in charge the deactivation in that case, and this happens during the masking with this implementation. Yeah, I see what you mean. If we end-up here with an active interrupt, that's because the lazy interrupt masking has been used, and we need to fix up things. The real reason is that if we mask an interrupt on the host, it is because we don't want the guest to process it at all. There is three cases: 1) The interrupt was inactive: no problem 2) The interrupt was active, but not presented to the guest yet: no problem either. The interrupt will be taken again on unmask. 3) The interrupt was active and presented to the guest: we might get a double deactivate, which shouldn't be a big deal (but mostly should not occur at all). Would something like this make sense? yes makes sense. The only thing that scares me a bit is 3: when masking/DIR an edge irq (#n) we can have the same new physical IRQ showing up when unmasking (#n+1); when guest deactivates the #nth virtual IRQ it is going to DIR #n+1 physical IRQ. That bit is not worrying me too much for a few reasons reasons: - You normally don't mask a forwarded interrupt. You only do that on specific events like guest termination. At that point, it doesn't matter anymore. - Edge interrupts can always be coalesced. So getting one event instead of two is not a problem. - Deactivation (specially on EOI from a guest) is not refcounted. It just clears the active bit. Also with VGIC state machine, we must be attention not to inject the second forwarded edge irq while there is one programmed in the LR. We said it comes from the HW so it must be true? Not safe anymore here ... I don't believe this is a problem. You should never end-up masking the interrupt if you don't intend to tear it down (this is why we have the active bit - to avoid masking thing in normal operations). OK On a related note, I wonder if we need to mark the interrupt pending if it is configured as edge. Otherwise, we could loose an interrupt in case 2 (mask, deactivate, unmask, oh look nothing triggers). Thoughts? Yes I think this makes sense indeed. Definitively this one will be lost. Depends. If we are to restore a working interrupt flow, then we need it. If we just mask to allow an interrupt to be unforwarded, then we do not have to care. yes. I am currently focused on unforwarding and effectively that works fine since, as you explained before, I am tearing down the system. Best Regards Eric Thanks, M. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 11/16] KVM: Implement IRQ bypass consumer callbacks for x86
On Tue, 2015-08-11 at 14:03 +0800, Feng Wu wrote: Implement the following callbacks for x86: - kvm_arch_irq_bypass_add_producer - kvm_arch_irq_bypass_del_producer - kvm_arch_irq_bypass_stop: dummy callback - kvm_arch_irq_bypass_resume: dummy callback and set CONFIG_HAVE_KVM_IRQ_BYPASS for x86. Signed-off-by: Feng Wu feng...@intel.com --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/Kconfig| 1 + arch/x86/kvm/x86.c | 39 +++ 3 files changed, 41 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 82d0709..3038c1b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -24,6 +24,7 @@ #include linux/perf_event.h #include linux/pvclock_gtod.h #include linux/clocksource.h +#include linux/irqbypass.h #include asm/pvclock-abi.h #include asm/desc.h diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index c951d44..b90776f 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -30,6 +30,7 @@ config KVM select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQFD select IRQ_BYPASS_MANAGER + select HAVE_KVM_IRQ_BYPASS select HAVE_KVM_IRQ_ROUTING select HAVE_KVM_EVENTFD select KVM_APIC_ARCHITECTURE diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8f09a76..8df7b0d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -50,6 +50,8 @@ #include linux/pci.h #include linux/timekeeper_internal.h #include linux/pvclock_gtod.h +#include linux/kvm_irqfd.h +#include linux/irqbypass.h #include trace/events/kvm.h #define CREATE_TRACE_POINTS @@ -8321,6 +8323,43 @@ out: return ret; } +int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons, + struct irq_bypass_producer *prod) +{ + struct kvm_kernel_irqfd *irqfd = + container_of(cons, struct kvm_kernel_irqfd, consumer); + + irqfd-producer = prod; + + return kvm_arch_update_pi_irte(irqfd-kvm, prod-irq, irqfd-gsi, 1); +} + +void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons, + struct irq_bypass_producer *prod) +{ + int ret; + struct kvm_kernel_irqfd *irqfd = + container_of(cons, struct kvm_kernel_irqfd, consumer); + + irqfd-producer = NULL; + + /* + * When producer of consumer is unregistered, we change back to + * remapped mode, so we can re-use the current implementation + * when the irq is masked/disabed or the consumer side (KVM + * int this case doesn't want to receive the interrupts. + */ + ret = kvm_arch_update_pi_irte(irqfd-kvm, prod-irq, irqfd-gsi, 0); + WARN_ON(ret); +} Some tracing support would be nice here so we have some way to determine whether we've made a successful connection. + +void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *cons) +{ +} +void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons) +{ +} Can't we define a common version of these with __attribute__((weak)) so that archs that don't need them don't need to add this cruft? + EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] vfio: Enable VFIO device for powerpc
ec53500f kvm: Add VFIO device added a special KVM pseudo-device which is used to handle any necessary interactions between KVM and VFIO. Currently that device is built on x86 and ARM, but not powerpc, although powerpc does support both KVM and VFIO. This makes things awkward in userspace Currently qemu prints an alarming error message if you attempt to use VFIO and it can't initialize the KVM VFIO device. We don't want to remove the warning, because lack of the KVM VFIO device could mean coherency problems on x86. On powerpc, however, the error is harmless but looks disturbing, and a test based on host architecture in qemu would be ugly, and break if we do need the KVM VFIO device for something important in future. There's nothing preventing the KVM VFIO device from being built for powerpc, so this patch turns it on. It won't actually do anything, since we don't define any of the arch_*() hooks, but it will make qemu happy and we can extend it in future if we need to. Signed-off-by: David Gibson da...@gibson.dropbear.id.au Reviewed-by: Eric Auger eric.au...@linaro.org --- arch/powerpc/kvm/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Alex (Graf), sorry, forgot to send this to you on my first posting of this patch, but I'm guessing it's your tree it should go in via. Resending, with Eric's reviewed-by line folded in. Please apply. diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index 0570eef..7f7b6d8 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -8,7 +8,7 @@ ccflags-y := -Ivirt/kvm -Iarch/powerpc/kvm KVM := ../../../virt/kvm common-objs-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ - $(KVM)/eventfd.o + $(KVM)/eventfd.o $(KVM)/vfio.o CFLAGS_e500_mmu.o := -I. CFLAGS_e500_mmu_host.o := -I. -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 01/15] KVM: arm/arm64: VGIC: don't track used LRs in the distributor
Hi Andre, On 07/10/2015 04:21 PM, Andre Przywara wrote: Currently we track which IRQ has been mapped to which VGIC list register and also have to synchronize both. We used to do this to hold some extra state (for instance the active bit). It turns out that this extra state in the LRs is no longer needed and this extra tracking causes some pain later. Remove the tracking feature (lr_map and lr_used) and get rid of quite some code on the way. On a guest exit we pick up all still pending IRQs from the LRs and put them back in the distributor. We don't care about active-only IRQs, so we keep them in the LRs. They will be retired either by our vgic_process_maintenance() routine or by the GIC hardware in case of edge triggered interrupts. In places where we scan LRs we now use our shadow copy of the ELRSR register directly. This code change means we lose the piggy-back optimization, which would re-use an active-only LR to inject the pending state on top of it. Tracing with various workloads shows that this actually occurred very rarely, the ballpark figure is about once every 10,000 exits in a disk I/O heavy workload. Also the list registers don't seem to as scarce as assumed, with all 4 LRs on the popular implementations used less than once every 100,000 exits. This has been briefly tested on Midway, Juno and the model (the latter both with GICv2 and GICv3 guests). Signed-off-by: Andre Przywara andre.przyw...@arm.com --- include/kvm/arm_vgic.h | 6 --- virt/kvm/arm/vgic-v2.c | 1 + virt/kvm/arm/vgic-v3.c | 1 + virt/kvm/arm/vgic.c| 143 ++--- 4 files changed, 66 insertions(+), 85 deletions(-) diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 133ea00..2ccfa9a 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -279,9 +279,6 @@ struct vgic_v3_cpu_if { }; struct vgic_cpu { - /* per IRQ to LR mapping */ - u8 *vgic_irq_lr_map; - /* Pending/active/both interrupts on this VCPU */ DECLARE_BITMAP( pending_percpu, VGIC_NR_PRIVATE_IRQS); DECLARE_BITMAP( active_percpu, VGIC_NR_PRIVATE_IRQS); @@ -292,9 +289,6 @@ struct vgic_cpu { unsigned long *active_shared; unsigned long *pend_act_shared; - /* Bitmap of used/free list registers */ - DECLARE_BITMAP( lr_used, VGIC_V2_MAX_LRS); - /* Number of list registers on this CPU */ int nr_lr; diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c index f9b9c7c..f723710 100644 --- a/virt/kvm/arm/vgic-v2.c +++ b/virt/kvm/arm/vgic-v2.c @@ -144,6 +144,7 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu) * anyway. */ vcpu-arch.vgic_cpu.vgic_v2.vgic_vmcr = 0; + vcpu-arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0; /* Get the show on the road... */ vcpu-arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN; diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c index dff0602..21e5d28 100644 --- a/virt/kvm/arm/vgic-v3.c +++ b/virt/kvm/arm/vgic-v3.c @@ -178,6 +178,7 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu) * anyway. */ vgic_v3-vgic_vmcr = 0; + vgic_v3-vgic_elrsr = ~0; /* * If we are emulating a GICv3, we do it in an non-GICv2-compatible diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index bc40137..394622c 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -79,7 +79,6 @@ #include vgic.h static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu); -static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu); static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr); static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc); @@ -647,6 +646,17 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio, return false; } +static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr, +struct vgic_lr vlr) +{ + vgic_ops-sync_lr_elrsr(vcpu, lr, vlr); +} why not renaming this into vgic_set_elrsr. This would be homogeneous with other virtual interface control register setters? + +static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu) +{ + return vgic_ops-get_elrsr(vcpu); +} If I am not wrong, each time you manipulate the elrsr you handle the bitmap. why not directly returning an unsigned long * then (elrsr_ptr)? + /** * vgic_unqueue_irqs - move pending/active IRQs from LRs to the distributor * @vgic_cpu: Pointer to the vgic_cpu struct holding the LRs @@ -658,9 +668,11 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio *mmio, void vgic_unqueue_irqs(struct kvm_vcpu *vcpu) { struct vgic_cpu *vgic_cpu = vcpu-arch.vgic_cpu; + u64 elrsr = vgic_get_elrsr(vcpu); + unsigned long *elrsr_ptr = u64_to_bitmask(elrsr); int i; - for_each_set_bit(i, vgic_cpu-lr_used,
Re: [PATCH v4 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
On Tue, 11 Aug 2015 08:42:37 +0100 Christoffer Dall christoffer.d...@linaro.org wrote: On Fri, Aug 07, 2015 at 04:45:42PM +0100, Marc Zyngier wrote: In order to be able to feed physical interrupts to a guest, we need to be able to establish the virtual-physical mapping between the two worlds. The mappings are kept in a set of RCU lists, indexed by virtual interrupts. Signed-off-by: Marc Zyngier marc.zyng...@arm.com Thanks for addressing all my concerns: Reviewed-by: Christoffer Dall christoffer.d...@linaro.org Thanks a lot for all the reviewing, much appreciated. Unless someone has an objection, I'm going put this series (minus the last patch) into -next so it gets a bit more exposure. M. -- Jazz is not dead. It just smells funny. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 06/11] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts
On Wed, Aug 12, 2015 at 10:56:38AM +0100, Marc Zyngier wrote: On Tue, 11 Aug 2015 08:42:37 +0100 Christoffer Dall christoffer.d...@linaro.org wrote: On Fri, Aug 07, 2015 at 04:45:42PM +0100, Marc Zyngier wrote: In order to be able to feed physical interrupts to a guest, we need to be able to establish the virtual-physical mapping between the two worlds. The mappings are kept in a set of RCU lists, indexed by virtual interrupts. Signed-off-by: Marc Zyngier marc.zyng...@arm.com Thanks for addressing all my concerns: Reviewed-by: Christoffer Dall christoffer.d...@linaro.org Thanks a lot for all the reviewing, much appreciated. Unless someone has an objection, I'm going put this series (minus the last patch) into -next so it gets a bit more exposure. Sounds good to me, when I'm back from this paper writing business tomorrow/friday, I'll give it a spin and test the UEFI reboot issue with it as well. -Christoffer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm:x86:Fix error handling in the function ioapic_write_indirect
On 11/08/2015 19:39, Nicholas Krause wrote: This fixes error handling in the function ioapic_write_indirect to properly check if the call to the function ioapoc_service service has failed by not returning the value zero to indicate success to check if this occurs when calling this particular function and if so return immediately to the caller of ioapic_write_indirect due to us no longer being able to continue the function ioapic_write_indirect after this call fails. This is not true. You are back to the older problem that you are not trying to understand the functions you are modifying. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/12] HyperV equivalent of pvpanic driver
On 12/08/2015 13:54, Denis V. Lunev wrote: guys? we are going to move forward with other HyperV bits. Wait a second, 2.4 was released only a few hours ago... Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 05/15] KVM: arm/arm64: make GIC frame address initialization model specific
On 07/10/2015 04:21 PM, Andre Przywara wrote: Currently we initialize all the possible GIC frame addresses in one function, without looking at the specific GIC model we instantiate for the guest. As this gets confusing when adding another VGIC model later, lets move these initializations into the respective model's init nit: tobe more precise the init emulation function (not the vgic_v2/v3_init_model model's init function). pfouh?! ;-) functions. Signed-off-by: Andre Przywara andre.przyw...@arm.com --- virt/kvm/arm/vgic-v2-emul.c | 3 +++ virt/kvm/arm/vgic-v3-emul.c | 3 +++ virt/kvm/arm/vgic.c | 3 --- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c index 1390797..8faa28c 100644 --- a/virt/kvm/arm/vgic-v2-emul.c +++ b/virt/kvm/arm/vgic-v2-emul.c @@ -567,6 +567,9 @@ void vgic_v2_init_emulation(struct kvm *kvm) dist-vm_ops.init_model = vgic_v2_init_model; dist-vm_ops.map_resources = vgic_v2_map_resources; + dist-vgic_cpu_base = VGIC_ADDR_UNDEF; + dist-vgic_dist_base = VGIC_ADDR_UNDEF; Looks strange to see the common dist_base here. Why don't you leave it in common part, kvm_vgic_create; all the more so you left kvm-arch.vgic.vctrl_base = vgic-vctrl_base in kvm_vgic_create. Eric + kvm-arch.max_vcpus = VGIC_V2_MAX_CPUS; } diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c index d2eeb20..1f42348 100644 --- a/virt/kvm/arm/vgic-v3-emul.c +++ b/virt/kvm/arm/vgic-v3-emul.c @@ -885,6 +885,9 @@ void vgic_v3_init_emulation(struct kvm *kvm) dist-vm_ops.destroy_model = vgic_v3_destroy_model; dist-vm_ops.map_resources = vgic_v3_map_resources; + dist-vgic_dist_base = VGIC_ADDR_UNDEF; + dist-vgic_redist_base = VGIC_ADDR_UNDEF; + kvm-arch.max_vcpus = KVM_MAX_VCPUS; } diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index cc8f5ed..59f1801 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -1830,9 +1830,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type) kvm-arch.vgic.in_kernel = true; kvm-arch.vgic.vgic_model = type; kvm-arch.vgic.vctrl_base = vgic-vctrl_base; - kvm-arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF; - kvm-arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF; - kvm-arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF; out_unlock: for (; vcpu_lock_idx = 0; vcpu_lock_idx--) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 04/15] KVM: arm/arm64: extend arch CAP checks to allow per-VM capabilities
Reviewed-by: Eric Auger eric.au...@linaro.org Best Regards Eric On 07/10/2015 04:21 PM, Andre Przywara wrote: KVM capabilities can be a per-VM property, though ARM/ARM64 currently does not pass on the VM pointer to the architecture specific capability handlers. Add a struct kvm* parameter to those function to later allow proper per-VM capability reporting. Signed-off-by: Andre Przywara andre.przyw...@arm.com --- arch/arm/include/asm/kvm_host.h | 2 +- arch/arm/kvm/arm.c| 2 +- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/kvm/reset.c| 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index e896d2c..56cac05 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -213,7 +213,7 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr, kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr); } -static inline int kvm_arch_dev_ioctl_check_extension(long ext) +static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext) { return 0; } diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index bc738d2..7c65353 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -196,7 +196,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = KVM_MAX_VCPUS; break; default: - r = kvm_arch_dev_ioctl_check_extension(ext); + r = kvm_arch_dev_ioctl_check_extension(kvm, ext); break; } return r; diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 2709db2..8d78a72 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -47,7 +47,7 @@ int __attribute_const__ kvm_target_cpu(void); int kvm_reset_vcpu(struct kvm_vcpu *vcpu); -int kvm_arch_dev_ioctl_check_extension(long ext); +int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext); struct kvm_arch { /* The VMID generation used for the virt. memory system */ diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 0b43265..866502b 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -56,7 +56,7 @@ static bool cpu_has_32bit_el1(void) return !!(pfr0 0x20); } -int kvm_arch_dev_ioctl_check_extension(long ext) +int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext) { int r; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] irqchip: GICv3: Convert to EOImode == 1
On 11/08/15 10:14, Eric Auger wrote: Hi Marc, On 07/09/2015 03:19 PM, Marc Zyngier wrote: So far, GICv3 has been used in with EOImode == 0. The effect of this mode is to perform the priority drop and the deactivation of the interrupt at the same time. While this works perfectly for Linux (we only have a single priority), it causes issues when an interrupt is forwarded to a guest, and when we want the guest to perform the EOI itself. For this case, the GIC architecture provides EOImode == 1, where: - A write to ICC_EOIR1_EL1 drops the priority of the interrupt and leaves it active. Other interrupts at the same priority level can now be taken, but the active interrupt cannot be taken again - A write to ICC_DIR_EL1 marks the interrupt as inactive, meaning it can now be taken again. This patch converts the driver to be able to use this new mode, depending on whether or not the kernel can behave as a hypervisor. No feature change. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic-v3.c | 28 +--- include/linux/irqchip/arm-gic-v3.h | 9 + 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index c52f7ba..49768fc 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -30,6 +30,7 @@ #include asm/cputype.h #include asm/exception.h #include asm/smp_plat.h +#include asm/virt.h #include irq-gic-common.h #include irqchip.h @@ -50,6 +51,7 @@ struct gic_chip_data { }; static struct gic_chip_data gic_data __read_mostly; +static struct static_key supports_deactivate = STATIC_KEY_INIT_TRUE; #define gic_data_rdist()(this_cpu_ptr(gic_data.rdists.rdist)) #define gic_data_rdist_rd_base()(gic_data_rdist()-rd_base) @@ -293,7 +295,10 @@ static int gic_irq_get_irqchip_state(struct irq_data *d, static void gic_eoi_irq(struct irq_data *d) { -gic_write_eoir(gic_irq(d)); +if (static_key_true(supports_deactivate)) +gic_write_dir(gic_irq(d)); +else +gic_write_eoir(gic_irq(d)); } static int gic_set_type(struct irq_data *d, unsigned int type) @@ -343,6 +348,10 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs if (likely(irqnr 15 irqnr 1020) || irqnr = 8192) { int err; + +if (static_key_true(supports_deactivate)) +gic_write_eoir(irqnr); + err = handle_domain_irq(gic_data.domain, irqnr, regs); if (err) { WARN_ONCE(true, Unexpected interrupt received!\n); shouldn't we DIR here as well in case of err (we did EOI before)? Yes, we should, very good point. I'll fix that up. Besides Reviewed-by: Eric Auger eric.au...@linaro.org if it can help. Thanks! M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with error paths in various functions calling kvm_io_bus_unregister_dev
On 11/08/2015 16:02, nick wrote: Our we to just assume that calls to kvm_io_bug_unregister_dev always succeed as I disagree due to it allocating memory in kernel space that can easily fail. In additon I was wondering how the maintainers would like me to handle these calls as it's difficult due to it already being in error paths. Further more below is a link to the files and how the calls to kvm_io_bus_register_dev are executed in the respective error paths. http://lxr.free-electrons.com/ident?i=kvm_io_bus_unregister_dev You're right; the fix is not simple. The best way is to introduce a new API to register or unregister multiple devices atomically. This gets rid of calls in the error paths. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] irqchip: GIC: Convert to EOImode == 1
Hi Eric, On 11/08/15 10:15, Eric Auger wrote: Hi Marc, On 07/09/2015 03:19 PM, Marc Zyngier wrote: So far, GICv2 has been used in with EOImode == 0. The effect of this mode is to perform the priority drop and the deactivation of the interrupt at the same time. While this works perfectly for Linux (we only have a single priority), it causes issues when an interrupt is forwarded to a guest, and when we want the guest to perform the EOI itself. For this case, the GIC architecture provides EOImode == 1, where: - A write to the EOI register drops the priority of the interrupt and leaves it active. Other interrupts at the same priority level can now be taken, but the active interrupt cannot be taken again - A write to the DIR marks the interrupt as inactive, meaning it can now be taken again. We only enable this feature when booted in HYP mode and that the device-tree reporte reported a suitable CPU interface. Observable behaviour should remain unchanged. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic.c | 32 +--- include/linux/irqchip/arm-gic.h | 4 2 files changed, 33 insertions(+), 3 deletions(-) diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 8d7e1c8..e264675 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -46,6 +46,7 @@ #include asm/irq.h #include asm/exception.h #include asm/smp_plat.h +#include asm/virt.h #include irq-gic-common.h #include irqchip.h @@ -82,6 +83,8 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); #define NR_GIC_CPU_IF 8 static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly; +static struct static_key supports_deactivate = STATIC_KEY_INIT_TRUE; + #ifndef MAX_GIC_NR #define MAX_GIC_NR 1 #endif @@ -164,7 +167,10 @@ static void gic_unmask_irq(struct irq_data *d) static void gic_eoi_irq(struct irq_data *d) { -writel_relaxed(gic_irq(d), gic_cpu_base(d) + GIC_CPU_EOI); +if (static_key_true(supports_deactivate)) +writel_relaxed(gic_irq(d), gic_cpu_base(d) + GIC_CPU_DEACTIVATE); +else +writel_relaxed(gic_irq(d), gic_cpu_base(d) + GIC_CPU_EOI); } static int gic_irq_set_irqchip_state(struct irq_data *d, @@ -272,11 +278,15 @@ static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs) irqnr = irqstat GICC_IAR_INT_ID_MASK; if (likely(irqnr 15 irqnr 1021)) { shouldn't we have 1020? Looks like you have unearthed a very long standing (though not fatal) bug - I can trace it back to 2005 and the inclusion of the Realview support (see include/asm-arm/arch-realview/entry-macro.S in 8ad68bbf for the details). It may be that the original GIC didn't make number 1020 a special one, though the earliest spec I have access to (GICv1) is already making 1020 a reserved interrupt number. And looking at the pre-existing code (arch/arm/common/gic.c), 1020 seems to already be considered an invalid number. CC-ing Catalin, as he was the one who introduced it... ;-) Unless he says otherwise, I'll cook a patch for that. +if (static_key_true(supports_deactivate)) +writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); handle_domain_irq(gic-domain, irqnr, regs); why don't we handle the returned value of handle_domain_irq as we do in GICv3? Because I wrote the GICv3 code with my paranoia hat on... This really should never fail. continue; } if (irqnr 16) { writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); +if (static_key_true(supports_deactivate)) +writel_relaxed(irqstat, cpu_base + GIC_CPU_DEACTIVATE); #ifdef CONFIG_SMP handle_IPI(irqnr, regs); #endif @@ -358,7 +368,11 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(gic_data[0]); -u32 bypass = 0; +u32 bypass; +u32 mode = 0; + +if (static_key_true(supports_deactivate)) +mode = GIC_CPU_CTRL_EOImodeNS; /* * Preserve bypass disable bits to be written back later @@ -366,7 +380,7 @@ static void gic_cpu_if_up(void) bypass = readl(cpu_base + GIC_CPU_CTRL); bypass = GICC_DIS_BYPASS_MASK; -writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); +writel_relaxed(bypass | mode | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); } @@ -986,6 +1000,9 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start, register_cpu_notifier(gic_cpu_notifier); #endif set_handle_irq(gic_handle_irq); +pr_info (GIC: Using EOImode == %d\n, + static_key_true(supports_deactivate)); + } gic_dist_init(gic); @@ -1001,6 +1018,7 @@
Re: [PATCH 4/6] irqchip: GIC: Use chip_data instead of handler_data for cascaded interrupts
On 11/08/15 10:45, Eric Auger wrote: Hi Marc, On 07/09/2015 03:19 PM, Marc Zyngier wrote: When used as a primary interrupt controller, the GIC driver uses irq_data-chip_data to extract controller-specific information. When used as a secondary interrupt controller, it uses handler_data instead. As this difference is relatively pointless and only creates confusion, change the secondary path to match what the rest of the driver does. not really related to this series' topic, is it? It is. Or was. We use the handler_data field to hold the vcpu information, but that's also used by the chained interrupt handler to find out which secondary it is. But this patch is wrong anyway, as it entirely breaks secondary GICs (I found out when I merged this with a patch from Jon Hunter fixing some other related stuff). Took me hours to get an EB up and running! I have a working version now, which makes this patch irrelevant. Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 0/12] HyperV equivalent of pvpanic driver
On 08/12/2015 03:47 PM, Paolo Bonzini wrote: On 12/08/2015 13:54, Denis V. Lunev wrote: guys? we are going to move forward with other HyperV bits. Wait a second, 2.4 was released only a few hours ago... Paolo sure :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] irqchip: GICv3: Skip LPI deactivation
On 11/08/15 10:42, Eric Auger wrote: On 07/09/2015 03:19 PM, Marc Zyngier wrote: Contrary to other GICv3 interrupts, LPIs do not have an active state by virtue of being edge-triggered only (they only have a pending state). Given this, there is no point trying to deactivate them, and we can skip the ICC_DIR_EL1 entierely. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic-v3.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 49768fc..e02592b 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -295,10 +295,14 @@ static int gic_irq_get_irqchip_state(struct irq_data *d, static void gic_eoi_irq(struct irq_data *d) { -if (static_key_true(supports_deactivate)) +if (static_key_true(supports_deactivate)) { +/* No need to deactivate an LPI */ +if (gic_irq(d) = 8192) In case of EOIMode == 0, we do not call EOI. I can't understand whether it is an issue. What do you mean? We definitely perform an EOI in both EOImodes... In 4.8.3 Properties of LPI, in 2d note it is written: SW must issue a write to EOI to clear the active priorities register, hence the CPU interface still requires an active state for LPIs, even through this is not necessary within the redistributor Eric +return; gic_write_dir(gic_irq(d)); -else +} else { gic_write_eoir(gic_irq(d)); ... right here. Of am I missing something completely obvious? Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] irqchip: GICv3: Don't deactivate interrupts forwarded to a guest
On 11/08/15 11:03, Eric Auger wrote: On 07/09/2015 03:19 PM, Marc Zyngier wrote: Commit 0a4377de3056 (genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU) added just what we needed at the lowest level to allow an interrupt to be deactivated by a guest. When such a request reaches the GIC, it knows it doesn't need to perform the deactivation anymore, and can safely leave the guest do its magic. This of course requires additional support in both VFIO and KVM. Signed-off-by: Marc Zyngier marc.zyng...@arm.com --- drivers/irqchip/irq-gic-v3.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index e02592b..a1ca9e6 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -70,6 +70,11 @@ static inline int gic_irq_in_rdist(struct irq_data *d) return gic_irq(d) 32; } +static inline bool forwarded_irq(struct irq_data *d) +{ +return d-handler_data != NULL; +} + static inline void __iomem *gic_dist_base(struct irq_data *d) { if (gic_irq_in_rdist(d))/* SGI+PPI - SGI_base for this CPU */ @@ -231,6 +236,12 @@ static void gic_poke_irq(struct irq_data *d, u32 offset) static void gic_mask_irq(struct irq_data *d) { gic_poke_irq(d, GICD_ICENABLER); +/* + * When masking a forwarded interrupt, make sure it is + * deactivated as well. To me it is not straightforward to understand why a forwarded IRQ would need to be DIR'ed when masked. This is needed because of the disable_irq optimisation, I would add a related comment. The lazy disable_irq is just an optimization. The real reason is that if we mask an interrupt on the host, it is because we don't want the guest to process it at all. There is three cases: 1) The interrupt was inactive: no problem 2) The interrupt was active, but not presented to the guest yet: no problem either. The interrupt will be taken again on unmask. 3) The interrupt was active and presented to the guest: we might get a double deactivate, which shouldn't be a big deal (but mostly should not occur at all). Would something like this make sense? On a related note, I wonder if we need to mark the interrupt pending if it is configured as edge. Otherwise, we could loose an interrupt in case 2 (mask, deactivate, unmask, oh look nothing triggers). Thoughts? Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] irqchip: GICv2/v3: Add support for irq_vcpu_affinity
On 11/08/15 11:06, Eric Auger wrote: Hi Marc, The series works fine with KVM platform device passthrough use case (Calxeda Midway, GICv2, SPI assignment with VFIO). My integration branch can be found at https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.2-rc6-irq-forward-v3 if useful. Tested-by: Eric Auger eric.au...@linaro.org Many thanks for trying it (and for the review). I'll post an updated version soon. Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] irqchip: GIC: Convert to EOImode == 1
On Wed, Aug 12, 2015 at 02:31:47PM +0100, Marc Zyngier wrote: On 11/08/15 10:15, Eric Auger wrote: On 07/09/2015 03:19 PM, Marc Zyngier wrote: static int gic_irq_set_irqchip_state(struct irq_data *d, @@ -272,11 +278,15 @@ static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs) irqnr = irqstat GICC_IAR_INT_ID_MASK; if (likely(irqnr 15 irqnr 1021)) { shouldn't we have 1020? Looks like you have unearthed a very long standing (though not fatal) bug - I can trace it back to 2005 and the inclusion of the Realview support (see include/asm-arm/arch-realview/entry-macro.S in 8ad68bbf for the details). It may be that the original GIC didn't make number 1020 a special one, though the earliest spec I have access to (GICv1) is already making 1020 a reserved interrupt number. And looking at the pre-existing code (arch/arm/common/gic.c), 1020 seems to already be considered an invalid number. CC-ing Catalin, as he was the one who introduced it... ;-) Unless he says otherwise, I'll cook a patch for that. I really have no idea where it came from. The code probably pre-dates the existence of a GIC architecture spec (the GIC spec used to be part of the board or CPU TRM). I don't see any problem with using 1020 here. -- Catalin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 03/10] VFIO: platform: single handler using function pointer
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote: A single handler now is registered whatever the use case: automasked or not. A function pointer is set according to the wished behavior and the handler calls this function. The irq lock is taken/released in the root handler. eventfd_signal can be called in regions not allowed to sleep. Signed-off-by: Eric Auger eric.au...@linaro.org --- v4: creation --- drivers/vfio/platform/vfio_platform_irq.c | 21 +++-- drivers/vfio/platform/vfio_platform_private.h | 1 + 2 files changed, 16 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index 40f057a..b31b1f0 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev, static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx = dev_id; - unsigned long flags; int ret = IRQ_NONE; - spin_lock_irqsave(irq_ctx-lock, flags); - if (!irq_ctx-masked) { ret = IRQ_HANDLED; @@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) irq_ctx-masked = true; } - spin_unlock_irqrestore(irq_ctx-lock, flags); - if (ret == IRQ_HANDLED) eventfd_signal(irq_ctx-trigger, 1); Has this been run with lockdep to check whether this is safe to call with spinlock_irqsave held? @@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id) return IRQ_HANDLED; } +static irqreturn_t vfio_handler(int irq, void *dev_id) +{ + struct vfio_platform_irq *irq_ctx = dev_id; + unsigned long flags; + irqreturn_t ret; + + spin_lock_irqsave(irq_ctx-lock, flags); + ret = irq_ctx-handler(irq, dev_id); + spin_unlock_irqrestore(irq_ctx-lock, flags); + + return ret; +} + static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod) { } @@ -229,9 +237,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, } irq-trigger = trigger; + irq-handler = handler; irq_set_status_flags(irq-hwirq, IRQ_NOAUTOEN); - ret = request_irq(irq-hwirq, handler, 0, irq-name, irq); + ret = request_irq(irq-hwirq, vfio_handler, 0, irq-name, irq); if (ret) { kfree(irq-name); eventfd_ctx_put(trigger); diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h index 8b4f814..f848a6b 100644 --- a/drivers/vfio/platform/vfio_platform_private.h +++ b/drivers/vfio/platform/vfio_platform_private.h @@ -40,6 +40,7 @@ struct vfio_platform_irq { struct virqfd *mask; struct irq_bypass_producer producer; boolforwarded; + irqreturn_t (*handler)(int irq, void *dev_id); }; struct vfio_platform_region { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 06/10] VFIO: platform: add irq bypass producer management
On Mon, 2015-08-10 at 15:21 +0200, Eric Auger wrote: This patch populates the IRQ bypass callacks: - stop/start producer simply consist in disabling/enabling the host irq - add/del consumer: basically set the automasked flag to false/true Signed-off-by: Eric Auger eric.au...@linaro.org --- v2 - v3: - vfio_platform_irq_bypass_add_consumer now returns an error in case the IRQ is recognized as active - active boolean not set anymore - do not VFIO mask the IRQ anymore on unforward v1 - v2: - device handle caching in vfio_platform_device is introduced in a separate patch - use container_of --- drivers/vfio/platform/vfio_platform_irq.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index efaee58..400a188 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq) static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod) { + disable_irq(prod-irq); } static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod) { + enable_irq(prod-irq); } static int vfio_platform_irq_bypass_add_consumer( struct irq_bypass_producer *prod, struct irq_bypass_consumer *cons) { - return 0; + struct vfio_platform_irq *irq = + container_of(prod, struct vfio_platform_irq, producer); + + /* + * if the IRQ is active at irqchip level or VFIO (auto)masked + * this means the host IRQ is already under injection in the + * guest and this not safe to change the forwarding state at + * that stage. + * It is not possible to differentiate user-space masking + * from auto-masking, leading to possible false detection of + * active state. + */ + if (vfio_platform_is_active(irq)) + return -EAGAIN; Here's an example of why we don't want WARN_ON if a registration fails, this is effectively random. When and how is a re-try going to happen? + + return vfio_platform_set_automasked(irq, false); set_forwarded just seems so much more logical here. } static void vfio_platform_irq_bypass_del_consumer( struct irq_bypass_producer *prod, struct irq_bypass_consumer *cons) { + struct vfio_platform_irq *irq = + container_of(prod, struct vfio_platform_irq, producer); + + vfio_platform_set_automasked(irq, true); } static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/1] KVM: PPC: Book3S: correct width in XER handling
On 06.08.15 12:16, Laurent Vivier wrote: Hi, I'd also like to see this patch in the mainstream as it fixes a bug appearing when we switch from vCPU context to hypervisor context (guest crash). Thanks, applied to kvm-ppc-queue. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 05/10] VFIO: platform: add vfio_platform_is_active
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote: This function returns whether the IRQ is active at irqchip level or VFIO masked. If either is true, it is considered the IRQ is active. Currently there is no way to differentiate userspace masked IRQ from automasked IRQ. There might be false detection of activity. However it is currently acceptable to have false detection. Signed-off-by: Eric Auger eric.au...@linaro.org --- --- drivers/vfio/platform/vfio_platform_irq.c | 17 + 1 file changed, 17 insertions(+) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index a285384..efaee58 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq, return 0; } +static int vfio_platform_is_active(struct vfio_platform_irq *irq) vfio_platform_irq_is_active()? +{ + unsigned long flags; + bool active, masked, outstanding; + int ret; + + spin_lock_irqsave(irq-lock, flags); + + ret = irq_get_irqchip_state(irq-hwirq, IRQCHIP_STATE_ACTIVE, active); + BUG_ON(ret); Why can't we propagate this error to the caller and let them decide? + masked = irq-masked; + outstanding = active || masked; + + spin_unlock_irqrestore(irq-lock, flags); + return outstanding; +} + static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod) { } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 01/10] VFIO: platform: registration of a dummy IRQ bypass producer
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote: Register a dummy producer with void callbacks Signed-off-by: Eric Auger eric.au...@linaro.org --- v2 - v3: - rename vfio_platform_irq_bypass_resume into *_start --- drivers/vfio/platform/vfio_platform_irq.c | 32 +++ drivers/vfio/platform/vfio_platform_private.h | 2 ++ 2 files changed, 34 insertions(+) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index 88bba57..b5cb8c7 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -20,6 +20,7 @@ #include linux/types.h #include linux/vfio.h #include linux/irq.h +#include linux/irqbypass.h #include vfio_platform_private.h @@ -177,6 +178,27 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id) return IRQ_HANDLED; } +static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod) +{ +} + +static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod) +{ +} + +static int vfio_platform_irq_bypass_add_consumer( + struct irq_bypass_producer *prod, + struct irq_bypass_consumer *cons) +{ + return 0; +} + +static void vfio_platform_irq_bypass_del_consumer( + struct irq_bypass_producer *prod, + struct irq_bypass_consumer *cons) +{ +} + static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, int fd, irq_handler_t handler) { @@ -186,6 +208,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, if (irq-trigger) { free_irq(irq-hwirq, irq); + irq_bypass_unregister_producer(irq-producer); kfree(irq-name); eventfd_ctx_put(irq-trigger); irq-trigger = NULL; @@ -216,6 +239,15 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, return ret; } + irq-producer.token = (void *)trigger; + irq-producer.irq = irq-hwirq; + irq-producer.add_consumer = vfio_platform_irq_bypass_add_consumer; + irq-producer.del_consumer = vfio_platform_irq_bypass_del_consumer; + irq-producer.stop = vfio_platform_irq_bypass_stop; + irq-producer.start = vfio_platform_irq_bypass_start; + ret = irq_bypass_register_producer(irq-producer); + WARN_ON(ret); For what purpose? + if (!irq-masked) enable_irq(irq-hwirq); diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h index 1c9b3d5..1d2d4d6 100644 --- a/drivers/vfio/platform/vfio_platform_private.h +++ b/drivers/vfio/platform/vfio_platform_private.h @@ -17,6 +17,7 @@ #include linux/types.h #include linux/interrupt.h +#include linux/irqbypass.h #define VFIO_PLATFORM_OFFSET_SHIFT 40 #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) VFIO_PLATFORM_OFFSET_SHIFT) - 1) @@ -37,6 +38,7 @@ struct vfio_platform_irq { spinlock_t lock; struct virqfd *unmask; struct virqfd *mask; + struct irq_bypass_producer producer; }; struct vfio_platform_region { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 04/10] VFIO: platform: add vfio_platform_set_automasked
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote: This function makes possible to change the automasked mode. Signed-off-by: Eric Auger eric.au...@linaro.org --- v1 - v2: - set forwarded flag --- drivers/vfio/platform/vfio_platform_irq.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index b31b1f0..a285384 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id) return ret; } +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq, +bool automasked) +{ + unsigned long flags; + + spin_lock_irqsave(irq-lock, flags); + if (automasked) { + irq-forwarded = true; + irq-flags |= VFIO_IRQ_INFO_AUTOMASKED; + irq-handler = vfio_automasked_irq_handler; + } else { + irq-forwarded = false; + irq-flags = ~VFIO_IRQ_INFO_AUTOMASKED; + irq-handler = vfio_irq_handler; + } + spin_unlock_irqrestore(irq-lock, flags); + return 0; In vfio-speak, automasked means level and we're not magically changing the IRQ from level to edge, we're simply able to handle level differently based on a hardware optimization. Should the user visible flags therefore change based on this? Aren't we really setting the forwarded state rather than the automasked state? +} + static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod) { } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing
On 07.08.15 17:54, Nicholas Krause wrote: This fixes the incorrect return statement in the function mpic_set_default_irq_routing from always returning zero to signal success to this function's caller to instead return the return value of kvm_set_irq_routing as this function can fail and we need to correctly signal the caller of mpic_set_default_irq_routing when the call to this particular function has failed. Signed-off-by: Nicholas Krause xerofo...@gmail.com I like the patch, but I don't see it on the kvm-ppc mailing list. It doesn't show up on patchwork or spinics. Did something go wrong while sending it out? Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing
On 07.08.15 17:54, Nicholas Krause wrote: This fixes the incorrect return statement in the function mpic_set_default_irq_routing from always returning zero to signal success to this function's caller to instead return the return value of kvm_set_irq_routing as this function can fail and we need to correctly signal the caller of mpic_set_default_irq_routing when the call to this particular function has failed. Signed-off-by: Nicholas Krause xerofo...@gmail.com I like the patch, but I don't see it on the kvm-ppc mailing list. It doesn't show up on patchwork or spinics. Did something go wrong while sending it out? Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm:powerpc:Fix return statements for wrapper functions in the file book3s_64_mmu_hv.c
On 10.08.15 17:27, Nicholas Krause wrote: This fixes the wrapper functions kvm_umap_hva_hv and the function kvm_unmap_hav_range_hv to return the return value of the function kvm_handle_hva or kvm_handle_hva_range that they are wrapped to call internally rather then always making the caller of these wrapper functions think they always run successfully by returning the value of zero directly. Signed-off-by: Nicholas Krause xerofo...@gmail.com Paul, could you please take on this one? Thanks, Alex --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index dab68b7..0905c8f 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -774,14 +774,12 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp, int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva) { - kvm_handle_hva(kvm, hva, kvm_unmap_rmapp); - return 0; + return kvm_handle_hva(kvm, hva, kvm_unmap_rmapp); } int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned long end) { - kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp); - return 0; + return kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp); } void kvmppc_core_flush_memslot_hv(struct kvm *kvm, -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm:powerpc:Fix return statements for wrapper functions in the file book3s_64_mmu_hv.c
On 10.08.15 17:27, Nicholas Krause wrote: This fixes the wrapper functions kvm_umap_hva_hv and the function kvm_unmap_hav_range_hv to return the return value of the function kvm_handle_hva or kvm_handle_hva_range that they are wrapped to call internally rather then always making the caller of these wrapper functions think they always run successfully by returning the value of zero directly. Signed-off-by: Nicholas Krause xerofo...@gmail.com Paul, could you please take on this one? Thanks, Alex --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index dab68b7..0905c8f 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -774,14 +774,12 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp, int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva) { - kvm_handle_hva(kvm, hva, kvm_unmap_rmapp); - return 0; + return kvm_handle_hva(kvm, hva, kvm_unmap_rmapp); } int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned long end) { - kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp); - return 0; + return kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp); } void kvmppc_core_flush_memslot_hv(struct kvm *kvm, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 5/5] KVM: eventfd: add irq bypass consumer management
On Mon, 2015-08-10 at 15:31 +0200, Eric Auger wrote: This patch adds the registration/unregistration of an irq_bypass_consumer on irqfd assignment/deassignment. Signed-off-by: Eric Auger eric.au...@linaro.org Signed-off-by: Feng Wu feng...@intel.com --- v4 - v5: - due to removal of static inline stubs, add #ifdef CONFIG_HAVE_KVM_IRQ_BYPASS around consumer registration/unregistration - add pr_info when registration fails v2 - v3 (Feng Wu): - Use kvm_arch_irq_bypass_start - Remove kvm_arch_irq_bypass_update - Add member 'struct irq_bypass_producer *producer' in 'struct kvm_kernel_irqfd', it is needed by posted interrupt. - Remove 'irq_bypass_unregister_consumer' in kvm_irqfd_deassign() v1 - v2: - populate of kvm and gsi removed - unregister the consumer on irqfd_shutdown --- include/linux/kvm_irqfd.h | 2 ++ virt/kvm/eventfd.c| 15 +++ 2 files changed, 17 insertions(+) diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h index f926b39..0c1de05 100644 --- a/include/linux/kvm_irqfd.h +++ b/include/linux/kvm_irqfd.h @@ -64,6 +64,8 @@ struct kvm_kernel_irqfd { struct list_head list; poll_table pt; struct work_struct shutdown; + struct irq_bypass_consumer consumer; + struct irq_bypass_producer *producer; }; #endif /* __LINUX_KVM_IRQFD_H */ diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 647ffb8..d7a230f 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -35,6 +35,7 @@ #include linux/srcu.h #include linux/slab.h #include linux/seqlock.h +#include linux/irqbypass.h #include trace/events/kvm.h #include kvm/iodev.h @@ -140,6 +141,9 @@ irqfd_shutdown(struct work_struct *work) /* * It is now safe to release the object's resources */ +#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS + irq_bypass_unregister_consumer(irqfd-consumer); +#endif eventfd_ctx_put(irqfd-eventfd); kfree(irqfd); } @@ -379,6 +383,17 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) * we might race against the POLLHUP */ fdput(f); +#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS + irqfd-consumer.token = (void *)irqfd-eventfd; + irqfd-consumer.add_producer = kvm_arch_irq_bypass_add_producer; + irqfd-consumer.del_producer = kvm_arch_irq_bypass_del_producer; + irqfd-consumer.stop = kvm_arch_irq_bypass_stop; + irqfd-consumer.start = kvm_arch_irq_bypass_start; + ret = irq_bypass_register_consumer(irqfd-consumer); + if (ret) + pr_info(irq bypass consumer (token %p) registration fails: %d\n, + irqfd-consumer.token, ret); +#endif Does this series compile on its own? Aren't all these arch function unresolved? return 0; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing
On 12.08.15 21:06, nick wrote: On 2015-08-12 03:05 PM, Alexander Graf wrote: On 07.08.15 17:54, Nicholas Krause wrote: This fixes the incorrect return statement in the function mpic_set_default_irq_routing from always returning zero to signal success to this function's caller to instead return the return value of kvm_set_irq_routing as this function can fail and we need to correctly signal the caller of mpic_set_default_irq_routing when the call to this particular function has failed. Signed-off-by: Nicholas Krause xerofo...@gmail.com I like the patch, but I don't see it on the kvm-ppc mailing list. It doesn't show up on patchwork or spinics. Did something go wrong while sending it out? Alex Alex, Ask Paolo about it as he would be able to explain it better then I. Well, whatever the reason, I can only apply patches that actually appeared on the public mailing list. Otherwise people may not get the chance to review them ;). Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing
On 12.08.15 21:06, nick wrote: On 2015-08-12 03:05 PM, Alexander Graf wrote: On 07.08.15 17:54, Nicholas Krause wrote: This fixes the incorrect return statement in the function mpic_set_default_irq_routing from always returning zero to signal success to this function's caller to instead return the return value of kvm_set_irq_routing as this function can fail and we need to correctly signal the caller of mpic_set_default_irq_routing when the call to this particular function has failed. Signed-off-by: Nicholas Krause xerofo...@gmail.com I like the patch, but I don't see it on the kvm-ppc mailing list. It doesn't show up on patchwork or spinics. Did something go wrong while sending it out? Alex Alex, Ask Paolo about it as he would be able to explain it better then I. Well, whatever the reason, I can only apply patches that actually appeared on the public mailing list. Otherwise people may not get the chance to review them ;). Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL 0/2] KVM: s390: fix and feature for kvm/next (4.3)
Paolo, the following changes since commit c92ea7b9f7d256cabf7ee08a7627a5227e356dec: KVM: s390: log capability enablement and vm attribute changes (2015-07-29 11:02:36 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git tags/kvm-s390-next-20150812 for you to fetch changes up to 152b28392a8d9dd08e789b48b602eb75eef436fa: KVM: s390: Fix assumption that kvm_set_irq_routing is always run successfully (2015-08-07 12:15:23 +0200) KVM: s390: fix and feature for kvm/next (4.3) 1. error handling for irq routes 2. Gracefully handle STP time changes s390 supports a protocol for syncing different systems via the stp protocol that will steer the TOD clocks to keep all participating clocks below the round trip time between the system. In case of specific out of sync event Linux can opt-in to accept sync checks. This will result in non-monotonic jumps of the TOD clock, which Linux will correct via time offsets to keep the wall clock time monotonic. Now: KVM guests also base their time on the host TOD, so we need to fixup the offset for them as well. Fan Zhang (1): KVM: s390: host STP toleration for VMs Nicholas Krause (1): KVM: s390: Fix assumption that kvm_set_irq_routing is always run successfully Paolo Bonzini (1): Merge tag 'kvm-s390-next-20150728' of git://git.kernel.org/.../kvms390/linux into kvm-next arch/s390/include/asm/etr.h | 3 +++ arch/s390/kernel/time.c | 16 +--- arch/s390/kvm/interrupt.c | 10 +- arch/s390/kvm/kvm-s390.c| 41 +++-- arch/s390/kvm/priv.c| 2 ++ 5 files changed, 66 insertions(+), 6 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL 1/2] KVM: s390: host STP toleration for VMs
From: Fan Zhang zhang...@linux.vnet.ibm.com If the host has STP enabled, the TOD of the host will be changed during synchronization phases. These are performed during a stop_machine() call. As the guest TOD is based on the host TOD, we have to make sure that: - no VCPU is in the SIE (implicitly guaranteed via stop_machine()) - manual guest TOD calculations are not affected Epoch is the guest TOD clock delta to the host TOD clock. We have to adjust that value during the STP synchronization and make sure that code that accesses the epoch won't get interrupted in between (via disabling preemption). Signed-off-by: Fan Zhang zhang...@linux.vnet.ibm.com Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Reviewed-by: Christian Borntraeger borntrae...@de.ibm.com Acked-by: Martin Schwidefsky schwidef...@de.ibm.com Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com --- arch/s390/include/asm/etr.h | 3 +++ arch/s390/kernel/time.c | 16 +--- arch/s390/kvm/interrupt.c | 10 +- arch/s390/kvm/kvm-s390.c| 38 ++ arch/s390/kvm/priv.c| 2 ++ 5 files changed, 65 insertions(+), 4 deletions(-) diff --git a/arch/s390/include/asm/etr.h b/arch/s390/include/asm/etr.h index 629b79a..f7e5c36 100644 --- a/arch/s390/include/asm/etr.h +++ b/arch/s390/include/asm/etr.h @@ -214,6 +214,9 @@ static inline int etr_ptff(void *ptff_block, unsigned int func) void etr_switch_to_local(void); void etr_sync_check(void); +/* notifier for syncs */ +extern struct atomic_notifier_head s390_epoch_delta_notifier; + /* STP interruption parameter */ struct stp_irq_parm { unsigned int _pad0 : 14; diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c index 9e733d9..627887b 100644 --- a/arch/s390/kernel/time.c +++ b/arch/s390/kernel/time.c @@ -58,6 +58,9 @@ EXPORT_SYMBOL_GPL(sched_clock_base_cc); static DEFINE_PER_CPU(struct clock_event_device, comparators); +ATOMIC_NOTIFIER_HEAD(s390_epoch_delta_notifier); +EXPORT_SYMBOL(s390_epoch_delta_notifier); + /* * Scheduler clock - returns current time in nanosec units. */ @@ -752,7 +755,7 @@ static void clock_sync_cpu(struct clock_sync_data *sync) static int etr_sync_clock(void *data) { static int first; - unsigned long long clock, old_clock, delay, delta; + unsigned long long clock, old_clock, clock_delta, delay, delta; struct clock_sync_data *etr_sync; struct etr_aib *sync_port, *aib; int port; @@ -789,6 +792,9 @@ static int etr_sync_clock(void *data) delay = (unsigned long long) (aib-edf2.etv - sync_port-edf2.etv) 32; delta = adjust_time(old_clock, clock, delay); + clock_delta = clock - old_clock; + atomic_notifier_call_chain(s390_epoch_delta_notifier, 0, + clock_delta); etr_sync-fixup_cc = delta; fixup_clock_comparator(delta); /* Verify that the clock is properly set. */ @@ -1526,7 +1532,7 @@ void stp_island_check(void) static int stp_sync_clock(void *data) { static int first; - unsigned long long old_clock, delta; + unsigned long long old_clock, delta, new_clock, clock_delta; struct clock_sync_data *stp_sync; int rc; @@ -1551,7 +1557,11 @@ static int stp_sync_clock(void *data) old_clock = get_tod_clock(); rc = chsc_sstpc(stp_page, STP_OP_SYNC, 0); if (rc == 0) { - delta = adjust_time(old_clock, get_tod_clock(), 0); + new_clock = get_tod_clock(); + delta = adjust_time(old_clock, new_clock, 0); + clock_delta = new_clock - old_clock; + atomic_notifier_call_chain(s390_epoch_delta_notifier, + 0, clock_delta); fixup_clock_comparator(delta); rc = chsc_sstpi(stp_page, stp_info, sizeof(struct stp_sstpi)); diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index a578140..b277d50 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -71,9 +71,13 @@ static int ckc_interrupts_enabled(struct kvm_vcpu *vcpu) static int ckc_irq_pending(struct kvm_vcpu *vcpu) { + preempt_disable(); if (!(vcpu-arch.sie_block-ckc - get_tod_clock_fast() + vcpu-arch.sie_block-epoch)) + get_tod_clock_fast() + vcpu-arch.sie_block-epoch)) { + preempt_enable(); return 0; + } + preempt_enable(); return ckc_interrupts_enabled(vcpu); } @@ -856,7 +860,9 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu) goto no_timer; } + preempt_disable(); now = get_tod_clock_fast() + vcpu-arch.sie_block-epoch; +
[GIT PULL 2/2] KVM: s390: Fix assumption that kvm_set_irq_routing is always run successfully
From: Nicholas Krause xerofo...@gmail.com This fixes the assumption that kvm_set_irq_routing is always run successfully by instead making it equal to the variable r which we use for returning in the function kvm_arch_vm_ioctl instead of making r equal to zero when calling this particular function and incorrectly making the caller of kvm_arch_vm_ioctl think the function has run successfully. Signed-off-by: Nicholas Krause xerofo...@gmail.com Message-Id: 1438880754-27149-1-git-send-email-xerofo...@gmail.com Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com --- arch/s390/kvm/kvm-s390.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 4bdb860..397b88d 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -962,8 +962,7 @@ long kvm_arch_vm_ioctl(struct file *filp, if (kvm-arch.use_irqchip) { /* Set up dummy routing. */ memset(routing, 0, sizeof(routing)); - kvm_set_irq_routing(kvm, routing, 0, 0); - r = 0; + r = kvm_set_irq_routing(kvm, routing, 0, 0); } break; } -- 2.3.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/5] KVM: add kvm_has_request wrapper
Am 05.08.2015 um 18:32 schrieb Radim Krčmář: We want to have requests abstracted from bit operations. Signed-off-by: Radim Krčmář rkrc...@redhat.com --- kvm_check_request is now somewhat a misnomer (what is the difference between test and check?) but still Acked-by: Christian Borntraeger borntrae...@de.ibm.com for the new interface. maybe we can rename kvm_check_request in a separate patch somewhen. arch/x86/kvm/vmx.c | 2 +- include/linux/kvm_host.h | 7 ++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 217f66343dc8..17514fe7d2cb 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -5879,7 +5879,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu) if (intr_window_requested vmx_interrupt_allowed(vcpu)) return handle_interrupt_window(vmx-vcpu); - if (test_bit(KVM_REQ_EVENT, vcpu-requests)) + if (kvm_has_request(KVM_REQ_EVENT, vcpu)) return 1; err = emulate_instruction(vcpu, EMULTYPE_NO_REEXECUTE); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 27ccdf91a465..52e388367a26 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1089,9 +1089,14 @@ static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu) set_bit(req, vcpu-requests); } +static inline bool kvm_has_request(int req, struct kvm_vcpu *vcpu) +{ + return test_bit(req, vcpu-requests); +} + static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu) { - if (test_bit(req, vcpu-requests)) { + if (kvm_has_request(req, vcpu)) { clear_bit(req, vcpu-requests); return true; } else { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 5/5] KVM: refactor asynchronous vcpu ioctl dispatch
Am 05.08.2015 um 18:33 schrieb Radim Krčmář: I find the switch easier to read and modify. yes. Signed-off-by: Radim Krčmář rkrc...@redhat.com --- v2: new virt/kvm/kvm_main.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d7ffe6090520..71598554deed 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2252,12 +2252,15 @@ static long kvm_vcpu_ioctl(struct file *filp, * Special cases: vcpu ioctls that are asynchronous to vcpu execution, * so vcpu_load() would break it. */ + switch (ioctl) { #if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) - if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_S390_IRQ || ioctl == KVM_INTERRUPT) - return kvm_arch_vcpu_ioctl(filp, ioctl, arg); + case KVM_S390_INTERRUPT: + case KVM_S390_IRQ: + case KVM_INTERRUPT: When you are it you might want to put the KVM_S390* withing CONFIG_S390 and KVM_INTERRUPT within CONFIG_PPC || CONFIG_MIPS This might speed up the switch statement for s390/ppc/mips a tiny bit. It will add another ifdef, though. Paolo? #endif - if (ioctl == KVM_USER_EXIT) + case KVM_USER_EXIT: return kvm_arch_vcpu_ioctl(filp, ioctl, arg); + } r = vcpu_load(vcpu); if (r) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html