Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series
On Fri, 2015-09-18 at 16:58 +0200, Paolo Bonzini wrote: > > On 18/09/2015 16:29, Feng Wu wrote: > > VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. > > With VT-d Posted-Interrupts enabled, external interrupts from > > direct-assigned devices can be delivered to guests without VMM > > intervention when guest is running in non-root mode. > > > > You can find the VT-d Posted-Interrtups Spec. in the following URL: > > http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html > > Thanks. I will squash patches 2 and 14 together, and drop patch 3. > > Signed-off-bys are missing in patch 1 and 4. The patches exist > elsewhere in the mailing list archives, so not a big deal. Or just > reply to them with the S-o-b line. > > Alex, can you ack the series and review patch 12? I sent an ack for 12 separately, I got a bit lost in 16 & 17, but for all the others that don't already have some tag from me, Reviewed-by: Alex Williamson > > Joerg, can you ack patch 18? > > Paolo > > > v9: > > - Include the whole series: > > [01/18]: irq bypasser manager > > [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side > > forwarded irq > > [07/18] - [18/18]: VT-d PI part > > > > v8: > > refer to the changelog in each patch > > > > v7: > > * Define two weak irq bypass callbacks: > > - kvm_arch_irq_bypass_start() > > - kvm_arch_irq_bypass_stop() > > * Remove the x86 dummy implementation of the above two functions. > > * Print some useful information instead of WARN_ON() when the > > irq bypass consumer unregistration fails. > > * Fix an issue when calling pi_pre_block and pi_post_block. > > > > v6: > > * Rebase on 4.2.0-rc6 > > * Rebase on https://lkml.org/lkml/2015/8/6/526 and > > http://www.gossamer-threads.com/lists/linux/kernel/2235623 > > * Make the add_consumer and del_consumer callbacks static > > * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' > > * Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails > > * Remove optional dummy callbacks for irq producer > > > > v4: > > * For lowest-priority interrupt, only support single-CPU destination > > interrupts at the current stage, more common lowest priority support > > will be added later. > > * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle > > the posted-interrupts in the HLT emulation path. > > * Some small changes (coding style, typo, add some code comments) > > > > v3: > > * Adjust the Posted-interrupts Descriptor updating logic when vCPU is > > preempted or blocked. > > * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> KVM_DEV_VFIO_DEVICE_POST_IRQ > > * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> __KVM_HAVE_ARCH_KVM_VFIO_POST > > * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which > > can be used to change back to remapping mode. > > * Fix typo > > > > v2: > > * Use VFIO framework to enable this feature, the VFIO part of this series is > > base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control" > > * Rebase this patchset on > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, > > then revise some irq logic based on the new hierarchy irqdomain patches > > provided > > by Jiang Liu > > > > > > *** BLURB HERE *** > > > > Alex Williamson (1): > > virt: IRQ bypass manager > > > > Eric Auger (4): > > KVM: arm/arm64: select IRQ_BYPASS_MANAGER > > KVM: create kvm_irqfd.h > > KVM: introduce kvm_arch functions for IRQ bypass > > KVM: eventfd: add irq bypass consumer management > > > > Feng Wu (13): > > KVM: x86: select IRQ_BYPASS_MANAGER > > KVM: Extend struct pi_desc for VT-d Posted-Interrupts > > KVM: Add some helper functions for Posted-Interrupts > > KVM: Define a new interface kvm_intr_is_single_vcpu() > > KVM: Make struct kvm_irq_routing_table accessible > > KVM: make kvm_set_msi_irq() public > > vfio: Register/unregister irq_bypass_producer > > KVM: x86: Update IRTE for posted-interrupts > > KVM: Implement IRQ bypass consumer callbacks for x86 > > KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' > > KVM: Update Posted-Interrupts Descriptor when vCPU is preempted > > KVM: Update Posted-Interrupts Descriptor when vCPU is blocked > > iommu/vt-d: Add a command line parameter for VT-d posted-interrupts > > > > Documentation/kernel-parameters.txt | 1 + > > Documentation/virtual/kvm/locking.txt | 12 ++ > > MAINTAINERS | 7 + > > arch/arm/kvm/Kconfig | 2 + > > arch/arm/kvm/Makefile | 1 + > > arch/arm64/kvm/Kconfig| 2 + > > arch/arm64/kvm/Makefile | 1 + > > arch/x86/include/asm/kvm_host.h | 24 +++ > > arch/x86/kvm/Kconfig | 3 + > > arch/x86/kvm/Makefile | 3 + > > arch/x86/kvm/irq_comm.c | 32 ++- > > arch/x86/kvm/lapic.c | 59 ++ > > arch/x8
Re: [PATCH v9 12/18] vfio: Register/unregister irq_bypass_producer
On Fri, 2015-09-18 at 22:29 +0800, Feng Wu wrote: > This patch adds the registration/unregistration of an > irq_bypass_producer for MSI/MSIx on vfio pci devices. > > Signed-off-by: Feng Wu On nit, Paolo could you please fix the spelling of "registration" in the dev_info, otherwise: Acked-by: Alex Williamson > --- > v8: > - Merge "[PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI > devices" > into this patch. > > v6: > - Make the add_consumer and del_consumer callbacks static > - Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' > - Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails > - Remove optional dummy callbacks for irq producer > > drivers/vfio/pci/Kconfig| 1 + > drivers/vfio/pci/vfio_pci_intrs.c | 9 + > drivers/vfio/pci/vfio_pci_private.h | 2 ++ > 3 files changed, 12 insertions(+) > > diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig > index 579d83b..02912f1 100644 > --- a/drivers/vfio/pci/Kconfig > +++ b/drivers/vfio/pci/Kconfig > @@ -2,6 +2,7 @@ config VFIO_PCI > tristate "VFIO support for PCI devices" > depends on VFIO && PCI && EVENTFD > select VFIO_VIRQFD > + select IRQ_BYPASS_MANAGER > help > Support for the PCI VFIO bus driver. This is required to make > use of PCI drivers using the VFIO framework. > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c > b/drivers/vfio/pci/vfio_pci_intrs.c > index 1f577b4..c65299d 100644 > --- a/drivers/vfio/pci/vfio_pci_intrs.c > +++ b/drivers/vfio/pci/vfio_pci_intrs.c > @@ -319,6 +319,7 @@ static int vfio_msi_set_vector_signal(struct > vfio_pci_device *vdev, > > if (vdev->ctx[vector].trigger) { > free_irq(irq, vdev->ctx[vector].trigger); > + irq_bypass_unregister_producer(&vdev->ctx[vector].producer); > kfree(vdev->ctx[vector].name); > eventfd_ctx_put(vdev->ctx[vector].trigger); > vdev->ctx[vector].trigger = NULL; > @@ -360,6 +361,14 @@ static int vfio_msi_set_vector_signal(struct > vfio_pci_device *vdev, > return ret; > } > > + vdev->ctx[vector].producer.token = trigger; > + vdev->ctx[vector].producer.irq = irq; > + ret = irq_bypass_register_producer(&vdev->ctx[vector].producer); > + if (unlikely(ret)) > + dev_info(&pdev->dev, > + "irq bypass producer (token %p) registeration fails: %d\n", > + vdev->ctx[vector].producer.token, ret); > + > vdev->ctx[vector].trigger = trigger; > > return 0; > diff --git a/drivers/vfio/pci/vfio_pci_private.h > b/drivers/vfio/pci/vfio_pci_private.h > index ae0e1b4..0e7394f 100644 > --- a/drivers/vfio/pci/vfio_pci_private.h > +++ b/drivers/vfio/pci/vfio_pci_private.h > @@ -13,6 +13,7 @@ > > #include > #include > +#include > > #ifndef VFIO_PCI_PRIVATE_H > #define VFIO_PCI_PRIVATE_H > @@ -29,6 +30,7 @@ struct vfio_pci_irq_ctx { > struct virqfd *mask; > char*name; > boolmasked; > + struct irq_bypass_producer producer; > }; > > struct vfio_pci_device { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 03/13] KVM: Define a new interface kvm_intr_is_single_vcpu()
On 18/09/2015 18:16, Radim Krčmář wrote: >>> >> Ok, I was wondering whether this was the correct interpretation. Thanks! >> > >> > Paolo, I don't think Radim clarify your concern, right? Since mda is >> > 8-bit, it >> > is wrong with mda >> 16, this is your concern, right? > In case it was: mda is u32 so the bitshift is defined by C. > (xAPIC destinations in KVM's x2APIC mode are stored in lowest 8 bits of > mda, hence the cluster is always 0.) > > Or am I still missing the point? Yes, remembering that the cluster is always 0 solved my doubt. Paolo ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 03/13] KVM: Define a new interface kvm_intr_is_single_vcpu()
2015-09-17 23:18+, Wu, Feng: >> From: Paolo Bonzini [mailto:pbonz...@redhat.com] >> On 17/09/2015 17:58, Radim Krčmář wrote: >>> xAPIC address are only 8 bit long so they always get delivered to x2APIC >>> cluster 0, where first 16 bits work like xAPIC flat logical mode. >> >> Ok, I was wondering whether this was the correct interpretation. Thanks! > > Paolo, I don't think Radim clarify your concern, right? Since mda is 8-bit, it > is wrong with mda >> 16, this is your concern, right? In case it was: mda is u32 so the bitshift is defined by C. (xAPIC destinations in KVM's x2APIC mode are stored in lowest 8 bits of mda, hence the cluster is always 0.) Or am I still missing the point? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 17/18] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
On 18/09/2015 16:29, Feng Wu wrote: > This patch updates the Posted-Interrupts Descriptor when vCPU > is blocked. > > pre-block: > - Add the vCPU to the blocked per-CPU list > - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR > > post-block: > - Remove the vCPU from the per-CPU list > > Signed-off-by: Feng Wu > --- > v9: > - Add description for blocked_vcpu_on_cpu_lock in > Documentation/virtual/kvm/locking.txt > - Check !kvm_arch_has_assigned_device(vcpu->kvm) first, then > !irq_remapping_cap(IRQ_POSTING_CAP) > > v8: > - Rename 'pi_pre_block' to 'pre_block' > - Rename 'pi_post_block' to 'post_block' > - Change some comments > - Only add the vCPU to the blocking list when the VM has assigned devices. > > Documentation/virtual/kvm/locking.txt | 12 +++ > arch/x86/include/asm/kvm_host.h | 13 +++ > arch/x86/kvm/vmx.c| 153 > ++ > arch/x86/kvm/x86.c| 53 +--- > include/linux/kvm_host.h | 3 + > virt/kvm/kvm_main.c | 3 + > 6 files changed, 227 insertions(+), 10 deletions(-) > > diff --git a/Documentation/virtual/kvm/locking.txt > b/Documentation/virtual/kvm/locking.txt > index d68af4d..19f94a6 100644 > --- a/Documentation/virtual/kvm/locking.txt > +++ b/Documentation/virtual/kvm/locking.txt > @@ -166,3 +166,15 @@ Comment: The srcu read lock must be held while accessing > memslots (e.g. > MMIO/PIO address->device structure mapping (kvm->buses). > The srcu index can be stored in kvm_vcpu->srcu_idx per vcpu > if it is needed by multiple functions. > + > +Name:blocked_vcpu_on_cpu_lock > +Type:spinlock_t > +Arch:x86 > +Protects:blocked_vcpu_on_cpu > +Comment: This is a per-CPU lock and it is used for VT-d > posted-interrupts. > + When VT-d posted-interrupts is supported and the VM has assigned > + devices, we put the blocked vCPU on the list blocked_vcpu_on_cpu > + protected by blocked_vcpu_on_cpu_lock, when VT-d hardware issues > + wakeup notification event since external interrupts from the > + assigned devices happens, we will find the vCPU on the list to > + wakeup. > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 0ddd353..304fbb5 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -552,6 +552,8 @@ struct kvm_vcpu_arch { >*/ > bool write_fault_to_shadow_pgtable; > > + bool halted; > + > /* set at EPT violation at this point */ > unsigned long exit_qualification; > > @@ -864,6 +866,17 @@ struct kvm_x86_ops { > /* pmu operations of sub-arch */ > const struct kvm_pmu_ops *pmu_ops; > > + /* > + * Architecture specific hooks for vCPU blocking due to > + * HLT instruction. > + * Returns for .pre_block(): > + *- 0 means continue to block the vCPU. > + *- 1 means we cannot block the vCPU since some event > + *happens during this period, such as, 'ON' bit in > + *posted-interrupts descriptor is set. > + */ > + int (*pre_block)(struct kvm_vcpu *vcpu); > + void (*post_block)(struct kvm_vcpu *vcpu); > int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, > uint32_t guest_irq, bool set); > }; > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 902a67d..9968896 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -879,6 +879,13 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs); > static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu); > static DEFINE_PER_CPU(struct desc_ptr, host_gdt); > > +/* > + * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we > + * can find which vCPU should be waken up. > + */ > +static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu); > +static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock); > + > static unsigned long *vmx_io_bitmap_a; > static unsigned long *vmx_io_bitmap_b; > static unsigned long *vmx_msr_bitmap_legacy; > @@ -2985,6 +2992,8 @@ static int hardware_enable(void) > return -EBUSY; > > INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu)); > + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu)); > + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu)); > > /* >* Now we can enable the vmclear operation in kdump > @@ -6121,6 +6130,25 @@ static void update_ple_window_actual_max(void) > ple_window_grow, INT_MIN); > } > > +/* > + * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR. > + */ > +static void wakeup_handler(void) > +{ > + struct kvm_vcpu *vcpu; > + int cpu = smp_processor_id(); > + > + spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu)); > + list_for_each_entry(vcpu, &per_cpu(b
RE: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series
> -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Friday, September 18, 2015 11:21 PM > To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org; > mtosa...@redhat.com > Cc: eric.au...@linaro.org; k...@vger.kernel.org; > iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org > Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including > prerequisite series > > > > On 18/09/2015 17:08, Wu, Feng wrote: > > > > > >> -Original Message- > >> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > >> Sent: Friday, September 18, 2015 10:59 PM > >> To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org; > >> mtosa...@redhat.com > >> Cc: eric.au...@linaro.org; k...@vger.kernel.org; > >> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org > >> Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - > >> including > >> prerequisite series > >> > >> > >> > >> On 18/09/2015 16:29, Feng Wu wrote: > >>> VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. > >>> With VT-d Posted-Interrupts enabled, external interrupts from > >>> direct-assigned devices can be delivered to guests without VMM > >>> intervention when guest is running in non-root mode. > >>> > >>> You can find the VT-d Posted-Interrtups Spec. in the following URL: > >>> > >> > http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog > >> y/vt-directed-io-spec.html > >> > >> Thanks. I will squash patches 2 and 14 together, and drop patch 3. > >> > >> Signed-off-bys are missing in patch 1 and 4. The patches exist > >> elsewhere in the mailing list archives, so not a big deal. Or just > >> reply to them with the S-o-b line. > >> > > > > Thanks for your quick response, Paolo! I didn't change the code > > in patch 1 and 4, do I need to add s-o-b, if needed, I can reply > > the patches. > > Yes, the s-o-b just means that the code passed through your hands. Done. > > Note that I replied to patch 17, but no need to resend that one > either---just mailing list discussion is enough. Do you mean you replied to patch 17 just now, but I don't find your replies in the mailing list. Thanks, Feng > > Paolo > > > Thanks, > > Feng > > > >> Alex, can you ack the series and review patch 12? > >> > >> Joerg, can you ack patch 18? > >> > >> Paolo > >> > >>> v9: > >>> - Include the whole series: > >>> [01/18]: irq bypasser manager > >>> [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side > >> forwarded irq > >>> [07/18] - [18/18]: VT-d PI part > >>> > >>> v8: > >>> refer to the changelog in each patch > >>> > >>> v7: > >>> * Define two weak irq bypass callbacks: > >>> - kvm_arch_irq_bypass_start() > >>> - kvm_arch_irq_bypass_stop() > >>> * Remove the x86 dummy implementation of the above two functions. > >>> * Print some useful information instead of WARN_ON() when the > >>> irq bypass consumer unregistration fails. > >>> * Fix an issue when calling pi_pre_block and pi_post_block. > >>> > >>> v6: > >>> * Rebase on 4.2.0-rc6 > >>> * Rebase on https://lkml.org/lkml/2015/8/6/526 and > >> http://www.gossamer-threads.com/lists/linux/kernel/2235623 > >>> * Make the add_consumer and del_consumer callbacks static > >>> * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' > >>> * Use dev_info instead of WARN_ON() when irq_bypass_register_producer > >> fails > >>> * Remove optional dummy callbacks for irq producer > >>> > >>> v4: > >>> * For lowest-priority interrupt, only support single-CPU destination > >>> interrupts at the current stage, more common lowest priority support > >>> will be added later. > >>> * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle > >>> the posted-interrupts in the HLT emulation path. > >>> * Some small changes (coding style, typo, add some code comments) > >>> > >>> v3: > >>> * Adjust the Posted-interrupts Descriptor updating logic when vCPU is > >>> preempted or blocked. > >>> * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> > >> KVM_DEV_VFIO_DEVICE_POST_IRQ > >>> * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> > >> __KVM_HAVE_ARCH_KVM_VFIO_POST > >>> * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which > >>> can be used to change back to remapping mode. > >>> * Fix typo > >>> > >>> v2: > >>> * Use VFIO framework to enable this feature, the VFIO part of this series > >>> is > >>> base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control" > >>> * Rebase this patchset on > >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, > >>> then revise some irq logic based on the new hierarchy irqdomain > patches > >> provided > >>> by Jiang Liu > >>> > >>> > >>> *** BLURB HERE *** > >>> > >>> Alex Williamson (1): > >>> virt: IRQ bypass manager > >>> > >>> Eric Auger (4): > >>> KVM: arm/arm64: select IRQ_BYPASS_MANAGER > >>> KVM: create kvm_irqfd.h > >>> KVM: introduce kvm_arch functions for IRQ bypass > >>> KVM: eventfd: add irq bypas
RE: [PATCH v9 04/18] KVM: create kvm_irqfd.h
Signed-off-by: Feng Wu > -Original Message- > From: iommu-boun...@lists.linux-foundation.org > [mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of Feng Wu > Sent: Friday, September 18, 2015 10:30 PM > To: pbonz...@redhat.com; alex.william...@redhat.com; j...@8bytes.org; > mtosa...@redhat.com > Cc: iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > k...@vger.kernel.org; eric.au...@linaro.org > Subject: [PATCH v9 04/18] KVM: create kvm_irqfd.h > > From: Eric Auger > > Move _irqfd_resampler and _irqfd struct declarations in a new > public header: kvm_irqfd.h. They are respectively renamed into > kvm_kernel_irqfd_resampler and kvm_kernel_irqfd. Those datatypes > will be used by architecture specific code, in the context of > IRQ bypass manager integration. > > Signed-off-by: Eric Auger > --- > include/linux/kvm_irqfd.h | 69 ++ > virt/kvm/eventfd.c| 95 > --- > 2 files changed, 92 insertions(+), 72 deletions(-) > create mode 100644 include/linux/kvm_irqfd.h > > diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h > new file mode 100644 > index 000..f926b39 > --- /dev/null > +++ b/include/linux/kvm_irqfd.h > @@ -0,0 +1,69 @@ > +/* > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * irqfd: Allows an fd to be used to inject an interrupt to the guest > + * Credit goes to Avi Kivity for the original idea. > + */ > + > +#ifndef __LINUX_KVM_IRQFD_H > +#define __LINUX_KVM_IRQFD_H > + > +#include > +#include > + > +/* > + * Resampling irqfds are a special variety of irqfds used to emulate > + * level triggered interrupts. The interrupt is asserted on eventfd > + * trigger. On acknowledgment through the irq ack notifier, the > + * interrupt is de-asserted and userspace is notified through the > + * resamplefd. All resamplers on the same gsi are de-asserted > + * together, so we don't need to track the state of each individual > + * user. We can also therefore share the same irq source ID. > + */ > +struct kvm_kernel_irqfd_resampler { > + struct kvm *kvm; > + /* > + * List of resampling struct _irqfd objects sharing this gsi. > + * RCU list modified under kvm->irqfds.resampler_lock > + */ > + struct list_head list; > + struct kvm_irq_ack_notifier notifier; > + /* > + * Entry in list of kvm->irqfd.resampler_list. Use for sharing > + * resamplers among irqfds on the same gsi. > + * Accessed and modified under kvm->irqfds.resampler_lock > + */ > + struct list_head link; > +}; > + > +struct kvm_kernel_irqfd { > + /* Used for MSI fast-path */ > + struct kvm *kvm; > + wait_queue_t wait; > + /* Update side is protected by irqfds.lock */ > + struct kvm_kernel_irq_routing_entry irq_entry; > + seqcount_t irq_entry_sc; > + /* Used for level IRQ fast-path */ > + int gsi; > + struct work_struct inject; > + /* The resampler used by this irqfd (resampler-only) */ > + struct kvm_kernel_irqfd_resampler *resampler; > + /* Eventfd notified on resample (resampler-only) */ > + struct eventfd_ctx *resamplefd; > + /* Entry in list of irqfds for a resampler (resampler-only) */ > + struct list_head resampler_link; > + /* Used for setup/shutdown */ > + struct eventfd_ctx *eventfd; > + struct list_head list; > + poll_table pt; > + struct work_struct shutdown; > +}; > + > +#endif /* __LINUX_KVM_IRQFD_H */ > diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c > index 9ff4193..647ffb8 100644 > --- a/virt/kvm/eventfd.c > +++ b/virt/kvm/eventfd.c > @@ -23,6 +23,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -39,68 +40,14 @@ > #include > > #ifdef CONFIG_HAVE_KVM_IRQFD > -/* > - * > - * irqfd: Allows an fd to be used to inject an interrupt to the guest > - * > - * Credit goes to Avi Kivity for the original idea. > - * > - */ > - > -/* > - * Resampling irqfds are a special variety of irqfds used to emulate > - * level triggered interrupts. The interrupt is asserted on eventfd > - * trigger. On acknowledgement through the irq ack notifier, the > - * interrupt is de-asserted and userspace is notified through the > - * resamplefd. All resamplers on the same gsi are de-asserted > - * together, so we don't need to track the state of each individual > -
RE: [PATCH v9 01/18] virt: IRQ bypass manager
Signed-off-by: Feng Wu > -Original Message- > From: iommu-boun...@lists.linux-foundation.org > [mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of Feng Wu > Sent: Friday, September 18, 2015 10:30 PM > To: pbonz...@redhat.com; alex.william...@redhat.com; j...@8bytes.org; > mtosa...@redhat.com > Cc: iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org; > k...@vger.kernel.org; eric.au...@linaro.org > Subject: [PATCH v9 01/18] virt: IRQ bypass manager > > From: Alex Williamson > > When a physical I/O device is assigned to a virtual machine through > facilities like VFIO and KVM, the interrupt for the device generally > bounces through the host system before being injected into the VM. > However, hardware technologies exist that often allow the host to be > bypassed for some of these scenarios. Intel Posted Interrupts allow > the specified physical edge interrupts to be directly injected into a > guest when delivered to a physical processor while the vCPU is > running. ARM IRQ Forwarding allows forwarded physical interrupts to > be directly deactivated by the guest. > > The IRQ bypass manager here is meant to provide the shim to connect > interrupt producers, generally the host physical device driver, with > interrupt consumers, generally the hypervisor, in order to configure > these bypass mechanism. To do this, we base the connection on a > shared, opaque token. For KVM-VFIO this is expected to be an > eventfd_ctx since this is the connection we already use to connect an > eventfd to an irqfd on the in-kernel path. When a producer and > consumer with matching tokens is found, callbacks via both registered > participants allow the bypass facilities to be automatically enabled. > > Signed-off-by: Alex Williamson > Reviewed-by: Eric Auger > Tested-by: Eric Auger > Tested-by: Feng Wu > --- > v4: All producer callbacks are optional, as with Intel PI, it's > possible for the producer to be blissfully unaware of the bypass. > > MAINTAINERS | 7 ++ > include/linux/irqbypass.h | 90 > virt/lib/Kconfig | 2 + > virt/lib/Makefile | 1 + > virt/lib/irqbypass.c | 257 > ++ > 5 files changed, 357 insertions(+) > create mode 100644 include/linux/irqbypass.h > create mode 100644 virt/lib/Kconfig > create mode 100644 virt/lib/Makefile > create mode 100644 virt/lib/irqbypass.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index a9ae6c1..10c8b2f 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -10963,6 +10963,13 @@ L: net...@vger.kernel.org > S: Maintained > F: drivers/net/ethernet/via/via-velocity.* > > +VIRT LIB > +M: Alex Williamson > +M: Paolo Bonzini > +L: k...@vger.kernel.org > +S: Supported > +F: virt/lib/ > + > VIVID VIRTUAL VIDEO DRIVER > M: Hans Verkuil > L: linux-me...@vger.kernel.org > diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h > new file mode 100644 > index 000..1551b5b > --- /dev/null > +++ b/include/linux/irqbypass.h > @@ -0,0 +1,90 @@ > +/* > + * IRQ offload/bypass manager > + * > + * Copyright (C) 2015 Red Hat, Inc. > + * Copyright (c) 2015 Linaro Ltd. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > + */ > +#ifndef IRQBYPASS_H > +#define IRQBYPASS_H > + > +#include > + > +struct irq_bypass_consumer; > + > +/* > + * Theory of operation > + * > + * The IRQ bypass manager is a simple set of lists and callbacks that allows > + * IRQ producers (ex. physical interrupt sources) to be matched to IRQ > + * consumers (ex. virtualization hardware that allows IRQ bypass or offload) > + * via a shared token (ex. eventfd_ctx). Producers and consumers register > + * independently. When a token match is found, the optional @stop callback > + * will be called for each participant. The pair will then be connected via > + * the @add_* callbacks, and finally the optional @start callback will allow > + * any final coordination. When either participant is unregistered, the > + * process is repeated using the @del_* callbacks in place of the @add_* > + * callbacks. Match tokens must be unique per producer/consumer, 1:N > pairings > + * are not supported. > + */ > + > +/** > + * struct irq_bypass_producer - IRQ bypass producer definition > + * @node: IRQ bypass manager private list management > + * @token: opaque token to match between producer and consumer > + * @irq: Linux IRQ number for the producer device > + * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional) > + * @del_consumer: Disconnect the IRQ producer from an IRQ consumer > (optional) > + * @stop: Perform any quiesce operations necessary prior to add/del > (optional) > + * @start: Perform any startup operations necessary after add/del (optional) > + * > + * The IRQ bypass producer
Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series
On 18/09/2015 17:08, Wu, Feng wrote: > > >> -Original Message- >> From: Paolo Bonzini [mailto:pbonz...@redhat.com] >> Sent: Friday, September 18, 2015 10:59 PM >> To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org; >> mtosa...@redhat.com >> Cc: eric.au...@linaro.org; k...@vger.kernel.org; >> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org >> Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including >> prerequisite series >> >> >> >> On 18/09/2015 16:29, Feng Wu wrote: >>> VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. >>> With VT-d Posted-Interrupts enabled, external interrupts from >>> direct-assigned devices can be delivered to guests without VMM >>> intervention when guest is running in non-root mode. >>> >>> You can find the VT-d Posted-Interrtups Spec. in the following URL: >>> >> http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog >> y/vt-directed-io-spec.html >> >> Thanks. I will squash patches 2 and 14 together, and drop patch 3. >> >> Signed-off-bys are missing in patch 1 and 4. The patches exist >> elsewhere in the mailing list archives, so not a big deal. Or just >> reply to them with the S-o-b line. >> > > Thanks for your quick response, Paolo! I didn't change the code > in patch 1 and 4, do I need to add s-o-b, if needed, I can reply > the patches. Yes, the s-o-b just means that the code passed through your hands. Note that I replied to patch 17, but no need to resend that one either---just mailing list discussion is enough. Paolo > Thanks, > Feng > >> Alex, can you ack the series and review patch 12? >> >> Joerg, can you ack patch 18? >> >> Paolo >> >>> v9: >>> - Include the whole series: >>> [01/18]: irq bypasser manager >>> [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side >> forwarded irq >>> [07/18] - [18/18]: VT-d PI part >>> >>> v8: >>> refer to the changelog in each patch >>> >>> v7: >>> * Define two weak irq bypass callbacks: >>> - kvm_arch_irq_bypass_start() >>> - kvm_arch_irq_bypass_stop() >>> * Remove the x86 dummy implementation of the above two functions. >>> * Print some useful information instead of WARN_ON() when the >>> irq bypass consumer unregistration fails. >>> * Fix an issue when calling pi_pre_block and pi_post_block. >>> >>> v6: >>> * Rebase on 4.2.0-rc6 >>> * Rebase on https://lkml.org/lkml/2015/8/6/526 and >> http://www.gossamer-threads.com/lists/linux/kernel/2235623 >>> * Make the add_consumer and del_consumer callbacks static >>> * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' >>> * Use dev_info instead of WARN_ON() when irq_bypass_register_producer >> fails >>> * Remove optional dummy callbacks for irq producer >>> >>> v4: >>> * For lowest-priority interrupt, only support single-CPU destination >>> interrupts at the current stage, more common lowest priority support >>> will be added later. >>> * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle >>> the posted-interrupts in the HLT emulation path. >>> * Some small changes (coding style, typo, add some code comments) >>> >>> v3: >>> * Adjust the Posted-interrupts Descriptor updating logic when vCPU is >>> preempted or blocked. >>> * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> >> KVM_DEV_VFIO_DEVICE_POST_IRQ >>> * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> >> __KVM_HAVE_ARCH_KVM_VFIO_POST >>> * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which >>> can be used to change back to remapping mode. >>> * Fix typo >>> >>> v2: >>> * Use VFIO framework to enable this feature, the VFIO part of this series is >>> base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control" >>> * Rebase this patchset on >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, >>> then revise some irq logic based on the new hierarchy irqdomain patches >> provided >>> by Jiang Liu >>> >>> >>> *** BLURB HERE *** >>> >>> Alex Williamson (1): >>> virt: IRQ bypass manager >>> >>> Eric Auger (4): >>> KVM: arm/arm64: select IRQ_BYPASS_MANAGER >>> KVM: create kvm_irqfd.h >>> KVM: introduce kvm_arch functions for IRQ bypass >>> KVM: eventfd: add irq bypass consumer management >>> >>> Feng Wu (13): >>> KVM: x86: select IRQ_BYPASS_MANAGER >>> KVM: Extend struct pi_desc for VT-d Posted-Interrupts >>> KVM: Add some helper functions for Posted-Interrupts >>> KVM: Define a new interface kvm_intr_is_single_vcpu() >>> KVM: Make struct kvm_irq_routing_table accessible >>> KVM: make kvm_set_msi_irq() public >>> vfio: Register/unregister irq_bypass_producer >>> KVM: x86: Update IRTE for posted-interrupts >>> KVM: Implement IRQ bypass consumer callbacks for x86 >>> KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' >>> KVM: Update Posted-Interrupts Descriptor when vCPU is preempted >>> KVM: Update Posted-Interrupts Descriptor when vCPU is blocked >>> iommu/vt-d: Add a command line parameter for VT-d posted-interrupts >>> >>
[PATCH] iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA
Stage-2 TLBI by IPA takes a 48-bit address field, as opposed to the 64-bit field used by the VA-based invalidation commands. This patch re-jigs the SMMUv3 command construction code so that the address field is correctly masked. Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-v3.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index a24f359fa0d0..286e890e7d64 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -343,7 +343,8 @@ #define CMDQ_TLBI_0_VMID_SHIFT 32 #define CMDQ_TLBI_0_ASID_SHIFT 48 #define CMDQ_TLBI_1_LEAF (1UL << 0) -#define CMDQ_TLBI_1_ADDR_MASK ~0xfffUL +#define CMDQ_TLBI_1_VA_MASK~0xfffUL +#define CMDQ_TLBI_1_IPA_MASK 0xf000UL #define CMDQ_PRI_0_SSID_SHIFT 12 #define CMDQ_PRI_0_SSID_MASK 0xfUL @@ -771,11 +772,13 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) break; case CMDQ_OP_TLBI_NH_VA: cmd[0] |= (u64)ent->tlbi.asid << CMDQ_TLBI_0_ASID_SHIFT; - /* Fallthrough */ + cmd[1] |= ent->tlbi.leaf ? CMDQ_TLBI_1_LEAF : 0; + cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK; + break; case CMDQ_OP_TLBI_S2_IPA: cmd[0] |= (u64)ent->tlbi.vmid << CMDQ_TLBI_0_VMID_SHIFT; cmd[1] |= ent->tlbi.leaf ? CMDQ_TLBI_1_LEAF : 0; - cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_ADDR_MASK; + cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_IPA_MASK; break; case CMDQ_OP_TLBI_NH_ASID: cmd[0] |= (u64)ent->tlbi.asid << CMDQ_TLBI_0_ASID_SHIFT; -- 2.1.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series
> -Original Message- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Friday, September 18, 2015 10:59 PM > To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org; > mtosa...@redhat.com > Cc: eric.au...@linaro.org; k...@vger.kernel.org; > iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org > Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including > prerequisite series > > > > On 18/09/2015 16:29, Feng Wu wrote: > > VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. > > With VT-d Posted-Interrupts enabled, external interrupts from > > direct-assigned devices can be delivered to guests without VMM > > intervention when guest is running in non-root mode. > > > > You can find the VT-d Posted-Interrtups Spec. in the following URL: > > > http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog > y/vt-directed-io-spec.html > > Thanks. I will squash patches 2 and 14 together, and drop patch 3. > > Signed-off-bys are missing in patch 1 and 4. The patches exist > elsewhere in the mailing list archives, so not a big deal. Or just > reply to them with the S-o-b line. > Thanks for your quick response, Paolo! I didn't change the code in patch 1 and 4, do I need to add s-o-b, if needed, I can reply the patches. Thanks, Feng > Alex, can you ack the series and review patch 12? > > Joerg, can you ack patch 18? > > Paolo > > > v9: > > - Include the whole series: > > [01/18]: irq bypasser manager > > [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side > forwarded irq > > [07/18] - [18/18]: VT-d PI part > > > > v8: > > refer to the changelog in each patch > > > > v7: > > * Define two weak irq bypass callbacks: > > - kvm_arch_irq_bypass_start() > > - kvm_arch_irq_bypass_stop() > > * Remove the x86 dummy implementation of the above two functions. > > * Print some useful information instead of WARN_ON() when the > > irq bypass consumer unregistration fails. > > * Fix an issue when calling pi_pre_block and pi_post_block. > > > > v6: > > * Rebase on 4.2.0-rc6 > > * Rebase on https://lkml.org/lkml/2015/8/6/526 and > http://www.gossamer-threads.com/lists/linux/kernel/2235623 > > * Make the add_consumer and del_consumer callbacks static > > * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' > > * Use dev_info instead of WARN_ON() when irq_bypass_register_producer > fails > > * Remove optional dummy callbacks for irq producer > > > > v4: > > * For lowest-priority interrupt, only support single-CPU destination > > interrupts at the current stage, more common lowest priority support > > will be added later. > > * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle > > the posted-interrupts in the HLT emulation path. > > * Some small changes (coding style, typo, add some code comments) > > > > v3: > > * Adjust the Posted-interrupts Descriptor updating logic when vCPU is > > preempted or blocked. > > * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> > KVM_DEV_VFIO_DEVICE_POST_IRQ > > * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> > __KVM_HAVE_ARCH_KVM_VFIO_POST > > * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which > > can be used to change back to remapping mode. > > * Fix typo > > > > v2: > > * Use VFIO framework to enable this feature, the VFIO part of this series is > > base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control" > > * Rebase this patchset on > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, > > then revise some irq logic based on the new hierarchy irqdomain patches > provided > > by Jiang Liu > > > > > > *** BLURB HERE *** > > > > Alex Williamson (1): > > virt: IRQ bypass manager > > > > Eric Auger (4): > > KVM: arm/arm64: select IRQ_BYPASS_MANAGER > > KVM: create kvm_irqfd.h > > KVM: introduce kvm_arch functions for IRQ bypass > > KVM: eventfd: add irq bypass consumer management > > > > Feng Wu (13): > > KVM: x86: select IRQ_BYPASS_MANAGER > > KVM: Extend struct pi_desc for VT-d Posted-Interrupts > > KVM: Add some helper functions for Posted-Interrupts > > KVM: Define a new interface kvm_intr_is_single_vcpu() > > KVM: Make struct kvm_irq_routing_table accessible > > KVM: make kvm_set_msi_irq() public > > vfio: Register/unregister irq_bypass_producer > > KVM: x86: Update IRTE for posted-interrupts > > KVM: Implement IRQ bypass consumer callbacks for x86 > > KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' > > KVM: Update Posted-Interrupts Descriptor when vCPU is preempted > > KVM: Update Posted-Interrupts Descriptor when vCPU is blocked > > iommu/vt-d: Add a command line parameter for VT-d posted-interrupts > > > > Documentation/kernel-parameters.txt | 1 + > > Documentation/virtual/kvm/locking.txt | 12 ++ > > MAINTAINERS | 7 + > > arch/arm/kvm/Kconfig | 2 + > > arch/arm/kvm/Makefile | 1 + > > arch/arm64/kv
Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series
On 18/09/2015 16:29, Feng Wu wrote: > VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. > With VT-d Posted-Interrupts enabled, external interrupts from > direct-assigned devices can be delivered to guests without VMM > intervention when guest is running in non-root mode. > > You can find the VT-d Posted-Interrtups Spec. in the following URL: > http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html Thanks. I will squash patches 2 and 14 together, and drop patch 3. Signed-off-bys are missing in patch 1 and 4. The patches exist elsewhere in the mailing list archives, so not a big deal. Or just reply to them with the S-o-b line. Alex, can you ack the series and review patch 12? Joerg, can you ack patch 18? Paolo > v9: > - Include the whole series: > [01/18]: irq bypasser manager > [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side > forwarded irq > [07/18] - [18/18]: VT-d PI part > > v8: > refer to the changelog in each patch > > v7: > * Define two weak irq bypass callbacks: > - kvm_arch_irq_bypass_start() > - kvm_arch_irq_bypass_stop() > * Remove the x86 dummy implementation of the above two functions. > * Print some useful information instead of WARN_ON() when the > irq bypass consumer unregistration fails. > * Fix an issue when calling pi_pre_block and pi_post_block. > > v6: > * Rebase on 4.2.0-rc6 > * Rebase on https://lkml.org/lkml/2015/8/6/526 and > http://www.gossamer-threads.com/lists/linux/kernel/2235623 > * Make the add_consumer and del_consumer callbacks static > * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' > * Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails > * Remove optional dummy callbacks for irq producer > > v4: > * For lowest-priority interrupt, only support single-CPU destination > interrupts at the current stage, more common lowest priority support > will be added later. > * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle > the posted-interrupts in the HLT emulation path. > * Some small changes (coding style, typo, add some code comments) > > v3: > * Adjust the Posted-interrupts Descriptor updating logic when vCPU is > preempted or blocked. > * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> KVM_DEV_VFIO_DEVICE_POST_IRQ > * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> __KVM_HAVE_ARCH_KVM_VFIO_POST > * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which > can be used to change back to remapping mode. > * Fix typo > > v2: > * Use VFIO framework to enable this feature, the VFIO part of this series is > base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control" > * Rebase this patchset on > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, > then revise some irq logic based on the new hierarchy irqdomain patches > provided > by Jiang Liu > > > *** BLURB HERE *** > > Alex Williamson (1): > virt: IRQ bypass manager > > Eric Auger (4): > KVM: arm/arm64: select IRQ_BYPASS_MANAGER > KVM: create kvm_irqfd.h > KVM: introduce kvm_arch functions for IRQ bypass > KVM: eventfd: add irq bypass consumer management > > Feng Wu (13): > KVM: x86: select IRQ_BYPASS_MANAGER > KVM: Extend struct pi_desc for VT-d Posted-Interrupts > KVM: Add some helper functions for Posted-Interrupts > KVM: Define a new interface kvm_intr_is_single_vcpu() > KVM: Make struct kvm_irq_routing_table accessible > KVM: make kvm_set_msi_irq() public > vfio: Register/unregister irq_bypass_producer > KVM: x86: Update IRTE for posted-interrupts > KVM: Implement IRQ bypass consumer callbacks for x86 > KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' > KVM: Update Posted-Interrupts Descriptor when vCPU is preempted > KVM: Update Posted-Interrupts Descriptor when vCPU is blocked > iommu/vt-d: Add a command line parameter for VT-d posted-interrupts > > Documentation/kernel-parameters.txt | 1 + > Documentation/virtual/kvm/locking.txt | 12 ++ > MAINTAINERS | 7 + > arch/arm/kvm/Kconfig | 2 + > arch/arm/kvm/Makefile | 1 + > arch/arm64/kvm/Kconfig| 2 + > arch/arm64/kvm/Makefile | 1 + > arch/x86/include/asm/kvm_host.h | 24 +++ > arch/x86/kvm/Kconfig | 3 + > arch/x86/kvm/Makefile | 3 + > arch/x86/kvm/irq_comm.c | 32 ++- > arch/x86/kvm/lapic.c | 59 ++ > arch/x86/kvm/lapic.h | 2 + > arch/x86/kvm/trace.h | 33 > arch/x86/kvm/vmx.c| 361 > +- > arch/x86/kvm/x86.c| 108 +- > drivers/iommu/irq_remapping.c | 12 +- > drivers/vfio/pci/Kconfig | 1 + > drivers/vfio/pci/vfio_pci_intrs.c | 9 + > drivers/vfio/pci/vfio_pci_private.h | 2 + > include/linux
[PATCH v9 09/18] KVM: Define a new interface kvm_intr_is_single_vcpu()
This patch defines a new interface kvm_intr_is_single_vcpu(), which can returns whether the interrupt is for single-CPU or not. It is used by VT-d PI, since now we only support single-CPU interrupts, For lowest-priority interrupts, if user configures it via /proc/irq or uses irqbalance to make it single-CPU, we can use PI to deliver the interrupts to it. Full functionality of lowest-priority support will be added later. Signed-off-by: Feng Wu --- v9: - Move kvm_intr_is_single_vcpu_fast() to lapic.c - Remove incorrect WARN_ON_ONCE() v8: - Some optimizations in kvm_intr_is_single_vcpu(). - Expose kvm_intr_is_single_vcpu() so we can use it in vmx code. - Add kvm_intr_is_single_vcpu_fast() as the fast path to find the target vCPU for the single-destination interrupt arch/x86/include/asm/kvm_host.h | 3 +++ arch/x86/kvm/irq_comm.c | 27 +++ arch/x86/kvm/lapic.c| 59 + arch/x86/kvm/lapic.h| 2 ++ 4 files changed, 91 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 49ec903..af11bca 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1204,4 +1204,7 @@ int __x86_set_memory_region(struct kvm *kvm, int x86_set_memory_region(struct kvm *kvm, const struct kvm_userspace_memory_region *mem); +bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq, +struct kvm_vcpu **dest_vcpu); + #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c index 9efff9e..f86a0da 100644 --- a/arch/x86/kvm/irq_comm.c +++ b/arch/x86/kvm/irq_comm.c @@ -297,6 +297,33 @@ out: return r; } +bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq, +struct kvm_vcpu **dest_vcpu) +{ + int i, r = 0; + struct kvm_vcpu *vcpu; + + if (kvm_intr_is_single_vcpu_fast(kvm, irq, dest_vcpu)) + return true; + + kvm_for_each_vcpu(i, vcpu, kvm) { + if (!kvm_apic_present(vcpu)) + continue; + + if (!kvm_apic_match_dest(vcpu, NULL, irq->shorthand, + irq->dest_id, irq->dest_mode)) + continue; + + if (++r == 2) + return false; + + *dest_vcpu = vcpu; + } + + return r == 1; +} +EXPORT_SYMBOL_GPL(kvm_intr_is_single_vcpu); + #define IOAPIC_ROUTING_ENTRY(irq) \ { .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP, \ .u.irqchip = { .irqchip = KVM_IRQCHIP_IOAPIC, .pin = (irq) } } diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 2a5ca97..3c8fc71 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -764,6 +764,65 @@ out: return ret; } +bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq, + struct kvm_vcpu **dest_vcpu) +{ + struct kvm_apic_map *map; + bool ret = false; + struct kvm_lapic *dst = NULL; + + if (irq->shorthand) + return false; + + rcu_read_lock(); + map = rcu_dereference(kvm->arch.apic_map); + + if (!map) + goto out; + + if (irq->dest_mode == APIC_DEST_PHYSICAL) { + if (irq->dest_id == 0xFF) + goto out; + + if (irq->dest_id >= ARRAY_SIZE(map->phys_map)) + goto out; + + dst = map->phys_map[irq->dest_id]; + if (dst && kvm_apic_present(dst->vcpu)) + *dest_vcpu = dst->vcpu; + else + goto out; + } else { + u16 cid; + unsigned long bitmap = 1; + int i, r = 0; + + if (!kvm_apic_logical_map_valid(map)) + goto out; + + apic_logical_id(map, irq->dest_id, &cid, (u16 *)&bitmap); + + if (cid >= ARRAY_SIZE(map->logical_map)) + goto out; + + for_each_set_bit(i, &bitmap, 16) { + dst = map->logical_map[cid][i]; + if (++r == 2) + goto out; + } + + if (dst && kvm_apic_present(dst->vcpu)) + *dest_vcpu = dst->vcpu; + else + goto out; + } + + ret = true; +out: + rcu_read_unlock(); + return ret; +} + /* * Add a pending IRQ into lapic. * Return 1 if successfully added and 0 if discarded. diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index 7195274..032fe2d 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -169,4 +169,6 @@ bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector); void wait_lapic_expire(struct kvm_vcpu *vcpu); +bool kvm_intr_is_single_vcp
[PATCH v9 17/18] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
This patch updates the Posted-Interrupts Descriptor when vCPU is blocked. pre-block: - Add the vCPU to the blocked per-CPU list - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR post-block: - Remove the vCPU from the per-CPU list Signed-off-by: Feng Wu --- v9: - Add description for blocked_vcpu_on_cpu_lock in Documentation/virtual/kvm/locking.txt - Check !kvm_arch_has_assigned_device(vcpu->kvm) first, then !irq_remapping_cap(IRQ_POSTING_CAP) v8: - Rename 'pi_pre_block' to 'pre_block' - Rename 'pi_post_block' to 'post_block' - Change some comments - Only add the vCPU to the blocking list when the VM has assigned devices. Documentation/virtual/kvm/locking.txt | 12 +++ arch/x86/include/asm/kvm_host.h | 13 +++ arch/x86/kvm/vmx.c| 153 ++ arch/x86/kvm/x86.c| 53 +--- include/linux/kvm_host.h | 3 + virt/kvm/kvm_main.c | 3 + 6 files changed, 227 insertions(+), 10 deletions(-) diff --git a/Documentation/virtual/kvm/locking.txt b/Documentation/virtual/kvm/locking.txt index d68af4d..19f94a6 100644 --- a/Documentation/virtual/kvm/locking.txt +++ b/Documentation/virtual/kvm/locking.txt @@ -166,3 +166,15 @@ Comment: The srcu read lock must be held while accessing memslots (e.g. MMIO/PIO address->device structure mapping (kvm->buses). The srcu index can be stored in kvm_vcpu->srcu_idx per vcpu if it is needed by multiple functions. + +Name: blocked_vcpu_on_cpu_lock +Type: spinlock_t +Arch: x86 +Protects: blocked_vcpu_on_cpu +Comment: This is a per-CPU lock and it is used for VT-d posted-interrupts. + When VT-d posted-interrupts is supported and the VM has assigned + devices, we put the blocked vCPU on the list blocked_vcpu_on_cpu + protected by blocked_vcpu_on_cpu_lock, when VT-d hardware issues + wakeup notification event since external interrupts from the + assigned devices happens, we will find the vCPU on the list to + wakeup. diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0ddd353..304fbb5 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -552,6 +552,8 @@ struct kvm_vcpu_arch { */ bool write_fault_to_shadow_pgtable; + bool halted; + /* set at EPT violation at this point */ unsigned long exit_qualification; @@ -864,6 +866,17 @@ struct kvm_x86_ops { /* pmu operations of sub-arch */ const struct kvm_pmu_ops *pmu_ops; + /* +* Architecture specific hooks for vCPU blocking due to +* HLT instruction. +* Returns for .pre_block(): +*- 0 means continue to block the vCPU. +*- 1 means we cannot block the vCPU since some event +*happens during this period, such as, 'ON' bit in +*posted-interrupts descriptor is set. +*/ + int (*pre_block)(struct kvm_vcpu *vcpu); + void (*post_block)(struct kvm_vcpu *vcpu); int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); }; diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 902a67d..9968896 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -879,6 +879,13 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs); static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu); static DEFINE_PER_CPU(struct desc_ptr, host_gdt); +/* + * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we + * can find which vCPU should be waken up. + */ +static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu); +static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock); + static unsigned long *vmx_io_bitmap_a; static unsigned long *vmx_io_bitmap_b; static unsigned long *vmx_msr_bitmap_legacy; @@ -2985,6 +2992,8 @@ static int hardware_enable(void) return -EBUSY; INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu)); + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu)); + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu)); /* * Now we can enable the vmclear operation in kdump @@ -6121,6 +6130,25 @@ static void update_ple_window_actual_max(void) ple_window_grow, INT_MIN); } +/* + * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR. + */ +static void wakeup_handler(void) +{ + struct kvm_vcpu *vcpu; + int cpu = smp_processor_id(); + + spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu)); + list_for_each_entry(vcpu, &per_cpu(blocked_vcpu_on_cpu, cpu), + blocked_vcpu_list) { + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); + + if (pi_test_on(pi_desc) == 1) + kvm_vcpu_kick(vcpu)
[PATCH v9 15/18] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
This patch adds an arch specific hooks 'arch_update' in 'struct kvm_kernel_irqfd'. On Intel side, it is used to update the IRTE when VT-d posted-interrupts is used. Signed-off-by: Feng Wu --- v9: - Use 'if' instead of "? :" in kvm_arch_update_irqfd_routing() - coding style v8: - Remove callback .arch_update() - Remove kvm_arch_irqfd_init() - Call kvm_arch_update_irqfd_routing() instead. arch/x86/kvm/x86.c | 9 + include/linux/kvm_host.h | 2 ++ virt/kvm/eventfd.c | 20 +++- 3 files changed, 30 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 79dac02..58688aa 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8293,6 +8293,15 @@ void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons, " fails: %d\n", irqfd->consumer.token, ret); } +int kvm_arch_update_irqfd_routing(struct kvm *kvm, unsigned int host_irq, + uint32_t guest_irq, bool set) +{ + if (!kvm_x86_ops->update_pi_irte) + return -EINVAL; + + return kvm_x86_ops->update_pi_irte(kvm, host_irq, guest_irq, set); +} + EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5f183fb..feba1fb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1174,6 +1174,8 @@ void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *, struct irq_bypass_producer *); void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *); void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *); +int kvm_arch_update_irqfd_routing(struct kvm *kvm, unsigned int host_irq, + uint32_t guest_irq, bool set); #endif /* CONFIG_HAVE_KVM_IRQ_BYPASS */ #endif diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index c0a56a1..94306a3 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -266,6 +266,13 @@ void __attribute__((weak)) kvm_arch_irq_bypass_start( struct irq_bypass_consumer *cons) { } + +int __attribute__((weak)) kvm_arch_update_irqfd_routing( + struct kvm *kvm, unsigned int host_irq, + uint32_t guest_irq, bool set) +{ + return 0; +} #endif static int @@ -582,13 +589,24 @@ kvm_irqfd_release(struct kvm *kvm) */ void kvm_irq_routing_update(struct kvm *kvm) { + int ret; struct kvm_kernel_irqfd *irqfd; spin_lock_irq(&kvm->irqfds.lock); - list_for_each_entry(irqfd, &kvm->irqfds.items, list) + list_for_each_entry(irqfd, &kvm->irqfds.items, list) { irqfd_update(kvm, irqfd); +#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS + if (irqfd->producer) { + ret = kvm_arch_update_irqfd_routing( + irqfd->kvm, irqfd->producer->irq, + irqfd->gsi, 1); + WARN_ON(ret); + } +#endif + } + spin_unlock_irq(&kvm->irqfds.lock); } -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 12/18] vfio: Register/unregister irq_bypass_producer
This patch adds the registration/unregistration of an irq_bypass_producer for MSI/MSIx on vfio pci devices. Signed-off-by: Feng Wu --- v8: - Merge "[PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices" into this patch. v6: - Make the add_consumer and del_consumer callbacks static - Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' - Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails - Remove optional dummy callbacks for irq producer drivers/vfio/pci/Kconfig| 1 + drivers/vfio/pci/vfio_pci_intrs.c | 9 + drivers/vfio/pci/vfio_pci_private.h | 2 ++ 3 files changed, 12 insertions(+) diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig index 579d83b..02912f1 100644 --- a/drivers/vfio/pci/Kconfig +++ b/drivers/vfio/pci/Kconfig @@ -2,6 +2,7 @@ config VFIO_PCI tristate "VFIO support for PCI devices" depends on VFIO && PCI && EVENTFD select VFIO_VIRQFD + select IRQ_BYPASS_MANAGER help Support for the PCI VFIO bus driver. This is required to make use of PCI drivers using the VFIO framework. diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 1f577b4..c65299d 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -319,6 +319,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev, if (vdev->ctx[vector].trigger) { free_irq(irq, vdev->ctx[vector].trigger); + irq_bypass_unregister_producer(&vdev->ctx[vector].producer); kfree(vdev->ctx[vector].name); eventfd_ctx_put(vdev->ctx[vector].trigger); vdev->ctx[vector].trigger = NULL; @@ -360,6 +361,14 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev, return ret; } + vdev->ctx[vector].producer.token = trigger; + vdev->ctx[vector].producer.irq = irq; + ret = irq_bypass_register_producer(&vdev->ctx[vector].producer); + if (unlikely(ret)) + dev_info(&pdev->dev, + "irq bypass producer (token %p) registeration fails: %d\n", + vdev->ctx[vector].producer.token, ret); + vdev->ctx[vector].trigger = trigger; return 0; diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h index ae0e1b4..0e7394f 100644 --- a/drivers/vfio/pci/vfio_pci_private.h +++ b/drivers/vfio/pci/vfio_pci_private.h @@ -13,6 +13,7 @@ #include #include +#include #ifndef VFIO_PCI_PRIVATE_H #define VFIO_PCI_PRIVATE_H @@ -29,6 +30,7 @@ struct vfio_pci_irq_ctx { struct virqfd *mask; char*name; boolmasked; + struct irq_bypass_producer producer; }; struct vfio_pci_device { -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 13/18] KVM: x86: Update IRTE for posted-interrupts
This patch adds the routine to update IRTE for posted-interrupts when guest changes the interrupt configuration. Signed-off-by: Feng Wu --- v9: - Check !kvm_arch_has_assigned_device(kvm) first then !irq_remapping_cap(IRQ_POSTING_CAP) v8: - Move 'kvm_arch_update_pi_irte' to vmx.c as a callback - Only update the PI irte when VM has assigned devices - Add a trace point for VT-d posted-interrupts when we update or disable it for a specific irq. arch/x86/include/asm/kvm_host.h | 3 ++ arch/x86/kvm/trace.h| 33 arch/x86/kvm/vmx.c | 83 + arch/x86/kvm/x86.c | 2 + 4 files changed, 121 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index daa6126..8c44286 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -862,6 +862,9 @@ struct kvm_x86_ops { gfn_t offset, unsigned long mask); /* pmu operations of sub-arch */ const struct kvm_pmu_ops *pmu_ops; + + int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq, + uint32_t guest_irq, bool set); }; struct kvm_arch_async_pf { diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index 4eae7c3..539a9e4 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -974,6 +974,39 @@ TRACE_EVENT(kvm_enter_smm, __entry->smbase) ); +/* + * Tracepoint for VT-d posted-interrupts. + */ +TRACE_EVENT(kvm_pi_irte_update, + TP_PROTO(unsigned int vcpu_id, unsigned int gsi, +unsigned int gvec, u64 pi_desc_addr, bool set), + TP_ARGS(vcpu_id, gsi, gvec, pi_desc_addr, set), + + TP_STRUCT__entry( + __field(unsigned int, vcpu_id ) + __field(unsigned int, gsi ) + __field(unsigned int, gvec) + __field(u64,pi_desc_addr) + __field(bool, set ) + ), + + TP_fast_assign( + __entry->vcpu_id= vcpu_id; + __entry->gsi= gsi; + __entry->gvec = gvec; + __entry->pi_desc_addr = pi_desc_addr; + __entry->set= set; + ), + + TP_printk("VT-d PI is %s for this irq, vcpu %u, gsi: 0x%x, " + "gvec: 0x%x, pi_desc_addr: 0x%llx", + __entry->set ? "enabled and being updated" : "disabled", + __entry->vcpu_id, + __entry->gsi, + __entry->gvec, + __entry->pi_desc_addr) +); + #endif /* _TRACE_KVM_H */ #undef TRACE_INCLUDE_PATH diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 316f9bf..11bda72 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -45,6 +45,7 @@ #include #include #include +#include #include "trace.h" #include "pmu.h" @@ -605,6 +606,11 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_vmx, vcpu); } +struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu) +{ + return &(to_vmx(vcpu)->pi_desc); +} + #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x) #define FIELD(number, name)[number] = VMCS12_OFFSET(name) #define FIELD64(number, name) [number] = VMCS12_OFFSET(name), \ @@ -10344,6 +10350,81 @@ static void vmx_enable_log_dirty_pt_masked(struct kvm *kvm, kvm_mmu_clear_dirty_pt_masked(kvm, memslot, offset, mask); } +/* + * vmx_update_pi_irte - set IRTE for Posted-Interrupts + * + * @kvm: kvm + * @host_irq: host irq of the interrupt + * @guest_irq: gsi of the interrupt + * @set: set or unset PI + * returns 0 on success, < 0 on failure + */ +int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq, + uint32_t guest_irq, bool set) +{ + struct kvm_kernel_irq_routing_entry *e; + struct kvm_irq_routing_table *irq_rt; + struct kvm_lapic_irq irq; + struct kvm_vcpu *vcpu; + struct vcpu_data vcpu_info; + int idx, ret = -EINVAL; + + if (!kvm_arch_has_assigned_device(kvm) || + !irq_remapping_cap(IRQ_POSTING_CAP)) + return 0; + + idx = srcu_read_lock(&kvm->irq_srcu); + irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu); + BUG_ON(guest_irq >= irq_rt->nr_rt_entries); + + hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) { + if (e->type != KVM_IRQ_ROUTING_MSI) + continue; + /* +* VT-d PI cannot support posting multicast/broadcast +* interrupts to a vCPU, we still use interrupt remapping +* for these kind of interrupts. +* +* For lowest-priority interrupts, we only support +* those with sing
[PATCH v9 14/18] KVM: Implement IRQ bypass consumer callbacks for x86
Implement the following callbacks for x86: - kvm_arch_irq_bypass_add_producer - kvm_arch_irq_bypass_del_producer - kvm_arch_irq_bypass_stop: dummy callback - kvm_arch_irq_bypass_resume: dummy callback and set CONFIG_HAVE_KVM_IRQ_BYPASS for x86. Signed-off-by: Feng Wu --- v8: - Move the weak irq bypas stop and irq bypass start to this patch. - Call kvm_x86_ops->update_pi_irte() instead of kvm_arch_update_pi_irte(). arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/Kconfig| 1 + arch/x86/kvm/x86.c | 44 + virt/kvm/eventfd.c | 12 +++ 4 files changed, 58 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 8c44286..0ddd353 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -24,6 +24,7 @@ #include #include #include +#include #include #include diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index c951d44..b90776f 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -30,6 +30,7 @@ config KVM select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQFD select IRQ_BYPASS_MANAGER + select HAVE_KVM_IRQ_BYPASS select HAVE_KVM_IRQ_ROUTING select HAVE_KVM_EVENTFD select KVM_APIC_ARCHITECTURE diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9dcd501..79dac02 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -50,6 +50,8 @@ #include #include #include +#include +#include #include #define CREATE_TRACE_POINTS @@ -8249,6 +8251,48 @@ bool kvm_arch_has_noncoherent_dma(struct kvm *kvm) } EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma); +int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons, + struct irq_bypass_producer *prod) +{ + struct kvm_kernel_irqfd *irqfd = + container_of(cons, struct kvm_kernel_irqfd, consumer); + + if (kvm_x86_ops->update_pi_irte) { + irqfd->producer = prod; + return kvm_x86_ops->update_pi_irte(irqfd->kvm, + prod->irq, irqfd->gsi, 1); + } + + return -EINVAL; +} + +void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons, + struct irq_bypass_producer *prod) +{ + int ret; + struct kvm_kernel_irqfd *irqfd = + container_of(cons, struct kvm_kernel_irqfd, consumer); + + if (!kvm_x86_ops->update_pi_irte) { + WARN_ON(irqfd->producer != NULL); + return; + } + + WARN_ON(irqfd->producer != prod); + irqfd->producer = NULL; + + /* +* When producer of consumer is unregistered, we change back to +* remapped mode, so we can re-use the current implementation +* when the irq is masked/disabed or the consumer side (KVM +* int this case doesn't want to receive the interrupts. + */ + ret = kvm_x86_ops->update_pi_irte(irqfd->kvm, prod->irq, irqfd->gsi, 0); + if (ret) + printk(KERN_INFO "irq bypass consumer (token %p) unregistration" + " fails: %d\n", irqfd->consumer.token, ret); +} + EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault); diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index d7a230f..c0a56a1 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -256,6 +256,18 @@ static void irqfd_update(struct kvm *kvm, struct kvm_kernel_irqfd *irqfd) write_seqcount_end(&irqfd->irq_entry_sc); } +#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS +void __attribute__((weak)) kvm_arch_irq_bypass_stop( + struct irq_bypass_consumer *cons) +{ +} + +void __attribute__((weak)) kvm_arch_irq_bypass_start( + struct irq_bypass_consumer *cons) +{ +} +#endif + static int kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) { -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 16/18] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
This patch updates the Posted-Interrupts Descriptor when vCPU is preempted. sched out: - Set 'SN' to suppress furture non-urgent interrupts posted for the vCPU. sched in: - Clear 'SN' - Change NDST if vCPU is scheduled to a different CPU - Set 'NV' to POSTED_INTR_VECTOR Signed-off-by: Feng Wu --- v9: - Check !kvm_arch_has_assigned_device(vcpu->kvm) first, then !irq_remapping_cap(IRQ_POSTING_CAP) v8: - Add two wrapper fucntion vmx_vcpu_pi_load() and vmx_vcpu_pi_put(). - Only handle VT-d PI related logic when the VM has assigned devices. arch/x86/kvm/vmx.c | 79 ++ 1 file changed, 79 insertions(+) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 11bda72..902a67d 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1943,6 +1943,52 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx) preempt_enable(); } +static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) +{ + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); + struct pi_desc old, new; + unsigned int dest; + + if (!kvm_arch_has_assigned_device(vcpu->kvm) || + !irq_remapping_cap(IRQ_POSTING_CAP)) + return; + + do { + old.control = new.control = pi_desc->control; + + /* +* If 'nv' field is POSTED_INTR_WAKEUP_VECTOR, there +* are two possible cases: +* 1. After running 'pre_block', context switch +*happened. For this case, 'sn' was set in +*vmx_vcpu_put(), so we need to clear it here. +* 2. After running 'pre_block', we were blocked, +*and woken up by some other guy. For this case, +*we don't need to do anything, 'pi_post_block' +*will do everything for us. However, we cannot +*check whether it is case #1 or case #2 here +*(maybe, not needed), so we also clear sn here, +*I think it is not a big deal. +*/ + if (pi_desc->nv != POSTED_INTR_WAKEUP_VECTOR) { + if (vcpu->cpu != cpu) { + dest = cpu_physical_id(cpu); + + if (x2apic_enabled()) + new.ndst = dest; + else + new.ndst = (dest << 8) & 0xFF00; + } + + /* set 'NV' to 'notification vector' */ + new.nv = POSTED_INTR_VECTOR; + } + + /* Allow posting non-urgent interrupts */ + new.sn = 0; + } while (cmpxchg(&pi_desc->control, old.control, + new.control) != old.control); +} /* * Switches to specified vcpu, until a matching vcpu_put(), but assumes * vcpu mutex is already taken. @@ -1993,10 +2039,27 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */ vmx->loaded_vmcs->cpu = cpu; } + + vmx_vcpu_pi_load(vcpu, cpu); +} + +static void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu) +{ + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); + + if (!kvm_arch_has_assigned_device(vcpu->kvm) || + !irq_remapping_cap(IRQ_POSTING_CAP)) + return; + + /* Set SN when the vCPU is preempted */ + if (vcpu->preempted) + pi_set_sn(pi_desc); } static void vmx_vcpu_put(struct kvm_vcpu *vcpu) { + vmx_vcpu_pi_put(vcpu); + __vmx_load_host_state(to_vmx(vcpu)); if (!vmm_exclusive) { __loaded_vmcs_clear(to_vmx(vcpu)->loaded_vmcs); @@ -4426,6 +4489,22 @@ static inline bool kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu) { #ifdef CONFIG_SMP if (vcpu->mode == IN_GUEST_MODE) { + struct vcpu_vmx *vmx = to_vmx(vcpu); + + /* +* Currently, we don't support urgent interrupt, +* all interrupts are recognized as non-urgent +* interrupt, so we cannot post interrupts when +* 'SN' is set. +* +* If the vcpu is in guest mode, it means it is +* running instead of being scheduled out and +* waiting in the run queue, and that's the only +* case when 'SN' is set currently, warning if +* 'SN' is set. +*/ + WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc)); + apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), POSTED_INTR_VECTOR); return true; -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 18/18] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts
Enable VT-d Posted-Interrtups and add a command line parameter for it. Signed-off-by: Feng Wu Reviewed-by: Paolo Bonzini --- Documentation/kernel-parameters.txt | 1 + drivers/iommu/irq_remapping.c | 12 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 1d6f045..52aca36 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1547,6 +1547,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nosid disable Source ID checking no_x2apic_optout BIOS x2APIC opt-out request will be ignored + nopost disable Interrupt Posting iomem= Disable strict checking of access to MMIO memory strict regions from userspace. diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 2d99930..d8c3997 100644 --- a/drivers/iommu/irq_remapping.c +++ b/drivers/iommu/irq_remapping.c @@ -22,7 +22,7 @@ int irq_remap_broken; int disable_sourceid_checking; int no_x2apic_optout; -int disable_irq_post = 1; +int disable_irq_post = 0; static int disable_irq_remap; static struct irq_remap_ops *remap_ops; @@ -58,14 +58,18 @@ static __init int setup_irqremap(char *str) return -EINVAL; while (*str) { - if (!strncmp(str, "on", 2)) + if (!strncmp(str, "on", 2)) { disable_irq_remap = 0; - else if (!strncmp(str, "off", 3)) + disable_irq_post = 0; + } else if (!strncmp(str, "off", 3)) { disable_irq_remap = 1; - else if (!strncmp(str, "nosid", 5)) + disable_irq_post = 1; + } else if (!strncmp(str, "nosid", 5)) disable_sourceid_checking = 1; else if (!strncmp(str, "no_x2apic_optout", 16)) no_x2apic_optout = 1; + else if (!strncmp(str, "nopost", 6)) + disable_irq_post = 1; str += strcspn(str, ","); while (*str == ',') -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 04/18] KVM: create kvm_irqfd.h
From: Eric Auger Move _irqfd_resampler and _irqfd struct declarations in a new public header: kvm_irqfd.h. They are respectively renamed into kvm_kernel_irqfd_resampler and kvm_kernel_irqfd. Those datatypes will be used by architecture specific code, in the context of IRQ bypass manager integration. Signed-off-by: Eric Auger --- include/linux/kvm_irqfd.h | 69 ++ virt/kvm/eventfd.c| 95 --- 2 files changed, 92 insertions(+), 72 deletions(-) create mode 100644 include/linux/kvm_irqfd.h diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h new file mode 100644 index 000..f926b39 --- /dev/null +++ b/include/linux/kvm_irqfd.h @@ -0,0 +1,69 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * irqfd: Allows an fd to be used to inject an interrupt to the guest + * Credit goes to Avi Kivity for the original idea. + */ + +#ifndef __LINUX_KVM_IRQFD_H +#define __LINUX_KVM_IRQFD_H + +#include +#include + +/* + * Resampling irqfds are a special variety of irqfds used to emulate + * level triggered interrupts. The interrupt is asserted on eventfd + * trigger. On acknowledgment through the irq ack notifier, the + * interrupt is de-asserted and userspace is notified through the + * resamplefd. All resamplers on the same gsi are de-asserted + * together, so we don't need to track the state of each individual + * user. We can also therefore share the same irq source ID. + */ +struct kvm_kernel_irqfd_resampler { + struct kvm *kvm; + /* +* List of resampling struct _irqfd objects sharing this gsi. +* RCU list modified under kvm->irqfds.resampler_lock +*/ + struct list_head list; + struct kvm_irq_ack_notifier notifier; + /* +* Entry in list of kvm->irqfd.resampler_list. Use for sharing +* resamplers among irqfds on the same gsi. +* Accessed and modified under kvm->irqfds.resampler_lock +*/ + struct list_head link; +}; + +struct kvm_kernel_irqfd { + /* Used for MSI fast-path */ + struct kvm *kvm; + wait_queue_t wait; + /* Update side is protected by irqfds.lock */ + struct kvm_kernel_irq_routing_entry irq_entry; + seqcount_t irq_entry_sc; + /* Used for level IRQ fast-path */ + int gsi; + struct work_struct inject; + /* The resampler used by this irqfd (resampler-only) */ + struct kvm_kernel_irqfd_resampler *resampler; + /* Eventfd notified on resample (resampler-only) */ + struct eventfd_ctx *resamplefd; + /* Entry in list of irqfds for a resampler (resampler-only) */ + struct list_head resampler_link; + /* Used for setup/shutdown */ + struct eventfd_ctx *eventfd; + struct list_head list; + poll_table pt; + struct work_struct shutdown; +}; + +#endif /* __LINUX_KVM_IRQFD_H */ diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 9ff4193..647ffb8 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -23,6 +23,7 @@ #include #include +#include #include #include #include @@ -39,68 +40,14 @@ #include #ifdef CONFIG_HAVE_KVM_IRQFD -/* - * - * irqfd: Allows an fd to be used to inject an interrupt to the guest - * - * Credit goes to Avi Kivity for the original idea. - * - */ - -/* - * Resampling irqfds are a special variety of irqfds used to emulate - * level triggered interrupts. The interrupt is asserted on eventfd - * trigger. On acknowledgement through the irq ack notifier, the - * interrupt is de-asserted and userspace is notified through the - * resamplefd. All resamplers on the same gsi are de-asserted - * together, so we don't need to track the state of each individual - * user. We can also therefore share the same irq source ID. - */ -struct _irqfd_resampler { - struct kvm *kvm; - /* -* List of resampling struct _irqfd objects sharing this gsi. -* RCU list modified under kvm->irqfds.resampler_lock -*/ - struct list_head list; - struct kvm_irq_ack_notifier notifier; - /* -* Entry in list of kvm->irqfd.resampler_list. Use for sharing -* resamplers among irqfds on the same gsi. -* Accessed and modified under kvm->irqfds.resampler_lock -*/ - struct list_head link; -}; - -struct _irqfd { - /* Used for MSI fast-path */ - s
[PATCH v9 06/18] KVM: eventfd: add irq bypass consumer management
From: Eric Auger This patch adds the registration/unregistration of an irq_bypass_consumer on irqfd assignment/deassignment. Signed-off-by: Eric Auger Signed-off-by: Feng Wu --- v4 -> v5: - due to removal of static inline stubs, add #ifdef CONFIG_HAVE_KVM_IRQ_BYPASS around consumer registration/unregistration - add pr_info when registration fails v2 -> v3 (Feng Wu): - Use kvm_arch_irq_bypass_start - Remove kvm_arch_irq_bypass_update - Add member 'struct irq_bypass_producer *producer' in 'struct kvm_kernel_irqfd', it is needed by posted interrupt. - Remove 'irq_bypass_unregister_consumer' in kvm_irqfd_deassign() v1 -> v2: - populate of kvm and gsi removed - unregister the consumer on irqfd_shutdown include/linux/kvm_irqfd.h | 2 ++ virt/kvm/eventfd.c| 15 +++ 2 files changed, 17 insertions(+) diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h index f926b39..0c1de05 100644 --- a/include/linux/kvm_irqfd.h +++ b/include/linux/kvm_irqfd.h @@ -64,6 +64,8 @@ struct kvm_kernel_irqfd { struct list_head list; poll_table pt; struct work_struct shutdown; + struct irq_bypass_consumer consumer; + struct irq_bypass_producer *producer; }; #endif /* __LINUX_KVM_IRQFD_H */ diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 647ffb8..d7a230f 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include @@ -140,6 +141,9 @@ irqfd_shutdown(struct work_struct *work) /* * It is now safe to release the object's resources */ +#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS + irq_bypass_unregister_consumer(&irqfd->consumer); +#endif eventfd_ctx_put(irqfd->eventfd); kfree(irqfd); } @@ -379,6 +383,17 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) * we might race against the POLLHUP */ fdput(f); +#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS + irqfd->consumer.token = (void *)irqfd->eventfd; + irqfd->consumer.add_producer = kvm_arch_irq_bypass_add_producer; + irqfd->consumer.del_producer = kvm_arch_irq_bypass_del_producer; + irqfd->consumer.stop = kvm_arch_irq_bypass_stop; + irqfd->consumer.start = kvm_arch_irq_bypass_start; + ret = irq_bypass_register_consumer(&irqfd->consumer); + if (ret) + pr_info("irq bypass consumer (token %p) registration fails: %d\n", + irqfd->consumer.token, ret); +#endif return 0; -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 11/18] KVM: make kvm_set_msi_irq() public
Make kvm_set_msi_irq() public, we can use this function outside. Signed-off-by: Feng Wu Reviewed-by: Paolo Bonzini --- v8: - Export kvm_set_msi_irq() so we can use it in vmx code arch/x86/include/asm/kvm_host.h | 4 arch/x86/kvm/irq_comm.c | 5 +++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index af11bca..daa6126 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -175,6 +175,8 @@ enum { */ #define KVM_APIC_PV_EOI_PENDING1 +struct kvm_kernel_irq_routing_entry; + /* * We don't want allocation failures within the mmu code, so we preallocate * enough memory for a single page fault in a cache. @@ -1207,4 +1209,6 @@ int x86_set_memory_region(struct kvm *kvm, bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq, struct kvm_vcpu **dest_vcpu); +void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e, +struct kvm_lapic_irq *irq); #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c index f86a0da..4f6fa67 100644 --- a/arch/x86/kvm/irq_comm.c +++ b/arch/x86/kvm/irq_comm.c @@ -91,8 +91,8 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, return r; } -static inline void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e, - struct kvm_lapic_irq *irq) +void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e, +struct kvm_lapic_irq *irq) { trace_kvm_msi_set_irq(e->msi.address_lo, e->msi.data); @@ -108,6 +108,7 @@ static inline void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e, irq->level = 1; irq->shorthand = 0; } +EXPORT_SYMBOL_GPL(kvm_set_msi_irq); int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, int irq_source_id, int level, bool line_status) -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 08/18] KVM: Add some helper functions for Posted-Interrupts
This patch adds some helper functions to manipulate the Posted-Interrupts Descriptor. Signed-off-by: Feng Wu Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 271dd70..316f9bf 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -443,6 +443,8 @@ struct nested_vmx { }; #define POSTED_INTR_ON 0 +#define POSTED_INTR_SN 1 + /* Posted-Interrupt Descriptor */ struct pi_desc { u32 pir[8]; /* Posted interrupt requested */ @@ -483,6 +485,30 @@ static int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc) return test_and_set_bit(vector, (unsigned long *)pi_desc->pir); } +static void pi_clear_sn(struct pi_desc *pi_desc) +{ + return clear_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + +static void pi_set_sn(struct pi_desc *pi_desc) +{ + return set_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + +static int pi_test_on(struct pi_desc *pi_desc) +{ + return test_bit(POSTED_INTR_ON, + (unsigned long *)&pi_desc->control); +} + +static int pi_test_sn(struct pi_desc *pi_desc) +{ + return test_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + struct vcpu_vmx { struct kvm_vcpu vcpu; unsigned long host_rsp; -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 10/18] KVM: Make struct kvm_irq_routing_table accessible
Move struct kvm_irq_routing_table from irqchip.c to kvm_host.h, so we can use it outside of irqchip.c. Signed-off-by: Feng Wu Reviewed-by: Paolo Bonzini --- include/linux/kvm_host.h | 14 ++ virt/kvm/irqchip.c | 10 -- 2 files changed, 14 insertions(+), 10 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5ac8d21..5f183fb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -328,6 +328,20 @@ struct kvm_kernel_irq_routing_entry { struct hlist_node link; }; +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING + +struct kvm_irq_routing_table { + int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS]; + u32 nr_rt_entries; + /* +* Array indexed by gsi. Each entry contains list of irq chips +* the gsi is connected to. +*/ + struct hlist_head map[0]; +}; + +#endif + #ifndef KVM_PRIVATE_MEM_SLOTS #define KVM_PRIVATE_MEM_SLOTS 0 #endif diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c index 21c1424..2cf45d3 100644 --- a/virt/kvm/irqchip.c +++ b/virt/kvm/irqchip.c @@ -31,16 +31,6 @@ #include #include "irq.h" -struct kvm_irq_routing_table { - int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS]; - u32 nr_rt_entries; - /* -* Array indexed by gsi. Each entry contains list of irq chips -* the gsi is connected to. -*/ - struct hlist_head map[0]; -}; - int kvm_irq_map_gsi(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *entries, int gsi) { -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 05/18] KVM: introduce kvm_arch functions for IRQ bypass
From: Eric Auger This patch introduces - kvm_arch_irq_bypass_add_producer - kvm_arch_irq_bypass_del_producer - kvm_arch_irq_bypass_stop - kvm_arch_irq_bypass_start They make possible to specialize the KVM IRQ bypass consumer in case CONFIG_KVM_HAVE_IRQ_BYPASS is set. Signed-off-by: Eric Auger Signed-off-by: Feng Wu --- v4 -> v5: - remove static inline stub functions v2 -> v3 (Feng Wu): - use 'kvm_arch_irq_bypass_start' instead of 'kvm_arch_irq_bypass_resume' - Remove 'kvm_arch_irq_bypass_update', which is not needed to be a irqbypass callback per Alex's comments. - Make kvm_arch_irq_bypass_add_producer return 'int' v1 -> v2: - use CONFIG_KVM_HAVE_IRQ_BYPASS instead CONFIG_IRQ_BYPASS_MANAGER - rename all functions according to Paolo's proposal - add kvm_arch_irq_bypass_update according to Feng's need include/linux/kvm_host.h | 10 ++ virt/kvm/Kconfig | 3 +++ 2 files changed, 13 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 05e99b8..5ac8d21 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -24,6 +24,7 @@ #include #include #include +#include #include #include @@ -1151,5 +1152,14 @@ static inline void kvm_vcpu_set_dy_eligible(struct kvm_vcpu *vcpu, bool val) { } #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */ + +#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS +int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *, + struct irq_bypass_producer *); +void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *, + struct irq_bypass_producer *); +void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *); +void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *); +#endif /* CONFIG_HAVE_KVM_IRQ_BYPASS */ #endif diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index e2c876d..9f8014d 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -47,3 +47,6 @@ config KVM_GENERIC_DIRTYLOG_READ_PROTECT config KVM_COMPAT def_bool y depends on COMPAT && !S390 + +config HAVE_KVM_IRQ_BYPASS + bool -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 07/18] KVM: Extend struct pi_desc for VT-d Posted-Interrupts
Extend struct pi_desc for VT-d Posted-Interrupts. Signed-off-by: Feng Wu --- arch/x86/kvm/vmx.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 83b7b5c..271dd70 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -446,8 +446,24 @@ struct nested_vmx { /* Posted-Interrupt Descriptor */ struct pi_desc { u32 pir[8]; /* Posted interrupt requested */ - u32 control;/* bit 0 of control is outstanding notification bit */ - u32 rsvd[7]; + union { + struct { + /* bit 256 - Outstanding Notification */ + u16 on : 1, + /* bit 257 - Suppress Notification */ + sn : 1, + /* bit 271:258 - Reserved */ + rsvd_1 : 14; + /* bit 279:272 - Notification Vector */ + u8 nv; + /* bit 287:280 - Reserved */ + u8 rsvd_2; + /* bit 319:288 - Notification Destination */ + u32 ndst; + }; + u64 control; + }; + u32 rsvd[6]; } __aligned(64); static bool pi_test_and_set_on(struct pi_desc *pi_desc) -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series
VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. With VT-d Posted-Interrupts enabled, external interrupts from direct-assigned devices can be delivered to guests without VMM intervention when guest is running in non-root mode. You can find the VT-d Posted-Interrtups Spec. in the following URL: http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html v9: - Include the whole series: [01/18]: irq bypasser manager [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side forwarded irq [07/18] - [18/18]: VT-d PI part v8: refer to the changelog in each patch v7: * Define two weak irq bypass callbacks: - kvm_arch_irq_bypass_start() - kvm_arch_irq_bypass_stop() * Remove the x86 dummy implementation of the above two functions. * Print some useful information instead of WARN_ON() when the irq bypass consumer unregistration fails. * Fix an issue when calling pi_pre_block and pi_post_block. v6: * Rebase on 4.2.0-rc6 * Rebase on https://lkml.org/lkml/2015/8/6/526 and http://www.gossamer-threads.com/lists/linux/kernel/2235623 * Make the add_consumer and del_consumer callbacks static * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)' * Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails * Remove optional dummy callbacks for irq producer v4: * For lowest-priority interrupt, only support single-CPU destination interrupts at the current stage, more common lowest priority support will be added later. * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle the posted-interrupts in the HLT emulation path. * Some small changes (coding style, typo, add some code comments) v3: * Adjust the Posted-interrupts Descriptor updating logic when vCPU is preempted or blocked. * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> KVM_DEV_VFIO_DEVICE_POST_IRQ * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> __KVM_HAVE_ARCH_KVM_VFIO_POST * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which can be used to change back to remapping mode. * Fix typo v2: * Use VFIO framework to enable this feature, the VFIO part of this series is base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control" * Rebase this patchset on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, then revise some irq logic based on the new hierarchy irqdomain patches provided by Jiang Liu *** BLURB HERE *** Alex Williamson (1): virt: IRQ bypass manager Eric Auger (4): KVM: arm/arm64: select IRQ_BYPASS_MANAGER KVM: create kvm_irqfd.h KVM: introduce kvm_arch functions for IRQ bypass KVM: eventfd: add irq bypass consumer management Feng Wu (13): KVM: x86: select IRQ_BYPASS_MANAGER KVM: Extend struct pi_desc for VT-d Posted-Interrupts KVM: Add some helper functions for Posted-Interrupts KVM: Define a new interface kvm_intr_is_single_vcpu() KVM: Make struct kvm_irq_routing_table accessible KVM: make kvm_set_msi_irq() public vfio: Register/unregister irq_bypass_producer KVM: x86: Update IRTE for posted-interrupts KVM: Implement IRQ bypass consumer callbacks for x86 KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' KVM: Update Posted-Interrupts Descriptor when vCPU is preempted KVM: Update Posted-Interrupts Descriptor when vCPU is blocked iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Documentation/kernel-parameters.txt | 1 + Documentation/virtual/kvm/locking.txt | 12 ++ MAINTAINERS | 7 + arch/arm/kvm/Kconfig | 2 + arch/arm/kvm/Makefile | 1 + arch/arm64/kvm/Kconfig| 2 + arch/arm64/kvm/Makefile | 1 + arch/x86/include/asm/kvm_host.h | 24 +++ arch/x86/kvm/Kconfig | 3 + arch/x86/kvm/Makefile | 3 + arch/x86/kvm/irq_comm.c | 32 ++- arch/x86/kvm/lapic.c | 59 ++ arch/x86/kvm/lapic.h | 2 + arch/x86/kvm/trace.h | 33 arch/x86/kvm/vmx.c| 361 +- arch/x86/kvm/x86.c| 108 +- drivers/iommu/irq_remapping.c | 12 +- drivers/vfio/pci/Kconfig | 1 + drivers/vfio/pci/vfio_pci_intrs.c | 9 + drivers/vfio/pci/vfio_pci_private.h | 2 + include/linux/irqbypass.h | 90 + include/linux/kvm_host.h | 29 +++ include/linux/kvm_irqfd.h | 71 +++ virt/kvm/Kconfig | 3 + virt/kvm/eventfd.c| 142 +++-- virt/kvm/irqchip.c| 10 - virt/kvm/kvm_main.c | 3 + virt/lib/Kconfig | 2 + virt/lib/Makefile | 1 + virt/lib/irqbypass.c | 257 30 files changed, 1182 insertions(+), 101 deletions(-) create mode 100644 i
[PATCH v9 01/18] virt: IRQ bypass manager
From: Alex Williamson When a physical I/O device is assigned to a virtual machine through facilities like VFIO and KVM, the interrupt for the device generally bounces through the host system before being injected into the VM. However, hardware technologies exist that often allow the host to be bypassed for some of these scenarios. Intel Posted Interrupts allow the specified physical edge interrupts to be directly injected into a guest when delivered to a physical processor while the vCPU is running. ARM IRQ Forwarding allows forwarded physical interrupts to be directly deactivated by the guest. The IRQ bypass manager here is meant to provide the shim to connect interrupt producers, generally the host physical device driver, with interrupt consumers, generally the hypervisor, in order to configure these bypass mechanism. To do this, we base the connection on a shared, opaque token. For KVM-VFIO this is expected to be an eventfd_ctx since this is the connection we already use to connect an eventfd to an irqfd on the in-kernel path. When a producer and consumer with matching tokens is found, callbacks via both registered participants allow the bypass facilities to be automatically enabled. Signed-off-by: Alex Williamson Reviewed-by: Eric Auger Tested-by: Eric Auger Tested-by: Feng Wu --- v4: All producer callbacks are optional, as with Intel PI, it's possible for the producer to be blissfully unaware of the bypass. MAINTAINERS | 7 ++ include/linux/irqbypass.h | 90 virt/lib/Kconfig | 2 + virt/lib/Makefile | 1 + virt/lib/irqbypass.c | 257 ++ 5 files changed, 357 insertions(+) create mode 100644 include/linux/irqbypass.h create mode 100644 virt/lib/Kconfig create mode 100644 virt/lib/Makefile create mode 100644 virt/lib/irqbypass.c diff --git a/MAINTAINERS b/MAINTAINERS index a9ae6c1..10c8b2f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10963,6 +10963,13 @@ L: net...@vger.kernel.org S: Maintained F: drivers/net/ethernet/via/via-velocity.* +VIRT LIB +M: Alex Williamson +M: Paolo Bonzini +L: k...@vger.kernel.org +S: Supported +F: virt/lib/ + VIVID VIRTUAL VIDEO DRIVER M: Hans Verkuil L: linux-me...@vger.kernel.org diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h new file mode 100644 index 000..1551b5b --- /dev/null +++ b/include/linux/irqbypass.h @@ -0,0 +1,90 @@ +/* + * IRQ offload/bypass manager + * + * Copyright (C) 2015 Red Hat, Inc. + * Copyright (c) 2015 Linaro Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#ifndef IRQBYPASS_H +#define IRQBYPASS_H + +#include + +struct irq_bypass_consumer; + +/* + * Theory of operation + * + * The IRQ bypass manager is a simple set of lists and callbacks that allows + * IRQ producers (ex. physical interrupt sources) to be matched to IRQ + * consumers (ex. virtualization hardware that allows IRQ bypass or offload) + * via a shared token (ex. eventfd_ctx). Producers and consumers register + * independently. When a token match is found, the optional @stop callback + * will be called for each participant. The pair will then be connected via + * the @add_* callbacks, and finally the optional @start callback will allow + * any final coordination. When either participant is unregistered, the + * process is repeated using the @del_* callbacks in place of the @add_* + * callbacks. Match tokens must be unique per producer/consumer, 1:N pairings + * are not supported. + */ + +/** + * struct irq_bypass_producer - IRQ bypass producer definition + * @node: IRQ bypass manager private list management + * @token: opaque token to match between producer and consumer + * @irq: Linux IRQ number for the producer device + * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional) + * @del_consumer: Disconnect the IRQ producer from an IRQ consumer (optional) + * @stop: Perform any quiesce operations necessary prior to add/del (optional) + * @start: Perform any startup operations necessary after add/del (optional) + * + * The IRQ bypass producer structure represents an interrupt source for + * participation in possible host bypass, for instance an interrupt vector + * for a physical device assigned to a VM. + */ +struct irq_bypass_producer { + struct list_head node; + void *token; + int irq; + int (*add_consumer)(struct irq_bypass_producer *, + struct irq_bypass_consumer *); + void (*del_consumer)(struct irq_bypass_producer *, +struct irq_bypass_consumer *); + void (*stop)(struct irq_bypass_producer *); + void (*start)(struct irq_bypass_producer *); +}; + +/** + * struct irq_bypass_consumer - IRQ bypass consumer definition + * @
[PATCH v9 03/18] KVM: arm/arm64: select IRQ_BYPASS_MANAGER
From: Eric Auger Select IRQ_BYPASS_MANAGER when CONFIG_KVM is set Also add compilation of virt/lib. Signed-off-by: Eric Auger Signed-off-by: Feng Wu --- v3 -> v4: - add compilation of virt/lib in arm/arm64 KVM v2 -> v3: - [Feng Wu] Correct a typo in 'arch/arm64/kvm/Kconfig' v1 -> v2: - also set IRQ_BYPASS_MANAGER for arm64 arch/arm/kvm/Kconfig| 2 ++ arch/arm/kvm/Makefile | 1 + arch/arm64/kvm/Kconfig | 2 ++ arch/arm64/kvm/Makefile | 1 + 4 files changed, 6 insertions(+) diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig index bfb915d..3c565b9 100644 --- a/arch/arm/kvm/Kconfig +++ b/arch/arm/kvm/Kconfig @@ -3,6 +3,7 @@ # source "virt/kvm/Kconfig" +source "virt/lib/Kconfig" menuconfig VIRTUALIZATION bool "Virtualization" @@ -31,6 +32,7 @@ config KVM select KVM_VFIO select HAVE_KVM_EVENTFD select HAVE_KVM_IRQFD + select IRQ_BYPASS_MANAGER depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER ---help--- Support hosting virtualized guest machines. diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile index c5eef02c..a6a41dd 100644 --- a/arch/arm/kvm/Makefile +++ b/arch/arm/kvm/Makefile @@ -24,3 +24,4 @@ obj-y += $(KVM)/arm/vgic.o obj-y += $(KVM)/arm/vgic-v2.o obj-y += $(KVM)/arm/vgic-v2-emul.o obj-y += $(KVM)/arm/arch_timer.o +obj-y += ../../../virt/lib/ diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index bfffe8f..2509539 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -3,6 +3,7 @@ # source "virt/kvm/Kconfig" +source "virt/lib/Kconfig" menuconfig VIRTUALIZATION bool "Virtualization" @@ -31,6 +32,7 @@ config KVM select KVM_VFIO select HAVE_KVM_EVENTFD select HAVE_KVM_IRQFD + select IRQ_BYPASS_MANAGER ---help--- Support hosting virtualized guest machines. diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index f90f4aa..55eec69 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -27,3 +27,4 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3-emul.o kvm-$(CONFIG_KVM_ARM_HOST) += vgic-v3-switch.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o +kvm-$(CONFIG_KVM_ARM_HOST) += ../../../virt/lib/ -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 02/18] KVM: x86: select IRQ_BYPASS_MANAGER
Select IRQ_BYPASS_MANAGER for x86 when CONFIG_KVM is set Signed-off-by: Feng Wu --- arch/x86/kvm/Kconfig | 2 ++ arch/x86/kvm/Makefile | 3 +++ 2 files changed, 5 insertions(+) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index d8a1d56..c951d44 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -3,6 +3,7 @@ # source "virt/kvm/Kconfig" +source "virt/lib/Kconfig" menuconfig VIRTUALIZATION bool "Virtualization" @@ -28,6 +29,7 @@ config KVM select ANON_INODES select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQFD + select IRQ_BYPASS_MANAGER select HAVE_KVM_IRQ_ROUTING select HAVE_KVM_EVENTFD select KVM_APIC_ARCHITECTURE diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 67d215c..05cc2d7 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -6,6 +6,9 @@ CFLAGS_svm.o := -I. CFLAGS_vmx.o := -I. KVM := ../../../virt/kvm +LIB := ../../../virt/lib + +obj-$(CONFIG_IRQ_BYPASS_MANAGER) += $(LIB)/ kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu: Ensure IAS is set correctly for AArch32-capable SMMUs
AArch32-capable SMMU implementations have a minimum IAS of 40 bits, so ensure that is reflected in the stage-2 page table configuration. Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-v3.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index dafaf59dc3b8..a24f359fa0d0 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -56,6 +56,7 @@ #define IDR0_TTF_SHIFT 2 #define IDR0_TTF_MASK 0x3 #define IDR0_TTF_AARCH64 (2 << IDR0_TTF_SHIFT) +#define IDR0_TTF_AARCH32_64(3 << IDR0_TTF_SHIFT) #define IDR0_S1P (1 << 1) #define IDR0_S2P (1 << 0) @@ -2460,7 +2461,13 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu) } /* We only support the AArch64 table format at present */ - if ((reg & IDR0_TTF_MASK << IDR0_TTF_SHIFT) < IDR0_TTF_AARCH64) { + switch (reg & IDR0_TTF_MASK << IDR0_TTF_SHIFT) { + case IDR0_TTF_AARCH32_64: + smmu->ias = 40; + /* Fallthrough */ + case IDR0_TTF_AARCH64: + break; + default: dev_err(smmu->dev, "AArch64 table format not supported!\n"); return -ENXIO; } @@ -2541,8 +2548,7 @@ static int arm_smmu_device_probe(struct arm_smmu_device *smmu) dev_warn(smmu->dev, "failed to set DMA mask for table walker\n"); - if (!smmu->ias) - smmu->ias = smmu->oas; + smmu->ias = max(smmu->ias, smmu->oas); dev_info(smmu->dev, "ias %lu-bit, oas %lu-bit (features 0x%08x)\n", smmu->ias, smmu->oas, smmu->features); -- 2.1.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()
On Fri, Sep 18, 2015 at 12:04:26PM +0100, Robin Murphy wrote: > Specifically, the problem case for that is when phys_addr_t is 64-bit but > dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary to avoid > a truncation warning when we make the DMA API calls, but we actually need > the opposite in the comparison here - comparing the different types directly > allows integer promotion to kick in appropriately so we don't lose the top > half of the larger address. Otherwise, you'd never spot the difference > between, say, your original page at 0x88c000 and a bounce-buffered copy > that happened to end up mapped to 0xc000. Hmm. Thinking about this, I think we ought to add to arch/arm/mm/Kconfig: config ARCH_PHYS_ADDR_T_64BIT def_bool ARM_LPAE config ARCH_DMA_ADDR_T_64BIT bool + select ARCH_PHYS_ADDR_T_64BIT I seem to remember that you're quite right that dma_addr_t <= phys_addr_t but dma_addr_t must never be bigger than phys_addr_t. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()
On 18/09/15 09:55, Yong Wu wrote: On Thu, 2015-09-17 at 17:42 +0100, Robin Murphy wrote: [...] the appropriate course of action. Further care (and ugliness) is also necessary in the comparison to avoid truncation if phys_addr_t and dma_addr_t differ in size. [...] /* * We depend on the IOMMU being able to work with any physical -* address directly, so if the DMA layer suggests it can't by -* giving us back some translation, that bodes very badly... +* address directly, so if the DMA layer suggests otherwise by +* translating or truncating them, that bodes very badly... */ - if (dma != __arm_lpae_dma_addr(dev, pages)) + if (dma != virt_to_phys(pages)) Could I ask why not use __arm_lpae_dma_addr(pages) here? dma is dma_addr_t. Specifically, the problem case for that is when phys_addr_t is 64-bit but dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary to avoid a truncation warning when we make the DMA API calls, but we actually need the opposite in the comparison here - comparing the different types directly allows integer promotion to kick in appropriately so we don't lose the top half of the larger address. Otherwise, you'd never spot the difference between, say, your original page at 0x88c000 and a bounce-buffered copy that happened to end up mapped to 0xc000. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()
On Thu, 2015-09-17 at 17:42 +0100, Robin Murphy wrote: > In checking whether DMA addresses differ from physical addresses, using > dma_to_phys() is actually the wrong thing to do, since it may hide any > DMA offset, which is precisely one of the things we are checking for. > Simply casting between the two address types, whilst ugly, is in fact > the appropriate course of action. Further care (and ugliness) is also > necessary in the comparison to avoid truncation if phys_addr_t and > dma_addr_t differ in size. > > We can also reject any device with a fixed DMA offset up-front at page > table creation, leaving the allocation-time check for the more subtle > cases like bounce buffering due to an incorrect DMA mask. > > Furthermore, we can then fix the hackish KConfig dependency so that > architectures without a dma_to_phys() implementation may still > COMPILE_TEST (or even use!) the code. The true dependency is on the > DMA API, so use the appropriate symbol for that. > > Signed-off-by: Robin Murphy > --- [...] > > static bool selftest_running = false; > > -static dma_addr_t __arm_lpae_dma_addr(struct device *dev, void *pages) > +static dma_addr_t __arm_lpae_dma_addr(void *pages) > { > - return phys_to_dma(dev, virt_to_phys(pages)); > + return (dma_addr_t)virt_to_phys(pages); > } > > static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, > @@ -223,10 +223,10 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t > gfp, > goto out_free; > /* >* We depend on the IOMMU being able to work with any physical > - * address directly, so if the DMA layer suggests it can't by > - * giving us back some translation, that bodes very badly... > + * address directly, so if the DMA layer suggests otherwise by > + * translating or truncating them, that bodes very badly... >*/ > - if (dma != __arm_lpae_dma_addr(dev, pages)) > + if (dma != virt_to_phys(pages)) Could I ask why not use __arm_lpae_dma_addr(pages) here? dma is dma_addr_t. > goto out_unmap; > } > > @@ -243,10 +243,8 @@ out_free: > static void __arm_lpae_free_pages(void *pages, size_t size, > struct io_pgtable_cfg *cfg) > { > - struct device *dev = cfg->iommu_dev; > - > if (!selftest_running) > - dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages), > + dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages), >size, DMA_TO_DEVICE); > free_pages_exact(pages, size); > } > @@ -254,12 +252,11 @@ static void __arm_lpae_free_pages(void *pages, size_t > size, > static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte, > struct io_pgtable_cfg *cfg) > { > - struct device *dev = cfg->iommu_dev; > - > *ptep = pte; > > if (!selftest_running) > - dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep), > + dma_sync_single_for_device(cfg->iommu_dev, > +__arm_lpae_dma_addr(ptep), > sizeof(pte), DMA_TO_DEVICE); > } > > @@ -629,6 +626,11 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) > if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS) > return NULL; > > + if (cfg->iommu_dev->dma_pfn_offset) { > + dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for > IOMMU page tables\n"); > + return NULL; > + } > + > data = kmalloc(sizeof(*data), GFP_KERNEL); > if (!data) > return NULL; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu