Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series

2015-09-18 Thread Alex Williamson
On Fri, 2015-09-18 at 16:58 +0200, Paolo Bonzini wrote:
> 
> On 18/09/2015 16:29, Feng Wu wrote:
> > VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
> > With VT-d Posted-Interrupts enabled, external interrupts from
> > direct-assigned devices can be delivered to guests without VMM
> > intervention when guest is running in non-root mode.
> > 
> > You can find the VT-d Posted-Interrtups Spec. in the following URL:
> > http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html
> 
> Thanks.  I will squash patches 2 and 14 together, and drop patch 3.
> 
> Signed-off-bys are missing in patch 1 and 4.  The patches exist
> elsewhere in the mailing list archives, so not a big deal.  Or just
> reply to them with the S-o-b line.
> 
> Alex, can you ack the series and review patch 12?

I sent an ack for 12 separately, I got a bit lost in 16 & 17, but for
all the others that don't already have some tag from me,

Reviewed-by: Alex Williamson 

> 
> Joerg, can you ack patch 18?
> 
> Paolo
> 
> > v9:
> > - Include the whole series:
> > [01/18]: irq bypasser manager
> > [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side 
> > forwarded irq
> > [07/18] - [18/18]: VT-d PI part
> > 
> > v8:
> > refer to the changelog in each patch
> > 
> > v7:
> > * Define two weak irq bypass callbacks:
> >   - kvm_arch_irq_bypass_start()
> >   - kvm_arch_irq_bypass_stop()
> > * Remove the x86 dummy implementation of the above two functions.
> > * Print some useful information instead of WARN_ON() when the
> >   irq bypass consumer unregistration fails.
> > * Fix an issue when calling pi_pre_block and pi_post_block.
> > 
> > v6:
> > * Rebase on 4.2.0-rc6
> > * Rebase on https://lkml.org/lkml/2015/8/6/526 and 
> > http://www.gossamer-threads.com/lists/linux/kernel/2235623
> > * Make the add_consumer and del_consumer callbacks static
> > * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
> > * Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails
> > * Remove optional dummy callbacks for irq producer
> > 
> > v4:
> > * For lowest-priority interrupt, only support single-CPU destination
> > interrupts at the current stage, more common lowest priority support
> > will be added later.
> > * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle
> > the posted-interrupts in the HLT emulation path.
> > * Some small changes (coding style, typo, add some code comments)
> > 
> > v3:
> > * Adjust the Posted-interrupts Descriptor updating logic when vCPU is
> >   preempted or blocked.
> > * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> KVM_DEV_VFIO_DEVICE_POST_IRQ
> > * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> __KVM_HAVE_ARCH_KVM_VFIO_POST
> > * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which
> >   can be used to change back to remapping mode.
> > * Fix typo
> > 
> > v2:
> > * Use VFIO framework to enable this feature, the VFIO part of this series is
> >   base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control"
> > * Rebase this patchset on 
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git,
> >   then revise some irq logic based on the new hierarchy irqdomain patches 
> > provided
> >   by Jiang Liu 
> > 
> > 
> > *** BLURB HERE ***
> > 
> > Alex Williamson (1):
> >   virt: IRQ bypass manager
> > 
> > Eric Auger (4):
> >   KVM: arm/arm64: select IRQ_BYPASS_MANAGER
> >   KVM: create kvm_irqfd.h
> >   KVM: introduce kvm_arch functions for IRQ bypass
> >   KVM: eventfd: add irq bypass consumer management
> > 
> > Feng Wu (13):
> >   KVM: x86: select IRQ_BYPASS_MANAGER
> >   KVM: Extend struct pi_desc for VT-d Posted-Interrupts
> >   KVM: Add some helper functions for Posted-Interrupts
> >   KVM: Define a new interface kvm_intr_is_single_vcpu()
> >   KVM: Make struct kvm_irq_routing_table accessible
> >   KVM: make kvm_set_msi_irq() public
> >   vfio: Register/unregister irq_bypass_producer
> >   KVM: x86: Update IRTE for posted-interrupts
> >   KVM: Implement IRQ bypass consumer callbacks for x86
> >   KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
> >   KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
> >   KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
> >   iommu/vt-d: Add a command line parameter for VT-d posted-interrupts
> > 
> >  Documentation/kernel-parameters.txt   |   1 +
> >  Documentation/virtual/kvm/locking.txt |  12 ++
> >  MAINTAINERS   |   7 +
> >  arch/arm/kvm/Kconfig  |   2 +
> >  arch/arm/kvm/Makefile |   1 +
> >  arch/arm64/kvm/Kconfig|   2 +
> >  arch/arm64/kvm/Makefile   |   1 +
> >  arch/x86/include/asm/kvm_host.h   |  24 +++
> >  arch/x86/kvm/Kconfig  |   3 +
> >  arch/x86/kvm/Makefile |   3 +
> >  arch/x86/kvm/irq_comm.c   |  32 ++-
> >  arch/x86/kvm/lapic.c  |  59 ++
> >  arch/x8

Re: [PATCH v9 12/18] vfio: Register/unregister irq_bypass_producer

2015-09-18 Thread Alex Williamson
On Fri, 2015-09-18 at 22:29 +0800, Feng Wu wrote:
> This patch adds the registration/unregistration of an
> irq_bypass_producer for MSI/MSIx on vfio pci devices.
> 
> Signed-off-by: Feng Wu 

On nit, Paolo could you please fix the spelling of "registration" in the
dev_info, otherwise:

Acked-by: Alex Williamson 


> ---
> v8:
> - Merge "[PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI 
> devices"
>   into this patch.
> 
> v6:
> - Make the add_consumer and del_consumer callbacks static
> - Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
> - Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails
> - Remove optional dummy callbacks for irq producer
> 
>  drivers/vfio/pci/Kconfig| 1 +
>  drivers/vfio/pci/vfio_pci_intrs.c   | 9 +
>  drivers/vfio/pci/vfio_pci_private.h | 2 ++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 579d83b..02912f1 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -2,6 +2,7 @@ config VFIO_PCI
>   tristate "VFIO support for PCI devices"
>   depends on VFIO && PCI && EVENTFD
>   select VFIO_VIRQFD
> + select IRQ_BYPASS_MANAGER
>   help
> Support for the PCI VFIO bus driver.  This is required to make
> use of PCI drivers using the VFIO framework.
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
> b/drivers/vfio/pci/vfio_pci_intrs.c
> index 1f577b4..c65299d 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -319,6 +319,7 @@ static int vfio_msi_set_vector_signal(struct 
> vfio_pci_device *vdev,
>  
>   if (vdev->ctx[vector].trigger) {
>   free_irq(irq, vdev->ctx[vector].trigger);
> + irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
>   kfree(vdev->ctx[vector].name);
>   eventfd_ctx_put(vdev->ctx[vector].trigger);
>   vdev->ctx[vector].trigger = NULL;
> @@ -360,6 +361,14 @@ static int vfio_msi_set_vector_signal(struct 
> vfio_pci_device *vdev,
>   return ret;
>   }
>  
> + vdev->ctx[vector].producer.token = trigger;
> + vdev->ctx[vector].producer.irq = irq;
> + ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
> + if (unlikely(ret))
> + dev_info(&pdev->dev,
> + "irq bypass producer (token %p) registeration fails: %d\n",
> + vdev->ctx[vector].producer.token, ret);
> +
>   vdev->ctx[vector].trigger = trigger;
>  
>   return 0;
> diff --git a/drivers/vfio/pci/vfio_pci_private.h 
> b/drivers/vfio/pci/vfio_pci_private.h
> index ae0e1b4..0e7394f 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -13,6 +13,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #ifndef VFIO_PCI_PRIVATE_H
>  #define VFIO_PCI_PRIVATE_H
> @@ -29,6 +30,7 @@ struct vfio_pci_irq_ctx {
>   struct virqfd   *mask;
>   char*name;
>   boolmasked;
> + struct irq_bypass_producer  producer;
>  };
>  
>  struct vfio_pci_device {



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 03/13] KVM: Define a new interface kvm_intr_is_single_vcpu()

2015-09-18 Thread Paolo Bonzini


On 18/09/2015 18:16, Radim Krčmář wrote:
>>> >> Ok, I was wondering whether this was the correct interpretation.  Thanks!
>> > 
>> > Paolo, I don't think Radim clarify your concern, right? Since mda is 
>> > 8-bit, it
>> > is wrong with mda >> 16, this is your concern, right?
> In case it was:  mda is u32 so the bitshift is defined by C.
> (xAPIC destinations in KVM's x2APIC mode are stored in lowest 8 bits of
>  mda, hence the cluster is always 0.)
> 
> Or am I still missing the point?

Yes, remembering that the cluster is always 0 solved my doubt.

Paolo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 03/13] KVM: Define a new interface kvm_intr_is_single_vcpu()

2015-09-18 Thread Radim Krčmář
2015-09-17 23:18+, Wu, Feng:
>> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
>> On 17/09/2015 17:58, Radim Krčmář wrote:
>>> xAPIC address are only 8 bit long so they always get delivered to x2APIC
>>> cluster 0, where first 16 bits work like xAPIC flat logical mode.
>> 
>> Ok, I was wondering whether this was the correct interpretation.  Thanks!
> 
> Paolo, I don't think Radim clarify your concern, right? Since mda is 8-bit, it
> is wrong with mda >> 16, this is your concern, right?

In case it was:  mda is u32 so the bitshift is defined by C.
(xAPIC destinations in KVM's x2APIC mode are stored in lowest 8 bits of
 mda, hence the cluster is always 0.)

Or am I still missing the point?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v9 17/18] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked

2015-09-18 Thread Paolo Bonzini


On 18/09/2015 16:29, Feng Wu wrote:
> This patch updates the Posted-Interrupts Descriptor when vCPU
> is blocked.
> 
> pre-block:
> - Add the vCPU to the blocked per-CPU list
> - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
> 
> post-block:
> - Remove the vCPU from the per-CPU list
> 
> Signed-off-by: Feng Wu 
> ---
> v9:
> - Add description for blocked_vcpu_on_cpu_lock in 
> Documentation/virtual/kvm/locking.txt
> - Check !kvm_arch_has_assigned_device(vcpu->kvm) first, then
>   !irq_remapping_cap(IRQ_POSTING_CAP)
> 
> v8:
> - Rename 'pi_pre_block' to 'pre_block'
> - Rename 'pi_post_block' to 'post_block'
> - Change some comments
> - Only add the vCPU to the blocking list when the VM has assigned devices.
> 
>  Documentation/virtual/kvm/locking.txt |  12 +++
>  arch/x86/include/asm/kvm_host.h   |  13 +++
>  arch/x86/kvm/vmx.c| 153 
> ++
>  arch/x86/kvm/x86.c|  53 +---
>  include/linux/kvm_host.h  |   3 +
>  virt/kvm/kvm_main.c   |   3 +
>  6 files changed, 227 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/locking.txt 
> b/Documentation/virtual/kvm/locking.txt
> index d68af4d..19f94a6 100644
> --- a/Documentation/virtual/kvm/locking.txt
> +++ b/Documentation/virtual/kvm/locking.txt
> @@ -166,3 +166,15 @@ Comment: The srcu read lock must be held while accessing 
> memslots (e.g.
>   MMIO/PIO address->device structure mapping (kvm->buses).
>   The srcu index can be stored in kvm_vcpu->srcu_idx per vcpu
>   if it is needed by multiple functions.
> +
> +Name:blocked_vcpu_on_cpu_lock
> +Type:spinlock_t
> +Arch:x86
> +Protects:blocked_vcpu_on_cpu
> +Comment: This is a per-CPU lock and it is used for VT-d 
> posted-interrupts.
> + When VT-d posted-interrupts is supported and the VM has assigned
> + devices, we put the blocked vCPU on the list blocked_vcpu_on_cpu
> + protected by blocked_vcpu_on_cpu_lock, when VT-d hardware issues
> + wakeup notification event since external interrupts from the
> + assigned devices happens, we will find the vCPU on the list to
> + wakeup.
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 0ddd353..304fbb5 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -552,6 +552,8 @@ struct kvm_vcpu_arch {
>*/
>   bool write_fault_to_shadow_pgtable;
>  
> + bool halted;
> +
>   /* set at EPT violation at this point */
>   unsigned long exit_qualification;
>  
> @@ -864,6 +866,17 @@ struct kvm_x86_ops {
>   /* pmu operations of sub-arch */
>   const struct kvm_pmu_ops *pmu_ops;
>  
> + /*
> +  * Architecture specific hooks for vCPU blocking due to
> +  * HLT instruction.
> +  * Returns for .pre_block():
> +  *- 0 means continue to block the vCPU.
> +  *- 1 means we cannot block the vCPU since some event
> +  *happens during this period, such as, 'ON' bit in
> +  *posted-interrupts descriptor is set.
> +  */
> + int (*pre_block)(struct kvm_vcpu *vcpu);
> + void (*post_block)(struct kvm_vcpu *vcpu);
>   int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq,
> uint32_t guest_irq, bool set);
>  };
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 902a67d..9968896 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -879,6 +879,13 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
>  static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
>  static DEFINE_PER_CPU(struct desc_ptr, host_gdt);
>  
> +/*
> + * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we
> + * can find which vCPU should be waken up.
> + */
> +static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu);
> +static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock);
> +
>  static unsigned long *vmx_io_bitmap_a;
>  static unsigned long *vmx_io_bitmap_b;
>  static unsigned long *vmx_msr_bitmap_legacy;
> @@ -2985,6 +2992,8 @@ static int hardware_enable(void)
>   return -EBUSY;
>  
>   INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
> + INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
> + spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
>  
>   /*
>* Now we can enable the vmclear operation in kdump
> @@ -6121,6 +6130,25 @@ static void update_ple_window_actual_max(void)
>   ple_window_grow, INT_MIN);
>  }
>  
> +/*
> + * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
> + */
> +static void wakeup_handler(void)
> +{
> + struct kvm_vcpu *vcpu;
> + int cpu = smp_processor_id();
> +
> + spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
> + list_for_each_entry(vcpu, &per_cpu(b

RE: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series

2015-09-18 Thread Wu, Feng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Friday, September 18, 2015 11:21 PM
> To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org;
> mtosa...@redhat.com
> Cc: eric.au...@linaro.org; k...@vger.kernel.org;
> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org
> Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including
> prerequisite series
> 
> 
> 
> On 18/09/2015 17:08, Wu, Feng wrote:
> >
> >
> >> -Original Message-
> >> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> >> Sent: Friday, September 18, 2015 10:59 PM
> >> To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org;
> >> mtosa...@redhat.com
> >> Cc: eric.au...@linaro.org; k...@vger.kernel.org;
> >> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org
> >> Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - 
> >> including
> >> prerequisite series
> >>
> >>
> >>
> >> On 18/09/2015 16:29, Feng Wu wrote:
> >>> VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
> >>> With VT-d Posted-Interrupts enabled, external interrupts from
> >>> direct-assigned devices can be delivered to guests without VMM
> >>> intervention when guest is running in non-root mode.
> >>>
> >>> You can find the VT-d Posted-Interrtups Spec. in the following URL:
> >>>
> >>
> http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog
> >> y/vt-directed-io-spec.html
> >>
> >> Thanks.  I will squash patches 2 and 14 together, and drop patch 3.
> >>
> >> Signed-off-bys are missing in patch 1 and 4.  The patches exist
> >> elsewhere in the mailing list archives, so not a big deal.  Or just
> >> reply to them with the S-o-b line.
> >>
> >
> > Thanks for your quick response, Paolo! I didn't change the code
> > in patch 1 and 4, do I need to add s-o-b, if needed, I can reply
> > the patches.
> 
> Yes, the s-o-b just means that the code passed through your hands.

Done.
> 
> Note that I replied to patch 17, but no need to resend that one
> either---just mailing list discussion is enough.

Do you mean you replied to patch 17 just now, but I don't find your replies
in the mailing list.

Thanks,
Feng

> 
> Paolo
> 
> > Thanks,
> > Feng
> >
> >> Alex, can you ack the series and review patch 12?
> >>
> >> Joerg, can you ack patch 18?
> >>
> >> Paolo
> >>
> >>> v9:
> >>> - Include the whole series:
> >>> [01/18]: irq bypasser manager
> >>> [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side
> >> forwarded irq
> >>> [07/18] - [18/18]: VT-d PI part
> >>>
> >>> v8:
> >>> refer to the changelog in each patch
> >>>
> >>> v7:
> >>> * Define two weak irq bypass callbacks:
> >>>   - kvm_arch_irq_bypass_start()
> >>>   - kvm_arch_irq_bypass_stop()
> >>> * Remove the x86 dummy implementation of the above two functions.
> >>> * Print some useful information instead of WARN_ON() when the
> >>>   irq bypass consumer unregistration fails.
> >>> * Fix an issue when calling pi_pre_block and pi_post_block.
> >>>
> >>> v6:
> >>> * Rebase on 4.2.0-rc6
> >>> * Rebase on https://lkml.org/lkml/2015/8/6/526 and
> >> http://www.gossamer-threads.com/lists/linux/kernel/2235623
> >>> * Make the add_consumer and del_consumer callbacks static
> >>> * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
> >>> * Use dev_info instead of WARN_ON() when irq_bypass_register_producer
> >> fails
> >>> * Remove optional dummy callbacks for irq producer
> >>>
> >>> v4:
> >>> * For lowest-priority interrupt, only support single-CPU destination
> >>> interrupts at the current stage, more common lowest priority support
> >>> will be added later.
> >>> * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle
> >>> the posted-interrupts in the HLT emulation path.
> >>> * Some small changes (coding style, typo, add some code comments)
> >>>
> >>> v3:
> >>> * Adjust the Posted-interrupts Descriptor updating logic when vCPU is
> >>>   preempted or blocked.
> >>> * KVM_DEV_VFIO_DEVICE_POSTING_IRQ -->
> >> KVM_DEV_VFIO_DEVICE_POST_IRQ
> >>> * __KVM_HAVE_ARCH_KVM_VFIO_POSTING -->
> >> __KVM_HAVE_ARCH_KVM_VFIO_POST
> >>> * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which
> >>>   can be used to change back to remapping mode.
> >>> * Fix typo
> >>>
> >>> v2:
> >>> * Use VFIO framework to enable this feature, the VFIO part of this series 
> >>> is
> >>>   base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control"
> >>> * Rebase this patchset on
> >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git,
> >>>   then revise some irq logic based on the new hierarchy irqdomain
> patches
> >> provided
> >>>   by Jiang Liu 
> >>>
> >>>
> >>> *** BLURB HERE ***
> >>>
> >>> Alex Williamson (1):
> >>>   virt: IRQ bypass manager
> >>>
> >>> Eric Auger (4):
> >>>   KVM: arm/arm64: select IRQ_BYPASS_MANAGER
> >>>   KVM: create kvm_irqfd.h
> >>>   KVM: introduce kvm_arch functions for IRQ bypass
> >>>   KVM: eventfd: add irq bypas

RE: [PATCH v9 04/18] KVM: create kvm_irqfd.h

2015-09-18 Thread Wu, Feng
Signed-off-by: Feng Wu 

> -Original Message-
> From: iommu-boun...@lists.linux-foundation.org
> [mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of Feng Wu
> Sent: Friday, September 18, 2015 10:30 PM
> To: pbonz...@redhat.com; alex.william...@redhat.com; j...@8bytes.org;
> mtosa...@redhat.com
> Cc: iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; eric.au...@linaro.org
> Subject: [PATCH v9 04/18] KVM: create kvm_irqfd.h
> 
> From: Eric Auger 
> 
> Move _irqfd_resampler and _irqfd struct declarations in a new
> public header: kvm_irqfd.h. They are respectively renamed into
> kvm_kernel_irqfd_resampler and kvm_kernel_irqfd. Those datatypes
> will be used by architecture specific code, in the context of
> IRQ bypass manager integration.
> 
> Signed-off-by: Eric Auger 
> ---
>  include/linux/kvm_irqfd.h | 69 ++
>  virt/kvm/eventfd.c| 95 
> ---
>  2 files changed, 92 insertions(+), 72 deletions(-)
>  create mode 100644 include/linux/kvm_irqfd.h
> 
> diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h
> new file mode 100644
> index 000..f926b39
> --- /dev/null
> +++ b/include/linux/kvm_irqfd.h
> @@ -0,0 +1,69 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * irqfd: Allows an fd to be used to inject an interrupt to the guest
> + * Credit goes to Avi Kivity for the original idea.
> + */
> +
> +#ifndef __LINUX_KVM_IRQFD_H
> +#define __LINUX_KVM_IRQFD_H
> +
> +#include 
> +#include 
> +
> +/*
> + * Resampling irqfds are a special variety of irqfds used to emulate
> + * level triggered interrupts.  The interrupt is asserted on eventfd
> + * trigger.  On acknowledgment through the irq ack notifier, the
> + * interrupt is de-asserted and userspace is notified through the
> + * resamplefd.  All resamplers on the same gsi are de-asserted
> + * together, so we don't need to track the state of each individual
> + * user.  We can also therefore share the same irq source ID.
> + */
> +struct kvm_kernel_irqfd_resampler {
> + struct kvm *kvm;
> + /*
> +  * List of resampling struct _irqfd objects sharing this gsi.
> +  * RCU list modified under kvm->irqfds.resampler_lock
> +  */
> + struct list_head list;
> + struct kvm_irq_ack_notifier notifier;
> + /*
> +  * Entry in list of kvm->irqfd.resampler_list.  Use for sharing
> +  * resamplers among irqfds on the same gsi.
> +  * Accessed and modified under kvm->irqfds.resampler_lock
> +  */
> + struct list_head link;
> +};
> +
> +struct kvm_kernel_irqfd {
> + /* Used for MSI fast-path */
> + struct kvm *kvm;
> + wait_queue_t wait;
> + /* Update side is protected by irqfds.lock */
> + struct kvm_kernel_irq_routing_entry irq_entry;
> + seqcount_t irq_entry_sc;
> + /* Used for level IRQ fast-path */
> + int gsi;
> + struct work_struct inject;
> + /* The resampler used by this irqfd (resampler-only) */
> + struct kvm_kernel_irqfd_resampler *resampler;
> + /* Eventfd notified on resample (resampler-only) */
> + struct eventfd_ctx *resamplefd;
> + /* Entry in list of irqfds for a resampler (resampler-only) */
> + struct list_head resampler_link;
> + /* Used for setup/shutdown */
> + struct eventfd_ctx *eventfd;
> + struct list_head list;
> + poll_table pt;
> + struct work_struct shutdown;
> +};
> +
> +#endif /* __LINUX_KVM_IRQFD_H */
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 9ff4193..647ffb8 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -23,6 +23,7 @@
> 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -39,68 +40,14 @@
>  #include 
> 
>  #ifdef CONFIG_HAVE_KVM_IRQFD
> -/*
> - * 
> - * irqfd: Allows an fd to be used to inject an interrupt to the guest
> - *
> - * Credit goes to Avi Kivity for the original idea.
> - * 
> - */
> -
> -/*
> - * Resampling irqfds are a special variety of irqfds used to emulate
> - * level triggered interrupts.  The interrupt is asserted on eventfd
> - * trigger.  On acknowledgement through the irq ack notifier, the
> - * interrupt is de-asserted and userspace is notified through the
> - * resamplefd.  All resamplers on the same gsi are de-asserted
> - * together, so we don't need to track the state of each individual
> -

RE: [PATCH v9 01/18] virt: IRQ bypass manager

2015-09-18 Thread Wu, Feng
Signed-off-by: Feng Wu 

> -Original Message-
> From: iommu-boun...@lists.linux-foundation.org
> [mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of Feng Wu
> Sent: Friday, September 18, 2015 10:30 PM
> To: pbonz...@redhat.com; alex.william...@redhat.com; j...@8bytes.org;
> mtosa...@redhat.com
> Cc: iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; eric.au...@linaro.org
> Subject: [PATCH v9 01/18] virt: IRQ bypass manager
> 
> From: Alex Williamson 
> 
> When a physical I/O device is assigned to a virtual machine through
> facilities like VFIO and KVM, the interrupt for the device generally
> bounces through the host system before being injected into the VM.
> However, hardware technologies exist that often allow the host to be
> bypassed for some of these scenarios.  Intel Posted Interrupts allow
> the specified physical edge interrupts to be directly injected into a
> guest when delivered to a physical processor while the vCPU is
> running.  ARM IRQ Forwarding allows forwarded physical interrupts to
> be directly deactivated by the guest.
> 
> The IRQ bypass manager here is meant to provide the shim to connect
> interrupt producers, generally the host physical device driver, with
> interrupt consumers, generally the hypervisor, in order to configure
> these bypass mechanism.  To do this, we base the connection on a
> shared, opaque token.  For KVM-VFIO this is expected to be an
> eventfd_ctx since this is the connection we already use to connect an
> eventfd to an irqfd on the in-kernel path.  When a producer and
> consumer with matching tokens is found, callbacks via both registered
> participants allow the bypass facilities to be automatically enabled.
> 
> Signed-off-by: Alex Williamson 
> Reviewed-by: Eric Auger 
> Tested-by: Eric Auger 
> Tested-by: Feng Wu 
> ---
> v4: All producer callbacks are optional, as with Intel PI, it's
> possible for the producer to be blissfully unaware of the bypass.
> 
>  MAINTAINERS   |   7 ++
>  include/linux/irqbypass.h |  90 
>  virt/lib/Kconfig  |   2 +
>  virt/lib/Makefile |   1 +
>  virt/lib/irqbypass.c  | 257
> ++
>  5 files changed, 357 insertions(+)
>  create mode 100644 include/linux/irqbypass.h
>  create mode 100644 virt/lib/Kconfig
>  create mode 100644 virt/lib/Makefile
>  create mode 100644 virt/lib/irqbypass.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a9ae6c1..10c8b2f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -10963,6 +10963,13 @@ L:   net...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/net/ethernet/via/via-velocity.*
> 
> +VIRT LIB
> +M:   Alex Williamson 
> +M:   Paolo Bonzini 
> +L:   k...@vger.kernel.org
> +S:   Supported
> +F:   virt/lib/
> +
>  VIVID VIRTUAL VIDEO DRIVER
>  M:   Hans Verkuil 
>  L:   linux-me...@vger.kernel.org
> diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
> new file mode 100644
> index 000..1551b5b
> --- /dev/null
> +++ b/include/linux/irqbypass.h
> @@ -0,0 +1,90 @@
> +/*
> + * IRQ offload/bypass manager
> + *
> + * Copyright (C) 2015 Red Hat, Inc.
> + * Copyright (c) 2015 Linaro Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +#ifndef IRQBYPASS_H
> +#define IRQBYPASS_H
> +
> +#include 
> +
> +struct irq_bypass_consumer;
> +
> +/*
> + * Theory of operation
> + *
> + * The IRQ bypass manager is a simple set of lists and callbacks that allows
> + * IRQ producers (ex. physical interrupt sources) to be matched to IRQ
> + * consumers (ex. virtualization hardware that allows IRQ bypass or offload)
> + * via a shared token (ex. eventfd_ctx).  Producers and consumers register
> + * independently.  When a token match is found, the optional @stop callback
> + * will be called for each participant.  The pair will then be connected via
> + * the @add_* callbacks, and finally the optional @start callback will allow
> + * any final coordination.  When either participant is unregistered, the
> + * process is repeated using the @del_* callbacks in place of the @add_*
> + * callbacks.  Match tokens must be unique per producer/consumer, 1:N
> pairings
> + * are not supported.
> + */
> +
> +/**
> + * struct irq_bypass_producer - IRQ bypass producer definition
> + * @node: IRQ bypass manager private list management
> + * @token: opaque token to match between producer and consumer
> + * @irq: Linux IRQ number for the producer device
> + * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional)
> + * @del_consumer: Disconnect the IRQ producer from an IRQ consumer
> (optional)
> + * @stop: Perform any quiesce operations necessary prior to add/del
> (optional)
> + * @start: Perform any startup operations necessary after add/del (optional)
> + *
> + * The IRQ bypass producer

Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series

2015-09-18 Thread Paolo Bonzini


On 18/09/2015 17:08, Wu, Feng wrote:
> 
> 
>> -Original Message-
>> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
>> Sent: Friday, September 18, 2015 10:59 PM
>> To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org;
>> mtosa...@redhat.com
>> Cc: eric.au...@linaro.org; k...@vger.kernel.org;
>> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org
>> Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including
>> prerequisite series
>>
>>
>>
>> On 18/09/2015 16:29, Feng Wu wrote:
>>> VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
>>> With VT-d Posted-Interrupts enabled, external interrupts from
>>> direct-assigned devices can be delivered to guests without VMM
>>> intervention when guest is running in non-root mode.
>>>
>>> You can find the VT-d Posted-Interrtups Spec. in the following URL:
>>>
>> http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog
>> y/vt-directed-io-spec.html
>>
>> Thanks.  I will squash patches 2 and 14 together, and drop patch 3.
>>
>> Signed-off-bys are missing in patch 1 and 4.  The patches exist
>> elsewhere in the mailing list archives, so not a big deal.  Or just
>> reply to them with the S-o-b line.
>>
> 
> Thanks for your quick response, Paolo! I didn't change the code
> in patch 1 and 4, do I need to add s-o-b, if needed, I can reply
> the patches.

Yes, the s-o-b just means that the code passed through your hands.

Note that I replied to patch 17, but no need to resend that one
either---just mailing list discussion is enough.

Paolo

> Thanks,
> Feng
> 
>> Alex, can you ack the series and review patch 12?
>>
>> Joerg, can you ack patch 18?
>>
>> Paolo
>>
>>> v9:
>>> - Include the whole series:
>>> [01/18]: irq bypasser manager
>>> [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side
>> forwarded irq
>>> [07/18] - [18/18]: VT-d PI part
>>>
>>> v8:
>>> refer to the changelog in each patch
>>>
>>> v7:
>>> * Define two weak irq bypass callbacks:
>>>   - kvm_arch_irq_bypass_start()
>>>   - kvm_arch_irq_bypass_stop()
>>> * Remove the x86 dummy implementation of the above two functions.
>>> * Print some useful information instead of WARN_ON() when the
>>>   irq bypass consumer unregistration fails.
>>> * Fix an issue when calling pi_pre_block and pi_post_block.
>>>
>>> v6:
>>> * Rebase on 4.2.0-rc6
>>> * Rebase on https://lkml.org/lkml/2015/8/6/526 and
>> http://www.gossamer-threads.com/lists/linux/kernel/2235623
>>> * Make the add_consumer and del_consumer callbacks static
>>> * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
>>> * Use dev_info instead of WARN_ON() when irq_bypass_register_producer
>> fails
>>> * Remove optional dummy callbacks for irq producer
>>>
>>> v4:
>>> * For lowest-priority interrupt, only support single-CPU destination
>>> interrupts at the current stage, more common lowest priority support
>>> will be added later.
>>> * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle
>>> the posted-interrupts in the HLT emulation path.
>>> * Some small changes (coding style, typo, add some code comments)
>>>
>>> v3:
>>> * Adjust the Posted-interrupts Descriptor updating logic when vCPU is
>>>   preempted or blocked.
>>> * KVM_DEV_VFIO_DEVICE_POSTING_IRQ -->
>> KVM_DEV_VFIO_DEVICE_POST_IRQ
>>> * __KVM_HAVE_ARCH_KVM_VFIO_POSTING -->
>> __KVM_HAVE_ARCH_KVM_VFIO_POST
>>> * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which
>>>   can be used to change back to remapping mode.
>>> * Fix typo
>>>
>>> v2:
>>> * Use VFIO framework to enable this feature, the VFIO part of this series is
>>>   base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control"
>>> * Rebase this patchset on
>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git,
>>>   then revise some irq logic based on the new hierarchy irqdomain patches
>> provided
>>>   by Jiang Liu 
>>>
>>>
>>> *** BLURB HERE ***
>>>
>>> Alex Williamson (1):
>>>   virt: IRQ bypass manager
>>>
>>> Eric Auger (4):
>>>   KVM: arm/arm64: select IRQ_BYPASS_MANAGER
>>>   KVM: create kvm_irqfd.h
>>>   KVM: introduce kvm_arch functions for IRQ bypass
>>>   KVM: eventfd: add irq bypass consumer management
>>>
>>> Feng Wu (13):
>>>   KVM: x86: select IRQ_BYPASS_MANAGER
>>>   KVM: Extend struct pi_desc for VT-d Posted-Interrupts
>>>   KVM: Add some helper functions for Posted-Interrupts
>>>   KVM: Define a new interface kvm_intr_is_single_vcpu()
>>>   KVM: Make struct kvm_irq_routing_table accessible
>>>   KVM: make kvm_set_msi_irq() public
>>>   vfio: Register/unregister irq_bypass_producer
>>>   KVM: x86: Update IRTE for posted-interrupts
>>>   KVM: Implement IRQ bypass consumer callbacks for x86
>>>   KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
>>>   KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
>>>   KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
>>>   iommu/vt-d: Add a command line parameter for VT-d posted-interrupts
>>>
>>

[PATCH] iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA

2015-09-18 Thread Will Deacon
Stage-2 TLBI by IPA takes a 48-bit address field, as opposed to the
64-bit field used by the VA-based invalidation commands.

This patch re-jigs the SMMUv3 command construction code so that the
address field is correctly masked.

Signed-off-by: Will Deacon 
---
 drivers/iommu/arm-smmu-v3.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index a24f359fa0d0..286e890e7d64 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -343,7 +343,8 @@
 #define CMDQ_TLBI_0_VMID_SHIFT 32
 #define CMDQ_TLBI_0_ASID_SHIFT 48
 #define CMDQ_TLBI_1_LEAF   (1UL << 0)
-#define CMDQ_TLBI_1_ADDR_MASK  ~0xfffUL
+#define CMDQ_TLBI_1_VA_MASK~0xfffUL
+#define CMDQ_TLBI_1_IPA_MASK   0xf000UL
 
 #define CMDQ_PRI_0_SSID_SHIFT  12
 #define CMDQ_PRI_0_SSID_MASK   0xfUL
@@ -771,11 +772,13 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
arm_smmu_cmdq_ent *ent)
break;
case CMDQ_OP_TLBI_NH_VA:
cmd[0] |= (u64)ent->tlbi.asid << CMDQ_TLBI_0_ASID_SHIFT;
-   /* Fallthrough */
+   cmd[1] |= ent->tlbi.leaf ? CMDQ_TLBI_1_LEAF : 0;
+   cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK;
+   break;
case CMDQ_OP_TLBI_S2_IPA:
cmd[0] |= (u64)ent->tlbi.vmid << CMDQ_TLBI_0_VMID_SHIFT;
cmd[1] |= ent->tlbi.leaf ? CMDQ_TLBI_1_LEAF : 0;
-   cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_ADDR_MASK;
+   cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_IPA_MASK;
break;
case CMDQ_OP_TLBI_NH_ASID:
cmd[0] |= (u64)ent->tlbi.asid << CMDQ_TLBI_0_ASID_SHIFT;
-- 
2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series

2015-09-18 Thread Wu, Feng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Friday, September 18, 2015 10:59 PM
> To: Wu, Feng; alex.william...@redhat.com; j...@8bytes.org;
> mtosa...@redhat.com
> Cc: eric.au...@linaro.org; k...@vger.kernel.org;
> iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org
> Subject: Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including
> prerequisite series
> 
> 
> 
> On 18/09/2015 16:29, Feng Wu wrote:
> > VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
> > With VT-d Posted-Interrupts enabled, external interrupts from
> > direct-assigned devices can be delivered to guests without VMM
> > intervention when guest is running in non-root mode.
> >
> > You can find the VT-d Posted-Interrtups Spec. in the following URL:
> >
> http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog
> y/vt-directed-io-spec.html
> 
> Thanks.  I will squash patches 2 and 14 together, and drop patch 3.
> 
> Signed-off-bys are missing in patch 1 and 4.  The patches exist
> elsewhere in the mailing list archives, so not a big deal.  Or just
> reply to them with the S-o-b line.
> 

Thanks for your quick response, Paolo! I didn't change the code
in patch 1 and 4, do I need to add s-o-b, if needed, I can reply
the patches.

Thanks,
Feng

> Alex, can you ack the series and review patch 12?
> 
> Joerg, can you ack patch 18?
> 
> Paolo
> 
> > v9:
> > - Include the whole series:
> > [01/18]: irq bypasser manager
> > [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side
> forwarded irq
> > [07/18] - [18/18]: VT-d PI part
> >
> > v8:
> > refer to the changelog in each patch
> >
> > v7:
> > * Define two weak irq bypass callbacks:
> >   - kvm_arch_irq_bypass_start()
> >   - kvm_arch_irq_bypass_stop()
> > * Remove the x86 dummy implementation of the above two functions.
> > * Print some useful information instead of WARN_ON() when the
> >   irq bypass consumer unregistration fails.
> > * Fix an issue when calling pi_pre_block and pi_post_block.
> >
> > v6:
> > * Rebase on 4.2.0-rc6
> > * Rebase on https://lkml.org/lkml/2015/8/6/526 and
> http://www.gossamer-threads.com/lists/linux/kernel/2235623
> > * Make the add_consumer and del_consumer callbacks static
> > * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
> > * Use dev_info instead of WARN_ON() when irq_bypass_register_producer
> fails
> > * Remove optional dummy callbacks for irq producer
> >
> > v4:
> > * For lowest-priority interrupt, only support single-CPU destination
> > interrupts at the current stage, more common lowest priority support
> > will be added later.
> > * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle
> > the posted-interrupts in the HLT emulation path.
> > * Some small changes (coding style, typo, add some code comments)
> >
> > v3:
> > * Adjust the Posted-interrupts Descriptor updating logic when vCPU is
> >   preempted or blocked.
> > * KVM_DEV_VFIO_DEVICE_POSTING_IRQ -->
> KVM_DEV_VFIO_DEVICE_POST_IRQ
> > * __KVM_HAVE_ARCH_KVM_VFIO_POSTING -->
> __KVM_HAVE_ARCH_KVM_VFIO_POST
> > * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which
> >   can be used to change back to remapping mode.
> > * Fix typo
> >
> > v2:
> > * Use VFIO framework to enable this feature, the VFIO part of this series is
> >   base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control"
> > * Rebase this patchset on
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git,
> >   then revise some irq logic based on the new hierarchy irqdomain patches
> provided
> >   by Jiang Liu 
> >
> >
> > *** BLURB HERE ***
> >
> > Alex Williamson (1):
> >   virt: IRQ bypass manager
> >
> > Eric Auger (4):
> >   KVM: arm/arm64: select IRQ_BYPASS_MANAGER
> >   KVM: create kvm_irqfd.h
> >   KVM: introduce kvm_arch functions for IRQ bypass
> >   KVM: eventfd: add irq bypass consumer management
> >
> > Feng Wu (13):
> >   KVM: x86: select IRQ_BYPASS_MANAGER
> >   KVM: Extend struct pi_desc for VT-d Posted-Interrupts
> >   KVM: Add some helper functions for Posted-Interrupts
> >   KVM: Define a new interface kvm_intr_is_single_vcpu()
> >   KVM: Make struct kvm_irq_routing_table accessible
> >   KVM: make kvm_set_msi_irq() public
> >   vfio: Register/unregister irq_bypass_producer
> >   KVM: x86: Update IRTE for posted-interrupts
> >   KVM: Implement IRQ bypass consumer callbacks for x86
> >   KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
> >   KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
> >   KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
> >   iommu/vt-d: Add a command line parameter for VT-d posted-interrupts
> >
> >  Documentation/kernel-parameters.txt   |   1 +
> >  Documentation/virtual/kvm/locking.txt |  12 ++
> >  MAINTAINERS   |   7 +
> >  arch/arm/kvm/Kconfig  |   2 +
> >  arch/arm/kvm/Makefile |   1 +
> >  arch/arm64/kv

Re: [PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series

2015-09-18 Thread Paolo Bonzini


On 18/09/2015 16:29, Feng Wu wrote:
> VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
> With VT-d Posted-Interrupts enabled, external interrupts from
> direct-assigned devices can be delivered to guests without VMM
> intervention when guest is running in non-root mode.
> 
> You can find the VT-d Posted-Interrtups Spec. in the following URL:
> http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

Thanks.  I will squash patches 2 and 14 together, and drop patch 3.

Signed-off-bys are missing in patch 1 and 4.  The patches exist
elsewhere in the mailing list archives, so not a big deal.  Or just
reply to them with the S-o-b line.

Alex, can you ack the series and review patch 12?

Joerg, can you ack patch 18?

Paolo

> v9:
> - Include the whole series:
> [01/18]: irq bypasser manager
> [02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side 
> forwarded irq
> [07/18] - [18/18]: VT-d PI part
> 
> v8:
> refer to the changelog in each patch
> 
> v7:
> * Define two weak irq bypass callbacks:
>   - kvm_arch_irq_bypass_start()
>   - kvm_arch_irq_bypass_stop()
> * Remove the x86 dummy implementation of the above two functions.
> * Print some useful information instead of WARN_ON() when the
>   irq bypass consumer unregistration fails.
> * Fix an issue when calling pi_pre_block and pi_post_block.
> 
> v6:
> * Rebase on 4.2.0-rc6
> * Rebase on https://lkml.org/lkml/2015/8/6/526 and 
> http://www.gossamer-threads.com/lists/linux/kernel/2235623
> * Make the add_consumer and del_consumer callbacks static
> * Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
> * Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails
> * Remove optional dummy callbacks for irq producer
> 
> v4:
> * For lowest-priority interrupt, only support single-CPU destination
> interrupts at the current stage, more common lowest priority support
> will be added later.
> * Accoring to Marcelo's suggestion, when vCPU is blocked, we handle
> the posted-interrupts in the HLT emulation path.
> * Some small changes (coding style, typo, add some code comments)
> 
> v3:
> * Adjust the Posted-interrupts Descriptor updating logic when vCPU is
>   preempted or blocked.
> * KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> KVM_DEV_VFIO_DEVICE_POST_IRQ
> * __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> __KVM_HAVE_ARCH_KVM_VFIO_POST
> * Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which
>   can be used to change back to remapping mode.
> * Fix typo
> 
> v2:
> * Use VFIO framework to enable this feature, the VFIO part of this series is
>   base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control"
> * Rebase this patchset on 
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git,
>   then revise some irq logic based on the new hierarchy irqdomain patches 
> provided
>   by Jiang Liu 
> 
> 
> *** BLURB HERE ***
> 
> Alex Williamson (1):
>   virt: IRQ bypass manager
> 
> Eric Auger (4):
>   KVM: arm/arm64: select IRQ_BYPASS_MANAGER
>   KVM: create kvm_irqfd.h
>   KVM: introduce kvm_arch functions for IRQ bypass
>   KVM: eventfd: add irq bypass consumer management
> 
> Feng Wu (13):
>   KVM: x86: select IRQ_BYPASS_MANAGER
>   KVM: Extend struct pi_desc for VT-d Posted-Interrupts
>   KVM: Add some helper functions for Posted-Interrupts
>   KVM: Define a new interface kvm_intr_is_single_vcpu()
>   KVM: Make struct kvm_irq_routing_table accessible
>   KVM: make kvm_set_msi_irq() public
>   vfio: Register/unregister irq_bypass_producer
>   KVM: x86: Update IRTE for posted-interrupts
>   KVM: Implement IRQ bypass consumer callbacks for x86
>   KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
>   KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
>   KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
>   iommu/vt-d: Add a command line parameter for VT-d posted-interrupts
> 
>  Documentation/kernel-parameters.txt   |   1 +
>  Documentation/virtual/kvm/locking.txt |  12 ++
>  MAINTAINERS   |   7 +
>  arch/arm/kvm/Kconfig  |   2 +
>  arch/arm/kvm/Makefile |   1 +
>  arch/arm64/kvm/Kconfig|   2 +
>  arch/arm64/kvm/Makefile   |   1 +
>  arch/x86/include/asm/kvm_host.h   |  24 +++
>  arch/x86/kvm/Kconfig  |   3 +
>  arch/x86/kvm/Makefile |   3 +
>  arch/x86/kvm/irq_comm.c   |  32 ++-
>  arch/x86/kvm/lapic.c  |  59 ++
>  arch/x86/kvm/lapic.h  |   2 +
>  arch/x86/kvm/trace.h  |  33 
>  arch/x86/kvm/vmx.c| 361 
> +-
>  arch/x86/kvm/x86.c| 108 +-
>  drivers/iommu/irq_remapping.c |  12 +-
>  drivers/vfio/pci/Kconfig  |   1 +
>  drivers/vfio/pci/vfio_pci_intrs.c |   9 +
>  drivers/vfio/pci/vfio_pci_private.h   |   2 +
>  include/linux

[PATCH v9 09/18] KVM: Define a new interface kvm_intr_is_single_vcpu()

2015-09-18 Thread Feng Wu
This patch defines a new interface kvm_intr_is_single_vcpu(),
which can returns whether the interrupt is for single-CPU or not.

It is used by VT-d PI, since now we only support single-CPU
interrupts, For lowest-priority interrupts, if user configures
it via /proc/irq or uses irqbalance to make it single-CPU, we
can use PI to deliver the interrupts to it. Full functionality
of lowest-priority support will be added later.

Signed-off-by: Feng Wu 
---
v9:
- Move kvm_intr_is_single_vcpu_fast() to lapic.c
- Remove incorrect WARN_ON_ONCE()

v8:
- Some optimizations in kvm_intr_is_single_vcpu().
- Expose kvm_intr_is_single_vcpu() so we can use it in vmx code.
- Add kvm_intr_is_single_vcpu_fast() as the fast path to find
  the target vCPU for the single-destination interrupt

 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/kvm/irq_comm.c | 27 +++
 arch/x86/kvm/lapic.c| 59 +
 arch/x86/kvm/lapic.h|  2 ++
 4 files changed, 91 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 49ec903..af11bca 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1204,4 +1204,7 @@ int __x86_set_memory_region(struct kvm *kvm,
 int x86_set_memory_region(struct kvm *kvm,
  const struct kvm_userspace_memory_region *mem);
 
+bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
+struct kvm_vcpu **dest_vcpu);
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 9efff9e..f86a0da 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -297,6 +297,33 @@ out:
return r;
 }
 
+bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
+struct kvm_vcpu **dest_vcpu)
+{
+   int i, r = 0;
+   struct kvm_vcpu *vcpu;
+
+   if (kvm_intr_is_single_vcpu_fast(kvm, irq, dest_vcpu))
+   return true;
+
+   kvm_for_each_vcpu(i, vcpu, kvm) {
+   if (!kvm_apic_present(vcpu))
+   continue;
+
+   if (!kvm_apic_match_dest(vcpu, NULL, irq->shorthand,
+   irq->dest_id, irq->dest_mode))
+   continue;
+
+   if (++r == 2)
+   return false;
+
+   *dest_vcpu = vcpu;
+   }
+
+   return r == 1;
+}
+EXPORT_SYMBOL_GPL(kvm_intr_is_single_vcpu);
+
 #define IOAPIC_ROUTING_ENTRY(irq) \
{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,  \
  .u.irqchip = { .irqchip = KVM_IRQCHIP_IOAPIC, .pin = (irq) } }
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 2a5ca97..3c8fc71 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -764,6 +764,65 @@ out:
return ret;
 }
 
+bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
+   struct kvm_vcpu **dest_vcpu)
+{
+   struct kvm_apic_map *map;
+   bool ret = false;
+   struct kvm_lapic *dst = NULL;
+
+   if (irq->shorthand)
+   return false;
+
+   rcu_read_lock();
+   map = rcu_dereference(kvm->arch.apic_map);
+
+   if (!map)
+   goto out;
+
+   if (irq->dest_mode == APIC_DEST_PHYSICAL) {
+   if (irq->dest_id == 0xFF)
+   goto out;
+
+   if (irq->dest_id >= ARRAY_SIZE(map->phys_map))
+   goto out;
+
+   dst = map->phys_map[irq->dest_id];
+   if (dst && kvm_apic_present(dst->vcpu))
+   *dest_vcpu = dst->vcpu;
+   else
+   goto out;
+   } else {
+   u16 cid;
+   unsigned long bitmap = 1;
+   int i, r = 0;
+
+   if (!kvm_apic_logical_map_valid(map))
+   goto out;
+
+   apic_logical_id(map, irq->dest_id, &cid, (u16 *)&bitmap);
+
+   if (cid >= ARRAY_SIZE(map->logical_map))
+   goto out;
+
+   for_each_set_bit(i, &bitmap, 16) {
+   dst = map->logical_map[cid][i];
+   if (++r == 2)
+   goto out;
+   }
+
+   if (dst && kvm_apic_present(dst->vcpu))
+   *dest_vcpu = dst->vcpu;
+   else
+   goto out;
+   }
+
+   ret = true;
+out:
+   rcu_read_unlock();
+   return ret;
+}
+
 /*
  * Add a pending IRQ into lapic.
  * Return 1 if successfully added and 0 if discarded.
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 7195274..032fe2d 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -169,4 +169,6 @@ bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int 
vector);
 
 void wait_lapic_expire(struct kvm_vcpu *vcpu);
 
+bool kvm_intr_is_single_vcp

[PATCH v9 17/18] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked

2015-09-18 Thread Feng Wu
This patch updates the Posted-Interrupts Descriptor when vCPU
is blocked.

pre-block:
- Add the vCPU to the blocked per-CPU list
- Set 'NV' to POSTED_INTR_WAKEUP_VECTOR

post-block:
- Remove the vCPU from the per-CPU list

Signed-off-by: Feng Wu 
---
v9:
- Add description for blocked_vcpu_on_cpu_lock in 
Documentation/virtual/kvm/locking.txt
- Check !kvm_arch_has_assigned_device(vcpu->kvm) first, then
  !irq_remapping_cap(IRQ_POSTING_CAP)

v8:
- Rename 'pi_pre_block' to 'pre_block'
- Rename 'pi_post_block' to 'post_block'
- Change some comments
- Only add the vCPU to the blocking list when the VM has assigned devices.

 Documentation/virtual/kvm/locking.txt |  12 +++
 arch/x86/include/asm/kvm_host.h   |  13 +++
 arch/x86/kvm/vmx.c| 153 ++
 arch/x86/kvm/x86.c|  53 +---
 include/linux/kvm_host.h  |   3 +
 virt/kvm/kvm_main.c   |   3 +
 6 files changed, 227 insertions(+), 10 deletions(-)

diff --git a/Documentation/virtual/kvm/locking.txt 
b/Documentation/virtual/kvm/locking.txt
index d68af4d..19f94a6 100644
--- a/Documentation/virtual/kvm/locking.txt
+++ b/Documentation/virtual/kvm/locking.txt
@@ -166,3 +166,15 @@ Comment:   The srcu read lock must be held while accessing 
memslots (e.g.
MMIO/PIO address->device structure mapping (kvm->buses).
The srcu index can be stored in kvm_vcpu->srcu_idx per vcpu
if it is needed by multiple functions.
+
+Name:  blocked_vcpu_on_cpu_lock
+Type:  spinlock_t
+Arch:  x86
+Protects:  blocked_vcpu_on_cpu
+Comment:   This is a per-CPU lock and it is used for VT-d 
posted-interrupts.
+   When VT-d posted-interrupts is supported and the VM has assigned
+   devices, we put the blocked vCPU on the list blocked_vcpu_on_cpu
+   protected by blocked_vcpu_on_cpu_lock, when VT-d hardware issues
+   wakeup notification event since external interrupts from the
+   assigned devices happens, we will find the vCPU on the list to
+   wakeup.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0ddd353..304fbb5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -552,6 +552,8 @@ struct kvm_vcpu_arch {
 */
bool write_fault_to_shadow_pgtable;
 
+   bool halted;
+
/* set at EPT violation at this point */
unsigned long exit_qualification;
 
@@ -864,6 +866,17 @@ struct kvm_x86_ops {
/* pmu operations of sub-arch */
const struct kvm_pmu_ops *pmu_ops;
 
+   /*
+* Architecture specific hooks for vCPU blocking due to
+* HLT instruction.
+* Returns for .pre_block():
+*- 0 means continue to block the vCPU.
+*- 1 means we cannot block the vCPU since some event
+*happens during this period, such as, 'ON' bit in
+*posted-interrupts descriptor is set.
+*/
+   int (*pre_block)(struct kvm_vcpu *vcpu);
+   void (*post_block)(struct kvm_vcpu *vcpu);
int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq,
  uint32_t guest_irq, bool set);
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 902a67d..9968896 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -879,6 +879,13 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
 static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
 static DEFINE_PER_CPU(struct desc_ptr, host_gdt);
 
+/*
+ * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we
+ * can find which vCPU should be waken up.
+ */
+static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu);
+static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock);
+
 static unsigned long *vmx_io_bitmap_a;
 static unsigned long *vmx_io_bitmap_b;
 static unsigned long *vmx_msr_bitmap_legacy;
@@ -2985,6 +2992,8 @@ static int hardware_enable(void)
return -EBUSY;
 
INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
+   INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
+   spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
 
/*
 * Now we can enable the vmclear operation in kdump
@@ -6121,6 +6130,25 @@ static void update_ple_window_actual_max(void)
ple_window_grow, INT_MIN);
 }
 
+/*
+ * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
+ */
+static void wakeup_handler(void)
+{
+   struct kvm_vcpu *vcpu;
+   int cpu = smp_processor_id();
+
+   spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
+   list_for_each_entry(vcpu, &per_cpu(blocked_vcpu_on_cpu, cpu),
+   blocked_vcpu_list) {
+   struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+   if (pi_test_on(pi_desc) == 1)
+   kvm_vcpu_kick(vcpu)

[PATCH v9 15/18] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'

2015-09-18 Thread Feng Wu
This patch adds an arch specific hooks 'arch_update' in
'struct kvm_kernel_irqfd'. On Intel side, it is used to
update the IRTE when VT-d posted-interrupts is used.

Signed-off-by: Feng Wu 
---
v9:
- Use 'if' instead of "? :" in kvm_arch_update_irqfd_routing()
- coding style

v8:
- Remove callback .arch_update()
- Remove kvm_arch_irqfd_init()
- Call kvm_arch_update_irqfd_routing() instead.

 arch/x86/kvm/x86.c   |  9 +
 include/linux/kvm_host.h |  2 ++
 virt/kvm/eventfd.c   | 20 +++-
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 79dac02..58688aa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8293,6 +8293,15 @@ void kvm_arch_irq_bypass_del_producer(struct 
irq_bypass_consumer *cons,
   " fails: %d\n", irqfd->consumer.token, ret);
 }
 
+int kvm_arch_update_irqfd_routing(struct kvm *kvm, unsigned int host_irq,
+  uint32_t guest_irq, bool set)
+{
+   if (!kvm_x86_ops->update_pi_irte)
+   return -EINVAL;
+
+   return kvm_x86_ops->update_pi_irte(kvm, host_irq, guest_irq, set);
+}
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 5f183fb..feba1fb 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1174,6 +1174,8 @@ void kvm_arch_irq_bypass_del_producer(struct 
irq_bypass_consumer *,
   struct irq_bypass_producer *);
 void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *);
 void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *);
+int kvm_arch_update_irqfd_routing(struct kvm *kvm, unsigned int host_irq,
+ uint32_t guest_irq, bool set);
 #endif /* CONFIG_HAVE_KVM_IRQ_BYPASS */
 #endif
 
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index c0a56a1..94306a3 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -266,6 +266,13 @@ void __attribute__((weak)) kvm_arch_irq_bypass_start(
struct irq_bypass_consumer *cons)
 {
 }
+
+int  __attribute__((weak)) kvm_arch_update_irqfd_routing(
+   struct kvm *kvm, unsigned int host_irq,
+   uint32_t guest_irq, bool set)
+{
+   return 0;
+}
 #endif
 
 static int
@@ -582,13 +589,24 @@ kvm_irqfd_release(struct kvm *kvm)
  */
 void kvm_irq_routing_update(struct kvm *kvm)
 {
+   int ret;
struct kvm_kernel_irqfd *irqfd;
 
spin_lock_irq(&kvm->irqfds.lock);
 
-   list_for_each_entry(irqfd, &kvm->irqfds.items, list)
+   list_for_each_entry(irqfd, &kvm->irqfds.items, list) {
irqfd_update(kvm, irqfd);
 
+#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS
+   if (irqfd->producer) {
+   ret = kvm_arch_update_irqfd_routing(
+   irqfd->kvm, irqfd->producer->irq,
+   irqfd->gsi, 1);
+   WARN_ON(ret);
+   }
+#endif
+   }
+
spin_unlock_irq(&kvm->irqfds.lock);
 }
 
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 12/18] vfio: Register/unregister irq_bypass_producer

2015-09-18 Thread Feng Wu
This patch adds the registration/unregistration of an
irq_bypass_producer for MSI/MSIx on vfio pci devices.

Signed-off-by: Feng Wu 
---
v8:
- Merge "[PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices"
  into this patch.

v6:
- Make the add_consumer and del_consumer callbacks static
- Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
- Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails
- Remove optional dummy callbacks for irq producer

 drivers/vfio/pci/Kconfig| 1 +
 drivers/vfio/pci/vfio_pci_intrs.c   | 9 +
 drivers/vfio/pci/vfio_pci_private.h | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 579d83b..02912f1 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -2,6 +2,7 @@ config VFIO_PCI
tristate "VFIO support for PCI devices"
depends on VFIO && PCI && EVENTFD
select VFIO_VIRQFD
+   select IRQ_BYPASS_MANAGER
help
  Support for the PCI VFIO bus driver.  This is required to make
  use of PCI drivers using the VFIO framework.
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
b/drivers/vfio/pci/vfio_pci_intrs.c
index 1f577b4..c65299d 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -319,6 +319,7 @@ static int vfio_msi_set_vector_signal(struct 
vfio_pci_device *vdev,
 
if (vdev->ctx[vector].trigger) {
free_irq(irq, vdev->ctx[vector].trigger);
+   irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
kfree(vdev->ctx[vector].name);
eventfd_ctx_put(vdev->ctx[vector].trigger);
vdev->ctx[vector].trigger = NULL;
@@ -360,6 +361,14 @@ static int vfio_msi_set_vector_signal(struct 
vfio_pci_device *vdev,
return ret;
}
 
+   vdev->ctx[vector].producer.token = trigger;
+   vdev->ctx[vector].producer.irq = irq;
+   ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
+   if (unlikely(ret))
+   dev_info(&pdev->dev,
+   "irq bypass producer (token %p) registeration fails: %d\n",
+   vdev->ctx[vector].producer.token, ret);
+
vdev->ctx[vector].trigger = trigger;
 
return 0;
diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index ae0e1b4..0e7394f 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 
 #ifndef VFIO_PCI_PRIVATE_H
 #define VFIO_PCI_PRIVATE_H
@@ -29,6 +30,7 @@ struct vfio_pci_irq_ctx {
struct virqfd   *mask;
char*name;
boolmasked;
+   struct irq_bypass_producer  producer;
 };
 
 struct vfio_pci_device {
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 13/18] KVM: x86: Update IRTE for posted-interrupts

2015-09-18 Thread Feng Wu
This patch adds the routine to update IRTE for posted-interrupts
when guest changes the interrupt configuration.

Signed-off-by: Feng Wu 
---
v9:
- Check !kvm_arch_has_assigned_device(kvm) first then
  !irq_remapping_cap(IRQ_POSTING_CAP)

v8:
- Move 'kvm_arch_update_pi_irte' to vmx.c as a callback
- Only update the PI irte when VM has assigned devices
- Add a trace point for VT-d posted-interrupts when we update
  or disable it for a specific irq.

 arch/x86/include/asm/kvm_host.h |  3 ++
 arch/x86/kvm/trace.h| 33 
 arch/x86/kvm/vmx.c  | 83 +
 arch/x86/kvm/x86.c  |  2 +
 4 files changed, 121 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index daa6126..8c44286 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -862,6 +862,9 @@ struct kvm_x86_ops {
   gfn_t offset, unsigned long mask);
/* pmu operations of sub-arch */
const struct kvm_pmu_ops *pmu_ops;
+
+   int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq,
+ uint32_t guest_irq, bool set);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 4eae7c3..539a9e4 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -974,6 +974,39 @@ TRACE_EVENT(kvm_enter_smm,
  __entry->smbase)
 );
 
+/*
+ * Tracepoint for VT-d posted-interrupts.
+ */
+TRACE_EVENT(kvm_pi_irte_update,
+   TP_PROTO(unsigned int vcpu_id, unsigned int gsi,
+unsigned int gvec, u64 pi_desc_addr, bool set),
+   TP_ARGS(vcpu_id, gsi, gvec, pi_desc_addr, set),
+
+   TP_STRUCT__entry(
+   __field(unsigned int,   vcpu_id )
+   __field(unsigned int,   gsi )
+   __field(unsigned int,   gvec)
+   __field(u64,pi_desc_addr)
+   __field(bool,   set )
+   ),
+
+   TP_fast_assign(
+   __entry->vcpu_id= vcpu_id;
+   __entry->gsi= gsi;
+   __entry->gvec   = gvec;
+   __entry->pi_desc_addr   = pi_desc_addr;
+   __entry->set= set;
+   ),
+
+   TP_printk("VT-d PI is %s for this irq, vcpu %u, gsi: 0x%x, "
+ "gvec: 0x%x, pi_desc_addr: 0x%llx",
+ __entry->set ? "enabled and being updated" : "disabled",
+ __entry->vcpu_id,
+ __entry->gsi,
+ __entry->gvec,
+ __entry->pi_desc_addr)
+);
+
 #endif /* _TRACE_KVM_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 316f9bf..11bda72 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "trace.h"
 #include "pmu.h"
@@ -605,6 +606,11 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu 
*vcpu)
return container_of(vcpu, struct vcpu_vmx, vcpu);
 }
 
+struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
+{
+   return &(to_vmx(vcpu)->pi_desc);
+}
+
 #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x)
 #define FIELD(number, name)[number] = VMCS12_OFFSET(name)
 #define FIELD64(number, name)  [number] = VMCS12_OFFSET(name), \
@@ -10344,6 +10350,81 @@ static void vmx_enable_log_dirty_pt_masked(struct kvm 
*kvm,
kvm_mmu_clear_dirty_pt_masked(kvm, memslot, offset, mask);
 }
 
+/*
+ * vmx_update_pi_irte - set IRTE for Posted-Interrupts
+ *
+ * @kvm: kvm
+ * @host_irq: host irq of the interrupt
+ * @guest_irq: gsi of the interrupt
+ * @set: set or unset PI
+ * returns 0 on success, < 0 on failure
+ */
+int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
+  uint32_t guest_irq, bool set)
+{
+   struct kvm_kernel_irq_routing_entry *e;
+   struct kvm_irq_routing_table *irq_rt;
+   struct kvm_lapic_irq irq;
+   struct kvm_vcpu *vcpu;
+   struct vcpu_data vcpu_info;
+   int idx, ret = -EINVAL;
+
+   if (!kvm_arch_has_assigned_device(kvm) ||
+   !irq_remapping_cap(IRQ_POSTING_CAP))
+   return 0;
+
+   idx = srcu_read_lock(&kvm->irq_srcu);
+   irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
+   BUG_ON(guest_irq >= irq_rt->nr_rt_entries);
+
+   hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) {
+   if (e->type != KVM_IRQ_ROUTING_MSI)
+   continue;
+   /*
+* VT-d PI cannot support posting multicast/broadcast
+* interrupts to a vCPU, we still use interrupt remapping
+* for these kind of interrupts.
+*
+* For lowest-priority interrupts, we only support
+* those with sing

[PATCH v9 14/18] KVM: Implement IRQ bypass consumer callbacks for x86

2015-09-18 Thread Feng Wu
Implement the following callbacks for x86:

- kvm_arch_irq_bypass_add_producer
- kvm_arch_irq_bypass_del_producer
- kvm_arch_irq_bypass_stop: dummy callback
- kvm_arch_irq_bypass_resume: dummy callback

and set CONFIG_HAVE_KVM_IRQ_BYPASS for x86.

Signed-off-by: Feng Wu 
---
v8:
- Move the weak irq bypas stop and irq bypass start to this patch.
- Call kvm_x86_ops->update_pi_irte() instead of kvm_arch_update_pi_irte().

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/Kconfig|  1 +
 arch/x86/kvm/x86.c  | 44 +
 virt/kvm/eventfd.c  | 12 +++
 4 files changed, 58 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8c44286..0ddd353 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index c951d44..b90776f 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -30,6 +30,7 @@ config KVM
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQFD
select IRQ_BYPASS_MANAGER
+   select HAVE_KVM_IRQ_BYPASS
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_EVENTFD
select KVM_APIC_ARCHITECTURE
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9dcd501..79dac02 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -50,6 +50,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 #define CREATE_TRACE_POINTS
@@ -8249,6 +8251,48 @@ bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
 }
 EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma);
 
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
+ struct irq_bypass_producer *prod)
+{
+   struct kvm_kernel_irqfd *irqfd =
+   container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+   if (kvm_x86_ops->update_pi_irte) {
+   irqfd->producer = prod;
+   return kvm_x86_ops->update_pi_irte(irqfd->kvm,
+   prod->irq, irqfd->gsi, 1);
+   }
+
+   return -EINVAL;
+}
+
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
+ struct irq_bypass_producer *prod)
+{
+   int ret;
+   struct kvm_kernel_irqfd *irqfd =
+   container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+   if (!kvm_x86_ops->update_pi_irte) {
+   WARN_ON(irqfd->producer != NULL);
+   return;
+   }
+
+   WARN_ON(irqfd->producer != prod);
+   irqfd->producer = NULL;
+
+   /*
+* When producer of consumer is unregistered, we change back to
+* remapped mode, so we can re-use the current implementation
+* when the irq is masked/disabed or the consumer side (KVM
+* int this case doesn't want to receive the interrupts.
+   */
+   ret = kvm_x86_ops->update_pi_irte(irqfd->kvm, prod->irq, irqfd->gsi, 0);
+   if (ret)
+   printk(KERN_INFO "irq bypass consumer (token %p) unregistration"
+  " fails: %d\n", irqfd->consumer.token, ret);
+}
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index d7a230f..c0a56a1 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -256,6 +256,18 @@ static void irqfd_update(struct kvm *kvm, struct 
kvm_kernel_irqfd *irqfd)
write_seqcount_end(&irqfd->irq_entry_sc);
 }
 
+#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS
+void __attribute__((weak)) kvm_arch_irq_bypass_stop(
+   struct irq_bypass_consumer *cons)
+{
+}
+
+void __attribute__((weak)) kvm_arch_irq_bypass_start(
+   struct irq_bypass_consumer *cons)
+{
+}
+#endif
+
 static int
 kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
 {
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 16/18] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted

2015-09-18 Thread Feng Wu
This patch updates the Posted-Interrupts Descriptor when vCPU
is preempted.

sched out:
- Set 'SN' to suppress furture non-urgent interrupts posted for
the vCPU.

sched in:
- Clear 'SN'
- Change NDST if vCPU is scheduled to a different CPU
- Set 'NV' to POSTED_INTR_VECTOR

Signed-off-by: Feng Wu 
---
v9:
- Check !kvm_arch_has_assigned_device(vcpu->kvm) first, then
  !irq_remapping_cap(IRQ_POSTING_CAP)

v8:
- Add two wrapper fucntion vmx_vcpu_pi_load() and vmx_vcpu_pi_put().
- Only handle VT-d PI related logic when the VM has assigned devices.

 arch/x86/kvm/vmx.c | 79 ++
 1 file changed, 79 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 11bda72..902a67d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1943,6 +1943,52 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx)
preempt_enable();
 }
 
+static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
+{
+   struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+   struct pi_desc old, new;
+   unsigned int dest;
+
+   if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
+   !irq_remapping_cap(IRQ_POSTING_CAP))
+   return;
+
+   do {
+   old.control = new.control = pi_desc->control;
+
+   /*
+* If 'nv' field is POSTED_INTR_WAKEUP_VECTOR, there
+* are two possible cases:
+* 1. After running 'pre_block', context switch
+*happened. For this case, 'sn' was set in
+*vmx_vcpu_put(), so we need to clear it here.
+* 2. After running 'pre_block', we were blocked,
+*and woken up by some other guy. For this case,
+*we don't need to do anything, 'pi_post_block'
+*will do everything for us. However, we cannot
+*check whether it is case #1 or case #2 here
+*(maybe, not needed), so we also clear sn here,
+*I think it is not a big deal.
+*/
+   if (pi_desc->nv != POSTED_INTR_WAKEUP_VECTOR) {
+   if (vcpu->cpu != cpu) {
+   dest = cpu_physical_id(cpu);
+
+   if (x2apic_enabled())
+   new.ndst = dest;
+   else
+   new.ndst = (dest << 8) & 0xFF00;
+   }
+
+   /* set 'NV' to 'notification vector' */
+   new.nv = POSTED_INTR_VECTOR;
+   }
+
+   /* Allow posting non-urgent interrupts */
+   new.sn = 0;
+   } while (cmpxchg(&pi_desc->control, old.control,
+   new.control) != old.control);
+}
 /*
  * Switches to specified vcpu, until a matching vcpu_put(), but assumes
  * vcpu mutex is already taken.
@@ -1993,10 +2039,27 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int 
cpu)
vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
vmx->loaded_vmcs->cpu = cpu;
}
+
+   vmx_vcpu_pi_load(vcpu, cpu);
+}
+
+static void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
+{
+   struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+   if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
+   !irq_remapping_cap(IRQ_POSTING_CAP))
+   return;
+
+   /* Set SN when the vCPU is preempted */
+   if (vcpu->preempted)
+   pi_set_sn(pi_desc);
 }
 
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 {
+   vmx_vcpu_pi_put(vcpu);
+
__vmx_load_host_state(to_vmx(vcpu));
if (!vmm_exclusive) {
__loaded_vmcs_clear(to_vmx(vcpu)->loaded_vmcs);
@@ -4426,6 +4489,22 @@ static inline bool 
kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_SMP
if (vcpu->mode == IN_GUEST_MODE) {
+   struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+   /*
+* Currently, we don't support urgent interrupt,
+* all interrupts are recognized as non-urgent
+* interrupt, so we cannot post interrupts when
+* 'SN' is set.
+*
+* If the vcpu is in guest mode, it means it is
+* running instead of being scheduled out and
+* waiting in the run queue, and that's the only
+* case when 'SN' is set currently, warning if
+* 'SN' is set.
+*/
+   WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc));
+
apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
POSTED_INTR_VECTOR);
return true;
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 18/18] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts

2015-09-18 Thread Feng Wu
Enable VT-d Posted-Interrtups and add a command line
parameter for it.

Signed-off-by: Feng Wu 
Reviewed-by: Paolo Bonzini 
---
 Documentation/kernel-parameters.txt |  1 +
 drivers/iommu/irq_remapping.c   | 12 
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 1d6f045..52aca36 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1547,6 +1547,7 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
nosid   disable Source ID checking
no_x2apic_optout
BIOS x2APIC opt-out request will be ignored
+   nopost  disable Interrupt Posting
 
iomem=  Disable strict checking of access to MMIO memory
strict  regions from userspace.
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 2d99930..d8c3997 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -22,7 +22,7 @@ int irq_remap_broken;
 int disable_sourceid_checking;
 int no_x2apic_optout;
 
-int disable_irq_post = 1;
+int disable_irq_post = 0;
 
 static int disable_irq_remap;
 static struct irq_remap_ops *remap_ops;
@@ -58,14 +58,18 @@ static __init int setup_irqremap(char *str)
return -EINVAL;
 
while (*str) {
-   if (!strncmp(str, "on", 2))
+   if (!strncmp(str, "on", 2)) {
disable_irq_remap = 0;
-   else if (!strncmp(str, "off", 3))
+   disable_irq_post = 0;
+   } else if (!strncmp(str, "off", 3)) {
disable_irq_remap = 1;
-   else if (!strncmp(str, "nosid", 5))
+   disable_irq_post = 1;
+   } else if (!strncmp(str, "nosid", 5))
disable_sourceid_checking = 1;
else if (!strncmp(str, "no_x2apic_optout", 16))
no_x2apic_optout = 1;
+   else if (!strncmp(str, "nopost", 6))
+   disable_irq_post = 1;
 
str += strcspn(str, ",");
while (*str == ',')
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 04/18] KVM: create kvm_irqfd.h

2015-09-18 Thread Feng Wu
From: Eric Auger 

Move _irqfd_resampler and _irqfd struct declarations in a new
public header: kvm_irqfd.h. They are respectively renamed into
kvm_kernel_irqfd_resampler and kvm_kernel_irqfd. Those datatypes
will be used by architecture specific code, in the context of
IRQ bypass manager integration.

Signed-off-by: Eric Auger 
---
 include/linux/kvm_irqfd.h | 69 ++
 virt/kvm/eventfd.c| 95 ---
 2 files changed, 92 insertions(+), 72 deletions(-)
 create mode 100644 include/linux/kvm_irqfd.h

diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h
new file mode 100644
index 000..f926b39
--- /dev/null
+++ b/include/linux/kvm_irqfd.h
@@ -0,0 +1,69 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * irqfd: Allows an fd to be used to inject an interrupt to the guest
+ * Credit goes to Avi Kivity for the original idea.
+ */
+
+#ifndef __LINUX_KVM_IRQFD_H
+#define __LINUX_KVM_IRQFD_H
+
+#include 
+#include 
+
+/*
+ * Resampling irqfds are a special variety of irqfds used to emulate
+ * level triggered interrupts.  The interrupt is asserted on eventfd
+ * trigger.  On acknowledgment through the irq ack notifier, the
+ * interrupt is de-asserted and userspace is notified through the
+ * resamplefd.  All resamplers on the same gsi are de-asserted
+ * together, so we don't need to track the state of each individual
+ * user.  We can also therefore share the same irq source ID.
+ */
+struct kvm_kernel_irqfd_resampler {
+   struct kvm *kvm;
+   /*
+* List of resampling struct _irqfd objects sharing this gsi.
+* RCU list modified under kvm->irqfds.resampler_lock
+*/
+   struct list_head list;
+   struct kvm_irq_ack_notifier notifier;
+   /*
+* Entry in list of kvm->irqfd.resampler_list.  Use for sharing
+* resamplers among irqfds on the same gsi.
+* Accessed and modified under kvm->irqfds.resampler_lock
+*/
+   struct list_head link;
+};
+
+struct kvm_kernel_irqfd {
+   /* Used for MSI fast-path */
+   struct kvm *kvm;
+   wait_queue_t wait;
+   /* Update side is protected by irqfds.lock */
+   struct kvm_kernel_irq_routing_entry irq_entry;
+   seqcount_t irq_entry_sc;
+   /* Used for level IRQ fast-path */
+   int gsi;
+   struct work_struct inject;
+   /* The resampler used by this irqfd (resampler-only) */
+   struct kvm_kernel_irqfd_resampler *resampler;
+   /* Eventfd notified on resample (resampler-only) */
+   struct eventfd_ctx *resamplefd;
+   /* Entry in list of irqfds for a resampler (resampler-only) */
+   struct list_head resampler_link;
+   /* Used for setup/shutdown */
+   struct eventfd_ctx *eventfd;
+   struct list_head list;
+   poll_table pt;
+   struct work_struct shutdown;
+};
+
+#endif /* __LINUX_KVM_IRQFD_H */
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 9ff4193..647ffb8 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -23,6 +23,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -39,68 +40,14 @@
 #include 
 
 #ifdef CONFIG_HAVE_KVM_IRQFD
-/*
- * 
- * irqfd: Allows an fd to be used to inject an interrupt to the guest
- *
- * Credit goes to Avi Kivity for the original idea.
- * 
- */
-
-/*
- * Resampling irqfds are a special variety of irqfds used to emulate
- * level triggered interrupts.  The interrupt is asserted on eventfd
- * trigger.  On acknowledgement through the irq ack notifier, the
- * interrupt is de-asserted and userspace is notified through the
- * resamplefd.  All resamplers on the same gsi are de-asserted
- * together, so we don't need to track the state of each individual
- * user.  We can also therefore share the same irq source ID.
- */
-struct _irqfd_resampler {
-   struct kvm *kvm;
-   /*
-* List of resampling struct _irqfd objects sharing this gsi.
-* RCU list modified under kvm->irqfds.resampler_lock
-*/
-   struct list_head list;
-   struct kvm_irq_ack_notifier notifier;
-   /*
-* Entry in list of kvm->irqfd.resampler_list.  Use for sharing
-* resamplers among irqfds on the same gsi.
-* Accessed and modified under kvm->irqfds.resampler_lock
-*/
-   struct list_head link;
-};
-
-struct _irqfd {
-   /* Used for MSI fast-path */
-   s

[PATCH v9 06/18] KVM: eventfd: add irq bypass consumer management

2015-09-18 Thread Feng Wu
From: Eric Auger 

This patch adds the registration/unregistration of an
irq_bypass_consumer on irqfd assignment/deassignment.

Signed-off-by: Eric Auger 
Signed-off-by: Feng Wu 
---
v4 -> v5:
- due to removal of static inline stubs, add
  #ifdef CONFIG_HAVE_KVM_IRQ_BYPASS
  around consumer registration/unregistration
- add pr_info when registration fails

v2 -> v3 (Feng Wu):
- Use kvm_arch_irq_bypass_start
- Remove kvm_arch_irq_bypass_update
- Add member 'struct irq_bypass_producer *producer' in
  'struct kvm_kernel_irqfd', it is needed by posted interrupt.
- Remove 'irq_bypass_unregister_consumer' in kvm_irqfd_deassign()

v1 -> v2:
- populate of kvm and gsi removed
- unregister the consumer on irqfd_shutdown

 include/linux/kvm_irqfd.h |  2 ++
 virt/kvm/eventfd.c| 15 +++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h
index f926b39..0c1de05 100644
--- a/include/linux/kvm_irqfd.h
+++ b/include/linux/kvm_irqfd.h
@@ -64,6 +64,8 @@ struct kvm_kernel_irqfd {
struct list_head list;
poll_table pt;
struct work_struct shutdown;
+   struct irq_bypass_consumer consumer;
+   struct irq_bypass_producer *producer;
 };
 
 #endif /* __LINUX_KVM_IRQFD_H */
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 647ffb8..d7a230f 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -140,6 +141,9 @@ irqfd_shutdown(struct work_struct *work)
/*
 * It is now safe to release the object's resources
 */
+#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS
+   irq_bypass_unregister_consumer(&irqfd->consumer);
+#endif
eventfd_ctx_put(irqfd->eventfd);
kfree(irqfd);
 }
@@ -379,6 +383,17 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
 * we might race against the POLLHUP
 */
fdput(f);
+#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS
+   irqfd->consumer.token = (void *)irqfd->eventfd;
+   irqfd->consumer.add_producer = kvm_arch_irq_bypass_add_producer;
+   irqfd->consumer.del_producer = kvm_arch_irq_bypass_del_producer;
+   irqfd->consumer.stop = kvm_arch_irq_bypass_stop;
+   irqfd->consumer.start = kvm_arch_irq_bypass_start;
+   ret = irq_bypass_register_consumer(&irqfd->consumer);
+   if (ret)
+   pr_info("irq bypass consumer (token %p) registration fails: 
%d\n",
+   irqfd->consumer.token, ret);
+#endif
 
return 0;
 
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 11/18] KVM: make kvm_set_msi_irq() public

2015-09-18 Thread Feng Wu
Make kvm_set_msi_irq() public, we can use this function outside.

Signed-off-by: Feng Wu 
Reviewed-by: Paolo Bonzini 
---
v8:
- Export kvm_set_msi_irq() so we can use it in vmx code

 arch/x86/include/asm/kvm_host.h | 4 
 arch/x86/kvm/irq_comm.c | 5 +++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index af11bca..daa6126 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -175,6 +175,8 @@ enum {
  */
 #define KVM_APIC_PV_EOI_PENDING1
 
+struct kvm_kernel_irq_routing_entry;
+
 /*
  * We don't want allocation failures within the mmu code, so we preallocate
  * enough memory for a single page fault in a cache.
@@ -1207,4 +1209,6 @@ int x86_set_memory_region(struct kvm *kvm,
 bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
 struct kvm_vcpu **dest_vcpu);
 
+void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
+struct kvm_lapic_irq *irq);
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index f86a0da..4f6fa67 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -91,8 +91,8 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct 
kvm_lapic *src,
return r;
 }
 
-static inline void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
-  struct kvm_lapic_irq *irq)
+void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
+struct kvm_lapic_irq *irq)
 {
trace_kvm_msi_set_irq(e->msi.address_lo, e->msi.data);
 
@@ -108,6 +108,7 @@ static inline void kvm_set_msi_irq(struct 
kvm_kernel_irq_routing_entry *e,
irq->level = 1;
irq->shorthand = 0;
 }
+EXPORT_SYMBOL_GPL(kvm_set_msi_irq);
 
 int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
struct kvm *kvm, int irq_source_id, int level, bool line_status)
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 08/18] KVM: Add some helper functions for Posted-Interrupts

2015-09-18 Thread Feng Wu
This patch adds some helper functions to manipulate the
Posted-Interrupts Descriptor.

Signed-off-by: Feng Wu 
Reviewed-by: Paolo Bonzini 
---
 arch/x86/kvm/vmx.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 271dd70..316f9bf 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -443,6 +443,8 @@ struct nested_vmx {
 };
 
 #define POSTED_INTR_ON  0
+#define POSTED_INTR_SN  1
+
 /* Posted-Interrupt Descriptor */
 struct pi_desc {
u32 pir[8]; /* Posted interrupt requested */
@@ -483,6 +485,30 @@ static int pi_test_and_set_pir(int vector, struct pi_desc 
*pi_desc)
return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
 }
 
+static void pi_clear_sn(struct pi_desc *pi_desc)
+{
+   return clear_bit(POSTED_INTR_SN,
+   (unsigned long *)&pi_desc->control);
+}
+
+static void pi_set_sn(struct pi_desc *pi_desc)
+{
+   return set_bit(POSTED_INTR_SN,
+   (unsigned long *)&pi_desc->control);
+}
+
+static int pi_test_on(struct pi_desc *pi_desc)
+{
+   return test_bit(POSTED_INTR_ON,
+   (unsigned long *)&pi_desc->control);
+}
+
+static int pi_test_sn(struct pi_desc *pi_desc)
+{
+   return test_bit(POSTED_INTR_SN,
+   (unsigned long *)&pi_desc->control);
+}
+
 struct vcpu_vmx {
struct kvm_vcpu   vcpu;
unsigned long host_rsp;
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 10/18] KVM: Make struct kvm_irq_routing_table accessible

2015-09-18 Thread Feng Wu
Move struct kvm_irq_routing_table from irqchip.c to kvm_host.h,
so we can use it outside of irqchip.c.

Signed-off-by: Feng Wu 
Reviewed-by: Paolo Bonzini 
---
 include/linux/kvm_host.h | 14 ++
 virt/kvm/irqchip.c   | 10 --
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 5ac8d21..5f183fb 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -328,6 +328,20 @@ struct kvm_kernel_irq_routing_entry {
struct hlist_node link;
 };
 
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+
+struct kvm_irq_routing_table {
+   int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
+   u32 nr_rt_entries;
+   /*
+* Array indexed by gsi. Each entry contains list of irq chips
+* the gsi is connected to.
+*/
+   struct hlist_head map[0];
+};
+
+#endif
+
 #ifndef KVM_PRIVATE_MEM_SLOTS
 #define KVM_PRIVATE_MEM_SLOTS 0
 #endif
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 21c1424..2cf45d3 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -31,16 +31,6 @@
 #include 
 #include "irq.h"
 
-struct kvm_irq_routing_table {
-   int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
-   u32 nr_rt_entries;
-   /*
-* Array indexed by gsi. Each entry contains list of irq chips
-* the gsi is connected to.
-*/
-   struct hlist_head map[0];
-};
-
 int kvm_irq_map_gsi(struct kvm *kvm,
struct kvm_kernel_irq_routing_entry *entries, int gsi)
 {
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 05/18] KVM: introduce kvm_arch functions for IRQ bypass

2015-09-18 Thread Feng Wu
From: Eric Auger 

This patch introduces
- kvm_arch_irq_bypass_add_producer
- kvm_arch_irq_bypass_del_producer
- kvm_arch_irq_bypass_stop
- kvm_arch_irq_bypass_start

They make possible to specialize the KVM IRQ bypass consumer in
case CONFIG_KVM_HAVE_IRQ_BYPASS is set.

Signed-off-by: Eric Auger 
Signed-off-by: Feng Wu 
---
v4 -> v5:
- remove static inline stub functions

v2 -> v3 (Feng Wu):
- use 'kvm_arch_irq_bypass_start' instead of 'kvm_arch_irq_bypass_resume'
- Remove 'kvm_arch_irq_bypass_update', which is not needed to be
  a irqbypass callback per Alex's comments.
- Make kvm_arch_irq_bypass_add_producer return 'int'

v1 -> v2:
- use CONFIG_KVM_HAVE_IRQ_BYPASS instead CONFIG_IRQ_BYPASS_MANAGER
- rename all functions according to Paolo's proposal
- add kvm_arch_irq_bypass_update according to Feng's need

 include/linux/kvm_host.h | 10 ++
 virt/kvm/Kconfig |  3 +++
 2 files changed, 13 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 05e99b8..5ac8d21 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -1151,5 +1152,14 @@ static inline void kvm_vcpu_set_dy_eligible(struct 
kvm_vcpu *vcpu, bool val)
 {
 }
 #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */
+
+#ifdef CONFIG_HAVE_KVM_IRQ_BYPASS
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *,
+  struct irq_bypass_producer *);
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *,
+  struct irq_bypass_producer *);
+void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *);
+void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *);
+#endif /* CONFIG_HAVE_KVM_IRQ_BYPASS */
 #endif
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index e2c876d..9f8014d 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -47,3 +47,6 @@ config KVM_GENERIC_DIRTYLOG_READ_PROTECT
 config KVM_COMPAT
def_bool y
depends on COMPAT && !S390
+
+config HAVE_KVM_IRQ_BYPASS
+   bool
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 07/18] KVM: Extend struct pi_desc for VT-d Posted-Interrupts

2015-09-18 Thread Feng Wu
Extend struct pi_desc for VT-d Posted-Interrupts.

Signed-off-by: Feng Wu 
---
 arch/x86/kvm/vmx.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 83b7b5c..271dd70 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -446,8 +446,24 @@ struct nested_vmx {
 /* Posted-Interrupt Descriptor */
 struct pi_desc {
u32 pir[8]; /* Posted interrupt requested */
-   u32 control;/* bit 0 of control is outstanding notification bit */
-   u32 rsvd[7];
+   union {
+   struct {
+   /* bit 256 - Outstanding Notification */
+   u16 on  : 1,
+   /* bit 257 - Suppress Notification */
+   sn  : 1,
+   /* bit 271:258 - Reserved */
+   rsvd_1  : 14;
+   /* bit 279:272 - Notification Vector */
+   u8  nv;
+   /* bit 287:280 - Reserved */
+   u8  rsvd_2;
+   /* bit 319:288 - Notification Destination */
+   u32 ndst;
+   };
+   u64 control;
+   };
+   u32 rsvd[6];
 } __aligned(64);
 
 static bool pi_test_and_set_on(struct pi_desc *pi_desc)
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 00/18] Add VT-d Posted-Interrupts support - including prerequisite series

2015-09-18 Thread Feng Wu
VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

v9:
- Include the whole series:
[01/18]: irq bypasser manager
[02/18] - [06/18]: Common non-architecture part for VT-d PI and ARM side 
forwarded irq
[07/18] - [18/18]: VT-d PI part

v8:
refer to the changelog in each patch

v7:
* Define two weak irq bypass callbacks:
  - kvm_arch_irq_bypass_start()
  - kvm_arch_irq_bypass_stop()
* Remove the x86 dummy implementation of the above two functions.
* Print some useful information instead of WARN_ON() when the
  irq bypass consumer unregistration fails.
* Fix an issue when calling pi_pre_block and pi_post_block.

v6:
* Rebase on 4.2.0-rc6
* Rebase on https://lkml.org/lkml/2015/8/6/526 and 
http://www.gossamer-threads.com/lists/linux/kernel/2235623
* Make the add_consumer and del_consumer callbacks static
* Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
* Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails
* Remove optional dummy callbacks for irq producer

v4:
* For lowest-priority interrupt, only support single-CPU destination
interrupts at the current stage, more common lowest priority support
will be added later.
* Accoring to Marcelo's suggestion, when vCPU is blocked, we handle
the posted-interrupts in the HLT emulation path.
* Some small changes (coding style, typo, add some code comments)

v3:
* Adjust the Posted-interrupts Descriptor updating logic when vCPU is
  preempted or blocked.
* KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> KVM_DEV_VFIO_DEVICE_POST_IRQ
* __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> __KVM_HAVE_ARCH_KVM_VFIO_POST
* Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which
  can be used to change back to remapping mode.
* Fix typo

v2:
* Use VFIO framework to enable this feature, the VFIO part of this series is
  base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control"
* Rebase this patchset on 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git,
  then revise some irq logic based on the new hierarchy irqdomain patches 
provided
  by Jiang Liu 


*** BLURB HERE ***

Alex Williamson (1):
  virt: IRQ bypass manager

Eric Auger (4):
  KVM: arm/arm64: select IRQ_BYPASS_MANAGER
  KVM: create kvm_irqfd.h
  KVM: introduce kvm_arch functions for IRQ bypass
  KVM: eventfd: add irq bypass consumer management

Feng Wu (13):
  KVM: x86: select IRQ_BYPASS_MANAGER
  KVM: Extend struct pi_desc for VT-d Posted-Interrupts
  KVM: Add some helper functions for Posted-Interrupts
  KVM: Define a new interface kvm_intr_is_single_vcpu()
  KVM: Make struct kvm_irq_routing_table accessible
  KVM: make kvm_set_msi_irq() public
  vfio: Register/unregister irq_bypass_producer
  KVM: x86: Update IRTE for posted-interrupts
  KVM: Implement IRQ bypass consumer callbacks for x86
  KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
  KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
  KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
  iommu/vt-d: Add a command line parameter for VT-d posted-interrupts

 Documentation/kernel-parameters.txt   |   1 +
 Documentation/virtual/kvm/locking.txt |  12 ++
 MAINTAINERS   |   7 +
 arch/arm/kvm/Kconfig  |   2 +
 arch/arm/kvm/Makefile |   1 +
 arch/arm64/kvm/Kconfig|   2 +
 arch/arm64/kvm/Makefile   |   1 +
 arch/x86/include/asm/kvm_host.h   |  24 +++
 arch/x86/kvm/Kconfig  |   3 +
 arch/x86/kvm/Makefile |   3 +
 arch/x86/kvm/irq_comm.c   |  32 ++-
 arch/x86/kvm/lapic.c  |  59 ++
 arch/x86/kvm/lapic.h  |   2 +
 arch/x86/kvm/trace.h  |  33 
 arch/x86/kvm/vmx.c| 361 +-
 arch/x86/kvm/x86.c| 108 +-
 drivers/iommu/irq_remapping.c |  12 +-
 drivers/vfio/pci/Kconfig  |   1 +
 drivers/vfio/pci/vfio_pci_intrs.c |   9 +
 drivers/vfio/pci/vfio_pci_private.h   |   2 +
 include/linux/irqbypass.h |  90 +
 include/linux/kvm_host.h  |  29 +++
 include/linux/kvm_irqfd.h |  71 +++
 virt/kvm/Kconfig  |   3 +
 virt/kvm/eventfd.c| 142 +++--
 virt/kvm/irqchip.c|  10 -
 virt/kvm/kvm_main.c   |   3 +
 virt/lib/Kconfig  |   2 +
 virt/lib/Makefile |   1 +
 virt/lib/irqbypass.c  | 257 
 30 files changed, 1182 insertions(+), 101 deletions(-)
 create mode 100644 i

[PATCH v9 01/18] virt: IRQ bypass manager

2015-09-18 Thread Feng Wu
From: Alex Williamson 

When a physical I/O device is assigned to a virtual machine through
facilities like VFIO and KVM, the interrupt for the device generally
bounces through the host system before being injected into the VM.
However, hardware technologies exist that often allow the host to be
bypassed for some of these scenarios.  Intel Posted Interrupts allow
the specified physical edge interrupts to be directly injected into a
guest when delivered to a physical processor while the vCPU is
running.  ARM IRQ Forwarding allows forwarded physical interrupts to
be directly deactivated by the guest.

The IRQ bypass manager here is meant to provide the shim to connect
interrupt producers, generally the host physical device driver, with
interrupt consumers, generally the hypervisor, in order to configure
these bypass mechanism.  To do this, we base the connection on a
shared, opaque token.  For KVM-VFIO this is expected to be an
eventfd_ctx since this is the connection we already use to connect an
eventfd to an irqfd on the in-kernel path.  When a producer and
consumer with matching tokens is found, callbacks via both registered
participants allow the bypass facilities to be automatically enabled.

Signed-off-by: Alex Williamson 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
Tested-by: Feng Wu 
---
v4: All producer callbacks are optional, as with Intel PI, it's
possible for the producer to be blissfully unaware of the bypass.

 MAINTAINERS   |   7 ++
 include/linux/irqbypass.h |  90 
 virt/lib/Kconfig  |   2 +
 virt/lib/Makefile |   1 +
 virt/lib/irqbypass.c  | 257 ++
 5 files changed, 357 insertions(+)
 create mode 100644 include/linux/irqbypass.h
 create mode 100644 virt/lib/Kconfig
 create mode 100644 virt/lib/Makefile
 create mode 100644 virt/lib/irqbypass.c

diff --git a/MAINTAINERS b/MAINTAINERS
index a9ae6c1..10c8b2f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10963,6 +10963,13 @@ L: net...@vger.kernel.org
 S: Maintained
 F: drivers/net/ethernet/via/via-velocity.*
 
+VIRT LIB
+M: Alex Williamson 
+M: Paolo Bonzini 
+L: k...@vger.kernel.org
+S: Supported
+F: virt/lib/
+
 VIVID VIRTUAL VIDEO DRIVER
 M: Hans Verkuil 
 L: linux-me...@vger.kernel.org
diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
new file mode 100644
index 000..1551b5b
--- /dev/null
+++ b/include/linux/irqbypass.h
@@ -0,0 +1,90 @@
+/*
+ * IRQ offload/bypass manager
+ *
+ * Copyright (C) 2015 Red Hat, Inc.
+ * Copyright (c) 2015 Linaro Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef IRQBYPASS_H
+#define IRQBYPASS_H
+
+#include 
+
+struct irq_bypass_consumer;
+
+/*
+ * Theory of operation
+ *
+ * The IRQ bypass manager is a simple set of lists and callbacks that allows
+ * IRQ producers (ex. physical interrupt sources) to be matched to IRQ
+ * consumers (ex. virtualization hardware that allows IRQ bypass or offload)
+ * via a shared token (ex. eventfd_ctx).  Producers and consumers register
+ * independently.  When a token match is found, the optional @stop callback
+ * will be called for each participant.  The pair will then be connected via
+ * the @add_* callbacks, and finally the optional @start callback will allow
+ * any final coordination.  When either participant is unregistered, the
+ * process is repeated using the @del_* callbacks in place of the @add_*
+ * callbacks.  Match tokens must be unique per producer/consumer, 1:N pairings
+ * are not supported.
+ */
+
+/**
+ * struct irq_bypass_producer - IRQ bypass producer definition
+ * @node: IRQ bypass manager private list management
+ * @token: opaque token to match between producer and consumer
+ * @irq: Linux IRQ number for the producer device
+ * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional)
+ * @del_consumer: Disconnect the IRQ producer from an IRQ consumer (optional)
+ * @stop: Perform any quiesce operations necessary prior to add/del (optional)
+ * @start: Perform any startup operations necessary after add/del (optional)
+ *
+ * The IRQ bypass producer structure represents an interrupt source for
+ * participation in possible host bypass, for instance an interrupt vector
+ * for a physical device assigned to a VM.
+ */
+struct irq_bypass_producer {
+   struct list_head node;
+   void *token;
+   int irq;
+   int (*add_consumer)(struct irq_bypass_producer *,
+   struct irq_bypass_consumer *);
+   void (*del_consumer)(struct irq_bypass_producer *,
+struct irq_bypass_consumer *);
+   void (*stop)(struct irq_bypass_producer *);
+   void (*start)(struct irq_bypass_producer *);
+};
+
+/**
+ * struct irq_bypass_consumer - IRQ bypass consumer definition
+ * @

[PATCH v9 03/18] KVM: arm/arm64: select IRQ_BYPASS_MANAGER

2015-09-18 Thread Feng Wu
From: Eric Auger 

Select IRQ_BYPASS_MANAGER when CONFIG_KVM is set
Also add compilation of virt/lib.

Signed-off-by: Eric Auger 
Signed-off-by: Feng Wu 
---
v3 -> v4:
- add compilation of virt/lib in arm/arm64 KVM

v2 -> v3:
- [Feng Wu] Correct a typo in 'arch/arm64/kvm/Kconfig'

v1 -> v2:
- also set IRQ_BYPASS_MANAGER for arm64

 arch/arm/kvm/Kconfig| 2 ++
 arch/arm/kvm/Makefile   | 1 +
 arch/arm64/kvm/Kconfig  | 2 ++
 arch/arm64/kvm/Makefile | 1 +
 4 files changed, 6 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index bfb915d..3c565b9 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
bool "Virtualization"
@@ -31,6 +32,7 @@ config KVM
select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
+   select IRQ_BYPASS_MANAGER
depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
---help---
  Support hosting virtualized guest machines.
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index c5eef02c..a6a41dd 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -24,3 +24,4 @@ obj-y += $(KVM)/arm/vgic.o
 obj-y += $(KVM)/arm/vgic-v2.o
 obj-y += $(KVM)/arm/vgic-v2-emul.o
 obj-y += $(KVM)/arm/arch_timer.o
+obj-y += ../../../virt/lib/
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index bfffe8f..2509539 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
bool "Virtualization"
@@ -31,6 +32,7 @@ config KVM
select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
+   select IRQ_BYPASS_MANAGER
---help---
  Support hosting virtualized guest machines.
 
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index f90f4aa..55eec69 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -27,3 +27,4 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3-emul.o
 kvm-$(CONFIG_KVM_ARM_HOST) += vgic-v3-switch.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
+kvm-$(CONFIG_KVM_ARM_HOST) += ../../../virt/lib/
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v9 02/18] KVM: x86: select IRQ_BYPASS_MANAGER

2015-09-18 Thread Feng Wu
Select IRQ_BYPASS_MANAGER for x86 when CONFIG_KVM is set

Signed-off-by: Feng Wu 
---
 arch/x86/kvm/Kconfig  | 2 ++
 arch/x86/kvm/Makefile | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index d8a1d56..c951d44 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
bool "Virtualization"
@@ -28,6 +29,7 @@ config KVM
select ANON_INODES
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQFD
+   select IRQ_BYPASS_MANAGER
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_EVENTFD
select KVM_APIC_ARCHITECTURE
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 67d215c..05cc2d7 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -6,6 +6,9 @@ CFLAGS_svm.o := -I.
 CFLAGS_vmx.o := -I.
 
 KVM := ../../../virt/kvm
+LIB := ../../../virt/lib
+
+obj-$(CONFIG_IRQ_BYPASS_MANAGER)   += $(LIB)/
 
 kvm-y  += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
$(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu: Ensure IAS is set correctly for AArch32-capable SMMUs

2015-09-18 Thread Will Deacon
AArch32-capable SMMU implementations have a minimum IAS of 40 bits, so
ensure that is reflected in the stage-2 page table configuration.

Signed-off-by: Will Deacon 
---
 drivers/iommu/arm-smmu-v3.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index dafaf59dc3b8..a24f359fa0d0 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -56,6 +56,7 @@
 #define IDR0_TTF_SHIFT 2
 #define IDR0_TTF_MASK  0x3
 #define IDR0_TTF_AARCH64   (2 << IDR0_TTF_SHIFT)
+#define IDR0_TTF_AARCH32_64(3 << IDR0_TTF_SHIFT)
 #define IDR0_S1P   (1 << 1)
 #define IDR0_S2P   (1 << 0)
 
@@ -2460,7 +2461,13 @@ static int arm_smmu_device_probe(struct arm_smmu_device 
*smmu)
}
 
/* We only support the AArch64 table format at present */
-   if ((reg & IDR0_TTF_MASK << IDR0_TTF_SHIFT) < IDR0_TTF_AARCH64) {
+   switch (reg & IDR0_TTF_MASK << IDR0_TTF_SHIFT) {
+   case IDR0_TTF_AARCH32_64:
+   smmu->ias = 40;
+   /* Fallthrough */
+   case IDR0_TTF_AARCH64:
+   break;
+   default:
dev_err(smmu->dev, "AArch64 table format not supported!\n");
return -ENXIO;
}
@@ -2541,8 +2548,7 @@ static int arm_smmu_device_probe(struct arm_smmu_device 
*smmu)
dev_warn(smmu->dev,
 "failed to set DMA mask for table walker\n");
 
-   if (!smmu->ias)
-   smmu->ias = smmu->oas;
+   smmu->ias = max(smmu->ias, smmu->oas);
 
dev_info(smmu->dev, "ias %lu-bit, oas %lu-bit (features 0x%08x)\n",
 smmu->ias, smmu->oas, smmu->features);
-- 
2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()

2015-09-18 Thread Russell King - ARM Linux
On Fri, Sep 18, 2015 at 12:04:26PM +0100, Robin Murphy wrote:
> Specifically, the problem case for that is when phys_addr_t is 64-bit but
> dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary to avoid
> a truncation warning when we make the DMA API calls, but we actually need
> the opposite in the comparison here - comparing the different types directly
> allows integer promotion to kick in appropriately so we don't lose the top
> half of the larger address. Otherwise, you'd never spot the difference
> between, say, your original page at 0x88c000 and a bounce-buffered copy
> that happened to end up mapped to 0xc000.

Hmm.  Thinking about this, I think we ought to add to arch/arm/mm/Kconfig:

 config ARCH_PHYS_ADDR_T_64BIT
def_bool ARM_LPAE
 
 config ARCH_DMA_ADDR_T_64BIT
bool
+   select ARCH_PHYS_ADDR_T_64BIT

I seem to remember that you're quite right that dma_addr_t <= phys_addr_t
but dma_addr_t must never be bigger than phys_addr_t.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()

2015-09-18 Thread Robin Murphy

On 18/09/15 09:55, Yong Wu wrote:

On Thu, 2015-09-17 at 17:42 +0100, Robin Murphy wrote:

[...]

the appropriate course of action. Further care (and ugliness) is also
necessary in the comparison to avoid truncation if phys_addr_t and
dma_addr_t differ in size.

[...]

/*
 * We depend on the IOMMU being able to work with any physical
-* address directly, so if the DMA layer suggests it can't by
-* giving us back some translation, that bodes very badly...
+* address directly, so if the DMA layer suggests otherwise by
+* translating or truncating them, that bodes very badly...
 */
-   if (dma != __arm_lpae_dma_addr(dev, pages))
+   if (dma != virt_to_phys(pages))


Could I ask why not use __arm_lpae_dma_addr(pages) here?
dma is dma_addr_t.


Specifically, the problem case for that is when phys_addr_t is 64-bit 
but dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary 
to avoid a truncation warning when we make the DMA API calls, but we 
actually need the opposite in the comparison here - comparing the 
different types directly allows integer promotion to kick in 
appropriately so we don't lose the top half of the larger address. 
Otherwise, you'd never spot the difference between, say, your original 
page at 0x88c000 and a bounce-buffered copy that happened to end up 
mapped to 0xc000.


Robin.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/io-pgtable-arm: Don't use dma_to_phys()

2015-09-18 Thread Yong Wu
On Thu, 2015-09-17 at 17:42 +0100, Robin Murphy wrote:
> In checking whether DMA addresses differ from physical addresses, using
> dma_to_phys() is actually the wrong thing to do, since it may hide any
> DMA offset, which is precisely one of the things we are checking for.
> Simply casting between the two address types, whilst ugly, is in fact
> the appropriate course of action. Further care (and ugliness) is also
> necessary in the comparison to avoid truncation if phys_addr_t and
> dma_addr_t differ in size.
> 
> We can also reject any device with a fixed DMA offset up-front at page
> table creation, leaving the allocation-time check for the more subtle
> cases like bounce buffering due to an incorrect DMA mask.
> 
> Furthermore, we can then fix the hackish KConfig dependency so that
> architectures without a dma_to_phys() implementation may still
> COMPILE_TEST (or even use!) the code. The true dependency is on the
> DMA API, so use the appropriate symbol for that.
> 
> Signed-off-by: Robin Murphy 
> ---
[...]
>  
>  static bool selftest_running = false;
>  
> -static dma_addr_t __arm_lpae_dma_addr(struct device *dev, void *pages)
> +static dma_addr_t __arm_lpae_dma_addr(void *pages)
>  {
> - return phys_to_dma(dev, virt_to_phys(pages));
> + return (dma_addr_t)virt_to_phys(pages);
>  }
>  
>  static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
> @@ -223,10 +223,10 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t 
> gfp,
>   goto out_free;
>   /*
>* We depend on the IOMMU being able to work with any physical
> -  * address directly, so if the DMA layer suggests it can't by
> -  * giving us back some translation, that bodes very badly...
> +  * address directly, so if the DMA layer suggests otherwise by
> +  * translating or truncating them, that bodes very badly...
>*/
> - if (dma != __arm_lpae_dma_addr(dev, pages))
> + if (dma != virt_to_phys(pages))

Could I ask why not use __arm_lpae_dma_addr(pages) here?
dma is dma_addr_t.

>   goto out_unmap;
>   }
>  
> @@ -243,10 +243,8 @@ out_free:
>  static void __arm_lpae_free_pages(void *pages, size_t size,
> struct io_pgtable_cfg *cfg)
>  {
> - struct device *dev = cfg->iommu_dev;
> -
>   if (!selftest_running)
> - dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages),
> + dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages),
>size, DMA_TO_DEVICE);
>   free_pages_exact(pages, size);
>  }
> @@ -254,12 +252,11 @@ static void __arm_lpae_free_pages(void *pages, size_t 
> size,
>  static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
>  struct io_pgtable_cfg *cfg)
>  {
> - struct device *dev = cfg->iommu_dev;
> -
>   *ptep = pte;
>  
>   if (!selftest_running)
> - dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep),
> + dma_sync_single_for_device(cfg->iommu_dev,
> +__arm_lpae_dma_addr(ptep),
>  sizeof(pte), DMA_TO_DEVICE);
>  }
>  
> @@ -629,6 +626,11 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>   if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>   return NULL;
>  
> + if (cfg->iommu_dev->dma_pfn_offset) {
> + dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for 
> IOMMU page tables\n");
> + return NULL;
> + }
> +
>   data = kmalloc(sizeof(*data), GFP_KERNEL);
>   if (!data)
>   return NULL;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu