Re: Regression when booting 5.15 as dom0 on arm64 (WAS: Re: [linux-linus test] 161829: regressions - FAIL)

2021-05-10 Thread Julien Grall

Hi Christoph,

On 10/05/2021 09:40, Christoph Hellwig wrote:

On Sat, May 08, 2021 at 12:32:37AM +0100, Julien Grall wrote:

The pointer dereferenced seems to suggest that the swiotlb hasn't been
allocated. From what I can tell, this may be because swiotlb_force is set
to SWIOTLB_NO_FORCE, we will still enable the swiotlb when running on top
of Xen.

I am not entirely sure what would be the correct fix. Any opinions?


Can you try something like the patch below (not even compile tested, but
the intent should be obvious?


diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 16a2b2b1c54d..7671bc153fb1 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -44,6 +44,8 @@
  #include 
  #include 
  
+#include 

+
  /*
   * We need to be able to catch inadvertent references to memstart_addr
   * that occur (potentially in generic code) before arm64_memblock_init()
@@ -482,7 +484,7 @@ void __init mem_init(void)
if (swiotlb_force == SWIOTLB_FORCE ||
max_pfn > PFN_DOWN(arm64_dma_phys_limit))
swiotlb_init(1);
-   else
+   else if (!IS_ENABLED(CONFIG_XEN) || !xen_swiotlb_detect())
swiotlb_force = SWIOTLB_NO_FORCE;
  
  	set_max_mapnr(max_pfn - PHYS_PFN_OFFSET);


I have applied the patch on top of 5.13-rc1 and can confirm I am able to 
boot dom0. Are you going to submit the patch?


Thank you for your help!

Best regards,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Regression when booting 5.15 as dom0 on arm64 (WAS: Re: [linux-linus test] 161829: regressions - FAIL)

2021-05-07 Thread Julien Grall
amd64-xl-qcow2fail
  test-armhf-armhf-libvirt-raw pass
  test-amd64-i386-xl-raw   fail
  test-amd64-amd64-xl-rtds fail
  test-armhf-armhf-xl-rtds pass
  test-arm64-arm64-xl-seattle  fail
  test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow pass
  test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  fail
  test-amd64-amd64-xl-shadow   pass
  test-amd64-i386-xl-shadowfail
  test-arm64-arm64-xl-thunderx fail
  test-amd64-amd64-libvirt-vhd fail
  test-armhf-armhf-xl-vhd  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
 http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
 
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
 http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
 http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1625261 lines long.)



--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: "Convert the AMD iommu driver to the dma-iommu api" is buggy

2019-10-16 Thread Julien Grall

Hi Robin,

On 16/10/2019 17:09, Robin Murphy wrote:

On 16/10/2019 17:03, Joerg Roedel wrote:

On Wed, Oct 16, 2019 at 11:53:33AM -0400, Qian Cai wrote:

On Wed, 2019-10-16 at 17:31 +0200, Joerg Roedel wrote:



The x86 one might just be a mistake.

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index ad05484d0c80..63c4b894751d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2542,7 +2542,7 @@ static int amd_iommu_map(struct iommu_domain *dom,
unsigned long iova,
 if (iommu_prot & IOMMU_WRITE)
 prot |= IOMMU_PROT_IW;
-   ret = iommu_map_page(domain, iova, paddr, page_size, prot, GFP_KERNEL);
+   ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp);


Yeah, that is a bug, I spotted that too.


@@ -1185,7 +1185,7 @@ static struct iommu_dma_msi_page
*iommu_dma_get_msi_page(struct device *dev,
 if (!iova)
 goto out_free_page;
-   if (iommu_map(domain, iova, msi_addr, size, prot))
+   if (iommu_map_atomic(domain, iova, msi_addr, size, prot))
 goto out_free_iova;


Not so sure this is a bug, this code is only about setting up MSIs on
ARM. It probably doesn't need to be atomic.
Right, since the iommu_dma_prepare_msi() operation was broken out to happen at 
MSI domain setup time, I don't think the comment in there applies any more, so 
we can probably stop disabling irqs around the iommu_dma_get_msi_page() call.


Yes, I agree that it does not need to be atomic anymore.

Cheers,

--
Julien Grall


Re: [Xen-devel] [PATCH 08/11] swiotlb-xen: use the same foreign page check everywhere

2019-08-19 Thread Julien Grall

Hi Christoph,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:

xen_dma_map_page uses a different and more complicated check for
foreign pages than the other three cache maintainance helpers.
Switch it to the simpler pfn_vali method a well.


NIT: s/pfn_vali/pfn_valid/



Signed-off-by: Christoph Hellwig 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH 01/11] xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance

2019-08-19 Thread Julien Grall

Hi Christoph,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:

+static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
+dma_addr_t dev_addr, unsigned long offset, size_t size,
+enum dma_data_direction dir, unsigned long attrs)
+{
+   unsigned long page_pfn = page_to_xen_pfn(page);
+   unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
+   unsigned long compound_pages =
+   (1<

The Arm version as a comment here. Could we retain it?


+   if (local)
+   dma_direct_map_page(hwdev, page, offset, size, dir, attrs);
+   else
+   __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, 
attrs);
+}
+


Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH 04/11] xen/arm: remove xen_dma_ops

2019-08-19 Thread Julien Grall

Hi Christoph,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:

arm and arm64 can just use xen_swiotlb_dma_ops directly like x86, no
need for a pointer indirection.

Signed-off-by: Christoph Hellwig 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall


Re: [Xen-devel] [PATCH 02/11] xen/arm: use dev_is_dma_coherent

2019-08-19 Thread Julien Grall

Hi Christoph,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:

Use the dma-noncoherent dev_is_dma_coherent helper instead of the home
grown variant.


It took me a bit of time to understand that dev->archdata.dma_coherent 
and dev->dma_coherent will always contain the same value.


Would you mind it mention it in the commit message?

Other than that:

Reviewed-by: Julien Grall 



Signed-off-by: Christoph Hellwig 


Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH 07/11] swiotlb-xen: provide a single page-coherent.h header

2019-08-17 Thread Julien Grall

Hi Christoph,

On 8/17/19 7:50 AM, Christoph Hellwig wrote:

On Fri, Aug 16, 2019 at 11:40:43PM +0100, Julien Grall wrote:

I am not sure I agree with this rename. The implementation of the helpers
are very Arm specific as this is assuming Dom0 is 1:1 mapped.

This was necessary due to the lack of IOMMU on Arm platforms back then.
But this is now a pain to get rid of it on newer platform...


So if you look at the final version of the header after the whole
series, what assumes a 1:1 mapping?  It all just is

if (pfn_valid())
local cache sync;
else
call into the arch code;


In the context of Xen Arm, the dev_addr is a host physical address. From 
my understanding pfn_valid() is dealing with a guest physical frame.


Therefore by passing PFN_DOWN(dev_addr) in argument you assume that the 
host and guest address spaces are the same.




are you concerned that the local cache sync might have to be split
up more for a non-1:1 map in that case?  We could just movea
the xen_dma_* routines into the arch instead of __xen_dma, but it
really helps to have a common interface header.
Moving xen_dma_* routines into the arch would be a good option. 
Although, I would still consider a stub version for arch not requiring 
specific DMA.


Cheers,

--
Julien Grall



Re: [Xen-devel] [PATCH 07/11] swiotlb-xen: provide a single page-coherent.h header

2019-08-16 Thread Julien Grall

Hi,

On 8/16/19 2:00 PM, Christoph Hellwig wrote:

Merge the various page-coherent.h files into a single one that either
provides prototypes or stubs depending on the need for cache
maintainance.

For extra benefits alo include  in the file
actually implementing the interfaces provided.

Signed-off-by: Christoph Hellwig 
---
  arch/arm/include/asm/xen/page-coherent.h   |  2 --
  arch/arm/xen/mm.c  |  1 +
  arch/arm64/include/asm/xen/page-coherent.h |  2 --
  arch/x86/include/asm/xen/page-coherent.h   | 22 --
  drivers/xen/swiotlb-xen.c  |  4 +---
  include/Kbuild |  2 +-
  include/xen/{arm => }/page-coherent.h  | 27 +++---


I am not sure I agree with this rename. The implementation of the 
helpers are very Arm specific as this is assuming Dom0 is 1:1 mapped.


This was necessary due to the lack of IOMMU on Arm platforms back then.
But this is now a pain to get rid of it on newer platform...

Cheers,

--
Julien Grall


[PATCH v3 7/7] iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

2019-05-01 Thread Julien Grall
A recent change split iommu_dma_map_msi_msg() in two new functions. The
function was still implemented to avoid modifying all the callers at
once.

Now that all the callers have been reworked, iommu_dma_map_msi_msg() can
be removed.

Signed-off-by: Julien Grall 
Reviewed-by: Robin Murphy 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Add Robin's and Eric's reviewed-by

Changes in v2:
- Rework the commit message
---
 drivers/iommu/dma-iommu.c | 20 
 include/linux/dma-iommu.h |  5 -
 2 files changed, 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f847904098f7..13916fefeb27 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -935,23 +935,3 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc,
msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
 }
-
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-   struct msi_desc *desc = irq_get_msi_desc(irq);
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
-
-   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
-   /*
-* We're called from a void callback, so the best we can do is
-* 'fail' by filling the message with obviously bogus values.
-* Since we got this far due to an IOMMU being present, it's
-* not like the existing address would have worked anyway...
-*/
-   msg->address_hi = ~0U;
-   msg->address_lo = ~0U;
-   msg->data = ~0U;
-   } else {
-   iommu_dma_compose_msi_msg(desc, msg);
-   }
-}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 0b781a98ee73..476e0c54de2d 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -84,7 +84,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t 
msi_addr);
 void iommu_dma_compose_msi_msg(struct msi_desc *desc,
   struct msi_msg *msg);
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
 #else
@@ -124,10 +123,6 @@ static inline void iommu_dma_compose_msi_msg(struct 
msi_desc *desc,
 {
 }
 
-static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-}
-
 static inline void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 {
 }
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 6/7] irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg()

2019-05-01 Thread Julien Grall
The functions mbi_compose_m{b, s}i_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg() requires to be called
from a preemptible context.

A recent patch split iommu_dma_map_msi_msg in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 MSI driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the two MSI
mappings when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-mbi.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index fbfa7ff6deb1..d50f6cdf043c 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -84,6 +84,7 @@ static void mbi_free_msi(struct mbi_range *mbi, unsigned int 
hwirq,
 static int mbi_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct mbi_range *mbi = NULL;
int hwirq, offset, i, err = 0;
 
@@ -104,6 +105,16 @@ static int mbi_irq_domain_alloc(struct irq_domain *domain, 
unsigned int virq,
 
hwirq = mbi->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_CLRSPI_NSR);
+   if (err)
+   return err;
+
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_SETSPI_NSR);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = mbi_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
@@ -142,7 +153,7 @@ static void mbi_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[0].address_lo = lower_32_bits(mbi_phys_base + GICD_SETSPI_NSR);
msg[0].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 #ifdef CONFIG_PCI_MSI
@@ -202,7 +213,7 @@ static void mbi_compose_mbi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[1].address_lo = lower_32_bits(mbi_phys_base + GICD_CLRSPI_NSR);
msg[1].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, &msg[1]);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), &msg[1]);
 }
 
 /* Platform-MSI specific irqchip */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 5/7] irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg()

2019-05-01 Thread Julien Grall
ls_scfg_msi_compose_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent patch split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The FreeScale SCFG MSI driver is reworked to avoid executing preemptible
code in non-preemptible context. This can be achieved by preparing the
MSI maping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-ls-scfg-msi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-ls-scfg-msi.c 
b/drivers/irqchip/irq-ls-scfg-msi.c
index c671b3212010..669d29105772 100644
--- a/drivers/irqchip/irq-ls-scfg-msi.c
+++ b/drivers/irqchip/irq-ls-scfg-msi.c
@@ -100,7 +100,7 @@ static void ls_scfg_msi_compose_msg(struct irq_data *data, 
struct msi_msg *msg)
msg->data |= cpumask_first(mask);
}
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static int ls_scfg_msi_set_affinity(struct irq_data *irq_data,
@@ -141,6 +141,7 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
unsigned int nr_irqs,
void *args)
 {
+   msi_alloc_info_t *info = args;
struct ls_scfg_msi *msi_data = domain->host_data;
int pos, err = 0;
 
@@ -157,6 +158,10 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, msi_data->msiir_addr);
+   if (err)
+   return err;
+
irq_domain_set_info(domain, virq, pos,
&ls_scfg_msi_parent_chip, msi_data,
handle_simple_irq, NULL, NULL);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()

2019-05-01 Thread Julien Grall
its_irq_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 ITS driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
mapping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Fix typo in the commit message
- Check the return of iommu_dma_prepare_msi
- Add Eric's reviewed-by

Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-its.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..9cddf336c09d 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
msg->address_hi = upper_32_bits(addr);
msg->data   = its_get_event_id(d);
 
-   iommu_dma_map_msi_msg(d->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
 }
 
 static int its_irq_set_irqchip_state(struct irq_data *d,
@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 {
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+   struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,10 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = its_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/7] irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()

2019-05-01 Thread Julien Grall
gicv2m_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv2m driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
mapping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Add Eric's reviewed-by

Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v2m.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index f5fe0100f9ff..4359f0583377 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -110,7 +110,7 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
msg->data -= v2m->spi_offset;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static struct irq_chip gicv2m_irq_chip = {
@@ -167,6 +167,7 @@ static void gicv2m_unalloc_msi(struct v2m_data *v2m, 
unsigned int hwirq,
 static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int 
virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct v2m_data *v2m = NULL, *tmp;
int hwirq, offset, i, err = 0;
 
@@ -186,6 +187,11 @@ static int gicv2m_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 
hwirq = v2m->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   v2m->res.start + V2M_MSI_SETSPI_NS);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = gicv2m_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts

2019-05-01 Thread Julien Grall
On RT, iommu_dma_map_msi_msg() may be called from non-preemptible
context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as
the function is using spin_lock (they can sleep on RT).

iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT
and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

iomma_dma_map_msi_msg() is now split in two functions:
- iommu_dma_prepare_msi(): This function will prepare the mapping
in the IOMMU and store the cookie in the structure msi_desc. This
function should be called in preemptible context.
- iommu_dma_compose_msi_msg(): This function will update the MSI
message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 
Reviewed-by: Robin Murphy 
Reviewed-by: Eric Auguer 

---
Changes in v3:
- Update the comment to use kerneldoc format
- Fix typoes in the comments
- More use of msi_desc_set_iommu_cookie
- Add Robin's and Eric's reviewed-by

Changes in v2:
- Rework the commit message to use imperative mood
- Use the MSI accessor to get/set the iommu cookie
- Don't use ternary on return
- Select CONFIG_IRQ_MSI_IOMMU
- Pass an msi_desc rather than the irq number
---
 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 46 +-
 include/linux/dma-iommu.h | 25 +
 3 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816..eb1c8cd243f9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ config IOMMU_DMA
bool
select IOMMU_API
select IOMMU_IOVA
+   select IRQ_MSI_IOMMU
select NEED_SG_DMA_LENGTH
 
 config FSL_PAMU
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..f847904098f7 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,18 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
 }
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
 {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
 
-   if (!domain || !domain->iova_cookie)
-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
 
cookie = domain->iova_cookie;
 
@@ -911,7 +912,36 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(&cookie->msi_lock, flags);
 
-   if (WARN_ON(!msi_page)) {
+   msi_desc_set_iommu_cookie(desc, msi_page);
+
+   if (!msi_page)
+   return -ENOMEM;
+   return 0;
+}
+
+void iommu_dma_compose_msi_msg(struct msi_desc *desc,
+  struct msi_msg *msg)
+{
+   struct device *dev = msi_desc_to_dev(desc);
+   const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   const struct iommu_dma_msi_page *msi_page;
+
+   msi_page = msi_desc_get_iommu_cookie(desc);
+
+   if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+   return;
+
+   msg->address_hi = upper_32_bits(msi_page->iova);
+   msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+   msg->address_lo += lower_32_bits(msi_page->iova);
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+   struct msi_desc *desc = irq_get_msi_desc(irq);
+   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+
+   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
/*
 * We're called from a void callback, so the best we can do is
 * 'fail' by filling the message with obviously bogus values.
@@ -922,8 +952,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo = ~0U;
msg->data = ~0U;
} else {
-   msg->address_hi = upper_32_bits(msi_page->iova);
-   msg->address_lo &= cookie_msi_granule(cookie) - 1;
-   msg->address_lo += lower_32_bits(msi_page->iova);
+   

[PATCH v3 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-05-01 Thread Julien Grall
When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up change will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

A new field is introduced in msi_desc to store an IOMMU cookie. As the
cookie may not be required in some configuration, the field is protected
under a new config CONFIG_IRQ_MSI_IOMMU.

A pair of helpers has also been introduced to access the field.

Signed-off-by: Julien Grall 
Reviewed-by: Robin Murphy 
Reviewed-by: Eric Auger 

---
Changes in v3:
- Add Robin's and Eric's reviewed-by

Changes in v2:
- Update the commit message to use imperative mood
- Protect the field with a new config that will be selected by
IOMMU_DMA later on
- Add a set of helpers to access the new field
---
 include/linux/msi.h | 26 ++
 kernel/irq/Kconfig  |  3 +++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..82a308c19222 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device   *dev;
struct msi_msg  msg;
struct irq_affinity_desc*affinity;
+#ifdef CONFIG_IRQ_MSI_IOMMU
+   const void  *iommu_cookie;
+#endif
 
union {
/* PCI MSI/X specific data */
@@ -119,6 +122,29 @@ struct msi_desc {
 #define for_each_msi_entry_safe(desc, tmp, dev)\
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
 
+#ifdef CONFIG_IRQ_MSI_IOMMU
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return desc->iommu_cookie;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+   desc->iommu_cookie = iommu_cookie;
+}
+#else
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return NULL;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+}
+#endif
+
 #ifdef CONFIG_PCI_MSI
 #define first_pci_msi_entry(pdev)  first_msi_entry(&(pdev)->dev)
 #define for_each_pci_msi_entry(desc, pdev) \
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 5f3e2baefca9..8fee06625c37 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -91,6 +91,9 @@ config GENERIC_MSI_IRQ_DOMAIN
select IRQ_DOMAIN_HIERARCHY
select GENERIC_MSI_IRQ
 
+config IRQ_MSI_IOMMU
+   bool
+
 config HANDLE_DOMAIN_IRQ
bool
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-05-01 Thread Julien Grall
Hi all,

On RT, the function iommu_dma_map_msi_msg expects to be called from preemptible
context. However, this is not always the case resulting a splat with
!CONFIG_DEBUG_ATOMIC_SLEEP:

[   48.875777] BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:974
[   48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
[   48.875782] INFO: lockdep is turned off.
[   48.875784] irq event stamp: 10684
[   48.875786] hardirqs last  enabled at (10683): [] 
_raw_spin_unlock_irqrestore+0x88/0x90
[   48.875791] hardirqs last disabled at (10684): [] 
_raw_spin_lock_irqsave+0x24/0x68
[   48.875796] softirqs last  enabled at (0): [] 
copy_process.isra.1.part.2+0x8d8/0x1970
[   48.875801] softirqs last disabled at (0): [<>]   
(null)
[   48.875805] Preemption disabled at:
[   48.875805] [] __setup_irq+0xd8/0x6c0
[   48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 
5.0.3-rt1-7-g42ede9a0fed6 #45
[   48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno 
Development Platform, BIOS EDK II Jan 23 2017
[   48.875817] Call trace:
[   48.875818]  dump_backtrace+0x0/0x140
[   48.875821]  show_stack+0x14/0x20
[   48.875823]  dump_stack+0xa0/0xd4
[   48.875827]  ___might_sleep+0x16c/0x1f8
[   48.875831]  rt_spin_lock+0x5c/0x70
[   48.875835]  iommu_dma_map_msi_msg+0x5c/0x1d8
[   48.875839]  gicv2m_compose_msi_msg+0x3c/0x48
[   48.875843]  irq_chip_compose_msi_msg+0x40/0x58
[   48.875846]  msi_domain_activate+0x38/0x98
[   48.875849]  __irq_domain_activate_irq+0x58/0xa0
[   48.875852]  irq_domain_activate_irq+0x34/0x58
[   48.875855]  irq_activate+0x28/0x30
[   48.875858]  __setup_irq+0x2b0/0x6c0
[   48.875861]  request_threaded_irq+0xdc/0x188
[   48.875865]  sky2_setup_irq+0x44/0xf8
[   48.875868]  sky2_open+0x1a4/0x240
[   48.875871]  __dev_open+0xd8/0x188
[   48.875874]  __dev_change_flags+0x164/0x1f0
[   48.875877]  dev_change_flags+0x20/0x60
[   48.875879]  do_setlink+0x2a0/0xd30
[   48.875882]  __rtnl_newlink+0x5b4/0x6d8
[   48.875885]  rtnl_newlink+0x50/0x78
[   48.875888]  rtnetlink_rcv_msg+0x178/0x640
[   48.875891]  netlink_rcv_skb+0x58/0x118
[   48.875893]  rtnetlink_rcv+0x14/0x20
[   48.875896]  netlink_unicast+0x188/0x200
[   48.875898]  netlink_sendmsg+0x248/0x3d8
[   48.875900]  sock_sendmsg+0x18/0x40
[   48.875904]  ___sys_sendmsg+0x294/0x2d0
[   48.875908]  __sys_sendmsg+0x68/0xb8
[   48.875911]  __arm64_sys_sendmsg+0x20/0x28
[   48.875914]  el0_svc_common+0x90/0x118
[   48.875918]  el0_svc_handler+0x2c/0x80
[   48.875922]  el0_svc+0x8/0xc

Most of the patches have now been acked (missing a couple of ack from Joerg).

I was able to test the changes in GICv2m and GICv3 ITS. I don't have
hardware for the other interrupt controllers.

Cheers,

Julien Grall (7):
  genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
  iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts
  irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()
  irqchip/gic-v3-its: Don't map the MSI page in
its_irq_compose_msi_msg()
  irqchip/ls-scfg-msi: Don't map the MSI page in
ls_scfg_msi_compose_msg()
  irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b,
s}i_msg()
  iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 48 +++
 drivers/irqchip/irq-gic-v2m.c |  8 ++-
 drivers/irqchip/irq-gic-v3-its.c  |  7 +-
 drivers/irqchip/irq-gic-v3-mbi.c  | 15 ++--
 drivers/irqchip/irq-ls-scfg-msi.c |  7 +-
 include/linux/dma-iommu.h | 24 ++--
 include/linux/msi.h   | 26 +
 kernel/irq/Kconfig|  3 +++
 9 files changed, 112 insertions(+), 27 deletions(-)

-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()

2019-05-01 Thread Julien Grall

On 30/04/2019 13:34, Auger Eric wrote:

Hi Julien,


Hi Eric,

Thank you for the review!



On 4/29/19 4:44 PM, Julien Grall wrote:

its_irq_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 ITS driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
maping when allocating the MSI interrupt.

mapping


Signed-off-by: Julien Grall 

---
 Changes in v2:
 - Rework the commit message to use imperative mood
---
  drivers/irqchip/irq-gic-v3-its.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..12ddbcfe1b1e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
msg->address_hi  = upper_32_bits(addr);
msg->data= its_get_event_id(d);
  
-	iommu_dma_map_msi_msg(d->irq, msg);

+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
  }
  
  static int its_irq_set_irqchip_state(struct irq_data *d,

@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
  {
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+   struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,8 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (err)
return err;
  
+	err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));

Test err as in gicv2m driver?


Hmmm yes. Marc, do you want me to respin the patch?

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 7/7] iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

2019-04-29 Thread Julien Grall
A recent change split iommu_dma_map_msi_msg() in two new functions. The
function was still implemented to avoid modifying all the callers at
once.

Now that all the callers have been reworked, iommu_dma_map_msi_msg() can
be removed.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message
---
 drivers/iommu/dma-iommu.c | 20 
 include/linux/dma-iommu.h |  5 -
 2 files changed, 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2309f59cefa4..12f4464828a4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -936,23 +936,3 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc,
msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
 }
-
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-   struct msi_desc *desc = irq_get_msi_desc(irq);
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
-
-   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
-   /*
-* We're called from a void callback, so the best we can do is
-* 'fail' by filling the message with obviously bogus values.
-* Since we got this far due to an IOMMU being present, it's
-* not like the existing address would have worked anyway...
-*/
-   msg->address_hi = ~0U;
-   msg->address_lo = ~0U;
-   msg->data = ~0U;
-   } else {
-   iommu_dma_compose_msi_msg(desc, msg);
-   }
-}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 3fc48fbd6f63..ddd217c197df 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -82,7 +82,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t 
msi_addr);
 void iommu_dma_compose_msi_msg(struct msi_desc *desc,
   struct msi_msg *msg);
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
 #else
@@ -122,10 +121,6 @@ static inline void iommu_dma_compose_msi_msg(struct 
msi_desc *desc,
 {
 }
 
-static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-}
-
 static inline void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 {
 }
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/7] irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg()

2019-04-29 Thread Julien Grall
ls_scfg_msi_compose_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent patch split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The FreeScale SCFG MSI driver is reworked to avoid executing preemptible
code in non-preemptible context. This can be achieved by preparing the
MSI maping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-ls-scfg-msi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-ls-scfg-msi.c 
b/drivers/irqchip/irq-ls-scfg-msi.c
index c671b3212010..669d29105772 100644
--- a/drivers/irqchip/irq-ls-scfg-msi.c
+++ b/drivers/irqchip/irq-ls-scfg-msi.c
@@ -100,7 +100,7 @@ static void ls_scfg_msi_compose_msg(struct irq_data *data, 
struct msi_msg *msg)
msg->data |= cpumask_first(mask);
}
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static int ls_scfg_msi_set_affinity(struct irq_data *irq_data,
@@ -141,6 +141,7 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
unsigned int nr_irqs,
void *args)
 {
+   msi_alloc_info_t *info = args;
struct ls_scfg_msi *msi_data = domain->host_data;
int pos, err = 0;
 
@@ -157,6 +158,10 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, msi_data->msiir_addr);
+   if (err)
+   return err;
+
irq_domain_set_info(domain, virq, pos,
&ls_scfg_msi_parent_chip, msi_data,
handle_simple_irq, NULL, NULL);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()

2019-04-29 Thread Julien Grall
its_irq_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 ITS driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
maping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-its.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..12ddbcfe1b1e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
msg->address_hi = upper_32_bits(addr);
msg->data   = its_get_event_id(d);
 
-   iommu_dma_map_msi_msg(d->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(d), msg);
 }
 
 static int its_irq_set_irqchip_state(struct irq_data *d,
@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 {
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+   struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,8 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));
+
for (i = 0; i < nr_irqs; i++) {
err = its_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts

2019-04-29 Thread Julien Grall
On RT, iommu_dma_map_msi_msg() may be called from non-preemptible
context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as
the function is using spin_lock (they can sleep on RT).

iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT
and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

iomma_dma_map_msi_msg() is now split in two functions:
- iommu_dma_prepare_msi(): This function will prepare the mapping
in the IOMMU and store the cookie in the structure msi_desc. This
function should be called in preemptible context.
- iommu_dma_compose_msi_msg(): This function will update the MSI
message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
- Use the MSI accessor to get/set the iommu cookie
- Don't use ternary on return
- Select CONFIG_IRQ_MSI_IOMMU
- Pass an msi_desc rather than the irq number
---
 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 47 ++-
 include/linux/dma-iommu.h | 23 +++
 3 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816..eb1c8cd243f9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ config IOMMU_DMA
bool
select IOMMU_API
select IOMMU_IOVA
+   select IRQ_MSI_IOMMU
select NEED_SG_DMA_LENGTH
 
 config FSL_PAMU
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..2309f59cefa4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,18 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
 }
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
 {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
 
-   if (!domain || !domain->iova_cookie)
-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
 
cookie = domain->iova_cookie;
 
@@ -911,7 +912,37 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(&cookie->msi_lock, flags);
 
-   if (WARN_ON(!msi_page)) {
+   msi_desc_set_iommu_cookie(desc, msi_page);
+
+   if (!msi_page)
+   return -ENOMEM;
+   else
+   return 0;
+}
+
+void iommu_dma_compose_msi_msg(struct msi_desc *desc,
+  struct msi_msg *msg)
+{
+   struct device *dev = msi_desc_to_dev(desc);
+   const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   const struct iommu_dma_msi_page *msi_page;
+
+   msi_page = msi_desc_get_iommu_cookie(desc);
+
+   if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+   return;
+
+   msg->address_hi = upper_32_bits(msi_page->iova);
+   msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+   msg->address_lo += lower_32_bits(msi_page->iova);
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+   struct msi_desc *desc = irq_get_msi_desc(irq);
+   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+
+   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
/*
 * We're called from a void callback, so the best we can do is
 * 'fail' by filling the message with obviously bogus values.
@@ -922,8 +953,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo = ~0U;
msg->data = ~0U;
} else {
-   msg->address_hi = upper_32_bits(msi_page->iova);
-   msg->address_lo &= cookie_msi_granule(cookie) - 1;
-   msg->address_lo += lower_32_bits(msi_page->iova);
+   iommu_dma_compose_msi_msg(desc, msg);
}
 }
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index e760dc5d1fa8..3fc48fbd6f63 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -71,12 

[PATCH v2 0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-29 Thread Julien Grall
Hi all,

On RT, the function iommu_dma_map_msi_msg expects to be called from preemptible
context. However, this is not always the case resulting a splat with
!CONFIG_DEBUG_ATOMIC_SLEEP:

[   48.875777] BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:974
[   48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
[   48.875782] INFO: lockdep is turned off.
[   48.875784] irq event stamp: 10684
[   48.875786] hardirqs last  enabled at (10683): [] 
_raw_spin_unlock_irqrestore+0x88/0x90
[   48.875791] hardirqs last disabled at (10684): [] 
_raw_spin_lock_irqsave+0x24/0x68
[   48.875796] softirqs last  enabled at (0): [] 
copy_process.isra.1.part.2+0x8d8/0x1970
[   48.875801] softirqs last disabled at (0): [<>]   
(null)
[   48.875805] Preemption disabled at:
[   48.875805] [] __setup_irq+0xd8/0x6c0
[   48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 
5.0.3-rt1-7-g42ede9a0fed6 #45
[   48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno 
Development Platform, BIOS EDK II Jan 23 2017
[   48.875817] Call trace:
[   48.875818]  dump_backtrace+0x0/0x140
[   48.875821]  show_stack+0x14/0x20
[   48.875823]  dump_stack+0xa0/0xd4
[   48.875827]  ___might_sleep+0x16c/0x1f8
[   48.875831]  rt_spin_lock+0x5c/0x70
[   48.875835]  iommu_dma_map_msi_msg+0x5c/0x1d8
[   48.875839]  gicv2m_compose_msi_msg+0x3c/0x48
[   48.875843]  irq_chip_compose_msi_msg+0x40/0x58
[   48.875846]  msi_domain_activate+0x38/0x98
[   48.875849]  __irq_domain_activate_irq+0x58/0xa0
[   48.875852]  irq_domain_activate_irq+0x34/0x58
[   48.875855]  irq_activate+0x28/0x30
[   48.875858]  __setup_irq+0x2b0/0x6c0
[   48.875861]  request_threaded_irq+0xdc/0x188
[   48.875865]  sky2_setup_irq+0x44/0xf8
[   48.875868]  sky2_open+0x1a4/0x240
[   48.875871]  __dev_open+0xd8/0x188
[   48.875874]  __dev_change_flags+0x164/0x1f0
[   48.875877]  dev_change_flags+0x20/0x60
[   48.875879]  do_setlink+0x2a0/0xd30
[   48.875882]  __rtnl_newlink+0x5b4/0x6d8
[   48.875885]  rtnl_newlink+0x50/0x78
[   48.875888]  rtnetlink_rcv_msg+0x178/0x640
[   48.875891]  netlink_rcv_skb+0x58/0x118
[   48.875893]  rtnetlink_rcv+0x14/0x20
[   48.875896]  netlink_unicast+0x188/0x200
[   48.875898]  netlink_sendmsg+0x248/0x3d8
[   48.875900]  sock_sendmsg+0x18/0x40
[   48.875904]  ___sys_sendmsg+0x294/0x2d0
[   48.875908]  __sys_sendmsg+0x68/0xb8
[   48.875911]  __arm64_sys_sendmsg+0x20/0x28
[   48.875914]  el0_svc_common+0x90/0x118
[   48.875918]  el0_svc_handler+0x2c/0x80
[   48.875922]  el0_svc+0x8/0xc

This series is a first attempt to rework how MSI are mapped and composed
when an IOMMU is present.

I was able to test the changes in GICv2m and GICv3 ITS. I don't have
hardware for the other interrupt controllers.

Cheers,

Julien Grall (7):
  genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
  iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts
  irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()
  irqchip/gic-v3-its: Don't map the MSI page in
its_irq_compose_msi_msg()
  irqchip/ls-scfg-msi: Don't map the MSI page in
ls_scfg_msi_compose_msg()
  irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b,
s}i_msg()
  iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

 drivers/iommu/Kconfig |  1 +
 drivers/iommu/dma-iommu.c | 49 +++
 drivers/irqchip/irq-gic-v2m.c |  8 ++-
 drivers/irqchip/irq-gic-v3-its.c  |  5 +++-
 drivers/irqchip/irq-gic-v3-mbi.c  | 15 ++--
 drivers/irqchip/irq-ls-scfg-msi.c |  7 +-
 include/linux/dma-iommu.h | 22 --
 include/linux/msi.h   | 26 +
 kernel/irq/Kconfig|  3 +++
 9 files changed, 109 insertions(+), 27 deletions(-)

-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 6/7] irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg()

2019-04-29 Thread Julien Grall
The functions mbi_compose_m{b, s}i_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg() requires to be called
from a preemptible context.

A recent patch split iommu_dma_map_msi_msg in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv3 MSI driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the two MSI
mappings when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v3-mbi.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index fbfa7ff6deb1..d50f6cdf043c 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -84,6 +84,7 @@ static void mbi_free_msi(struct mbi_range *mbi, unsigned int 
hwirq,
 static int mbi_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct mbi_range *mbi = NULL;
int hwirq, offset, i, err = 0;
 
@@ -104,6 +105,16 @@ static int mbi_irq_domain_alloc(struct irq_domain *domain, 
unsigned int virq,
 
hwirq = mbi->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_CLRSPI_NSR);
+   if (err)
+   return err;
+
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_SETSPI_NSR);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = mbi_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
@@ -142,7 +153,7 @@ static void mbi_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[0].address_lo = lower_32_bits(mbi_phys_base + GICD_SETSPI_NSR);
msg[0].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 #ifdef CONFIG_PCI_MSI
@@ -202,7 +213,7 @@ static void mbi_compose_mbi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[1].address_lo = lower_32_bits(mbi_phys_base + GICD_CLRSPI_NSR);
msg[1].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, &msg[1]);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), &msg[1]);
 }
 
 /* Platform-MSI specific irqchip */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/7] irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()

2019-04-29 Thread Julien Grall
gicv2m_compose_msi_msg() may be called from non-preemptible context.
However, on RT, iommu_dma_map_msi_msg() requires to be called from a
preemptible context.

A recent change split iommu_dma_map_msi_msg() in two new functions:
one that should be called in preemptible context, the other does
not have any requirement.

The GICv2m driver is reworked to avoid executing preemptible code in
non-preemptible context. This can be achieved by preparing the MSI
mapping when allocating the MSI interrupt.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Rework the commit message to use imperative mood
---
 drivers/irqchip/irq-gic-v2m.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index f5fe0100f9ff..4359f0583377 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -110,7 +110,7 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
msg->data -= v2m->spi_offset;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(irq_data_get_msi_desc(data), msg);
 }
 
 static struct irq_chip gicv2m_irq_chip = {
@@ -167,6 +167,7 @@ static void gicv2m_unalloc_msi(struct v2m_data *v2m, 
unsigned int hwirq,
 static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int 
virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct v2m_data *v2m = NULL, *tmp;
int hwirq, offset, i, err = 0;
 
@@ -186,6 +187,11 @@ static int gicv2m_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 
hwirq = v2m->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   v2m->res.start + V2M_MSI_SETSPI_NS);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = gicv2m_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-04-29 Thread Julien Grall
When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up change will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

A new field is introduced in msi_desc to store an IOMMU cookie. As the
cookie may not be required in some configuration, the field is protected
under a new config CONFIG_IRQ_MSI_IOMMU.

A pair of helpers has also been introduced to access the field.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Update the commit message to use imperative mood
- Protect the field with a new config that will be selected by
IOMMU_DMA later on
- Add a set of helpers to access the new field
---
 include/linux/msi.h | 26 ++
 kernel/irq/Kconfig  |  3 +++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..82a308c19222 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device   *dev;
struct msi_msg  msg;
struct irq_affinity_desc*affinity;
+#ifdef CONFIG_IRQ_MSI_IOMMU
+   const void  *iommu_cookie;
+#endif
 
union {
/* PCI MSI/X specific data */
@@ -119,6 +122,29 @@ struct msi_desc {
 #define for_each_msi_entry_safe(desc, tmp, dev)\
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
 
+#ifdef CONFIG_IRQ_MSI_IOMMU
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return desc->iommu_cookie;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+   desc->iommu_cookie = iommu_cookie;
+}
+#else
+static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
+{
+   return NULL;
+}
+
+static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc,
+const void *iommu_cookie)
+{
+}
+#endif
+
 #ifdef CONFIG_PCI_MSI
 #define first_pci_msi_entry(pdev)  first_msi_entry(&(pdev)->dev)
 #define for_each_pci_msi_entry(desc, pdev) \
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 5f3e2baefca9..8fee06625c37 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -91,6 +91,9 @@ config GENERIC_MSI_IRQ_DOMAIN
select IRQ_DOMAIN_HIERARCHY
select GENERIC_MSI_IRQ
 
+config IRQ_MSI_IOMMU
+   bool
+
 config HANDLE_DOMAIN_IRQ
bool
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-29 Thread Julien Grall

Hi Marc,

On 23/04/2019 11:54, Marc Zyngier wrote:

On 18/04/2019 18:26, Julien Grall wrote:

On RT, the function iommu_dma_map_msi_msg may be called from
non-preemptible context. This will lead to a splat with
CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock
(they can sleep on RT).

The function iommu_dma_map_msi_msg is used to map the MSI page in the
IOMMU PT and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

This patch split the function iommu_dma_map_msi_msg in two new
functions:
 - iommu_dma_prepare_msi: This function will prepare the mapping in
 the IOMMU and store the cookie in the structure msi_desc. This
 function should be called in preemptible context.
 - iommu_dma_compose_msi_msg: This function will update the MSI
 message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 
---
  drivers/iommu/dma-iommu.c | 43 ---
  include/linux/dma-iommu.h | 21 +
  2 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..f5c1f1685095 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,17 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
  }
  
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)


I quite like the idea of moving from having an irq to having an msi_desc
passed to the IOMMU layer...


  {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
-   struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
  
-	if (!domain || !domain->iova_cookie)

-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
  
  	cookie = domain->iova_cookie;
  
@@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

 * of an MSI from within an IPI handler.
 */
spin_lock_irqsave(&cookie->msi_lock, flags);
-   msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
+   desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(&cookie->msi_lock, flags);
  
-	if (WARN_ON(!msi_page)) {

+   return (desc->iommu_cookie) ? 0 : -ENOMEM;
+}
+
+void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)


... but I'd like it even better if it was uniform. Can you please move
the irq_get_msi_desc() to the callers of iommu_dma_compose_msi_msg(),
and make both functions take a msi_desc?


Make sense. I will modify iommu_dma_compose_msi_msg to take a msi_desc in 
parameter.

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-23 Thread Julien Grall

Hi,

On 4/23/19 8:08 AM, Christoph Hellwig wrote:

On Thu, Apr 18, 2019 at 06:26:06PM +0100, Julien Grall wrote:

+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
  {
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
unsigned long flags;
  
+	if (!domain || !domain->iova_cookie) {

+   desc->iommu_cookie = NULL;
+   return 0;
+   }
  
  	cookie = domain->iova_cookie;
  
@@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

 * of an MSI from within an IPI handler.
 */
spin_lock_irqsave(&cookie->msi_lock, flags);
+   desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(&cookie->msi_lock, flags);
  
+	return (desc->iommu_cookie) ? 0 : -ENOMEM;


No need for the braces.  Also I personally find a:

if (!desc->iommu_cookie)
return -ENOMEM;
return 0;

much more readable, but that might just be personal preference.


I am happy either way. I will use your suggestion in the next version.

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-04-23 Thread Julien Grall

On 4/23/19 11:23 AM, Marc Zyngier wrote:

Hi Julien,


Hi Marc,


On 18/04/2019 18:26, Julien Grall wrote:

When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up patch will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

This patch introduces a new field in msi_desc to store an IOMMU cookie
when CONFIG_IOMMU_DMA is selected.

Signed-off-by: Julien Grall 
---
  include/linux/msi.h | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..d7907feef1bb 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device   *dev;
struct msi_msg  msg;
struct irq_affinity_desc*affinity;
+#ifdef CONFIG_IOMMU_DMA
+   const void  *iommu_cookie;
+#endif
  
  	union {

/* PCI MSI/X specific data */



Given that this is the only member in this structure that is dependent
on a config option, you could also add a couple of accessors that would
do nothing when IOMMU_DMA is not selected (and use that in the DMA code).


I haven't seen any use of the helpers so far because the DMA code is 
also protected by IOMMU_DMA.


I can add the helpers in the next version if you see any use outside of 
the DMA code.


Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-04-23 Thread Julien Grall

Hi Thomas,

On 4/18/19 8:28 PM, Thomas Gleixner wrote:

On Thu, 18 Apr 2019, Julien Grall wrote:


When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up patch will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

This patch introduces a new field in msi_desc to store an IOMMU cookie
when CONFIG_IOMMU_DMA is selected.


# git grep 'This patch' Documentation/process/

Applied to the whole series.


Sorry for that. I will rework all the commit messages and resend the series.

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/7] irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg

2019-04-18 Thread Julien Grall
The functions mbi_compose_m{b, s}i_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.

A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.

This patch reworks the GICv3 MBI driver to avoid executing preemptible
code in non-preemptible context by preparing the MSI mappings when
allocating the MSI interrupt.

Signed-off-by: Julien Grall 
---
 drivers/irqchip/irq-gic-v3-mbi.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-mbi.c b/drivers/irqchip/irq-gic-v3-mbi.c
index fbfa7ff6deb1..c812b80e3ce9 100644
--- a/drivers/irqchip/irq-gic-v3-mbi.c
+++ b/drivers/irqchip/irq-gic-v3-mbi.c
@@ -84,6 +84,7 @@ static void mbi_free_msi(struct mbi_range *mbi, unsigned int 
hwirq,
 static int mbi_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct mbi_range *mbi = NULL;
int hwirq, offset, i, err = 0;
 
@@ -104,6 +105,16 @@ static int mbi_irq_domain_alloc(struct irq_domain *domain, 
unsigned int virq,
 
hwirq = mbi->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_CLRSPI_NSR);
+   if (err)
+   return err;
+
+   err = iommu_dma_prepare_msi(info->desc,
+   mbi_phys_base + GICD_SETSPI_NSR);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = mbi_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
@@ -142,7 +153,7 @@ static void mbi_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[0].address_lo = lower_32_bits(mbi_phys_base + GICD_SETSPI_NSR);
msg[0].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(data->irq, msg);
 }
 
 #ifdef CONFIG_PCI_MSI
@@ -202,7 +213,7 @@ static void mbi_compose_mbi_msg(struct irq_data *data, 
struct msi_msg *msg)
msg[1].address_lo = lower_32_bits(mbi_phys_base + GICD_CLRSPI_NSR);
msg[1].data = data->parent_data->hwirq;
 
-   iommu_dma_map_msi_msg(data->irq, &msg[1]);
+   iommu_dma_compose_msi_msg(data->irq, &msg[1]);
 }
 
 /* Platform-MSI specific irqchip */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 7/7] iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

2019-04-18 Thread Julien Grall
A recent patch introduced two new functions to replace
iommu_dma_map_msi_msg() to avoid executing preemptible code in
non-preemptible context.

All the existings callers are now using the two new functions, so
iommu_dma_map_msi_msg() can be removed.

Signed-off-by: Julien Grall 
---
 drivers/iommu/dma-iommu.c | 20 
 include/linux/dma-iommu.h |  5 -
 2 files changed, 25 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f5c1f1685095..fdc8ded62e87 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -928,23 +928,3 @@ void iommu_dma_compose_msi_msg(int irq, struct msi_msg 
*msg)
msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
msg->address_lo += lower_32_bits(msi_page->iova);
 }
-
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-   struct msi_desc *desc = irq_get_msi_desc(irq);
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
-
-   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
-   /*
-* We're called from a void callback, so the best we can do is
-* 'fail' by filling the message with obviously bogus values.
-* Since we got this far due to an IOMMU being present, it's
-* not like the existing address would have worked anyway...
-*/
-   msg->address_hi = ~0U;
-   msg->address_lo = ~0U;
-   msg->data = ~0U;
-   } else {
-   iommu_dma_compose_msi_msg(irq, msg);
-   }
-}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 2f4b2c2cc859..4fe2b2fb19bf 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -81,7 +81,6 @@ int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t 
msi_addr);
 /* Update the MSI message if required. */
 void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg);
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
 #else
@@ -120,10 +119,6 @@ static inline void iommu_dma_compose_msi_msg(int irq, 
struct msi_msg *msg)
 {
 }
 
-static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
-{
-}
-
 static inline void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 {
 }
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/7] irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg

2019-04-18 Thread Julien Grall
The function gicv2m_compose_msi_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.

A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.

This patch reworks the gicv2m driver to avoid executing preemptible code
in non-preemptible context by preparing the MSI mapping when allocating
the MSI interrupt.

Signed-off-by: Julien Grall 
---
 drivers/irqchip/irq-gic-v2m.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index f5fe0100f9ff..e5372acd92c9 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -110,7 +110,7 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
msg->data -= v2m->spi_offset;
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(data->irq, msg);
 }
 
 static struct irq_chip gicv2m_irq_chip = {
@@ -167,6 +167,7 @@ static void gicv2m_unalloc_msi(struct v2m_data *v2m, 
unsigned int hwirq,
 static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int 
virq,
   unsigned int nr_irqs, void *args)
 {
+   msi_alloc_info_t *info = args;
struct v2m_data *v2m = NULL, *tmp;
int hwirq, offset, i, err = 0;
 
@@ -186,6 +187,11 @@ static int gicv2m_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 
hwirq = v2m->spi_start + offset;
 
+   err = iommu_dma_prepare_msi(info->desc,
+   v2m->res.start + V2M_MSI_SETSPI_NS);
+   if (err)
+   return err;
+
for (i = 0; i < nr_irqs; i++) {
err = gicv2m_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-18 Thread Julien Grall
On RT, the function iommu_dma_map_msi_msg may be called from
non-preemptible context. This will lead to a splat with
CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock
(they can sleep on RT).

The function iommu_dma_map_msi_msg is used to map the MSI page in the
IOMMU PT and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

This patch split the function iommu_dma_map_msi_msg in two new
functions:
- iommu_dma_prepare_msi: This function will prepare the mapping in
the IOMMU and store the cookie in the structure msi_desc. This
function should be called in preemptible context.
- iommu_dma_compose_msi_msg: This function will update the MSI
message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall 
---
 drivers/iommu/dma-iommu.c | 43 ---
 include/linux/dma-iommu.h | 21 +
 2 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 77aabe637a60..f5c1f1685095 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -888,17 +888,17 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return NULL;
 }
 
-void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)
 {
-   struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+   struct device *dev = msi_desc_to_dev(desc);
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie;
-   struct iommu_dma_msi_page *msi_page;
-   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
unsigned long flags;
 
-   if (!domain || !domain->iova_cookie)
-   return;
+   if (!domain || !domain->iova_cookie) {
+   desc->iommu_cookie = NULL;
+   return 0;
+   }
 
cookie = domain->iova_cookie;
 
@@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
 * of an MSI from within an IPI handler.
 */
spin_lock_irqsave(&cookie->msi_lock, flags);
-   msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);
+   desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);
spin_unlock_irqrestore(&cookie->msi_lock, flags);
 
-   if (WARN_ON(!msi_page)) {
+   return (desc->iommu_cookie) ? 0 : -ENOMEM;
+}
+
+void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)
+{
+   struct msi_desc *desc = irq_get_msi_desc(irq);
+   struct device *dev = msi_desc_to_dev(desc);
+   const struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+   const struct iommu_dma_msi_page *msi_page = desc->iommu_cookie;
+
+   if (!domain || !domain->iova_cookie || WARN_ON(!msi_page))
+   return;
+
+   msg->address_hi = upper_32_bits(msi_page->iova);
+   msg->address_lo &= cookie_msi_granule(domain->iova_cookie) - 1;
+   msg->address_lo += lower_32_bits(msi_page->iova);
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+   struct msi_desc *desc = irq_get_msi_desc(irq);
+   phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+
+   if (WARN_ON(iommu_dma_prepare_msi(desc, msi_addr))) {
/*
 * We're called from a void callback, so the best we can do is
 * 'fail' by filling the message with obviously bogus values.
@@ -922,8 +945,6 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->address_lo = ~0U;
msg->data = ~0U;
} else {
-   msg->address_hi = upper_32_bits(msi_page->iova);
-   msg->address_lo &= cookie_msi_granule(cookie) - 1;
-   msg->address_lo += lower_32_bits(msi_page->iova);
+   iommu_dma_compose_msi_msg(irq, msg);
}
 }
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index e760dc5d1fa8..2f4b2c2cc859 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -71,12 +71,23 @@ void iommu_dma_unmap_resource(struct device *dev, 
dma_addr_t handle,
size_t size, enum dma_data_direction dir, unsigned long attrs);
 
 /* The DMA API isn't _quite_ the whole story, though... */
+/*
+ * Map the MSI page in the IOMMU device and store it in @desc
+ *
+ * Return 0 if succeeded other an error if the preparation has failed.
+ */
+int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr);
+
+/* Update the MSI message if required. */
+void iommu_dma_compose_msi_msg(int irq, struc

[PATCH 4/7] irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg

2019-04-18 Thread Julien Grall
The function its_irq_compose_msi_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.

A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.

This patch reworks the GICv3 ITS driver to avoid executing preemptible
code in non-preemptible context by preparing the MSI mapping when
allocating the MSI interrupt.

Signed-off-by: Julien Grall 
---
 drivers/irqchip/irq-gic-v3-its.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..1e8e01797d9b 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1179,7 +1179,7 @@ static void its_irq_compose_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
msg->address_hi = upper_32_bits(addr);
msg->data   = its_get_event_id(d);
 
-   iommu_dma_map_msi_msg(d->irq, msg);
+   iommu_dma_compose_msi_msg(d->irq, msg);
 }
 
 static int its_irq_set_irqchip_state(struct irq_data *d,
@@ -2566,6 +2566,7 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
 {
msi_alloc_info_t *info = args;
struct its_device *its_dev = info->scratchpad[0].ptr;
+   struct its_node *its = its_dev->its;
irq_hw_number_t hwirq;
int err;
int i;
@@ -2574,6 +2575,8 @@ static int its_irq_domain_alloc(struct irq_domain 
*domain, unsigned int virq,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, its->get_msi_base(its_dev));
+
for (i = 0; i < nr_irqs; i++) {
err = its_irq_gic_domain_alloc(domain, virq + i, hwirq + i);
if (err)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/7] irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg

2019-04-18 Thread Julien Grall
The function ls_scfg_msi_compose_msg may be called from non-preemptible
context. However, on RT, iommu_dma_map_msi_msg requires to be called
from a preemptible context.

A recent patch split the function iommu_dma_map_msi_msg in 2 functions:
one that should be called in preemptible context, the other does
not have any requirement.

This patch reworks the FreeScale SCFG MSI driver to avoid executing
preemptible code in non-preemptible context by preparing the MSI mapping
when allocating the MSI interrupt.

Signed-off-by: Julien Grall 
---
 drivers/irqchip/irq-ls-scfg-msi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-ls-scfg-msi.c 
b/drivers/irqchip/irq-ls-scfg-msi.c
index c671b3212010..8099c5b1fcb5 100644
--- a/drivers/irqchip/irq-ls-scfg-msi.c
+++ b/drivers/irqchip/irq-ls-scfg-msi.c
@@ -100,7 +100,7 @@ static void ls_scfg_msi_compose_msg(struct irq_data *data, 
struct msi_msg *msg)
msg->data |= cpumask_first(mask);
}
 
-   iommu_dma_map_msi_msg(data->irq, msg);
+   iommu_dma_compose_msi_msg(data->irq, msg);
 }
 
 static int ls_scfg_msi_set_affinity(struct irq_data *irq_data,
@@ -141,6 +141,7 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
unsigned int nr_irqs,
void *args)
 {
+   msi_alloc_info_t *info = args;
struct ls_scfg_msi *msi_data = domain->host_data;
int pos, err = 0;
 
@@ -157,6 +158,10 @@ static int ls_scfg_msi_domain_irq_alloc(struct irq_domain 
*domain,
if (err)
return err;
 
+   err = iommu_dma_prepare_msi(info->desc, msi_data->msiir_addr);
+   if (err)
+   return err;
+
irq_domain_set_info(domain, virq, pos,
&ls_scfg_msi_parent_chip, msi_data,
handle_simple_irq, NULL, NULL);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

2019-04-18 Thread Julien Grall
Hi all,

On RT, the function iommu_dma_map_msi_msg expects to be called from preemptible
context. However, this is not always the case resulting a splat with
!CONFIG_DEBUG_ATOMIC_SLEEP:

[   48.875777] BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:974
[   48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
[   48.875782] INFO: lockdep is turned off.
[   48.875784] irq event stamp: 10684
[   48.875786] hardirqs last  enabled at (10683): [] 
_raw_spin_unlock_irqrestore+0x88/0x90
[   48.875791] hardirqs last disabled at (10684): [] 
_raw_spin_lock_irqsave+0x24/0x68
[   48.875796] softirqs last  enabled at (0): [] 
copy_process.isra.1.part.2+0x8d8/0x1970
[   48.875801] softirqs last disabled at (0): [<>]   
(null)
[   48.875805] Preemption disabled at:
[   48.875805] [] __setup_irq+0xd8/0x6c0
[   48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 
5.0.3-rt1-7-g42ede9a0fed6 #45
[   48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno 
Development Platform, BIOS EDK II Jan 23 2017
[   48.875817] Call trace:
[   48.875818]  dump_backtrace+0x0/0x140
[   48.875821]  show_stack+0x14/0x20
[   48.875823]  dump_stack+0xa0/0xd4
[   48.875827]  ___might_sleep+0x16c/0x1f8
[   48.875831]  rt_spin_lock+0x5c/0x70
[   48.875835]  iommu_dma_map_msi_msg+0x5c/0x1d8
[   48.875839]  gicv2m_compose_msi_msg+0x3c/0x48
[   48.875843]  irq_chip_compose_msi_msg+0x40/0x58
[   48.875846]  msi_domain_activate+0x38/0x98
[   48.875849]  __irq_domain_activate_irq+0x58/0xa0
[   48.875852]  irq_domain_activate_irq+0x34/0x58
[   48.875855]  irq_activate+0x28/0x30
[   48.875858]  __setup_irq+0x2b0/0x6c0
[   48.875861]  request_threaded_irq+0xdc/0x188
[   48.875865]  sky2_setup_irq+0x44/0xf8
[   48.875868]  sky2_open+0x1a4/0x240
[   48.875871]  __dev_open+0xd8/0x188
[   48.875874]  __dev_change_flags+0x164/0x1f0
[   48.875877]  dev_change_flags+0x20/0x60
[   48.875879]  do_setlink+0x2a0/0xd30
[   48.875882]  __rtnl_newlink+0x5b4/0x6d8
[   48.875885]  rtnl_newlink+0x50/0x78
[   48.875888]  rtnetlink_rcv_msg+0x178/0x640
[   48.875891]  netlink_rcv_skb+0x58/0x118
[   48.875893]  rtnetlink_rcv+0x14/0x20
[   48.875896]  netlink_unicast+0x188/0x200
[   48.875898]  netlink_sendmsg+0x248/0x3d8
[   48.875900]  sock_sendmsg+0x18/0x40
[   48.875904]  ___sys_sendmsg+0x294/0x2d0
[   48.875908]  __sys_sendmsg+0x68/0xb8
[   48.875911]  __arm64_sys_sendmsg+0x20/0x28
[   48.875914]  el0_svc_common+0x90/0x118
[   48.875918]  el0_svc_handler+0x2c/0x80
[   48.875922]  el0_svc+0x8/0xc

This series is a first attempt to rework how MSI are mapped and composed
when an IOMMU is present.

I was able to test the changes in GICv2m and GICv3 ITS. I don't have
hardware for the other interrupt controllers.

Cheers,

Julien Grall (7):
  genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
  iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts
  irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg
  irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg
  irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg
  irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg
  iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

 drivers/iommu/dma-iommu.c | 45 ---
 drivers/irqchip/irq-gic-v2m.c |  8 ++-
 drivers/irqchip/irq-gic-v3-its.c  |  5 -
 drivers/irqchip/irq-gic-v3-mbi.c  | 15 +++--
 drivers/irqchip/irq-ls-scfg-msi.c |  7 +-
 include/linux/dma-iommu.h | 20 +++--
 include/linux/msi.h   |  3 +++
 7 files changed, 74 insertions(+), 29 deletions(-)

-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/7] genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

2019-04-18 Thread Julien Grall
When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up patch will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

This patch introduces a new field in msi_desc to store an IOMMU cookie
when CONFIG_IOMMU_DMA is selected.

Signed-off-by: Julien Grall 
---
 include/linux/msi.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 7e9b81c3b50d..d7907feef1bb 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -77,6 +77,9 @@ struct msi_desc {
struct device   *dev;
struct msi_msg  msg;
struct irq_affinity_desc*affinity;
+#ifdef CONFIG_IOMMU_DMA
+   const void  *iommu_cookie;
+#endif
 
union {
/* PCI MSI/X specific data */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH] Revert "swiotlb: remove SWIOTLB_MAP_ERROR"

2019-03-08 Thread Julien Grall

Hi Christoph,

On 08/03/2019 15:23, Christoph Hellwig wrote:

On Tue, Mar 05, 2019 at 09:41:46AM +, Julien Grall wrote:

On Xen, dma_addr_t will always be 64-bit while the phys_addr_t will depend
on the MMU type. So we may have phys_addr_t smaller than dma_addr_t from
the kernel point of view.


How can dma_addr_t on arm have value > 32-bit when physical
addresses are 32-bit only?


The number of platform with IOMMU protected all DMA-capable device is not yet 
there. So we decided to not require IOMMU for Dom0. Instead, its memory is be 
directly mapped (i.e guest physical address == host physical address) and will 
always be below 4GB to cater the short page table case.


In the common case, Dom0 also contains the PV backend drivers. Those drivers may 
directly use the guest buffer in the DMA request (so a copy is avoided). To 
avoid using a bounce buffer too much, xen-swiotlb will find the host physical 
address associated to the guest buffer and will use it to compute the DMA address.


While Dom0 kernel may only deal with 32-bit physical address, the hypervisor can 
still deal with up to 40-bit physical address. This means the guest memory can 
be allocated above the 4GB threshold. Hence why the dma_addr_t is always 64-bit 
with CONFIG_XEN=y.


Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH] Revert "swiotlb: remove SWIOTLB_MAP_ERROR"

2019-03-05 Thread Julien Grall

Hi Arnd,

On 3/5/19 8:16 AM, Arnd Bergmann wrote:

On Tue, Mar 5, 2019 at 12:56 AM Robin Murphy  wrote:

On 2019-03-04 7:59 pm, Arnd Bergmann wrote:

This reverts commit b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR"), which
introduced an overflow warning in configurations that have a larger
dma_addr_t than phys_addr_t:

In file included from include/linux/dma-direct.h:5,
   from kernel/dma/swiotlb.c:23:
kernel/dma/swiotlb.c: In function 'swiotlb_tbl_map_single':
include/linux/dma-mapping.h:136:28: error: conversion from 'long long unsigned 
int' to 'phys_addr_t' {aka 'unsigned int'} changes value from 
'18446744073709551615' to '4294967295' [-Werror=overflow]
   #define DMA_MAPPING_ERROR  (~(dma_addr_t)0)
  ^
kernel/dma/swiotlb.c:544:9: note: in expansion of macro 'DMA_MAPPING_ERROR'
return DMA_MAPPING_ERROR;

The configuration that caused this is on 32-bit ARM, where the DMA address
space depends on the enabled hardware platforms, while the physical
address space depends on the type of MMU chosen (classic vs LPAE).


Are these real platforms, or random configs? Realistically I don't see a
great deal of need to support DMA_ADDR_T_64BIT for non-LPAE.
Particularly in this case since AFAIK the only selector of SWIOTLB on
Arm is Xen, and that by definition is never going to be useful on
non-LPAE hardware.

...
On Mon, Mar 4, 2019 at 11:00 PM Konrad Rzeszutek Wilk
 wrote:

What about making the phys_addr_t and dma_addr_t have the same
width with some magic #ifdef hackery?


As far as I can tell, only randconfig builds see this problem, in
real systems phys_addr_t is normally the same as dma_addr_t,
and you could reasonably have a machine with a larger phys_addr_t
than dma_addr_t but wouldn't need to bother.


On Xen, dma_addr_t will always be 64-bit while the phys_addr_t will 
depend on the MMU type. So we may have phys_addr_t smaller than 
dma_addr_t from the kernel point of view.


Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH] Revert "swiotlb: remove SWIOTLB_MAP_ERROR"

2019-03-05 Thread Julien Grall

Hi Robin,

On 3/4/19 11:56 PM, Robin Murphy wrote:

On 2019-03-04 7:59 pm, Arnd Bergmann wrote:
This reverts commit b907e20508d0 ("swiotlb: remove 
SWIOTLB_MAP_ERROR"), which

introduced an overflow warning in configurations that have a larger
dma_addr_t than phys_addr_t:

In file included from include/linux/dma-direct.h:5,
  from kernel/dma/swiotlb.c:23:
kernel/dma/swiotlb.c: In function 'swiotlb_tbl_map_single':
include/linux/dma-mapping.h:136:28: error: conversion from 'long long 
unsigned int' to 'phys_addr_t' {aka 'unsigned int'} changes value from 
'18446744073709551615' to '4294967295' [-Werror=overflow]

  #define DMA_MAPPING_ERROR  (~(dma_addr_t)0)
 ^
kernel/dma/swiotlb.c:544:9: note: in expansion of macro 
'DMA_MAPPING_ERROR'

   return DMA_MAPPING_ERROR;

The configuration that caused this is on 32-bit ARM, where the DMA 
address

space depends on the enabled hardware platforms, while the physical
address space depends on the type of MMU chosen (classic vs LPAE).


Are these real platforms, or random configs? Realistically I don't see a 
great deal of need to support DMA_ADDR_T_64BIT for non-LPAE. 


This is selected by CONFIG_XEN no matter the type of MMU chosen (see 
more below).


Particularly in this case since AFAIK the only selector of SWIOTLB on 
Arm is Xen, and that by definition is never going to be useful on 
non-LPAE hardware.


While Xen itself requires LPAE, it is still possible to run a non-LPAE 
kernel in the guest. For instance, last time I checked, Debian was 
shipping only non-LPAE kernel for Arm32.


On Arm, swiotlb is only used by the hardware domain (aka Dom0) to allow 
DMA in memory mapped from other guest. So the returned DMA address may 
be 64-bit. Hence why we select DMA_ADDR_T_64BIT above.


Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Fail to boot Dom0 on Xen Arm64 after "dma-mapping: bypass indirect calls for dma-direct"

2019-01-17 Thread Julien Grall

(+ Leo Yan who reported a similar issues on xen-devel)

Hi Christoph,

On 16/01/2019 18:19, Christoph Hellwig wrote:

Does this fix your problem?


Thank you for the patch. This allows me to boot Dom0 on Juno r2.

My knowledge of DMA is quite limited, so I may have missed something.

Looking at the change for arm64, you will always call dma-direct API. In 
previous Linux version, xen-swiotlb will call dev->archdata.dev_dma_ops (a copy 
of dev->dma_ops before setting Xen DMA ops) if not NULL. Does it mean we expect 
dev->dma_ops to always be NULL and hence using dma-direct API?


Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Fail to boot Dom0 on Xen Arm64 after "dma-mapping: bypass indirect calls for dma-direct"

2019-01-16 Thread Julien Grall
Hi,

I made an attempt to boot Linux 5.0-rc2 as Dom0 on Xen
Arm64 and got the following splat:

[4.074264] Unable to handle kernel NULL pointer dereference at virtual 
address 
[4.083074] Mem abort info:
[4.085870]   ESR = 0x9604
[4.089050]   Exception class = DABT (current EL), IL = 32 bits
[4.095065]   SET = 0, FnV = 0
[4.098138]   EA = 0, S1PTW = 0
[4.101405] Data abort info:
[4.104362]   ISV = 0, ISS = 0x0004
[4.108289]   CM = 0, WnR = 0
[4.111328] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
[4.118025] [] pgd=
[4.123058] Internal error: Oops: 9604 [#1] PREEMPT SMP
[4.128590] Modules linked in:
[4.131727] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 
5.0.0-rc2-00154-gc5dbed6dcf60 #1191
[4.139997] Hardware name: ARM Juno development board (r2) (DT)
[4.145995] pstate: 2005 (nzCv daif -PAN -UAO)
[4.150876] pc : xen_swiotlb_alloc_coherent+0x64/0x1e8
[4.156092] lr : dma_alloc_attrs+0xf4/0x110
[4.160359] sp : 1003b960
[4.163743] x29: 1003b960 x28: 112cfbb4
[4.169137] x27: 116ed000 x26: 1003ba90
[4.174533] x25: 10d6c350 x24: 0005
[4.179937] x23: 0002 x22: 0001f000
[4.185323] x21:  x20: 8008b904b0b0
[4.190727] x19: 114d4000 x18: 14033fff
[4.196113] x17:  x16: 
[4.201509] x15: 4000 x14: 110fc000
[4.206903] x13: 114dd000 x12: 0068
[4.212299] x11: 0001 x10: 
[4.217693] x9 :  x8 : 8008b9b05b00
[4.223089] x7 :  x6 : 
[4.228483] x5 : 106d4c38 x4 : 
[4.233879] x3 : 006000c0 x2 : 1003ba90
[4.239274] x1 : 0002 x0 : 
[4.244671] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
[4.251456] Call trace:
[4.253985]  xen_swiotlb_alloc_coherent+0x64/0x1e8
[4.258857]  dma_alloc_attrs+0xf4/0x110
[4.262774]  dmam_alloc_attrs+0x64/0xb8
[4.266691]  sil24_port_start+0x6c/0x100
[4.270704]  ata_host_start.part.10+0x104/0x210
[4.275304]  ata_host_activate+0x64/0x148
[4.279392]  sil24_init_one+0x1ac/0x2b0
[4.283312]  pci_device_probe+0xdc/0x168
[4.287313]  really_probe+0x1f0/0x298
[4.291054]  driver_probe_device+0x58/0x100
[4.295316]  __driver_attach+0xdc/0xe0
[4.299145]  bus_for_each_dev+0x74/0xc8
[4.303061]  driver_attach+0x20/0x28
[4.306716]  bus_add_driver+0x1b0/0x220
[4.310641]  driver_register+0x60/0x110
[4.314549]  __pci_register_driver+0x58/0x68
[4.318902]  sil24_pci_driver_init+0x20/0x28
[4.323251]  do_one_initcall+0xb8/0x348
[4.327166]  kernel_init_freeable+0x3cc/0x474
[4.331606]  kernel_init+0x10/0x100
[4.335171]  ret_from_fork+0x10/0x1c
[4.338829] Code: f941fa80 aa1503e4 aa1a03e2 aa1703e1 (f945)
[4.345028] ---[ end trace 277757f27697a72b ]---

The bisector fingered the following commit:

commit 356da6d0cde3323236977fce54c1f9612a742036
Author: Christoph Hellwig 
Date:   Thu Dec 6 13:39:32 2018 -0800

dma-mapping: bypass indirect calls for dma-direct

Avoid expensive indirect calls in the fast path DMA mapping
operations by directly calling the dma_direct_* ops if we are using
the directly mapped DMA operations.

Signed-off-by: Christoph Hellwig 
Acked-by: Jesper Dangaard Brouer 
Tested-by: Jesper Dangaard Brouer 
Tested-by: Tony Luck 

Discussing with Robin, it looks like that the current wrappers in for Xen
(see include/xen/arm/page-coherent.h) are not able to cope with direct
calls. Those wrappers are used by swiotlb to call the underlying DMA ops
of the device. They are unable to cope with NULL (aka direct-mapping), hence
the NULL dereference crash.

Do you have any suggestion how this should be fixed?

Best regards,

-- 
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v5 03/17] iommu: introduce a reserved iova cookie

2016-03-07 Thread Julien Grall

Hi Eric,

On 01/03/16 18:27, Eric Auger wrote:

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0e3b009..7b2bb94 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1072,6 +1072,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct 
bus_type *bus,

domain->ops  = bus->iommu_ops;
domain->type = type;
+   mutex_init(&domain->reserved_mutex);


For consistency, the RB-tree reserved_binding_list should be initialized 
too:


domain->reserved_binding_list = RB_ROOT;

Cheers,

--
Julien Grall
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu