Re: [RFC PATCH v1 0/9] deferred_probe_timeout logic clean up

2022-05-30 Thread Sebastian Andrzej Siewior
On 2022-05-26 01:15:39 [-0700], Saravana Kannan wrote:
> Yoshihiro/Geert,
Hi Saravana,

> If you can test this patch series and confirm that the NFS root case
> works, I'd really appreciate that.

The two patches you sent earlier, plus this series, plus

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7ff7fbb006431..829d9b1f7403f 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1697,8 +1697,6 @@ static int fw_devlink_may_probe(struct device *dev, void 
*data)
  */
 void __init fw_devlink_unblock_may_probe(void)
 {
-   struct device_link *link, *ln;
-
if (!fw_devlink_flags || fw_devlink_is_permissive())
return;
 
and it compiles + boots without a delay.

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration

2022-05-25 Thread Sebastian Andrzej Siewior
On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > to 0 avoids the delay.
> 
> In your case, I think it might be working as intended? Curious, what
> was the call stack in your case where it was blocked?

Why is then there 10sec delay during boot? The backtrace is
|[ cut here ]
|WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 
wait_for_device_probe+0x30/0x110
|Modules linked in:
|CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
|RIP: 0010:wait_for_device_probe+0x30/0x110
|Call Trace:
| 
| prepare_namespace+0x2b/0x160
| kernel_init_freeable+0x2b3/0x2dd
| kernel_init+0x11/0x110
| ret_from_fork+0x22/0x30
| 

Looking closer, it can't access init. This in particular box boots
directly the kernel without an initramfs so the kernel later mounts
/dev/sda1 and everything is good.  So that seems to be the reason…
My other machine with an initramfs does not show this problem.

> -Saravana

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration

2022-05-24 Thread Sebastian Andrzej Siewior
On 2022-05-23 20:43:06 [-0700], Saravana Kannan wrote:
…
> Thanks for all the help. I think I know what's going on.

I, too got here because my boot recently was extended by 10 seconds and
bisected to that commit in question.

> If you revert the following commit, then you'll see that your device
> no longer hangs with my changes.
> 35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
> the deferred_probe_timeout fires

Removing probe_timeout_waitqueue (as suggested) or setting the timeout
to 0 avoids the delay.

> -Saravana

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 02/13] x86: always set IF before oopsing from page fault

2019-04-04 Thread Sebastian Andrzej Siewior
- stepping on del button while browsing though CCs.
On 2019-04-04 09:47:27 [-0600], Tycho Andersen wrote:
> > Hmm.  do_exit() isn't really meant to be "try your best to leave the
> > system somewhat usable without returning" -- it's a function that,
> > other than in OOPSes, is called from a well-defined state.  So I think
> > rewind_stack_do_exit() is probably a better spot.  But we need to
> > rewind the stack and *then* turn on IRQs, since we otherwise risk
> > exploding quite badly.
> 
> Ok, sounds good. I guess we can include something like this patch in
> the next series.

The tracing infrastructure probably doesn't know that the interrupts are
back on. Also if you were holding a spin lock then your preempt count
isn't 0 which means that might_sleep() will trigger a splat (in your
backtrace it was zero).

> Thanks,
> 
> Tycho
Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: fix unused symbol iommu_table_lock

2018-05-07 Thread Sebastian Andrzej Siewior
On 2018-05-06 11:53:26 [+0200], Danilo Krummrich wrote:
> When CONFIG_IRQ_REMAP is not set iommu_table_lock is unused and hece
> a warning is generated.
> 
> Fixes: ea6166f4b83e9 ("iommu/amd: Split irq_lookup_table out of the 
> amd_iommu_devtable_lock")
> Signed-off-by: Danilo Krummrich 

https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=iommu/fixes&id=94c793accacdb0d33c1df66f3b324eec96d26e58

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/3] iommu/amd: Cleanup locking in __attach/detach_device()

2018-05-04 Thread Sebastian Andrzej Siewior
From: Anna-Maria Gleixner 

Since introduction of the pd_bitmap_lock in commit 2bc001808904
("iommu/amd: Split domain id out of amd_iommu_devtable_lock")
amd_iommu_devtable_lock is only taken around __detach_device() and
__attach_device() calls.

The lock is not protecting anything as all operations are domain specific
and protected by domain->lock in __detach_device() and __attach_device(),
so amd_iommu_devtable_lock has no real purpose anymore.

Lock domain->lock before calling into __detach_device() and
__attach_device() and simplify the implementation of those functions. Add
lockdep checks where appropriate.

Signed-off-by: Anna-Maria Gleixner 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 70 +--
 1 file changed, 15 insertions(+), 55 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e484275a4c69..f66a5d0b7c62 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -81,7 +81,6 @@
  */
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
-static DEFINE_SPINLOCK(amd_iommu_devtable_lock);
 static DEFINE_SPINLOCK(pd_bitmap_lock);
 
 /* List of all available dev_data structures */
@@ -1886,6 +1885,8 @@ static void do_attach(struct iommu_dev_data *dev_data,
u16 alias;
bool ats;
 
+   lockdep_assert_held(&domain->lock);
+
iommu = amd_iommu_rlookup_table[dev_data->devid];
alias = dev_data->alias;
ats   = dev_data->ats.enabled;
@@ -1906,11 +1907,13 @@ static void do_attach(struct iommu_dev_data *dev_data,
device_flush_dte(dev_data);
 }
 
-static void do_detach(struct iommu_dev_data *dev_data)
+static void __detach_device(struct iommu_dev_data *dev_data)
 {
struct amd_iommu *iommu;
u16 alias;
 
+   lockdep_assert_held(&dev_data->domain->lock);
+
iommu = amd_iommu_rlookup_table[dev_data->devid];
alias = dev_data->alias;
 
@@ -1936,32 +1939,13 @@ static void do_detach(struct iommu_dev_data *dev_data)
 static int __attach_device(struct iommu_dev_data *dev_data,
   struct protection_domain *domain)
 {
-   int ret;
-
-   /*
-* Must be called with IRQs disabled. Warn here to detect early
-* when its not.
-*/
-   WARN_ON(!irqs_disabled());
-
-   /* lock domain */
-   spin_lock(&domain->lock);
-
-   ret = -EBUSY;
if (dev_data->domain != NULL)
-   goto out_unlock;
+   return -EBUSY;
 
/* Attach alias group root */
do_attach(dev_data, domain);
 
-   ret = 0;
-
-out_unlock:
-
-   /* ready */
-   spin_unlock(&domain->lock);
-
-   return ret;
+   return 0;
 }
 
 
@@ -2088,9 +2072,10 @@ static int attach_device(struct device *dev,
}
 
 skip_ats_check:
-   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
+
+   spin_lock_irqsave(&domain->lock, flags);
ret = __attach_device(dev_data, domain);
-   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&domain->lock, flags);
 
/*
 * We might boot into a crash-kernel here. The crashed kernel
@@ -2103,29 +2088,7 @@ static int attach_device(struct device *dev,
 }
 
 /*
- * Removes a device from a protection domain (unlocked)
- */
-static void __detach_device(struct iommu_dev_data *dev_data)
-{
-   struct protection_domain *domain;
-
-   /*
-* Must be called with IRQs disabled. Warn here to detect early
-* when its not.
-*/
-   WARN_ON(!irqs_disabled());
-
-   domain = dev_data->domain;
-
-   spin_lock(&domain->lock);
-
-   do_detach(dev_data);
-
-   spin_unlock(&domain->lock);
-}
-
-/*
- * Removes a device from a protection domain (with devtable_lock held)
+ * Removes a device from a protection domain
  */
 static void detach_device(struct device *dev)
 {
@@ -2145,10 +2108,9 @@ static void detach_device(struct device *dev)
if (WARN_ON(!dev_data->domain))
return;
 
-   /* lock device table */
-   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&domain->lock, flags);
__detach_device(dev_data);
-   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&domain->lock, flags);
 
if (!dev_is_pci(dev))
return;
@@ -2785,16 +2747,14 @@ static void cleanup_domain(struct protection_domain 
*domain)
struct iommu_dev_data *entry;
unsigned long flags;
 
-   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
-
+   spin_lock_irqsave(&domain->lock, flags);
while (!list_empty(&domain->dev_list)) {
entry = list_first_entry(&domain->dev_list,
 

[PATCH 3/3] iommu/amd: Do not flush when device is busy

2018-05-04 Thread Sebastian Andrzej Siewior
From: Anna-Maria Gleixner 

When device is already attached to a domain, there is no need to execute
the domain_flush_tlb_pde(). Therefore move the check if the domain is set
into attach_device().

Signed-off-by: Anna-Maria Gleixner 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 32 ++--
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index f66a5d0b7c62..a801678ae1b4 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1878,8 +1878,11 @@ static void clear_dte_entry(u16 devid)
amd_iommu_apply_erratum_63(devid);
 }
 
-static void do_attach(struct iommu_dev_data *dev_data,
- struct protection_domain *domain)
+/*
+ * This function does assigns the device visible for the hardware
+ */
+static void __attach_device(struct iommu_dev_data *dev_data,
+   struct protection_domain *domain)
 {
struct amd_iommu *iommu;
u16 alias;
@@ -1932,23 +1935,6 @@ static void __detach_device(struct iommu_dev_data 
*dev_data)
device_flush_dte(dev_data);
 }
 
-/*
- * If a device is not yet associated with a domain, this function does
- * assigns it visible for the hardware
- */
-static int __attach_device(struct iommu_dev_data *dev_data,
-  struct protection_domain *domain)
-{
-   if (dev_data->domain != NULL)
-   return -EBUSY;
-
-   /* Attach alias group root */
-   do_attach(dev_data, domain);
-
-   return 0;
-}
-
-
 static void pdev_iommuv2_disable(struct pci_dev *pdev)
 {
pci_disable_ats(pdev);
@@ -2045,7 +2031,6 @@ static int attach_device(struct device *dev,
struct pci_dev *pdev;
struct iommu_dev_data *dev_data;
unsigned long flags;
-   int ret;
 
dev_data = get_dev_data(dev);
 
@@ -2073,8 +2058,11 @@ static int attach_device(struct device *dev,
 
 skip_ats_check:
 
+   if (dev_data->domain != NULL)
+   return -EBUSY;
+
spin_lock_irqsave(&domain->lock, flags);
-   ret = __attach_device(dev_data, domain);
+   __attach_device(dev_data, domain);
spin_unlock_irqrestore(&domain->lock, flags);
 
/*
@@ -2084,7 +2072,7 @@ static int attach_device(struct device *dev,
 */
domain_flush_tlb_pde(domain);
 
-   return ret;
+   return 0;
 }
 
 /*
-- 
2.17.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu/amd: Prevent possible null pointer dereference and infinite loop

2018-05-04 Thread Sebastian Andrzej Siewior
From: Anna-Maria Gleixner 

The check for !dev_data->domain in __detach_device() emits a warning and
returns. The calling code in detach_device() dereferences dev_data->domain
afterwards unconditionally, so in case that dev_data->domain is NULL the
warning will be immediately followed by a NULL pointer dereference.

The calling code in cleanup_domain() loops infinite when !dev_data->domain
and the check in __detach_device() returns immediately because dev_list is
not changed.

do_detach() duplicates this check without throwing a warning.

Move the check with the explanation of the do_detach() code into the caller
detach_device() and return immediately. Throw an error, when hitting the
condition in cleanup_domain().

Signed-off-by: Anna-Maria Gleixner 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8fb8c737fffe..e484275a4c69 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1911,15 +1911,6 @@ static void do_detach(struct iommu_dev_data *dev_data)
struct amd_iommu *iommu;
u16 alias;
 
-   /*
-* First check if the device is still attached. It might already
-* be detached from its domain because the generic
-* iommu_detach_group code detached it and we try again here in
-* our alias handling.
-*/
-   if (!dev_data->domain)
-   return;
-
iommu = amd_iommu_rlookup_table[dev_data->devid];
alias = dev_data->alias;
 
@@ -2124,9 +2115,6 @@ static void __detach_device(struct iommu_dev_data 
*dev_data)
 */
WARN_ON(!irqs_disabled());
 
-   if (WARN_ON(!dev_data->domain))
-   return;
-
domain = dev_data->domain;
 
spin_lock(&domain->lock);
@@ -2148,6 +2136,15 @@ static void detach_device(struct device *dev)
dev_data = get_dev_data(dev);
domain   = dev_data->domain;
 
+   /*
+* First check if the device is still attached. It might already
+* be detached from its domain because the generic
+* iommu_detach_group code detached it and we try again here in
+* our alias handling.
+*/
+   if (WARN_ON(!dev_data->domain))
+   return;
+
/* lock device table */
spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
__detach_device(dev_data);
@@ -2793,6 +2790,7 @@ static void cleanup_domain(struct protection_domain 
*domain)
while (!list_empty(&domain->dev_list)) {
entry = list_first_entry(&domain->dev_list,
 struct iommu_dev_data, list);
+   BUG_ON(!entry->domain);
__detach_device(entry);
}
 
-- 
2.17.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


iommu/amd: avoid irqs_disabled() check

2018-05-04 Thread Sebastian Andrzej Siewior
Primary motivation was to get rid of the "WARN_ON(!irqs_disabled());"
check. The first patch avoids a possible loop (if cleanup_domain() is
invoked `entry->domain == NULL' then it loops for ever).
The irqs_disabled() check has been replaced with a lockdep_assert_held()
check.

Sebastian


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: fix unused-variable warning

2018-04-20 Thread Sebastian Andrzej Siewior
On 2018-04-20 11:28:36 [+0200], Tobias Regnery wrote:
> The iommu_table_lock is only used by code inside an ifdef CONFIG_IRQ_REMAP
> block. This leads to the following warning with CONFIG_IRQ_REMAP=n:
> 
> amd_iommu.c:86:24: warning: 'iommu_table_lock' defined but not used 
> [-Wunused-variable]
> 
> Guard the spinlock definition with the same ifdef.
> 
> Fixes: ea6166f4b83e9 ("iommu/amd: Split irq_lookup_table out of the 
> amd_iommu_devtable_lock")
> Signed-off-by: Tobias Regnery 

Thank you for spotting this, I am waiting for Jörg to pick up this
instead:
  http://lkml.kernel.org/r/20180404105713.633983-1-a...@arndb.de

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: amd: hide unused iommu_table_lock

2018-04-13 Thread Sebastian Andrzej Siewior
On 2018-04-04 12:56:59 [+0200], Arnd Bergmann wrote:
> The newly introduced lock is only used when CONFIG_IRQ_REMAP is enabled:
> 
> drivers/iommu/amd_iommu.c:86:24: error: 'iommu_table_lock' defined but not 
> used [-Werror=unused-variable]
>  static DEFINE_SPINLOCK(iommu_table_lock);
> 
> This moves the definition next to the user, within the #ifdef protected
> section of the file.
> 
> Fixes: ea6166f4b83e ("iommu/amd: Split irq_lookup_table out of the 
> amd_iommu_devtable_lock")
> Signed-off-by: Arnd Bergmann 
Acked-by: Sebastian Andrzej Siewior 

Thank you Arnd.

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 08/10] iommu/amd: drop the lock while allocating new irq remap table

2018-03-22 Thread Sebastian Andrzej Siewior
The irq_remap_table is allocated while the iommu_table_lock is held with
interrupts disabled.
>From looking at the call sites, all callers are in the early device
initialisation (apic_bsp_setup(), pci_enable_device(),
pci_enable_msi()) so make sense to drop the lock which also enables
interrupts and try to allocate that memory with GFP_KERNEL instead
GFP_ATOMIC.

Since during the allocation the iommu_table_lock is dropped, we need to
recheck if table exists after the lock has been reacquired. I *think*
that it is impossible that the "devid" entry appears in irq_lookup_table
while the lock is dropped since the same device can only be probed once.
However I check for both cases, just to be sure.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 65 +--
 1 file changed, 46 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 11ea2d656be8..c493d345b3ef 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3617,6 +3617,30 @@ static struct irq_remap_table *get_irq_table(u16 devid)
return table;
 }
 
+static struct irq_remap_table *__alloc_irq_table(void)
+{
+   struct irq_remap_table *table;
+
+   table = kzalloc(sizeof(*table), GFP_KERNEL);
+   if (!table)
+   return NULL;
+
+   table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_KERNEL);
+   if (!table->table) {
+   kfree(table);
+   return NULL;
+   }
+   raw_spin_lock_init(&table->lock);
+
+   if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
+   memset(table->table, 0,
+  MAX_IRQS_PER_TABLE * sizeof(u32));
+   else
+   memset(table->table, 0,
+  (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
+   return table;
+}
+
 static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid,
  struct irq_remap_table *table)
 {
@@ -3628,6 +3652,7 @@ static void set_remap_table_entry(struct amd_iommu 
*iommu, u16 devid,
 static struct irq_remap_table *alloc_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
+   struct irq_remap_table *new_table = NULL;
struct amd_iommu *iommu;
unsigned long flags;
u16 alias;
@@ -3646,42 +3671,44 @@ static struct irq_remap_table *alloc_irq_table(u16 
devid)
table = irq_lookup_table[alias];
if (table) {
set_remap_table_entry(iommu, devid, table);
-   goto out;
+   goto out_wait;
}
+   spin_unlock_irqrestore(&iommu_table_lock, flags);
 
/* Nothing there yet, allocate new irq remapping table */
-   table = kzalloc(sizeof(*table), GFP_ATOMIC);
-   if (!table)
+   new_table = __alloc_irq_table();
+   if (!new_table)
+   return NULL;
+
+   spin_lock_irqsave(&iommu_table_lock, flags);
+
+   table = irq_lookup_table[devid];
+   if (table)
goto out_unlock;
 
-   /* Initialize table spin-lock */
-   raw_spin_lock_init(&table->lock);
-
-   table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_ATOMIC);
-   if (!table->table) {
-   kfree(table);
-   table = NULL;
-   goto out_unlock;
+   table = irq_lookup_table[alias];
+   if (table) {
+   set_remap_table_entry(iommu, devid, table);
+   goto out_wait;
}
 
-   if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
-   memset(table->table, 0,
-  MAX_IRQS_PER_TABLE * sizeof(u32));
-   else
-   memset(table->table, 0,
-  (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
-
+   table = new_table;
+   new_table = NULL;
 
set_remap_table_entry(iommu, devid, table);
if (devid != alias)
set_remap_table_entry(iommu, alias, table);
 
-out:
+out_wait:
iommu_completion_wait(iommu);
 
 out_unlock:
spin_unlock_irqrestore(&iommu_table_lock, flags);
 
+   if (new_table) {
+   kmem_cache_free(amd_iommu_irq_cache, new_table->table);
+   kfree(new_table);
+   }
return table;
 }
 
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 05/10] iommu/amd: remove the special case from alloc_irq_table()

2018-03-22 Thread Sebastian Andrzej Siewior
alloc_irq_table() has a special ioapic argument. If set then it will
pre-allocate / reserve the first 32 indexes. The argument is only once
true and it would make alloc_irq_table() a little simpler if we would
extract the special bits to the caller.
The caller of irq_remapping_alloc() is holding irq_domain_mutex so the
initialization of iommu->irte_ops->set_allocated() should not race
against other user.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index a87d2fee68db..a5982a61f181 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3617,7 +3617,7 @@ static struct irq_remap_table *get_irq_table(u16 devid)
return table;
 }
 
-static struct irq_remap_table *alloc_irq_table(u16 devid, bool ioapic)
+static struct irq_remap_table *alloc_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
struct amd_iommu *iommu;
@@ -3651,10 +3651,6 @@ static struct irq_remap_table *alloc_irq_table(u16 
devid, bool ioapic)
/* Initialize table spin-lock */
raw_spin_lock_init(&table->lock);
 
-   if (ioapic)
-   /* Keep the first 32 indexes free for IOAPIC interrupts */
-   table->min_index = 32;
-
table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_ATOMIC);
if (!table->table) {
kfree(table);
@@ -3669,12 +3665,6 @@ static struct irq_remap_table *alloc_irq_table(u16 
devid, bool ioapic)
memset(table->table, 0,
   (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 
-   if (ioapic) {
-   int i;
-
-   for (i = 0; i < 32; ++i)
-   iommu->irte_ops->set_allocated(table, i);
-   }
 
irq_lookup_table[devid] = table;
set_dte_irq_entry(devid, table);
@@ -3704,7 +3694,7 @@ static int alloc_irq_index(u16 devid, int count, bool 
align)
if (!iommu)
return -ENODEV;
 
-   table = alloc_irq_table(devid, false);
+   table = alloc_irq_table(devid);
if (!table)
return -ENODEV;
 
@@ -4130,10 +4120,26 @@ static int irq_remapping_alloc(struct irq_domain 
*domain, unsigned int virq,
return ret;
 
if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
-   if (alloc_irq_table(devid, true))
+   struct irq_remap_table *table;
+   struct amd_iommu *iommu;
+
+   table = alloc_irq_table(devid);
+   if (table) {
+   if (!table->min_index) {
+   /*
+* Keep the first 32 indexes free for IOAPIC
+* interrupts.
+*/
+   table->min_index = 32;
+   iommu = amd_iommu_rlookup_table[devid];
+   for (i = 0; i < 32; ++i)
+   iommu->irte_ops->set_allocated(table, 
i);
+   }
+   WARN_ON(table->min_index != 32);
index = info->ioapic_pin;
-   else
+   } else {
ret = -ENOMEM;
+   }
} else {
bool align = (info->type == X86_IRQ_ALLOC_TYPE_MSI);
 
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 09/10] iommu/amd: make amd_iommu_devtable_lock a spin_lock

2018-03-22 Thread Sebastian Andrzej Siewior
Before commit 0bb6e243d7fb ("iommu/amd: Support IOMMU_DOMAIN_DMA type
allocation") amd_iommu_devtable_lock had a read_lock() user but now
there are none. In fact, after the mentioned commit we had only
write_lock() user of the lock. Since there is no reason to keep it as
writer lock, change its type to a spin_lock.
I *think* that we might even be able to remove the lock because all its
current user seem to have their own protection.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index c493d345b3ef..23e26a4b708e 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -80,7 +80,7 @@
  */
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
-static DEFINE_RWLOCK(amd_iommu_devtable_lock);
+static DEFINE_SPINLOCK(amd_iommu_devtable_lock);
 static DEFINE_SPINLOCK(pd_bitmap_lock);
 static DEFINE_SPINLOCK(iommu_table_lock);
 
@@ -2097,9 +2097,9 @@ static int attach_device(struct device *dev,
}
 
 skip_ats_check:
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
ret = __attach_device(dev_data, domain);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 
/*
 * We might boot into a crash-kernel here. The crashed kernel
@@ -2149,9 +2149,9 @@ static void detach_device(struct device *dev)
domain   = dev_data->domain;
 
/* lock device table */
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
__detach_device(dev_data);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 
if (!dev_is_pci(dev))
return;
@@ -2814,7 +2814,7 @@ static void cleanup_domain(struct protection_domain 
*domain)
struct iommu_dev_data *entry;
unsigned long flags;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
 
while (!list_empty(&domain->dev_list)) {
entry = list_first_entry(&domain->dev_list,
@@ -2822,7 +2822,7 @@ static void cleanup_domain(struct protection_domain 
*domain)
__detach_device(entry);
}
 
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 }
 
 static void protection_domain_free(struct protection_domain *domain)
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 07/10] iommu/amd: factor out setting the remap table for a devid

2018-03-22 Thread Sebastian Andrzej Siewior
Setting the IRQ remap table for a specific devid (or its alias devid)
includes three steps. Those three steps are always repeated each time
this is done.
Introduce a new helper function, move those steps there and use that
function instead. The compiler can still decide if it is worth to
inline.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index ea120c7b46c9..11ea2d656be8 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3617,6 +3617,14 @@ static struct irq_remap_table *get_irq_table(u16 devid)
return table;
 }
 
+static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid,
+ struct irq_remap_table *table)
+{
+   irq_lookup_table[devid] = table;
+   set_dte_irq_entry(devid, table);
+   iommu_flush_dte(iommu, devid);
+}
+
 static struct irq_remap_table *alloc_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
@@ -3637,9 +3645,7 @@ static struct irq_remap_table *alloc_irq_table(u16 devid)
alias = amd_iommu_alias_table[devid];
table = irq_lookup_table[alias];
if (table) {
-   irq_lookup_table[devid] = table;
-   set_dte_irq_entry(devid, table);
-   iommu_flush_dte(iommu, devid);
+   set_remap_table_entry(iommu, devid, table);
goto out;
}
 
@@ -3666,14 +3672,9 @@ static struct irq_remap_table *alloc_irq_table(u16 devid)
   (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 
 
-   irq_lookup_table[devid] = table;
-   set_dte_irq_entry(devid, table);
-   iommu_flush_dte(iommu, devid);
-   if (devid != alias) {
-   irq_lookup_table[alias] = table;
-   set_dte_irq_entry(alias, table);
-   iommu_flush_dte(iommu, alias);
-   }
+   set_remap_table_entry(iommu, devid, table);
+   if (devid != alias)
+   set_remap_table_entry(iommu, alias, table);
 
 out:
iommu_completion_wait(iommu);
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 10/10] iommu/amd: return proper error code in irq_remapping_alloc()

2018-03-22 Thread Sebastian Andrzej Siewior
In the unlikely case when alloc_irq_table() is not able to return a
remap table then "ret" will be assigned with an error code. Later, the
code checks `index' and if it is negative (which it is because it is
initialized with `-1') and then then function properly aborts but
returns `-1' instead `-ENOMEM' what was intended.
In order to correct this, I assign -ENOMEM to index.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 23e26a4b708e..6107e24bed8a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4124,7 +4124,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
struct amd_ir_data *data = NULL;
struct irq_cfg *cfg;
int i, ret, devid;
-   int index = -1;
+   int index;
 
if (!info)
return -EINVAL;
@@ -4166,7 +4166,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
WARN_ON(table->min_index != 32);
index = info->ioapic_pin;
} else {
-   ret = -ENOMEM;
+   index = -ENOMEM;
}
} else {
bool align = (info->type == X86_IRQ_ALLOC_TYPE_MSI);
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 06/10] iommu/amd: use `table' instead `irt' as variable name in amd_iommu_update_ga()

2018-03-22 Thread Sebastian Andrzej Siewior
The variable of type struct irq_remap_table is always named `table'
except in amd_iommu_update_ga() where it is called `irt'. Make it
consistent and name it also `table'.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index a5982a61f181..ea120c7b46c9 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4405,7 +4405,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
 {
unsigned long flags;
struct amd_iommu *iommu;
-   struct irq_remap_table *irt;
+   struct irq_remap_table *table;
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
int devid = ir_data->irq_2_irte.devid;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
@@ -4419,11 +4419,11 @@ int amd_iommu_update_ga(int cpu, bool is_run, void 
*data)
if (!iommu)
return -ENODEV;
 
-   irt = get_irq_table(devid);
-   if (!irt)
+   table = get_irq_table(devid);
+   if (!table)
return -ENODEV;
 
-   raw_spin_lock_irqsave(&irt->lock, flags);
+   raw_spin_lock_irqsave(&table->lock, flags);
 
if (ref->lo.fields_vapic.guest_mode) {
if (cpu >= 0)
@@ -4432,7 +4432,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
barrier();
}
 
-   raw_spin_unlock_irqrestore(&irt->lock, flags);
+   raw_spin_unlock_irqrestore(&table->lock, flags);
 
iommu_flush_irt(iommu, devid);
iommu_completion_wait(iommu);
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 04/10] iommu/amd: split irq_lookup_table out of the amd_iommu_devtable_lock

2018-03-22 Thread Sebastian Andrzej Siewior
The function get_irq_table() reads/writes irq_lookup_table while holding
the amd_iommu_devtable_lock. It also modifies
amd_iommu_dev_table[].data[2].
set_dte_entry() is using amd_iommu_dev_table[].data[0|1] (under the
domain->lock) so it should be okay. The access to the iommu is
serialized with its own (iommu's) lock.

So split out get_irq_table() out of amd_iommu_devtable_lock's lock.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index fcfdce70707d..a87d2fee68db 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -82,6 +82,7 @@
 
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
 static DEFINE_SPINLOCK(pd_bitmap_lock);
+static DEFINE_SPINLOCK(iommu_table_lock);
 
 /* List of all available dev_data structures */
 static LLIST_HEAD(dev_data_list);
@@ -3623,7 +3624,7 @@ static struct irq_remap_table *alloc_irq_table(u16 devid, 
bool ioapic)
unsigned long flags;
u16 alias;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&iommu_table_lock, flags);
 
iommu = amd_iommu_rlookup_table[devid];
if (!iommu)
@@ -3688,7 +3689,7 @@ static struct irq_remap_table *alloc_irq_table(u16 devid, 
bool ioapic)
iommu_completion_wait(iommu);
 
 out_unlock:
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&iommu_table_lock, flags);
 
return table;
 }
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/10] iommu/amd: split domain id out of amd_iommu_devtable_lock

2018-03-22 Thread Sebastian Andrzej Siewior
domain_id_alloc() and domain_id_free() is used for id management. Those
two function share a bitmap (amd_iommu_pd_alloc_bitmap) and set/clear
bits based on id allocation. There is no need to share this with
amd_iommu_devtable_lock, it can use its own lock for this operation.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index d4c2b1a11924..fcfdce70707d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -81,6 +81,7 @@
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
+static DEFINE_SPINLOCK(pd_bitmap_lock);
 
 /* List of all available dev_data structures */
 static LLIST_HEAD(dev_data_list);
@@ -1600,29 +1601,26 @@ static void del_domain_from_list(struct 
protection_domain *domain)
 
 static u16 domain_id_alloc(void)
 {
-   unsigned long flags;
int id;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock(&pd_bitmap_lock);
id = find_first_zero_bit(amd_iommu_pd_alloc_bitmap, MAX_DOMAIN_ID);
BUG_ON(id == 0);
if (id > 0 && id < MAX_DOMAIN_ID)
__set_bit(id, amd_iommu_pd_alloc_bitmap);
else
id = 0;
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock(&pd_bitmap_lock);
 
return id;
 }
 
 static void domain_id_free(int id)
 {
-   unsigned long flags;
-
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock(&pd_bitmap_lock);
if (id > 0 && id < MAX_DOMAIN_ID)
__clear_bit(id, amd_iommu_pd_alloc_bitmap);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock(&pd_bitmap_lock);
 }
 
 #define DEFINE_FREE_PT_FN(LVL, FN) \
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 02/10] iommu/amd: turn dev_data_list into a lock less list

2018-03-22 Thread Sebastian Andrzej Siewior
alloc_dev_data() adds new items to dev_data_list and search_dev_data()
is searching for items in this list. Both protect the access to the list
with a spinlock.
There is no need to navigate forth and back within the list and there is
also no deleting of a specific item. This qualifies the list to become a
lock less list and as part of this, the spinlock can be removed.
With this change the ordering of those items within the list is changed:
before the change new items were added to the end of the list, now they
are added to the front. I don't think it matters but wanted to mention
it.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c   | 28 ++--
 drivers/iommu/amd_iommu_types.h |  2 +-
 2 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 121c54f1c768..d4c2b1a11924 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -83,8 +83,7 @@
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
 
 /* List of all available dev_data structures */
-static LIST_HEAD(dev_data_list);
-static DEFINE_SPINLOCK(dev_data_list_lock);
+static LLIST_HEAD(dev_data_list);
 
 LIST_HEAD(ioapic_map);
 LIST_HEAD(hpet_map);
@@ -203,40 +202,33 @@ static struct dma_ops_domain* to_dma_ops_domain(struct 
protection_domain *domain
 static struct iommu_dev_data *alloc_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
-   unsigned long flags;
 
dev_data = kzalloc(sizeof(*dev_data), GFP_KERNEL);
if (!dev_data)
return NULL;
 
dev_data->devid = devid;
-
-   spin_lock_irqsave(&dev_data_list_lock, flags);
-   list_add_tail(&dev_data->dev_data_list, &dev_data_list);
-   spin_unlock_irqrestore(&dev_data_list_lock, flags);
-
ratelimit_default_init(&dev_data->rs);
 
+   llist_add(&dev_data->dev_data_list, &dev_data_list);
return dev_data;
 }
 
 static struct iommu_dev_data *search_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
-   unsigned long flags;
+   struct llist_node *node;
 
-   spin_lock_irqsave(&dev_data_list_lock, flags);
-   list_for_each_entry(dev_data, &dev_data_list, dev_data_list) {
+   if (llist_empty(&dev_data_list))
+   return NULL;
+
+   node = dev_data_list.first;
+   llist_for_each_entry(dev_data, node, dev_data_list) {
if (dev_data->devid == devid)
-   goto out_unlock;
+   return dev_data;
}
 
-   dev_data = NULL;
-
-out_unlock:
-   spin_unlock_irqrestore(&dev_data_list_lock, flags);
-
-   return dev_data;
+   return NULL;
 }
 
 static int __last_alias(struct pci_dev *pdev, u16 alias, void *data)
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index da886b0095aa..1c9b080276c9 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -627,7 +627,7 @@ struct devid_map {
  */
 struct iommu_dev_data {
struct list_head list;/* For domain->dev_list */
-   struct list_head dev_data_list;   /* For global dev_data_list */
+   struct llist_node dev_data_list;  /* For global dev_data_list */
struct protection_domain *domain; /* Domain the device is bound to */
u16 devid;/* PCI Device ID */
u16 alias;/* Alias Device ID */
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 01/10] iommu/amd: take into account that alloc_dev_data() may return NULL

2018-03-22 Thread Sebastian Andrzej Siewior
find_dev_data() does not check whether the return value alloc_dev_data()
is NULL. This was okay once because the pointer was returned once as-is.
Since commit df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred
call-back") the pointer may be used within find_dev_data() so a NULL
check is required.

Cc: Baoquan He 
Fixes: df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred call-back")
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4cd19f93ca15..121c54f1c768 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -310,6 +310,8 @@ static struct iommu_dev_data *find_dev_data(u16 devid)
 
if (dev_data == NULL) {
dev_data = alloc_dev_data(devid);
+   if (!dev_data)
+   return NULL;
 
if (translation_pre_enabled(iommu))
dev_data->defer_attach = true;
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 00/10 v3] iommu/amd: lock splitting & GFP_KERNEL allocation

2018-03-22 Thread Sebastian Andrzej Siewior
Hi,

this is the rebase on top of iommu/x86/amd of my last series. It takes
Scotts comments into consideration from my v2.

It contains lock splitting and GFP_KERNEL allocation of remap-table.
Mostly cleanup.

The patches were boot tested on an AMD EPYC 7601.

Sebastian

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/10 v2] iommu/amd: lock splitting & GFP_KERNEL allocation

2018-03-19 Thread Sebastian Andrzej Siewior
On 2018-03-17 16:43:39 [-0500], Scott Wood wrote:
> If that's worth the lock dropping then fine (though why does only one
> of the two allocations use GFP_KERNEL?), but it doesn't need to be a
That was a mistake, I planned to keep both as GFP_KERNEL.

> raw lock if the non-allocating users are separated.  Keeping them
> separate will also preserve the WARNs if we somehow end up in an atomic
> context with no table (versus relying on atomic sleep debugging that
> may or may not be enabled), and make the code easier to understand by
> being explicit about which functions can be used from RT-atomic
> context.

That separated part is okay. We could keep it. However, I am not sure if
looking at the table irq_lookup_table[devid] without the lock is okay.
The pointer is assigned without DTE entry/iommu-flush to be completed. 
This does not look "okay".

> -Scott
> 
Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/10 v2] iommu/amd: lock splitting & GFP_KERNEL allocation

2018-03-17 Thread Sebastian Andrzej Siewior
On 2018-03-17 14:49:54 [-0500], Scott Wood wrote:
> On Fri, 2018-03-16 at 21:18 +0100, Sebastian Andrzej Siewior wrote:
> > The goal here is to make the memory allocation in get_irq_table() not
> > with disabled interrupts and having as little raw_spin_lock as
> > possible
> > while having them if the caller is also holding one (like desc->lock
> > during IRQ-affinity changes).
> > I reverted one patch one patch in the iommu while rebasing since it
> > make job easier.
> 
> If the goal is to have "as little raw_spin_lock as possible" -- and
> presumably also to avoid unnecessary complexity -- wouldn't it be
> better to leave my patch in, and drop patches 4 and 9?

9 gives me GFP_KERNEL instead atomic so no.
4 is needed I think but I could double check on Monday. 

> -Scott

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 08/10] iommu/amd: factor out setting the remap table for a devid

2018-03-16 Thread Sebastian Andrzej Siewior
Setting the IRQ remap table for a specific devid (or its alias devid)
includes three steps. Those three steps are always repeated each time
this is done.
Introduce a new helper function, move those steps there and use that
function instead. The compiler can still decide if it is worth to
inline.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 6ee8ef22ad51..7dd4c27a941d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3602,6 +3602,14 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
+static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid,
+ struct irq_remap_table *table)
+{
+   irq_lookup_table[devid] = table;
+   set_dte_irq_entry(devid, table);
+   iommu_flush_dte(iommu, devid);
+}
+
 static struct irq_remap_table *get_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
@@ -3622,9 +3630,7 @@ static struct irq_remap_table *get_irq_table(u16 devid)
alias = amd_iommu_alias_table[devid];
table = irq_lookup_table[alias];
if (table) {
-   irq_lookup_table[devid] = table;
-   set_dte_irq_entry(devid, table);
-   iommu_flush_dte(iommu, devid);
+   set_remap_table_entry(iommu, devid, table);
goto out;
}
 
@@ -3651,14 +3657,9 @@ static struct irq_remap_table *get_irq_table(u16 devid)
   (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 
 
-   irq_lookup_table[devid] = table;
-   set_dte_irq_entry(devid, table);
-   iommu_flush_dte(iommu, devid);
-   if (devid != alias) {
-   irq_lookup_table[alias] = table;
-   set_dte_irq_entry(alias, table);
-   iommu_flush_dte(iommu, alias);
-   }
+   set_remap_table_entry(iommu, devid, table);
+   if (devid != alias)
+   set_remap_table_entry(iommu, alias, table);
 
 out:
iommu_completion_wait(iommu);
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 10/10] iommu/amd: make amd_iommu_devtable_lock a spin_lock

2018-03-16 Thread Sebastian Andrzej Siewior
Before commit 0bb6e243d7fb ("iommu/amd: Support IOMMU_DOMAIN_DMA type
allocation") amd_iommu_devtable_lock had a read_lock() user but now
there are none. In fact, after the mentioned commit we had only
write_lock() user of the lock. Since there is no reason to keep it as
writer lock, change its type to a spin_lock.
I *think* that we might even be able to remove the lock because all its
current user seem to have their own protection.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 0a7ca5e95288..6ed59f1f6d33 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -80,7 +80,7 @@
  */
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
-static DEFINE_RWLOCK(amd_iommu_devtable_lock);
+static DEFINE_SPINLOCK(amd_iommu_devtable_lock);
 static DEFINE_SPINLOCK(pd_bitmap_lock);
 static DEFINE_RAW_SPINLOCK(iommu_table_lock);
 
@@ -2097,9 +2097,9 @@ static int attach_device(struct device *dev,
}
 
 skip_ats_check:
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
ret = __attach_device(dev_data, domain);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 
/*
 * We might boot into a crash-kernel here. The crashed kernel
@@ -2149,9 +2149,9 @@ static void detach_device(struct device *dev)
domain   = dev_data->domain;
 
/* lock device table */
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
__detach_device(dev_data);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 
if (!dev_is_pci(dev))
return;
@@ -2814,7 +2814,7 @@ static void cleanup_domain(struct protection_domain 
*domain)
struct iommu_dev_data *entry;
unsigned long flags;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
 
while (!list_empty(&domain->dev_list)) {
entry = list_first_entry(&domain->dev_list,
@@ -2822,7 +2822,7 @@ static void cleanup_domain(struct protection_domain 
*domain)
__detach_device(entry);
}
 
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 }
 
 static void protection_domain_free(struct protection_domain *domain)
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 09/10] iommu/amd: drop the lock while allocating new irq remap table

2018-03-16 Thread Sebastian Andrzej Siewior
The irq_remap_table is allocated while the iommu_table_lock is held with
interrupts disabled. While this works it makes RT scream very loudly.
>From looking at the call sites, all callers are in the early device
initialisation (apic_bsp_setup(), pci_enable_device(),
pci_enable_msi()) so make sense to drop the lock which also enables
interrupts and try to allocate that memory with GFP_KERNEL instead
GFP_ATOMIC.

Since during the allocation the iommu_table_lock is dropped, we need to
recheck if table exists after the lock has been reacquired. I *think*
that it is impossible that the "devid" entry appears in irq_lookup_table
while the lock is dropped since the same device can only be probed once.
It is more likely that another device added an `alias' entry. However I
check for both cases, just to be sure.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 65 +--
 1 file changed, 46 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 7dd4c27a941d..0a7ca5e95288 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3602,6 +3602,30 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
+static struct irq_remap_table *alloc_irq_table(void)
+{
+   struct irq_remap_table *table;
+
+   table = kzalloc(sizeof(*table), GFP_ATOMIC);
+   if (!table)
+   return NULL;
+
+   table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_KERNEL);
+   if (!table->table) {
+   kfree(table);
+   return NULL;
+   }
+   raw_spin_lock_init(&table->lock);
+
+   if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
+   memset(table->table, 0,
+  MAX_IRQS_PER_TABLE * sizeof(u32));
+   else
+   memset(table->table, 0,
+  (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
+   return table;
+}
+
 static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid,
  struct irq_remap_table *table)
 {
@@ -3613,6 +3637,7 @@ static void set_remap_table_entry(struct amd_iommu 
*iommu, u16 devid,
 static struct irq_remap_table *get_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
+   struct irq_remap_table *new_table = NULL;
struct amd_iommu *iommu;
unsigned long flags;
u16 alias;
@@ -3631,42 +3656,44 @@ static struct irq_remap_table *get_irq_table(u16 devid)
table = irq_lookup_table[alias];
if (table) {
set_remap_table_entry(iommu, devid, table);
-   goto out;
+   goto out_wait;
}
+   raw_spin_unlock_irqrestore(&iommu_table_lock, flags);
 
/* Nothing there yet, allocate new irq remapping table */
-   table = kzalloc(sizeof(*table), GFP_ATOMIC);
-   if (!table)
+   new_table = alloc_irq_table();
+   if (!new_table)
+   return NULL;
+
+   raw_spin_lock_irqsave(&iommu_table_lock, flags);
+
+   table = irq_lookup_table[devid];
+   if (table)
goto out_unlock;
 
-   /* Initialize table spin-lock */
-   raw_spin_lock_init(&table->lock);
-
-   table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_ATOMIC);
-   if (!table->table) {
-   kfree(table);
-   table = NULL;
-   goto out_unlock;
+   table = irq_lookup_table[alias];
+   if (table) {
+   set_remap_table_entry(iommu, devid, table);
+   goto out_wait;
}
 
-   if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
-   memset(table->table, 0,
-  MAX_IRQS_PER_TABLE * sizeof(u32));
-   else
-   memset(table->table, 0,
-  (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
-
+   table = new_table;
+   new_table = NULL;
 
set_remap_table_entry(iommu, devid, table);
if (devid != alias)
set_remap_table_entry(iommu, alias, table);
 
-out:
+out_wait:
iommu_completion_wait(iommu);
 
 out_unlock:
raw_spin_unlock_irqrestore(&iommu_table_lock, flags);
 
+   if (new_table) {
+   kmem_cache_free(amd_iommu_irq_cache, new_table->table);
+   kfree(new_table);
+   }
return table;
 }
 
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 06/10] iommu/amd: remove the special case from get_irq_table()

2018-03-16 Thread Sebastian Andrzej Siewior
get_irq_table() has a special ioapic argument. If set then it will
pre-allocate / reserve the first 32 indexes. The argument is only once
true and it would make get_irq_table() a little simpler if we would
extract the special bits to the caller.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 42 --
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 30ad2a3fbe15..e1628ff5f6bd 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3602,7 +3602,7 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
-static struct irq_remap_table *get_irq_table(u16 devid, bool ioapic)
+static struct irq_remap_table *get_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
struct amd_iommu *iommu;
@@ -3636,10 +3636,6 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
/* Initialize table spin-lock */
raw_spin_lock_init(&table->lock);
 
-   if (ioapic)
-   /* Keep the first 32 indexes free for IOAPIC interrupts */
-   table->min_index = 32;
-
table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_ATOMIC);
if (!table->table) {
kfree(table);
@@ -3654,12 +3650,6 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
memset(table->table, 0,
   (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 
-   if (ioapic) {
-   int i;
-
-   for (i = 0; i < 32; ++i)
-   iommu->irte_ops->set_allocated(table, i);
-   }
 
irq_lookup_table[devid] = table;
set_dte_irq_entry(devid, table);
@@ -3689,7 +3679,7 @@ static int alloc_irq_index(u16 devid, int count, bool 
align)
if (!iommu)
return -ENODEV;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return -ENODEV;
 
@@ -3740,7 +3730,7 @@ static int modify_irte_ga(u16 devid, int index, struct 
irte_ga *irte,
if (iommu == NULL)
return -EINVAL;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return -ENOMEM;
 
@@ -3773,7 +3763,7 @@ static int modify_irte(u16 devid, int index, union irte 
*irte)
if (iommu == NULL)
return -EINVAL;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return -ENOMEM;
 
@@ -3797,7 +3787,7 @@ static void free_irte(u16 devid, int index)
if (iommu == NULL)
return;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return;
 
@@ -4115,10 +4105,26 @@ static int irq_remapping_alloc(struct irq_domain 
*domain, unsigned int virq,
return ret;
 
if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
-   if (get_irq_table(devid, true))
+   struct irq_remap_table *table;
+   struct amd_iommu *iommu;
+
+   table = get_irq_table(devid);
+   if (table) {
+   if (!table->min_index) {
+   /*
+* Keep the first 32 indexes free for IOAPIC
+* interrupts.
+*/
+   table->min_index = 32;
+   iommu = amd_iommu_rlookup_table[devid];
+   for (i = 0; i < 32; ++i)
+   iommu->irte_ops->set_allocated(table, 
i);
+   }
index = info->ioapic_pin;
-   else
+   WARN_ON(table->min_index != 32);
+   } else {
ret = -ENOMEM;
+   }
} else {
bool align = (info->type == X86_IRQ_ALLOC_TYPE_MSI);
 
@@ -4398,7 +4404,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
if (!iommu)
return -ENODEV;
 
-   irt = get_irq_table(devid, false);
+   irt = get_irq_table(devid);
if (!irt)
return -ENODEV;
 
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 00/10 v2] iommu/amd: lock splitting & GFP_KERNEL allocation

2018-03-16 Thread Sebastian Andrzej Siewior
The goal here is to make the memory allocation in get_irq_table() not
with disabled interrupts and having as little raw_spin_lock as possible
while having them if the caller is also holding one (like desc->lock
during IRQ-affinity changes).
I reverted one patch one patch in the iommu while rebasing since it
make job easier.

The patches were boot tested on an AMD EPYC 7601.

Sebastian

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 07/10] iommu/amd: use `table' instead `irt' as variable name in amd_iommu_update_ga()

2018-03-16 Thread Sebastian Andrzej Siewior
The variable of type struct irq_remap_table is always named `table'
except in amd_iommu_update_ga() where it is called `irt'. Make it
consistent and name it also `table'.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e1628ff5f6bd..6ee8ef22ad51 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4390,7 +4390,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
 {
unsigned long flags;
struct amd_iommu *iommu;
-   struct irq_remap_table *irt;
+   struct irq_remap_table *table;
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
int devid = ir_data->irq_2_irte.devid;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
@@ -4404,11 +4404,11 @@ int amd_iommu_update_ga(int cpu, bool is_run, void 
*data)
if (!iommu)
return -ENODEV;
 
-   irt = get_irq_table(devid);
-   if (!irt)
+   table = get_irq_table(devid);
+   if (!table)
return -ENODEV;
 
-   raw_spin_lock_irqsave(&irt->lock, flags);
+   raw_spin_lock_irqsave(&table->lock, flags);
 
if (ref->lo.fields_vapic.guest_mode) {
if (cpu >= 0)
@@ -4417,7 +4417,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
barrier();
}
 
-   raw_spin_unlock_irqrestore(&irt->lock, flags);
+   raw_spin_unlock_irqrestore(&table->lock, flags);
 
iommu_flush_irt(iommu, devid);
iommu_completion_wait(iommu);
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 05/10] Revert "iommu/amd: Avoid locking get_irq_table() from atomic context"

2018-03-16 Thread Sebastian Andrzej Siewior
This reverts commit df42a04b15f1 ("iommu/amd: Avoid locking
get_irq_table() from atomic context").
Its goal is to avoid a warning/bug on RT. While I generally support that
goal this change is colliding with larger rework which accomplishes the
same goal but different.

Cc: Scott Wood 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 29 +++--
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 5191319d9f0a..30ad2a3fbe15 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3602,22 +3602,7 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
-static struct irq_remap_table *get_irq_table(u16 devid)
-{
-   struct irq_remap_table *table;
-
-   if (WARN_ONCE(!amd_iommu_rlookup_table[devid],
- "%s: no iommu for devid %x\n", __func__, devid))
-   return NULL;
-
-   table = irq_lookup_table[devid];
-   if (WARN_ONCE(!table, "%s: no table for devid %x\n", __func__, devid))
-   return NULL;
-
-   return table;
-}
-
-static struct irq_remap_table *alloc_irq_table(u16 devid, bool ioapic)
+static struct irq_remap_table *get_irq_table(u16 devid, bool ioapic)
 {
struct irq_remap_table *table = NULL;
struct amd_iommu *iommu;
@@ -3704,7 +3689,7 @@ static int alloc_irq_index(u16 devid, int count, bool 
align)
if (!iommu)
return -ENODEV;
 
-   table = alloc_irq_table(devid, false);
+   table = get_irq_table(devid, false);
if (!table)
return -ENODEV;
 
@@ -3755,7 +3740,7 @@ static int modify_irte_ga(u16 devid, int index, struct 
irte_ga *irte,
if (iommu == NULL)
return -EINVAL;
 
-   table = get_irq_table(devid);
+   table = get_irq_table(devid, false);
if (!table)
return -ENOMEM;
 
@@ -3788,7 +3773,7 @@ static int modify_irte(u16 devid, int index, union irte 
*irte)
if (iommu == NULL)
return -EINVAL;
 
-   table = get_irq_table(devid);
+   table = get_irq_table(devid, false);
if (!table)
return -ENOMEM;
 
@@ -3812,7 +3797,7 @@ static void free_irte(u16 devid, int index)
if (iommu == NULL)
return;
 
-   table = get_irq_table(devid);
+   table = get_irq_table(devid, false);
if (!table)
return;
 
@@ -4130,7 +4115,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
return ret;
 
if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
-   if (alloc_irq_table(devid, true))
+   if (get_irq_table(devid, true))
index = info->ioapic_pin;
else
ret = -ENOMEM;
@@ -4413,7 +4398,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
if (!iommu)
return -ENODEV;
 
-   irt = get_irq_table(devid);
+   irt = get_irq_table(devid, false);
if (!irt)
return -ENODEV;
 
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 04/10] iommu/amd: split irq_lookup_table out of the amd_iommu_devtable_lock

2018-03-16 Thread Sebastian Andrzej Siewior
The function get_irq_table() reads/writes irq_lookup_table while holding
the amd_iommu_devtable_lock. It also modifies
amd_iommu_dev_table[].data[2].
set_dte_entry() is using amd_iommu_dev_table[].data[0|1] (under the
domain->lock) so it should be okay. The access to the iommu is
serialized with its own (iommu's) lock.

So split out get_irq_table() out of amd_iommu_devtable_lock's lock. The
new lock is a raw_spin_lock because modify_irte_ga() is called while
desc->lock is held (which is raw).

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 692b2e3b9af1..5191319d9f0a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -82,6 +82,7 @@
 
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
 static DEFINE_SPINLOCK(pd_bitmap_lock);
+static DEFINE_RAW_SPINLOCK(iommu_table_lock);
 
 /* List of all available dev_data structures */
 static LLIST_HEAD(dev_data_list);
@@ -3623,7 +3624,7 @@ static struct irq_remap_table *alloc_irq_table(u16 devid, 
bool ioapic)
unsigned long flags;
u16 alias;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   raw_spin_lock_irqsave(&iommu_table_lock, flags);
 
iommu = amd_iommu_rlookup_table[devid];
if (!iommu)
@@ -3688,7 +3689,7 @@ static struct irq_remap_table *alloc_irq_table(u16 devid, 
bool ioapic)
iommu_completion_wait(iommu);
 
 out_unlock:
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   raw_spin_unlock_irqrestore(&iommu_table_lock, flags);
 
return table;
 }
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/10] iommu/amd: split domain id out of amd_iommu_devtable_lock

2018-03-16 Thread Sebastian Andrzej Siewior
domain_id_alloc() and domain_id_free() is used for id management. Those
two function share a bitmap (amd_iommu_pd_alloc_bitmap) and set/clear
bits based on id allocation. There is no need to share this with
amd_iommu_devtable_lock, it can use its own lock for this operation.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 0e57ce2c258b..692b2e3b9af1 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -81,6 +81,7 @@
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
+static DEFINE_SPINLOCK(pd_bitmap_lock);
 
 /* List of all available dev_data structures */
 static LLIST_HEAD(dev_data_list);
@@ -1600,29 +1601,26 @@ static void del_domain_from_list(struct 
protection_domain *domain)
 
 static u16 domain_id_alloc(void)
 {
-   unsigned long flags;
int id;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock(&pd_bitmap_lock);
id = find_first_zero_bit(amd_iommu_pd_alloc_bitmap, MAX_DOMAIN_ID);
BUG_ON(id == 0);
if (id > 0 && id < MAX_DOMAIN_ID)
__set_bit(id, amd_iommu_pd_alloc_bitmap);
else
id = 0;
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock(&pd_bitmap_lock);
 
return id;
 }
 
 static void domain_id_free(int id)
 {
-   unsigned long flags;
-
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock(&pd_bitmap_lock);
if (id > 0 && id < MAX_DOMAIN_ID)
__clear_bit(id, amd_iommu_pd_alloc_bitmap);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock(&pd_bitmap_lock);
 }
 
 #define DEFINE_FREE_PT_FN(LVL, FN) \
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 01/10] iommu/amd: take into account that alloc_dev_data() may return NULL

2018-03-16 Thread Sebastian Andrzej Siewior
find_dev_data() does not check whether the return value alloc_dev_data()
is NULL. This was okay once because the pointer was returned once as-is.
Since commit df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred
call-back") the pointer may be used within find_dev_data() so a NULL
check is required.

Cc: Baoquan He 
Fixes: df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred call-back")
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 05f55903dbf6..b50dcb17c68f 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -310,6 +310,8 @@ static struct iommu_dev_data *find_dev_data(u16 devid)
 
if (dev_data == NULL) {
dev_data = alloc_dev_data(devid);
+   if (!dev_data)
+   return NULL;
 
if (translation_pre_enabled(iommu))
dev_data->defer_attach = true;
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 02/10] iommu/amd: turn dev_data_list into a lock less list

2018-03-16 Thread Sebastian Andrzej Siewior
alloc_dev_data() adds new items to dev_data_list and search_dev_data()
is searching for items in this list. Both protect the access to the list
with a spinlock.
There is no need to navigate forth and back within the list and there is
also no deleting of a specific item. This qualifies the list to become a
lock less list and as part of this, the spinlock can be removed.
With this change the ordering of those items within the list is changed:
before the change new items were added to the end of the list, now they
are added to the front. I don't think it matters but wanted to mention
it.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c   | 28 ++--
 drivers/iommu/amd_iommu_types.h |  2 +-
 2 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b50dcb17c68f..0e57ce2c258b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -83,8 +83,7 @@
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
 
 /* List of all available dev_data structures */
-static LIST_HEAD(dev_data_list);
-static DEFINE_SPINLOCK(dev_data_list_lock);
+static LLIST_HEAD(dev_data_list);
 
 LIST_HEAD(ioapic_map);
 LIST_HEAD(hpet_map);
@@ -203,40 +202,33 @@ static struct dma_ops_domain* to_dma_ops_domain(struct 
protection_domain *domain
 static struct iommu_dev_data *alloc_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
-   unsigned long flags;
 
dev_data = kzalloc(sizeof(*dev_data), GFP_KERNEL);
if (!dev_data)
return NULL;
 
dev_data->devid = devid;
-
-   spin_lock_irqsave(&dev_data_list_lock, flags);
-   list_add_tail(&dev_data->dev_data_list, &dev_data_list);
-   spin_unlock_irqrestore(&dev_data_list_lock, flags);
-
ratelimit_default_init(&dev_data->rs);
 
+   llist_add(&dev_data->dev_data_list, &dev_data_list);
return dev_data;
 }
 
 static struct iommu_dev_data *search_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
-   unsigned long flags;
+   struct llist_node *node;
 
-   spin_lock_irqsave(&dev_data_list_lock, flags);
-   list_for_each_entry(dev_data, &dev_data_list, dev_data_list) {
+   if (llist_empty(&dev_data_list))
+   return NULL;
+
+   node = dev_data_list.first;
+   llist_for_each_entry(dev_data, node, dev_data_list) {
if (dev_data->devid == devid)
-   goto out_unlock;
+   return dev_data;
}
 
-   dev_data = NULL;
-
-out_unlock:
-   spin_unlock_irqrestore(&dev_data_list_lock, flags);
-
-   return dev_data;
+   return NULL;
 }
 
 static int __last_alias(struct pci_dev *pdev, u16 alias, void *data)
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index da886b0095aa..1c9b080276c9 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -627,7 +627,7 @@ struct devid_map {
  */
 struct iommu_dev_data {
struct list_head list;/* For domain->dev_list */
-   struct list_head dev_data_list;   /* For global dev_data_list */
+   struct llist_node dev_data_list;  /* For global dev_data_list */
struct protection_domain *domain; /* Domain the device is bound to */
u16 devid;/* PCI Device ID */
u16 alias;/* Alias Device ID */
-- 
2.16.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 05/10] iommu/amd: remove the special case from get_irq_table()

2018-03-15 Thread Sebastian Andrzej Siewior
On 2018-03-15 16:19:17 [+0100], Joerg Roedel wrote:
> Okay, if the irq-layer does the needed locking, then we don't need
> another lock here. There is the modify_irte_ga() path for the
> virtualized irq routing into KVM guests, but there should be no KVM
> guests running when setup the ioapic routing entries.

okay then. Let me rebase the queue on top whatever you have now and then
if the box still boots I will post them again.

>   Joerg

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 05/10] iommu/amd: remove the special case from get_irq_table()

2018-03-15 Thread Sebastian Andrzej Siewior
On 2018-03-15 14:07:23 [+0100], Joerg Roedel wrote:
> On Thu, Mar 15, 2018 at 02:01:53PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2018-03-15 13:53:42 [+0100], Joerg Roedel wrote:
> > > On Fri, Feb 23, 2018 at 11:27:31PM +0100, Sebastian Andrzej Siewior wrote:
> > > > @@ -4103,10 +4093,26 @@ static int irq_remapping_alloc(struct 
> > > > irq_domain *domain, unsigned int virq,
> > > > return ret;
> > > >  
> > > > if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
> > > > -   if (get_irq_table(devid, true))
> > > > +   struct irq_remap_table *table;
> > > > +   struct amd_iommu *iommu;
> > > > +
> > > > +   table = get_irq_table(devid);
> > > > +   if (table) {
> > > > +   if (!table->min_index) {
> > > > +   /*
> > > > +* Keep the first 32 indexes free for 
> > > > IOAPIC
> > > > +* interrupts.
> > > > +*/
> > > > +   table->min_index = 32;
> > > > +   iommu = amd_iommu_rlookup_table[devid];
> > > > +   for (i = 0; i < 32; ++i)
> > > > +   
> > > > iommu->irte_ops->set_allocated(table, i);
> > > > +   }
> > > 
> > > I think this needs to be protected by the table->lock.
> > 
> > Which part any why? The !->min_index part plus extending(setting to 32)?
> 
> In particular the set_allocated part, when get_irq_table() returns the
> table is visible to other users as well. I have not checked the
> irq-layer locking to be sure that can happen, though.

->set_allocated() operates only on 0…31 and other could be used at the
same time. However 0…31 should be accessed by other user before they are
ready.

irq_remapping_alloc() is that ->alloc() callback invoked via
irq_domain_alloc_irqs_hierarchy() and each caller has to hold the
&irq_domain_mutex mutex. So we should not have those in parallel.

Is it possible to have those entries accessed before the setup is
complete? My understanding is that this setup is performed once during
boot (especially that ioapic part) and not again.

From looking at that hunk, it should not hurt to add that lock, just
wanted to check it is really needed.

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 05/10] iommu/amd: remove the special case from get_irq_table()

2018-03-15 Thread Sebastian Andrzej Siewior
On 2018-03-15 13:53:42 [+0100], Joerg Roedel wrote:
> On Fri, Feb 23, 2018 at 11:27:31PM +0100, Sebastian Andrzej Siewior wrote:
> > @@ -4103,10 +4093,26 @@ static int irq_remapping_alloc(struct irq_domain 
> > *domain, unsigned int virq,
> > return ret;
> >  
> > if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
> > -   if (get_irq_table(devid, true))
> > +   struct irq_remap_table *table;
> > +   struct amd_iommu *iommu;
> > +
> > +   table = get_irq_table(devid);
> > +   if (table) {
> > +   if (!table->min_index) {
> > +   /*
> > +* Keep the first 32 indexes free for IOAPIC
> > +* interrupts.
> > +*/
> > +   table->min_index = 32;
> > +   iommu = amd_iommu_rlookup_table[devid];
> > +   for (i = 0; i < 32; ++i)
> > +   iommu->irte_ops->set_allocated(table, 
> > i);
> > +   }
> 
> I think this needs to be protected by the table->lock.

Which part any why? The !->min_index part plus extending(setting to 32)?

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 10/10] iommu/amd: make amd_iommu_devtable_lock a spin_lock

2018-02-23 Thread Sebastian Andrzej Siewior
Before commit 0bb6e243d7fb ("iommu/amd: Support IOMMU_DOMAIN_DMA type
allocation") amd_iommu_devtable_lock had a read_lock() user but now
there are none. In fact, after the mentioned commit we had only
write_lock() user of the lock. Since there is no reason to keep it as
writer lock, change its type to a spin_lock.
I *think* that we might even be able to remove the lock because all its
current user seem to have their own protection.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 04b7d263523f..cabb02cf6d42 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -80,7 +80,7 @@
  */
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
-static DEFINE_RWLOCK(amd_iommu_devtable_lock);
+static DEFINE_SPINLOCK(amd_iommu_devtable_lock);
 static DEFINE_SPINLOCK(pd_bitmap_lock);
 static DEFINE_RAW_SPINLOCK(iommu_table_lock);
 
@@ -2096,9 +2096,9 @@ static int attach_device(struct device *dev,
}
 
 skip_ats_check:
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
ret = __attach_device(dev_data, domain);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 
/*
 * We might boot into a crash-kernel here. The crashed kernel
@@ -2148,9 +2148,9 @@ static void detach_device(struct device *dev)
domain   = dev_data->domain;
 
/* lock device table */
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
__detach_device(dev_data);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 
if (!dev_is_pci(dev))
return;
@@ -2813,7 +2813,7 @@ static void cleanup_domain(struct protection_domain 
*domain)
struct iommu_dev_data *entry;
unsigned long flags;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock_irqsave(&amd_iommu_devtable_lock, flags);
 
while (!list_empty(&domain->dev_list)) {
entry = list_first_entry(&domain->dev_list,
@@ -2821,7 +2821,7 @@ static void cleanup_domain(struct protection_domain 
*domain)
__detach_device(entry);
}
 
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
 }
 
 static void protection_domain_free(struct protection_domain *domain)
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 05/10] iommu/amd: remove the special case from get_irq_table()

2018-02-23 Thread Sebastian Andrzej Siewior
get_irq_table() has a special ioapic argument. If set then it will
pre-allocate / reserve the first 32 indexes. The argument is only once
true and it would make get_irq_table() a little simpler if we would
extract the special bits to the caller.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 42 --
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 72487ac43eef..19de479fe21c 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3588,7 +3588,7 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
-static struct irq_remap_table *get_irq_table(u16 devid, bool ioapic)
+static struct irq_remap_table *get_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
struct amd_iommu *iommu;
@@ -3622,10 +3622,6 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
/* Initialize table spin-lock */
spin_lock_init(&table->lock);
 
-   if (ioapic)
-   /* Keep the first 32 indexes free for IOAPIC interrupts */
-   table->min_index = 32;
-
table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_ATOMIC);
if (!table->table) {
kfree(table);
@@ -3640,12 +3636,6 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
memset(table->table, 0,
   (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 
-   if (ioapic) {
-   int i;
-
-   for (i = 0; i < 32; ++i)
-   iommu->irte_ops->set_allocated(table, i);
-   }
 
irq_lookup_table[devid] = table;
set_dte_irq_entry(devid, table);
@@ -3675,7 +3665,7 @@ static int alloc_irq_index(u16 devid, int count, bool 
align)
if (!iommu)
return -ENODEV;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return -ENODEV;
 
@@ -3726,7 +3716,7 @@ static int modify_irte_ga(u16 devid, int index, struct 
irte_ga *irte,
if (iommu == NULL)
return -EINVAL;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return -ENOMEM;
 
@@ -3759,7 +3749,7 @@ static int modify_irte(u16 devid, int index, union irte 
*irte)
if (iommu == NULL)
return -EINVAL;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return -ENOMEM;
 
@@ -3783,7 +3773,7 @@ static void free_irte(u16 devid, int index)
if (iommu == NULL)
return;
 
-   table = get_irq_table(devid, false);
+   table = get_irq_table(devid);
if (!table)
return;
 
@@ -4103,10 +4093,26 @@ static int irq_remapping_alloc(struct irq_domain 
*domain, unsigned int virq,
return ret;
 
if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
-   if (get_irq_table(devid, true))
+   struct irq_remap_table *table;
+   struct amd_iommu *iommu;
+
+   table = get_irq_table(devid);
+   if (table) {
+   if (!table->min_index) {
+   /*
+* Keep the first 32 indexes free for IOAPIC
+* interrupts.
+*/
+   table->min_index = 32;
+   iommu = amd_iommu_rlookup_table[devid];
+   for (i = 0; i < 32; ++i)
+   iommu->irte_ops->set_allocated(table, 
i);
+   }
index = info->ioapic_pin;
-   else
+   WARN_ON(table->min_index != 32);
+   } else {
ret = -ENOMEM;
+   }
} else {
bool align = (info->type == X86_IRQ_ALLOC_TYPE_MSI);
 
@@ -4386,7 +4392,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
if (!iommu)
return -ENODEV;
 
-   irt = get_irq_table(devid, false);
+   irt = get_irq_table(devid);
if (!irt)
return -ENODEV;
 
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 09/10] iommu/amd: declare irq_remap_table's and amd_iommu's lock as a raw_spin_lock

2018-02-23 Thread Sebastian Andrzej Siewior
The irq affinity setting is called while desc->lock is held. The
desc->lock is a raw_spin_lock called with interrupts disabled. The
call chain involves modify_irte_ga() which needs to take the
irq_remap_table->lock in order to update the entry and later iommu->lock
in order to update and flush the iommu.
The latter is also required during table allocation.

I currently don't see a feasible way of getting this done without
turning both locks raw so here it is.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c   | 30 +++---
 drivers/iommu/amd_iommu_init.c  |  2 +-
 drivers/iommu/amd_iommu_types.h |  4 ++--
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index de6cc41d6cd2..04b7d263523f 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1052,9 +1052,9 @@ static int iommu_queue_command_sync(struct amd_iommu 
*iommu,
unsigned long flags;
int ret;
 
-   spin_lock_irqsave(&iommu->lock, flags);
+   raw_spin_lock_irqsave(&iommu->lock, flags);
ret = __iommu_queue_command_sync(iommu, cmd, sync);
-   spin_unlock_irqrestore(&iommu->lock, flags);
+   raw_spin_unlock_irqrestore(&iommu->lock, flags);
 
return ret;
 }
@@ -1080,7 +1080,7 @@ static int iommu_completion_wait(struct amd_iommu *iommu)
 
build_completion_wait(&cmd, (u64)&iommu->cmd_sem);
 
-   spin_lock_irqsave(&iommu->lock, flags);
+   raw_spin_lock_irqsave(&iommu->lock, flags);
 
iommu->cmd_sem = 0;
 
@@ -1091,7 +1091,7 @@ static int iommu_completion_wait(struct amd_iommu *iommu)
ret = wait_on_sem(&iommu->cmd_sem);
 
 out_unlock:
-   spin_unlock_irqrestore(&iommu->lock, flags);
+   raw_spin_unlock_irqrestore(&iommu->lock, flags);
 
return ret;
 }
@@ -3601,7 +3601,7 @@ static struct irq_remap_table *alloc_irq_table(void)
kfree(table);
return NULL;
}
-   spin_lock_init(&table->lock);
+   raw_spin_lock_init(&table->lock);
 
if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
memset(table->table, 0,
@@ -3700,7 +3700,7 @@ static int alloc_irq_index(u16 devid, int count, bool 
align)
if (align)
alignment = roundup_pow_of_two(count);
 
-   spin_lock_irqsave(&table->lock, flags);
+   raw_spin_lock_irqsave(&table->lock, flags);
 
/* Scan table for free entries */
for (index = ALIGN(table->min_index, alignment), c = 0;
@@ -3727,7 +3727,7 @@ static int alloc_irq_index(u16 devid, int count, bool 
align)
index = -ENOSPC;
 
 out:
-   spin_unlock_irqrestore(&table->lock, flags);
+   raw_spin_unlock_irqrestore(&table->lock, flags);
 
return index;
 }
@@ -3748,7 +3748,7 @@ static int modify_irte_ga(u16 devid, int index, struct 
irte_ga *irte,
if (!table)
return -ENOMEM;
 
-   spin_lock_irqsave(&table->lock, flags);
+   raw_spin_lock_irqsave(&table->lock, flags);
 
entry = (struct irte_ga *)table->table;
entry = &entry[index];
@@ -3759,7 +3759,7 @@ static int modify_irte_ga(u16 devid, int index, struct 
irte_ga *irte,
if (data)
data->ref = entry;
 
-   spin_unlock_irqrestore(&table->lock, flags);
+   raw_spin_unlock_irqrestore(&table->lock, flags);
 
iommu_flush_irt(iommu, devid);
iommu_completion_wait(iommu);
@@ -3781,9 +3781,9 @@ static int modify_irte(u16 devid, int index, union irte 
*irte)
if (!table)
return -ENOMEM;
 
-   spin_lock_irqsave(&table->lock, flags);
+   raw_spin_lock_irqsave(&table->lock, flags);
table->table[index] = irte->val;
-   spin_unlock_irqrestore(&table->lock, flags);
+   raw_spin_unlock_irqrestore(&table->lock, flags);
 
iommu_flush_irt(iommu, devid);
iommu_completion_wait(iommu);
@@ -3805,9 +3805,9 @@ static void free_irte(u16 devid, int index)
if (!table)
return;
 
-   spin_lock_irqsave(&table->lock, flags);
+   raw_spin_lock_irqsave(&table->lock, flags);
iommu->irte_ops->clear_allocated(table, index);
-   spin_unlock_irqrestore(&table->lock, flags);
+   raw_spin_unlock_irqrestore(&table->lock, flags);
 
iommu_flush_irt(iommu, devid);
iommu_completion_wait(iommu);
@@ -4424,7 +4424,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
if (!table)
return -ENODEV;
 
-   spin_lock_irqsave(&table->lock, flags);
+   raw_spin_lock_irqsave(&table->lock, flags);
 
if (ref->lo.fields_vapic.guest_mode) {
if (cpu >= 0)

[PATCH 01/10] iommu/amd: take into account that alloc_dev_data() may return NULL

2018-02-23 Thread Sebastian Andrzej Siewior
find_dev_data() does not check whether the return value alloc_dev_data()
is NULL. This was okay once because the pointer was returned once as-is.
Since commit df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred
call-back") the pointer may be used within find_dev_data() so a NULL
check is required.

Cc: Baoquan He 
Fixes: df3f7a6e8e85 ("iommu/amd: Use is_attach_deferred call-back")
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 74788fdeb773..8b591c192daf 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -310,6 +310,8 @@ static struct iommu_dev_data *find_dev_data(u16 devid)
 
if (dev_data == NULL) {
dev_data = alloc_dev_data(devid);
+   if (!dev_data)
+   return NULL;
 
if (translation_pre_enabled(iommu))
dev_data->defer_attach = true;
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 06/10] iommu/amd: use `table' instead `irt' as variable name in amd_iommu_update_ga()

2018-02-23 Thread Sebastian Andrzej Siewior
The variable of type struct irq_remap_table is always named `table'
except in amd_iommu_update_ga() where it is called `irt'. Make it
consistent and name it also `table'.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 19de479fe21c..d3a05d794218 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4378,7 +4378,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
 {
unsigned long flags;
struct amd_iommu *iommu;
-   struct irq_remap_table *irt;
+   struct irq_remap_table *table;
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
int devid = ir_data->irq_2_irte.devid;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
@@ -4392,11 +4392,11 @@ int amd_iommu_update_ga(int cpu, bool is_run, void 
*data)
if (!iommu)
return -ENODEV;
 
-   irt = get_irq_table(devid);
-   if (!irt)
+   table = get_irq_table(devid);
+   if (!table)
return -ENODEV;
 
-   spin_lock_irqsave(&irt->lock, flags);
+   spin_lock_irqsave(&table->lock, flags);
 
if (ref->lo.fields_vapic.guest_mode) {
if (cpu >= 0)
@@ -4405,7 +4405,7 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *data)
barrier();
}
 
-   spin_unlock_irqrestore(&irt->lock, flags);
+   spin_unlock_irqrestore(&table->lock, flags);
 
iommu_flush_irt(iommu, devid);
iommu_completion_wait(iommu);
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 07/10] iommu/amd: factor out setting the remap table for a devid

2018-02-23 Thread Sebastian Andrzej Siewior
Setting the IRQ remap table for a specific devid (or its alias devid)
includes three steps. Those three steps are always repeated each time
this is done.
Introduce a new helper function, move those steps there and use that
function instead. The compiler can still decide if it is worth to
inline.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index d3a05d794218..b763fcbd790d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3588,6 +3588,14 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
+static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid,
+ struct irq_remap_table *table)
+{
+   irq_lookup_table[devid] = table;
+   set_dte_irq_entry(devid, table);
+   iommu_flush_dte(iommu, devid);
+}
+
 static struct irq_remap_table *get_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
@@ -3608,9 +3616,7 @@ static struct irq_remap_table *get_irq_table(u16 devid)
alias = amd_iommu_alias_table[devid];
table = irq_lookup_table[alias];
if (table) {
-   irq_lookup_table[devid] = table;
-   set_dte_irq_entry(devid, table);
-   iommu_flush_dte(iommu, devid);
+   set_remap_table_entry(iommu, devid, table);
goto out;
}
 
@@ -3637,14 +3643,9 @@ static struct irq_remap_table *get_irq_table(u16 devid)
   (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 
 
-   irq_lookup_table[devid] = table;
-   set_dte_irq_entry(devid, table);
-   iommu_flush_dte(iommu, devid);
-   if (devid != alias) {
-   irq_lookup_table[alias] = table;
-   set_dte_irq_entry(alias, table);
-   iommu_flush_dte(iommu, alias);
-   }
+   set_remap_table_entry(iommu, devid, table);
+   if (devid != alias)
+   set_remap_table_entry(iommu, alias, table);
 
 out:
iommu_completion_wait(iommu);
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 08/10] iommu/amd: drop the lock while allocating new irq remap table

2018-02-23 Thread Sebastian Andrzej Siewior
The irq_remap_table is allocated while the iommu_table_lock is held with
interrupts disabled. While this works it makes RT scream very loudly.
>From looking at the call sites, all callers are in the early device
initialisation (apic_bsp_setup(), pci_enable_device(),
pci_enable_msi()) so make sense to drop the lock which also enables
interrupts and try to allocate that memory with GFP_KERNEL instead
GFP_ATOMIC.

Since during the allocation the iommu_table_lock is dropped, we need to
recheck if table exists after the lock has been reacquired. I *think*
that it is impossible that the "devid" entry appears in irq_lookup_table
while the lock is dropped since the same device can only be probed once.
It is more likely that another device added an `alias' entry. However I
check for both cases, just to be sure.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 65 +--
 1 file changed, 46 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b763fcbd790d..de6cc41d6cd2 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3588,6 +3588,30 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
+static struct irq_remap_table *alloc_irq_table(void)
+{
+   struct irq_remap_table *table;
+
+   table = kzalloc(sizeof(*table), GFP_ATOMIC);
+   if (!table)
+   return NULL;
+
+   table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_KERNEL);
+   if (!table->table) {
+   kfree(table);
+   return NULL;
+   }
+   spin_lock_init(&table->lock);
+
+   if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
+   memset(table->table, 0,
+  MAX_IRQS_PER_TABLE * sizeof(u32));
+   else
+   memset(table->table, 0,
+  (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
+   return table;
+}
+
 static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid,
  struct irq_remap_table *table)
 {
@@ -3599,6 +3623,7 @@ static void set_remap_table_entry(struct amd_iommu 
*iommu, u16 devid,
 static struct irq_remap_table *get_irq_table(u16 devid)
 {
struct irq_remap_table *table = NULL;
+   struct irq_remap_table *new_table = NULL;
struct amd_iommu *iommu;
unsigned long flags;
u16 alias;
@@ -3617,42 +3642,44 @@ static struct irq_remap_table *get_irq_table(u16 devid)
table = irq_lookup_table[alias];
if (table) {
set_remap_table_entry(iommu, devid, table);
-   goto out;
+   goto out_wait;
}
+   raw_spin_unlock_irqrestore(&iommu_table_lock, flags);
 
/* Nothing there yet, allocate new irq remapping table */
-   table = kzalloc(sizeof(*table), GFP_ATOMIC);
-   if (!table)
+   new_table = alloc_irq_table();
+   if (!new_table)
+   return NULL;
+
+   raw_spin_lock_irqsave(&iommu_table_lock, flags);
+
+   table = irq_lookup_table[devid];
+   if (table)
goto out_unlock;
 
-   /* Initialize table spin-lock */
-   spin_lock_init(&table->lock);
-
-   table->table = kmem_cache_alloc(amd_iommu_irq_cache, GFP_ATOMIC);
-   if (!table->table) {
-   kfree(table);
-   table = NULL;
-   goto out_unlock;
+   table = irq_lookup_table[alias];
+   if (table) {
+   set_remap_table_entry(iommu, devid, table);
+   goto out_wait;
}
 
-   if (!AMD_IOMMU_GUEST_IR_GA(amd_iommu_guest_ir))
-   memset(table->table, 0,
-  MAX_IRQS_PER_TABLE * sizeof(u32));
-   else
-   memset(table->table, 0,
-  (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
-
+   table = new_table;
+   new_table = NULL;
 
set_remap_table_entry(iommu, devid, table);
if (devid != alias)
set_remap_table_entry(iommu, alias, table);
 
-out:
+out_wait:
iommu_completion_wait(iommu);
 
 out_unlock:
raw_spin_unlock_irqrestore(&iommu_table_lock, flags);
 
+   if (new_table) {
+   kmem_cache_free(amd_iommu_irq_cache, new_table->table);
+   kfree(new_table);
+   }
return table;
 }
 
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


iommu/amd: lock splitting & GFP_KERNEL allocation

2018-02-23 Thread Sebastian Andrzej Siewior
Hi,

I have no idea why but suddenly my A10 box complained loudly about
locking and memory allocations within the iommu code under RT. Looking
at the code it has been like this for a longer time so the iommu must
have appeared recently (well there was a bios upgrade due to other
issues so it might have enabled it).

The goal here is to make the memory allocation in get_irq_table() not
with disabled interrupts and having as little raw_spin_lock as possible
while having them if the caller is also holding one (like desc->lock
during IRQ-affinity changes).

The patches were boot tested on an AMD EPYC 7601.

Sebastian

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 04/10] iommu/amd: split irq_lookup_table out of the amd_iommu_devtable_lock

2018-02-23 Thread Sebastian Andrzej Siewior
The function get_irq_table() reads/writes irq_lookup_table while holding
the amd_iommu_devtable_lock. It also modifies
amd_iommu_dev_table[].data[2].
set_dte_entry() is using amd_iommu_dev_table[].data[0|1] (under the
domain->lock) so it should be okay. The access to the iommu is
serialized with its own (iommu's) lock.

So split out get_irq_table() out of amd_iommu_devtable_lock's lock. The
new lock is a raw_spin_lock because modify_irte_ga() is called while
desc->lock is held (which is raw).

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 958efe311057..72487ac43eef 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -82,6 +82,7 @@
 
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
 static DEFINE_SPINLOCK(pd_bitmap_lock);
+static DEFINE_RAW_SPINLOCK(iommu_table_lock);
 
 /* List of all available dev_data structures */
 static LLIST_HEAD(dev_data_list);
@@ -3594,7 +3595,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
unsigned long flags;
u16 alias;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   raw_spin_lock_irqsave(&iommu_table_lock, flags);
 
iommu = amd_iommu_rlookup_table[devid];
if (!iommu)
@@ -3659,7 +3660,7 @@ static struct irq_remap_table *get_irq_table(u16 devid, 
bool ioapic)
iommu_completion_wait(iommu);
 
 out_unlock:
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   raw_spin_unlock_irqrestore(&iommu_table_lock, flags);
 
return table;
 }
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/10] iommu/amd: split domain id out of amd_iommu_devtable_lock

2018-02-23 Thread Sebastian Andrzej Siewior
domain_id_alloc() and domain_id_free() is used for id management. Those
two function share a bitmap (amd_iommu_pd_alloc_bitmap) and set/clear
bits based on id allocation. There is no need to share this with
amd_iommu_devtable_lock, it can use its own lock for this operation.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 53aece41ddf2..958efe311057 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -81,6 +81,7 @@
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
+static DEFINE_SPINLOCK(pd_bitmap_lock);
 
 /* List of all available dev_data structures */
 static LLIST_HEAD(dev_data_list);
@@ -1599,29 +1600,26 @@ static void del_domain_from_list(struct 
protection_domain *domain)
 
 static u16 domain_id_alloc(void)
 {
-   unsigned long flags;
int id;
 
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock(&pd_bitmap_lock);
id = find_first_zero_bit(amd_iommu_pd_alloc_bitmap, MAX_DOMAIN_ID);
BUG_ON(id == 0);
if (id > 0 && id < MAX_DOMAIN_ID)
__set_bit(id, amd_iommu_pd_alloc_bitmap);
else
id = 0;
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock(&pd_bitmap_lock);
 
return id;
 }
 
 static void domain_id_free(int id)
 {
-   unsigned long flags;
-
-   write_lock_irqsave(&amd_iommu_devtable_lock, flags);
+   spin_lock(&pd_bitmap_lock);
if (id > 0 && id < MAX_DOMAIN_ID)
__clear_bit(id, amd_iommu_pd_alloc_bitmap);
-   write_unlock_irqrestore(&amd_iommu_devtable_lock, flags);
+   spin_unlock(&pd_bitmap_lock);
 }
 
 #define DEFINE_FREE_PT_FN(LVL, FN) \
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 02/10] iommu/amd: turn dev_data_list into a lock less list

2018-02-23 Thread Sebastian Andrzej Siewior
alloc_dev_data() adds new items to dev_data_list and search_dev_data()
is searching for items in this list. Both protect the access to the list
with a spinlock.
There is no need to navigate forth and back within the list and there is
also no deleting of a specific item. This qualifies the list to become a
lock less list and as part of this, the spinlock can be removed.
With this change the ordering of those items within the list is changed:
before the change new items were added to the end of the list, now they
are added to the front. I don't think it matters but wanted to mention
it.

Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c   | 28 ++--
 drivers/iommu/amd_iommu_types.h |  2 +-
 2 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8b591c192daf..53aece41ddf2 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -83,8 +83,7 @@
 static DEFINE_RWLOCK(amd_iommu_devtable_lock);
 
 /* List of all available dev_data structures */
-static LIST_HEAD(dev_data_list);
-static DEFINE_SPINLOCK(dev_data_list_lock);
+static LLIST_HEAD(dev_data_list);
 
 LIST_HEAD(ioapic_map);
 LIST_HEAD(hpet_map);
@@ -203,40 +202,33 @@ static struct dma_ops_domain* to_dma_ops_domain(struct 
protection_domain *domain
 static struct iommu_dev_data *alloc_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
-   unsigned long flags;
 
dev_data = kzalloc(sizeof(*dev_data), GFP_KERNEL);
if (!dev_data)
return NULL;
 
dev_data->devid = devid;
-
-   spin_lock_irqsave(&dev_data_list_lock, flags);
-   list_add_tail(&dev_data->dev_data_list, &dev_data_list);
-   spin_unlock_irqrestore(&dev_data_list_lock, flags);
-
ratelimit_default_init(&dev_data->rs);
 
+   llist_add(&dev_data->dev_data_list, &dev_data_list);
return dev_data;
 }
 
 static struct iommu_dev_data *search_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
-   unsigned long flags;
+   struct llist_node *node;
 
-   spin_lock_irqsave(&dev_data_list_lock, flags);
-   list_for_each_entry(dev_data, &dev_data_list, dev_data_list) {
+   if (llist_empty(&dev_data_list))
+   return NULL;
+
+   node = dev_data_list.first;
+   llist_for_each_entry(dev_data, node, dev_data_list) {
if (dev_data->devid == devid)
-   goto out_unlock;
+   return dev_data;
}
 
-   dev_data = NULL;
-
-out_unlock:
-   spin_unlock_irqrestore(&dev_data_list_lock, flags);
-
-   return dev_data;
+   return NULL;
 }
 
 static int __last_alias(struct pci_dev *pdev, u16 alias, void *data)
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 6a877ebd058b..62d6f42b57c9 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -627,7 +627,7 @@ struct devid_map {
  */
 struct iommu_dev_data {
struct list_head list;/* For domain->dev_list */
-   struct list_head dev_data_list;   /* For global dev_data_list */
+   struct llist_node dev_data_list;  /* For global dev_data_list */
struct protection_domain *domain; /* Domain the device is bound to */
u16 devid;/* PCI Device ID */
u16 alias;/* Alias Device ID */
-- 
2.16.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/iova: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->fq

2017-11-02 Thread Sebastian Andrzej Siewior
On 2017-09-21 17:21:40 [+0200], Sebastian Andrzej Siewior wrote:
> get_cpu_ptr() disabled preemption and returns the ->fq object of the
> current CPU. raw_cpu_ptr() does the same except that it not disable
> preemption which means the scheduler can move it to another CPU after it
> obtained the per-CPU object.
> In this case this is not bad because the data structure itself is
> protected with a spin_lock. This change shouldn't matter however on RT
> it does because the sleeping lock can't be accessed with disabled
> preemption.

Did this make to your tree Jörg?

> Cc: Joerg Roedel 
> Cc: iommu@lists.linux-foundation.org
> Reported-by: vina...@gmail.com
> Signed-off-by: Sebastian Andrzej Siewior 
> ---
> On 2017-09-19 11:41:19 [+0200], Joerg Roedel wrote:
> > Hi Sebastian,
> Hi Jörg,
> 
> > I moved the flushing to driver/iommu/iova.c to share it with the Intel
> > IOMMU and possibly other drivers too, so this patch does no longer apply
> > to v4.14-rc1. Can you update the patch to these changes?
> 
> Sure.
> 
> v1…v2: move the change from amd_iommu.c to iova.c
> 
>  drivers/iommu/iova.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 33edfa794ae9..b30900025c62 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -570,7 +570,7 @@ void queue_iova(struct iova_domain *iovad,
>   unsigned long pfn, unsigned long pages,
>   unsigned long data)
>  {
> - struct iova_fq *fq = get_cpu_ptr(iovad->fq);
> + struct iova_fq *fq = raw_cpu_ptr(iovad->fq);
>   unsigned long flags;
>   unsigned idx;
>  
> @@ -600,8 +600,6 @@ void queue_iova(struct iova_domain *iovad,
>   if (atomic_cmpxchg(&iovad->fq_timer_on, 0, 1) == 0)
>   mod_timer(&iovad->fq_timer,
> jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
> -
> - put_cpu_ptr(iovad->fq);
>  }
>  EXPORT_SYMBOL_GPL(queue_iova);
>  
> -- 
> 2.14.1
> 

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2] iommu/iova: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->fq

2017-09-21 Thread Sebastian Andrzej Siewior
get_cpu_ptr() disabled preemption and returns the ->fq object of the
current CPU. raw_cpu_ptr() does the same except that it not disable
preemption which means the scheduler can move it to another CPU after it
obtained the per-CPU object.
In this case this is not bad because the data structure itself is
protected with a spin_lock. This change shouldn't matter however on RT
it does because the sleeping lock can't be accessed with disabled
preemption.

Cc: Joerg Roedel 
Cc: iommu@lists.linux-foundation.org
Reported-by: vina...@gmail.com
Signed-off-by: Sebastian Andrzej Siewior 
---
On 2017-09-19 11:41:19 [+0200], Joerg Roedel wrote:
> Hi Sebastian,
Hi Jörg,

> I moved the flushing to driver/iommu/iova.c to share it with the Intel
> IOMMU and possibly other drivers too, so this patch does no longer apply
> to v4.14-rc1. Can you update the patch to these changes?

Sure.

v1…v2: move the change from amd_iommu.c to iova.c

 drivers/iommu/iova.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 33edfa794ae9..b30900025c62 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -570,7 +570,7 @@ void queue_iova(struct iova_domain *iovad,
unsigned long pfn, unsigned long pages,
unsigned long data)
 {
-   struct iova_fq *fq = get_cpu_ptr(iovad->fq);
+   struct iova_fq *fq = raw_cpu_ptr(iovad->fq);
unsigned long flags;
unsigned idx;
 
@@ -600,8 +600,6 @@ void queue_iova(struct iova_domain *iovad,
if (atomic_cmpxchg(&iovad->fq_timer_on, 0, 1) == 0)
mod_timer(&iovad->fq_timer,
  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
-
-   put_cpu_ptr(iovad->fq);
 }
 EXPORT_SYMBOL_GPL(queue_iova);
 
-- 
2.14.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/amd: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->flush_queue

2017-09-21 Thread Sebastian Andrzej Siewior
On 2017-09-11 22:22:11 [-0400], Vinod Adhikary wrote:
> Dear all,
Hi,

> Thank you for the great community support and support from Sebastian to
> provide me this patch. I wanted to send this email to inform you and
> perhaps get some information on how I could keep myself updated on updates
> in regards to linux-rt. I apologize if I'm spamming you with this email.

There is a rt-users mailing list, see
https://wiki.linuxfoundation.org/realtime/communication/mailinglists

> I noticed `Bug: Scheduling while atomic` on ahci after building 4.9 rt and
> reported the issue on oftc channel. Sebastian quickly helped me get a patch
> and my `journalctl -f` was clean without any such messages. Yesterday I
> decided to build 4.11 without the patch but I saw a whole bunch of similar
> bug logs not just on ahci but on my nic card as well among a few others.
> There were a few instances where I had to hard shutdown my machine. I
> patched 4.11 with Sebastian's patch again and so far it's been okay for
> about 24 hours and going.

It sounds like the same issue. I taking this for v4.11 RT and v4.9 will
pick it at some point.

> My concern is, how I can provide more information to resolve this issue,
> perhaps in the next release. Please let me know how I can be of help.

usually providing a backtrace like you did to the rt-usrs list is
enough.

> Thank you

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/amd: Use raw_cpu_ptr() instead of get_cpu_ptr() for ->flush_queue

2017-09-06 Thread Sebastian Andrzej Siewior
get_cpu_ptr() disables preemption and returns the ->flush_queue object
of the current CPU. raw_cpu_ptr() does the same except that it not
disable preemption which means the scheduler can move it to another CPU
after it obtained the per-CPU object.
In this case this is not bad because the data structure itself is
protected with a spin_lock. This change shouldn't matter in general
but on RT it does because the sleeping lock can't be accessed with
disabled preemption.

Cc: Joerg Roedel 
Cc: iommu@lists.linux-foundation.org
Reported-by: vina...@gmail.com
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/amd_iommu.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4ad7e5e31943..943efbc08128 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1911,7 +1911,7 @@ static void queue_add(struct dma_ops_domain *dom,
pages = __roundup_pow_of_two(pages);
address >>= PAGE_SHIFT;
 
-   queue = get_cpu_ptr(dom->flush_queue);
+   queue = raw_cpu_ptr(dom->flush_queue);
spin_lock_irqsave(&queue->lock, flags);
 
/*
@@ -1940,8 +1940,6 @@ static void queue_add(struct dma_ops_domain *dom,
 
if (atomic_cmpxchg(&dom->flush_timer_on, 0, 1) == 0)
mod_timer(&dom->flush_timer, jiffies + msecs_to_jiffies(10));
-
-   put_cpu_ptr(dom->flush_queue);
 }
 
 static void queue_flush_timeout(unsigned long data)
-- 
2.14.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/iova: don't disable preempt around this_cpu_ptr()

2017-06-28 Thread Sebastian Andrzej Siewior
On 2017-06-28 11:22:05 [+0200], Joerg Roedel wrote:
> On Tue, Jun 27, 2017 at 06:16:47PM +0200, Sebastian Andrzej Siewior wrote:
> > Commit 583248e6620a ("iommu/iova: Disable preemption around use of
> > this_cpu_ptr()") disables preemption while accessing a per-CPU variable.
> > This does keep lockdep quiet. However I don't see the point why it is
> > bad if we get migrated after its access to another CPU.
> > __iova_rcache_insert() and __iova_rcache_get() immediately locks the
> > variable after obtaining it - before accessing its members.
> > _If_ we get migrated away after retrieving the address of cpu_rcache
> > before taking the lock then the *other* task on the same CPU will
> > retrieve the same address of cpu_rcache and will spin on the lock.
> > 
> > alloc_iova_fast() disables preemption while invoking
> > free_cpu_cached_iovas() on each CPU. The function itself uses
> > per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr()
> > does). It _could_ make sense to use get_online_cpus() instead but the we
> > have a hotplug notifier for CPU down (and none for up) so we are good.
> 
> Does that really matter? The spin_lock disables irqs and thus avoids
> preemption too. We also can't get rid of the irqsave lock here because
> these locks are taken in the dma-api path which is used from interrupt
> context.

It really does. The spin_lock() does disable preemption but this is not
the problem. The thing is that the preempt_disable() is superfluous and
it hurts Preempt-RT (and this is how I noticed it). Also the
get_cpu_ptr() is not requited and was only added to keep lockdep quiet
(according to the history).
Everything else here can stay as-is, I am just asking for the removal of
the redundant preempt_disable() where it is not required.

>   Joerg

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/3] iommu/iova: don't disable preempt around this_cpu_ptr()

2017-06-27 Thread Sebastian Andrzej Siewior
Commit 583248e6620a ("iommu/iova: Disable preemption around use of
this_cpu_ptr()") disables preemption while accessing a per-CPU variable.
This does keep lockdep quiet. However I don't see the point why it is
bad if we get migrated after its access to another CPU.
__iova_rcache_insert() and __iova_rcache_get() immediately locks the
variable after obtaining it - before accessing its members.
_If_ we get migrated away after retrieving the address of cpu_rcache
before taking the lock then the *other* task on the same CPU will
retrieve the same address of cpu_rcache and will spin on the lock.

alloc_iova_fast() disables preemption while invoking
free_cpu_cached_iovas() on each CPU. The function itself uses
per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr()
does). It _could_ make sense to use get_online_cpus() instead but the we
have a hotplug notifier for CPU down (and none for up) so we are good.

Cc: Joerg Roedel 
Cc: iommu@lists.linux-foundation.org
Cc: Andrew Morton 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/iova.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 5c88ba70e4e0..f0ff0aa04081 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static bool iova_rcache_insert(struct iova_domain *iovad,
   unsigned long pfn,
@@ -398,10 +399,8 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long 
size,
 
/* Try replenishing IOVAs by flushing rcache. */
flushed_rcache = true;
-   preempt_disable();
for_each_online_cpu(cpu)
free_cpu_cached_iovas(cpu, iovad);
-   preempt_enable();
goto retry;
}
 
@@ -729,7 +728,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
bool can_insert = false;
unsigned long flags;
 
-   cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
+   cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
spin_lock_irqsave(&cpu_rcache->lock, flags);
 
if (!iova_magazine_full(cpu_rcache->loaded)) {
@@ -759,7 +758,6 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
iova_magazine_push(cpu_rcache->loaded, iova_pfn);
 
spin_unlock_irqrestore(&cpu_rcache->lock, flags);
-   put_cpu_ptr(rcache->cpu_rcaches);
 
if (mag_to_free) {
iova_magazine_free_pfns(mag_to_free, iovad);
@@ -793,7 +791,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache 
*rcache,
bool has_pfn = false;
unsigned long flags;
 
-   cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
+   cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
spin_lock_irqsave(&cpu_rcache->lock, flags);
 
if (!iova_magazine_empty(cpu_rcache->loaded)) {
@@ -815,7 +813,6 @@ static unsigned long __iova_rcache_get(struct iova_rcache 
*rcache,
iova_pfn = iova_magazine_pop(cpu_rcache->loaded, limit_pfn);
 
spin_unlock_irqrestore(&cpu_rcache->lock, flags);
-   put_cpu_ptr(rcache->cpu_rcaches);
 
return iova_pfn;
 }
-- 
2.13.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/3] iommu/vt-d: don't disable preemption while accessing deferred_flush()

2017-06-27 Thread Sebastian Andrzej Siewior
get_cpu() disables preemption and returns the current CPU number. The
CPU number is only used once while retrieving the address of the local's
CPU deferred_flush pointer.
We can instead use raw_cpu_ptr() while we remain preemptible. The worst
thing that can happen is that flush_unmaps_timeout() is invoked multiple
times: once by taskA after seeing HIGH_WATER_MARK and then preempted to
another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU
as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0
right after its read because another CPU invoked flush_unmaps_timeout()
for this CPU.
The access to flush_data is protected by a spinlock so even if we get
migrated to another CPU or preempted - the data structure is protected.

While at it, I marked deferred_flush static since I can't find a
reference to it outside of this file.

Cc: David Woodhouse 
Cc: Joerg Roedel 
Cc: iommu@lists.linux-foundation.org
Cc: Andrew Morton 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/intel-iommu.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8500deda9175..1c7118d1525e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -481,7 +481,7 @@ struct deferred_flush_data {
struct deferred_flush_table *tables;
 };
 
-DEFINE_PER_CPU(struct deferred_flush_data, deferred_flush);
+static DEFINE_PER_CPU(struct deferred_flush_data, deferred_flush);
 
 /* bitmap for indexing intel_iommus */
 static int g_num_of_iommus;
@@ -3725,10 +3725,8 @@ static void add_unmap(struct dmar_domain *dom, unsigned 
long iova_pfn,
struct intel_iommu *iommu;
struct deferred_flush_entry *entry;
struct deferred_flush_data *flush_data;
-   unsigned int cpuid;
 
-   cpuid = get_cpu();
-   flush_data = per_cpu_ptr(&deferred_flush, cpuid);
+   flush_data = raw_cpu_ptr(&deferred_flush);
 
/* Flush all CPUs' entries to avoid deferring too much.  If
 * this becomes a bottleneck, can just flush us, and rely on
@@ -3761,8 +3759,6 @@ static void add_unmap(struct dmar_domain *dom, unsigned 
long iova_pfn,
}
flush_data->size++;
spin_unlock_irqrestore(&flush_data->lock, flags);
-
-   put_cpu();
 }
 
 static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
-- 
2.13.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 13/22] iommu/vt-d: Convert to hotplug state machine

2016-11-26 Thread Sebastian Andrzej Siewior
From: Anna-Maria Gleixner 

Install the callbacks via the state machine.

Cc: David Woodhouse 
Cc: Joerg Roedel 
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Anna-Maria Gleixner 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/intel-iommu.c | 24 ++--
 include/linux/cpuhotplug.h  |  1 +
 2 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a4407eabf0e6..fd7962560e56 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4665,25 +4665,13 @@ static void free_all_cpu_cached_iovas(unsigned int cpu)
}
 }
 
-static int intel_iommu_cpu_notifier(struct notifier_block *nfb,
-   unsigned long action, void *v)
+static int intel_iommu_cpu_dead(unsigned int cpu)
 {
-   unsigned int cpu = (unsigned long)v;
-
-   switch (action) {
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
-   free_all_cpu_cached_iovas(cpu);
-   flush_unmaps_timeout(cpu);
-   break;
-   }
-   return NOTIFY_OK;
+   free_all_cpu_cached_iovas(cpu);
+   flush_unmaps_timeout(cpu);
+   return 0;
 }
 
-static struct notifier_block intel_iommu_cpu_nb = {
-   .notifier_call = intel_iommu_cpu_notifier,
-};
-
 static ssize_t intel_iommu_show_version(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -4832,8 +4820,8 @@ int __init intel_iommu_init(void)
bus_register_notifier(&pci_bus_type, &device_nb);
if (si_domain && !hw_pass_through)
register_memory_notifier(&intel_iommu_memory_nb);
-   register_hotcpu_notifier(&intel_iommu_cpu_nb);
-
+   cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
+ intel_iommu_cpu_dead);
intel_iommu_enabled = 1;
 
return 0;
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index c7d0d76ef0ee..853f8176594d 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -40,6 +40,7 @@ enum cpuhp_state {
CPUHP_PAGE_ALLOC_DEAD,
CPUHP_NET_DEV_DEAD,
CPUHP_PCI_XGENE_DEAD,
+   CPUHP_IOMMU_INTEL_DEAD,
CPUHP_WORKQUEUE_PREP,
CPUHP_POWER_NUMA_PREPARE,
CPUHP_HRTIMERS_PREPARE,
-- 
2.10.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/5] iommu/iova: don't disable preempt around this_cpu_ptr()

2016-10-21 Thread Sebastian Andrzej Siewior
Commit 583248e6620a ("iommu/iova: Disable preemption around use of
this_cpu_ptr()") disables preemption while accessing a per-CPU variable.
This does keep lockdep quiet. However I don't see the point why it is
bad if we get migrated after its access to another CPU.
__iova_rcache_insert() and __iova_rcache_get() immediately locks the
variable after obtaining it - before accessing its members.
_If_ we get migrated away after retrieving the address of cpu_rcache
before taking the lock then the *other* task on the same CPU will
retrieve the same address of cpu_rcache and will spin on the lock.

alloc_iova_fast() disables preemption while invoking
free_cpu_cached_iovas() on each CPU. The function itself uses
per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr()
does) because it assumes the caller knows what he does because he might
access the data structure from a different CPU (which means he needs
protection against concurrent access).

Cc: Joerg Roedel 
Cc: iommu@lists.linux-foundation.org
Cc: Chris Wilson 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/iova.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index e23001bfcfee..359d5d169ec0 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static bool iova_rcache_insert(struct iova_domain *iovad,
   unsigned long pfn,
@@ -420,10 +421,8 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long 
size,
 
/* Try replenishing IOVAs by flushing rcache. */
flushed_rcache = true;
-   preempt_disable();
for_each_online_cpu(cpu)
free_cpu_cached_iovas(cpu, iovad);
-   preempt_enable();
goto retry;
}
 
@@ -751,7 +750,7 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
bool can_insert = false;
unsigned long flags;
 
-   cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
+   cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
spin_lock_irqsave(&cpu_rcache->lock, flags);
 
if (!iova_magazine_full(cpu_rcache->loaded)) {
@@ -781,7 +780,6 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
iova_magazine_push(cpu_rcache->loaded, iova_pfn);
 
spin_unlock_irqrestore(&cpu_rcache->lock, flags);
-   put_cpu_ptr(rcache->cpu_rcaches);
 
if (mag_to_free) {
iova_magazine_free_pfns(mag_to_free, iovad);
@@ -815,7 +813,7 @@ static unsigned long __iova_rcache_get(struct iova_rcache 
*rcache,
bool has_pfn = false;
unsigned long flags;
 
-   cpu_rcache = get_cpu_ptr(rcache->cpu_rcaches);
+   cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
spin_lock_irqsave(&cpu_rcache->lock, flags);
 
if (!iova_magazine_empty(cpu_rcache->loaded)) {
@@ -837,7 +835,6 @@ static unsigned long __iova_rcache_get(struct iova_rcache 
*rcache,
iova_pfn = iova_magazine_pop(cpu_rcache->loaded, limit_pfn);
 
spin_unlock_irqrestore(&cpu_rcache->lock, flags);
-   put_cpu_ptr(rcache->cpu_rcaches);
 
return iova_pfn;
 }
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/5] iommu/vt-d: don't disable preemption while accessing deferred_flush()

2016-10-21 Thread Sebastian Andrzej Siewior
get_cpu() disables preemption and returns the current CPU number. The
CPU number is later only used once while retrieving the address of the
local's CPU deferred_flush pointer.
We can instead use raw_cpu_ptr() while we remain preemptible. The worst
thing that can happen is that flush_unmaps_timeout() is invoked multiple
times: once by taskA after seeing HIGH_WATER_MARK and then preempted to
another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU
as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0
right after its read because another CPU invoked flush_unmaps_timeout()
for this CPU.
The access to flush_data is protected by a spinlock so even if we get
migrated to another CPU or preempted - the data structure is protected.

While at it, I marked deferred_flush static since I can't find a
reference to it outside of this file.

Cc: David Woodhouse 
Cc: Joerg Roedel 
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/iommu/intel-iommu.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a4407eabf0e6..0dad326a9483 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -479,7 +479,7 @@ struct deferred_flush_data {
struct deferred_flush_table *tables;
 };
 
-DEFINE_PER_CPU(struct deferred_flush_data, deferred_flush);
+static DEFINE_PER_CPU(struct deferred_flush_data, deferred_flush);
 
 /* bitmap for indexing intel_iommus */
 static int g_num_of_iommus;
@@ -3673,10 +3673,8 @@ static void add_unmap(struct dmar_domain *dom, unsigned 
long iova_pfn,
struct intel_iommu *iommu;
struct deferred_flush_entry *entry;
struct deferred_flush_data *flush_data;
-   unsigned int cpuid;
 
-   cpuid = get_cpu();
-   flush_data = per_cpu_ptr(&deferred_flush, cpuid);
+   flush_data = raw_cpu_ptr(&deferred_flush);
 
/* Flush all CPUs' entries to avoid deferring too much.  If
 * this becomes a bottleneck, can just flush us, and rely on
@@ -3709,8 +3707,6 @@ static void add_unmap(struct dmar_domain *dom, unsigned 
long iova_pfn,
}
flush_data->size++;
spin_unlock_irqrestore(&flush_data->lock, flags);
-
-   put_cpu();
 }
 
 static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu