Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-27 Thread Jan Vesely
On Mon, 2017-06-26 at 14:14 +0200, Joerg Roedel wrote:
> On Fri, Jun 23, 2017 at 10:20:47AM -0400, Jan Vesely wrote:
> > I was able to trigger "Completion-Wait loop timed out" messages in the
> > following situation:
> > Hung OpenCL task running on dGPU.
> > dGPU goes to sleep.
> > sigterm to hung task.
> > it seems to recover OK after the dGPU is powered back on
> 
> How does that 'dGPU goes to sleep' work? Do you put it to sleep manually
> via sysfs or something? Or is that something that amdgpu does on its
> own?

AMD folks should be able to provide more details. afaik, the driver
uses ACPI methods to power on/off the device. Driver routines wake the
device up before accessing it and there is a timeout to turn it off
after few seconds of inactivity.

> 
> It looks like the GPU just switches the ATS unit off when it goes to
> sleep and doesn't answer the invalidation anymore, which explains the
> completion-wait timeouts.

Both MMIO regs and PCIe config regs are turned off so it would not
surprise me if all PCIe requests were ignored by the device in off
state. it should be possible to request device wake up before
invalidating the relevant IOMMU domain. I'll leave to more
knowledgeable ppl to judge whether it's a good idea (we can also
postpone such invalidations until the device is woken by other means)


Jan

> 
> 
> 
>   Joerg
> 

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-23 Thread Jan Vesely
On Thu, 2017-06-22 at 23:57 +0200, Joerg Roedel wrote:
> On Thu, Jun 22, 2017 at 11:13:09AM -0400, Jan Vesely wrote:
> > It looks like I tested different patches.
> > linux-4.10.17 with both
> > "iommu/amd: Optimize iova queue flushing"
> 
> This patch isn't in my tree and will not go upstream.
> 
> > and
> > "iommu/amd: Disable previously enabled IOMMUs at boot"
> 
> This patch solves a different problem.
> 
> > (I haven't tested the series independently)
> > 
> > works OK. The machine booted successfully and I was able to test clover
> > based OpenCL and simple OpenGL on both iGPU(carrizo) and dGPU(iceland).
> 
> For a conclusive test please use what is in the iommu-tree, as this is
> what I plan to send upstream. You can use the 'next' branch of
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git

Tested commit c71bf5f133056aae71e8ae7ea66240574bd44f54.

The machine boots and runs OK, although it takes few minutes to boot up
(looks USB related).

OpenGL and OpenCL run OK on both GPUs.

I was able to trigger "Completion-Wait loop timed out" messages in the
following situation:
Hung OpenCL task running on dGPU.
dGPU goes to sleep.
sigterm to hung task.
it seems to recover OK after the dGPU is powered back on

dmesg:
[ 1628.049683] amdgpu: [powerplay] VI should always have 2 performance levels
[ 1628.845195] amdgpu :07:00.0: GPU pci config reset
[ 1667.270351] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.270437] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.270491] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.270505] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.270556] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.270607] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.270614] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.270664] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.270714] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.270721] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.270770] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.270846] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.270868] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.270922] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.270982] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.270992] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.271043] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.271096] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.271109] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.271164] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.271230] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.271245] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.271338] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.271394] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.271403] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.271458] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.271518] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.271533] amdgpu :07:00.0: couldn't schedule ib on ring 
[ 1667.271588] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
[ 1667.271644] [drm:amd_sched_main [amdgpu]] *ERROR* Failed to run job!
[ 1667.426742] AMD-Vi: Completion-Wait loop timed out
[ 1667.570025] AMD-Vi: Completion-Wait loop timed out
[ 1667.713326] AMD-Vi: Completion-Wait loop timed out
[ 1667.867561] AMD-Vi: Completion-Wait loop timed out
[ 1668.010886] AMD-Vi: Completion-Wait loop timed out
[ 1668.154207] AMD-Vi: Completion-Wait loop timed out
[ 1668.283193] AMD-Vi: Event logged [
[ 1668.283201] IOTLB_INV_TIMEOUT device=07:00.0 address=0x00040ce6e240]
[ 1668.430357] AMD-Vi: Completion-Wait loop timed out
[ 1668.581169] AMD-Vi: Completion-Wait loop timed out
[ 1668.718046] AMD-Vi: Completion-Wait loop timed out
[ 1668.854914] AMD-Vi: Completion-Wait loop timed out
[ 1668.991774] AMD-Vi: Completion-Wait loop timed out
[ 1669.128638] AMD-Vi: Completion-Wait loop timed out
[ 1669.272391] AMD-Vi: Completion-Wait loop timed out
[ 1669.285193] AMD-Vi: Event logged [
[ 1669.285200] IOTLB_INV_TIMEOUT device=07:00.0 address=0x00040ce6e2b0]
[ 1669.285756] [drm] PCIE GART of 3072M enabled (table at 0x0004).
[ 1669.288274] amdgpu: [powerplay] can't get the mac of 5
[ 1669.302600] [drm] ring test on 0 succeeded in 16 usecs
[ 1669.302987] [drm] ring test on 1 succeeded in 17 usecs
[ 1669.303037] [drm] ring test on 2 

Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-22 Thread Jan Vesely
On Thu, 2017-06-22 at 11:20 +0200, Joerg Roedel wrote:
> On Wed, Jun 21, 2017 at 05:09:31PM -0400, Jan Vesely wrote:
> > On Wed, 2017-06-21 at 12:01 -0500, Tom Lendacky wrote:
> > > On 6/21/2017 11:20 AM, Jan Vesely wrote:
> > > > Hi Arindam,
> > > > 
> > > > has this patch been replaced by Joerg's "[PATCH 0/7] iommu/amd:
> > > > Optimize iova queue flushing" series?
> > > 
> > > Yes, Joerg's patches replaced this patch.  He applied just the first two
> > > patches of this series.
> > 
> > Joerg's patches applied on top of 4.10.17 do not solve my issue (do I
> > need the first two patches of this series?). the machine still hangs on
> > boot with a flood of IOMMU wait loop timed out messages.
> > 
> > on the other hand patch 3/3 v1 applied on top of 4.10.17 fixes the
> > problem and the machine boots successfully
> 
> Interesting. I did some measurements on the IOTLB flush-rate with my
> network load-test. This test is designed to heavily excerise the IOMMU
> map/unmap path and thus cause many IOTLB invalidations too.

It looks like I tested different patches.
linux-4.10.17 with both
"iommu/amd: Optimize iova queue flushing"
and
"iommu/amd: Disable previously enabled IOMMUs at boot"
(I haven't tested the series independently)

works OK. The machine booted successfully and I was able to test clover
based OpenCL and simple OpenGL on both iGPU(carrizo) and dGPU(iceland).

thanks and sorry for the confusion,
Jan

> 
> Results are:
> 
>   Current upstream v4.12-rc6: ~147000 flushes/s
>   With Toms patches:~5900 flushes/s
>   With my patches:  ~1200 flushes/s
> 
> So while Toms patches also get the flush-rate down significantly, it is
> even lower with my patches. This indicates that the problem is
> triggerable even with low flush rates.
> 
> But I have no idea why it still triggers with my patches, but not with
> Toms. The approaches follow the same idea of only flushing domains that
> have map/unmap operations on them.
> 
> I really think we need the patch to blacklist ATS on these GPUs
> upstream.
> 
> Regards,
> 
>   Joerg
> 


signature.asc
Description: This is a digitally signed message part
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-21 Thread Jan Vesely
On Wed, 2017-06-21 at 12:01 -0500, Tom Lendacky wrote:
> On 6/21/2017 11:20 AM, Jan Vesely wrote:
> > Hi Arindam,
> > 
> > has this patch been replaced by Joerg's "[PATCH 0/7] iommu/amd:
> > Optimize iova queue flushing" series?
> 
> Yes, Joerg's patches replaced this patch.  He applied just the first two
> patches of this series.

Joerg's patches applied on top of 4.10.17 do not solve my issue (do I
need the first two patches of this series?). the machine still hangs on
boot with a flood of IOMMU wait loop timed out messages.

on the other hand patch 3/3 v1 applied on top of 4.10.17 fixes the
problem and the machine boots successfully

regards,
Jan


> 
> Thanks,
> Tom
> 
> > 
> > Jan
> > 
> > On Thu, 2017-06-08 at 22:33 +0200, Jan Vesely wrote:
> > > On Tue, 2017-06-06 at 10:02 +, Nath, Arindam wrote:
> > > > > -Original Message-
> > > > > From: Lendacky, Thomas
> > > > > Sent: Tuesday, June 06, 2017 1:23 AM
> > > > > To: iommu@lists.linux-foundation.org
> > > > > Cc: Nath, Arindam ; Joerg Roedel
> > > > > ; Duran, Leo ; Suthikulpanit,
> > > > > Suravee 
> > > > > Subject: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush
> > > > > 
> > > > > After reducing the amount of MMIO performed by the IOMMU during
> > > > > operation,
> > > > > perf data shows that flushing the TLB for all protection domains 
> > > > > during
> > > > > DMA unmapping is a performance issue. It is not necessary to flush the
> > > > > TLBs for all protection domains, only the protection domains 
> > > > > associated
> > > > > with iova's on the flush queue.
> > > > > 
> > > > > Create a separate queue that tracks the protection domains associated 
> > > > > with
> > > > > the iova's on the flush queue. This new queue optimizes the flushing 
> > > > > of
> > > > > TLBs to the required protection domains.
> > > > > 
> > > > > Reviewed-by: Arindam Nath 
> > > > > Signed-off-by: Tom Lendacky 
> > > > > ---
> > > > > drivers/iommu/amd_iommu.c |   56
> > > > > -
> > > > > 1 file changed, 50 insertions(+), 6 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> > > > > index 856103b..a5e77f0 100644
> > > > > --- a/drivers/iommu/amd_iommu.c
> > > > > +++ b/drivers/iommu/amd_iommu.c
> > > > > @@ -103,7 +103,18 @@ struct flush_queue {
> > > > >   struct flush_queue_entry *entries;
> > > > > };
> > > > > 
> > > > > +struct flush_pd_queue_entry {
> > > > > + struct protection_domain *pd;
> > > > > +};
> > > > > +
> > > > > +struct flush_pd_queue {
> > > > > + /* No lock needed, protected by flush_queue lock */
> > > > > + unsigned next;
> > > > > + struct flush_pd_queue_entry *entries;
> > > > > +};
> > > > > +
> > > > > static DEFINE_PER_CPU(struct flush_queue, flush_queue);
> > > > > +static DEFINE_PER_CPU(struct flush_pd_queue, flush_pd_queue);
> > > > > 
> > > > > static atomic_t queue_timer_on;
> > > > > static struct timer_list queue_timer;
> > > > > @@ -2227,16 +2238,20 @@ static struct iommu_group
> > > > > *amd_iommu_device_group(struct device *dev)
> > > > >   *
> > > > > 
> > > > > ***
> > > > > **/
> > > > > 
> > > > > -static void __queue_flush(struct flush_queue *queue)
> > > > > +static void __queue_flush(struct flush_queue *queue,
> > > > > +   struct flush_pd_queue *pd_queue)
> > > > > {
> > > > > - struct protection_domain *domain;
> > > > >   unsigned long flags;
> > > > >   int idx;
> > > > > 
> > > > >   /* First flush TLB of all known domains */
> > > > >   spin_lock_irqsave(&amd_iommu_pd_lock, flags);
> > > > > - list_for_each_entry(domain, &amd_iommu_pd_list, list)
> > &g

Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-21 Thread Jan Vesely
Hi Arindam,

has this patch been replaced by Joerg's "[PATCH 0/7] iommu/amd:
Optimize iova queue flushing" series?

Jan

On Thu, 2017-06-08 at 22:33 +0200, Jan Vesely wrote:
> On Tue, 2017-06-06 at 10:02 +, Nath, Arindam wrote:
> > > -Original Message-
> > > From: Lendacky, Thomas
> > > Sent: Tuesday, June 06, 2017 1:23 AM
> > > To: iommu@lists.linux-foundation.org
> > > Cc: Nath, Arindam ; Joerg Roedel
> > > ; Duran, Leo ; Suthikulpanit,
> > > Suravee 
> > > Subject: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush
> > > 
> > > After reducing the amount of MMIO performed by the IOMMU during
> > > operation,
> > > perf data shows that flushing the TLB for all protection domains during
> > > DMA unmapping is a performance issue. It is not necessary to flush the
> > > TLBs for all protection domains, only the protection domains associated
> > > with iova's on the flush queue.
> > > 
> > > Create a separate queue that tracks the protection domains associated with
> > > the iova's on the flush queue. This new queue optimizes the flushing of
> > > TLBs to the required protection domains.
> > > 
> > > Reviewed-by: Arindam Nath 
> > > Signed-off-by: Tom Lendacky 
> > > ---
> > > drivers/iommu/amd_iommu.c |   56
> > > -
> > > 1 file changed, 50 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> > > index 856103b..a5e77f0 100644
> > > --- a/drivers/iommu/amd_iommu.c
> > > +++ b/drivers/iommu/amd_iommu.c
> > > @@ -103,7 +103,18 @@ struct flush_queue {
> > >   struct flush_queue_entry *entries;
> > > };
> > > 
> > > +struct flush_pd_queue_entry {
> > > + struct protection_domain *pd;
> > > +};
> > > +
> > > +struct flush_pd_queue {
> > > + /* No lock needed, protected by flush_queue lock */
> > > + unsigned next;
> > > + struct flush_pd_queue_entry *entries;
> > > +};
> > > +
> > > static DEFINE_PER_CPU(struct flush_queue, flush_queue);
> > > +static DEFINE_PER_CPU(struct flush_pd_queue, flush_pd_queue);
> > > 
> > > static atomic_t queue_timer_on;
> > > static struct timer_list queue_timer;
> > > @@ -2227,16 +2238,20 @@ static struct iommu_group
> > > *amd_iommu_device_group(struct device *dev)
> > >  *
> > > 
> > > ***
> > > **/
> > > 
> > > -static void __queue_flush(struct flush_queue *queue)
> > > +static void __queue_flush(struct flush_queue *queue,
> > > +   struct flush_pd_queue *pd_queue)
> > > {
> > > - struct protection_domain *domain;
> > >   unsigned long flags;
> > >   int idx;
> > > 
> > >   /* First flush TLB of all known domains */
> > >   spin_lock_irqsave(&amd_iommu_pd_lock, flags);
> > > - list_for_each_entry(domain, &amd_iommu_pd_list, list)
> > > - domain_flush_tlb(domain);
> > > + for (idx = 0; idx < pd_queue->next; ++idx) {
> > > + struct flush_pd_queue_entry *entry;
> > > +
> > > + entry = pd_queue->entries + idx;
> > > + domain_flush_tlb(entry->pd);
> > > + }
> > >   spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
> > > 
> > >   /* Wait until flushes have completed */
> > > @@ -2255,6 +2270,7 @@ static void __queue_flush(struct flush_queue
> > > *queue)
> > >   entry->dma_dom = NULL;
> > >   }
> > > 
> > > + pd_queue->next = 0;
> > >   queue->next = 0;
> > > }
> > > 
> > > @@ -2263,13 +2279,15 @@ static void queue_flush_all(void)
> > >   int cpu;
> > > 
> > >   for_each_possible_cpu(cpu) {
> > > + struct flush_pd_queue *pd_queue;
> > >   struct flush_queue *queue;
> > >   unsigned long flags;
> > > 
> > >   queue = per_cpu_ptr(&flush_queue, cpu);
> > > + pd_queue = per_cpu_ptr(&flush_pd_queue, cpu);
> > >   spin_lock_irqsave(&queue->lock, flags);
> > >   if (queue->next > 0)
> > > - __queue_flush(queue);
> > > + __queue_flush(queue, pd_queue);
> > >

Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-08 Thread Jan Vesely
a_dom,
> > address >>= PAGE_SHIFT;
> > 
> > queue = get_cpu_ptr(&flush_queue);
> > +   pd_queue = get_cpu_ptr(&flush_pd_queue);
> > spin_lock_irqsave(&queue->lock, flags);
> > 
> > if (queue->next == FLUSH_QUEUE_SIZE)
> > -   __queue_flush(queue);
> > +   __queue_flush(queue, pd_queue);
> > +
> > +   for (idx = 0; idx < pd_queue->next; ++idx) {
> > +   pd_entry = pd_queue->entries + idx;
> > +   if (pd_entry->pd == &dma_dom->domain)
> > +   break;
> > +   }
> > +   if (idx == pd_queue->next) {
> > +   /* New protection domain, add it to the list */
> > +   pd_entry = pd_queue->entries + pd_queue->next++;
> > +   pd_entry->pd = &dma_dom->domain;
> > +   }
> > 
> > idx   = queue->next++;
> > entry = queue->entries + idx;
> > @@ -2309,6 +2341,7 @@ static void queue_add(struct dma_ops_domain
> > *dma_dom,
> > if (atomic_cmpxchg(&queue_timer_on, 0, 1) == 0)
> > mod_timer(&queue_timer, jiffies + msecs_to_jiffies(10));
> > 
> > +   put_cpu_ptr(&flush_pd_queue);
> > put_cpu_ptr(&flush_queue);
> > }
> > 
> > @@ -2810,6 +2843,8 @@ int __init amd_iommu_init_api(void)
> > return ret;
> > 
> > for_each_possible_cpu(cpu) {
> > +   struct flush_pd_queue *pd_queue =
> > per_cpu_ptr(&flush_pd_queue,
> > + cpu);
> > struct flush_queue *queue = per_cpu_ptr(&flush_queue,
> > cpu);
> > 
> > queue->entries = kzalloc(FLUSH_QUEUE_SIZE *
> > @@ -2819,6 +2854,12 @@ int __init amd_iommu_init_api(void)
> > goto out_put_iova;
> > 
> > spin_lock_init(&queue->lock);
> > +
> > +   pd_queue->entries = kzalloc(FLUSH_QUEUE_SIZE *
> > +   sizeof(*pd_queue->entries),
> > +   GFP_KERNEL);
> > +   if (!pd_queue->entries)
> > +   goto out_put_iova;
> > }
> > 
> > err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
> > @@ -2836,9 +2877,12 @@ int __init amd_iommu_init_api(void)
> > 
> > out_put_iova:
> > for_each_possible_cpu(cpu) {
> > +   struct flush_pd_queue *pd_queue =
> > per_cpu_ptr(&flush_pd_queue,
> > + cpu);
> > struct flush_queue *queue = per_cpu_ptr(&flush_queue,
> > cpu);
> > 
> > kfree(queue->entries);
> > +   kfree(pd_queue->entries);
> > }
> > 
> > return -ENOMEM;
> 
> Craig and Jan, can you please confirm whether this patch fixes the
> IOMMU timeout errors you encountered before? If it does, then this is
> a better implementation of the fix I provided few weeks back.

I have only remote access to the machine, so I won't be able to test
until June 22nd.

Jan

> 
> Thanks,
> Arindam

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v2)

2017-05-19 Thread Jan Vesely
On Fri, 2017-05-19 at 15:32 +0530, arindam.n...@amd.com wrote:
> From: Arindam Nath 
> 
> Change History
> --
> 
> v2: changes suggested by Joerg
> - add flush flag to improve efficiency of flush operation
> 
> v1:
> - The idea behind flush queues is to defer the IOTLB flushing
>   for domains for which the mappings are no longer valid. We
>   add such domains in queue_add(), and when the queue size
>   reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
> 
>   Since we have already taken lock before __queue_flush()
>   is called, we need to make sure the IOTLB flushing is
>   performed as quickly as possible.
> 
>   In the current implementation, we perform IOTLB flushing
>   for all domains irrespective of which ones were actually
>   added in the flush queue initially. This can be quite
>   expensive especially for domains for which unmapping is
>   not required at this point of time.
> 
>   This patch makes use of domain information in
>   'struct flush_queue_entry' to make sure we only flush
>   IOTLBs for domains who need it, skipping others.

Hi,

just a note, the patch also fixes "AMD-Vi: Completion-Wait loop timed
out" boot hang on my system (carrizo + iceland) [0,1,2]. Presumably the
old loop also tried to flush domains that included powered-off devices.

regards,
Jan

[0] https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/20
[1] https://bugs.freedesktop.org/show_bug.cgi?id=101029
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1448121

> 
> Suggested-by: Joerg Roedel 
> Signed-off-by: Arindam Nath 
> ---
>  drivers/iommu/amd_iommu.c   | 27 ---
>  drivers/iommu/amd_iommu_types.h |  2 ++
>  2 files changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 63cacf5..1edeebec 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2227,15 +2227,26 @@ static struct iommu_group 
> *amd_iommu_device_group(struct device *dev)
>  
>  static void __queue_flush(struct flush_queue *queue)
>  {
> - struct protection_domain *domain;
> - unsigned long flags;
>   int idx;
>  
> - /* First flush TLB of all known domains */
> - spin_lock_irqsave(&amd_iommu_pd_lock, flags);
> - list_for_each_entry(domain, &amd_iommu_pd_list, list)
> - domain_flush_tlb(domain);
> - spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
> + /* First flush TLB of all domains which were added to flush queue */
> + for (idx = 0; idx < queue->next; ++idx) {
> + struct flush_queue_entry *entry;
> +
> + entry = queue->entries + idx;
> +
> + /*
> +  * There might be cases where multiple IOVA entries for the
> +  * same domain are queued in the flush queue. To avoid
> +  * flushing the same domain again, we check whether the
> +  * flag is set or not. This improves the efficiency of
> +  * flush operation.
> +  */
> + if (!entry->dma_dom->domain.already_flushed) {
> + entry->dma_dom->domain.already_flushed = true;
> + domain_flush_tlb(&entry->dma_dom->domain);
> + }
> + }
>  
>   /* Wait until flushes have completed */
>   domain_flush_complete(NULL);
> @@ -2289,6 +2300,8 @@ static void queue_add(struct dma_ops_domain *dma_dom,
>   pages = __roundup_pow_of_two(pages);
>   address >>= PAGE_SHIFT;
>  
> + dma_dom->domain.already_flushed = false;
> +
>   queue = get_cpu_ptr(&flush_queue);
>   spin_lock_irqsave(&queue->lock, flags);
>  
> diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> index 4de8f41..4f5519d 100644
> --- a/drivers/iommu/amd_iommu_types.h
> +++ b/drivers/iommu/amd_iommu_types.h
> @@ -454,6 +454,8 @@ struct protection_domain {
>   bool updated;   /* complete domain flush required */
>   unsigned dev_cnt;   /* devices assigned to this domain */
>   unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
> + bool already_flushed;   /* flag to avoid flushing the same domain again
> +in a single invocation of __queue_flush() */
>  };
>  
>  /*

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/2] iommu/amd: Remove entry from the list before freeing it

2016-05-21 Thread Jan Vesely
From: Jan Vesely 

Signed-off-by: Jan Vesely 
---
 drivers/iommu/amd_iommu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 634f636..17c76f2 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3288,8 +3288,10 @@ static void amd_iommu_put_dm_regions(struct device *dev,
 {
struct iommu_dm_region *entry, *next;
 
-   list_for_each_entry_safe(entry, next, head, list)
+   list_for_each_entry_safe(entry, next, head, list) {
+   list_del(&entry->list);
kfree(entry);
+   }
 }
 
 static const struct iommu_ops amd_iommu_ops = {
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/amd: Destroy api_lock mutex when freeing domain

2016-05-21 Thread Jan Vesely
From: Jan Vesely 

Signed-off-by: Jan Vesely 
---
 drivers/iommu/amd_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 17c76f2..4ff5e40 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3016,6 +3016,7 @@ static void protection_domain_free(struct 
protection_domain *domain)
 
del_domain_from_list(domain);
 
+   mutex_destroy(&domain->api_lock);
if (domain->id)
domain_id_free(domain->id);
 
-- 
2.5.5

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu