Re: [PATCH v2] iommu/amd: Fix interrupt remapping when disable guest_mode

2017-06-28 Thread Suravee Suthikulpanit



On 6/28/17 19:44, Joerg Roedel wrote:

On Mon, Jun 26, 2017 at 04:28:04AM -0500, Suravee Suthikulpanit wrote:

Pass-through devices to VM guest can get updated IRQ affinity
information via irq_set_affinity() when not running in guest mode.
Currently, AMD IOMMU driver in GA mode ignores the updated information
if the pass-through device is setup to use vAPIC regardless of guest_mode.
This could cause invalid interrupt remapping.

Also, the guest_mode bit should be set and cleared only when
SVM updates posted-interrupt interrupt remapping information.

Signed-off-by: Suravee Suthikulpanit 
Cc: Joerg Roedel 


Nervermind, I added this fixes line and applied the patch:

Fixes: d98de49a53e48 ('iommu/amd: Enable vAPIC interrupt remapping mode 
by default'



Thanks,

Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: Don't free parent pagetable of the PTE we're adding

2017-06-28 Thread David Dillow via iommu
When adding a large scatterlist entry that covers more than the L3
superpage size (1GB) but has an alignment such that we must use L2
superpages (2MB) , we give dma_pte_free_level() a range that causes it
to free the L3 pagetable we're about to populate. We fix this by telling
dma_pte_free_pagetable() about the pagetable level we're about to populate
to prevent freeing it.

For example, mapping a scatterlist with entry lengths 854MB and 1194MB
at IOVA 0x8000 would, when processing the 2MB-aligned second
entry, cause pfn_to_dma_pte() to create a L3 directory to hold L2
superpages for the mapping at IOVA 0xc000. We would previously
call dma_pte_free_pagetable(domain, 0xc, 0xf), which
would free the L3 directory pfn_to_dma_pte() just created for IO PFN
0xc. Telling dma_pte_free_pagetable() to retain the L3
directories while using L2 superpages avoids the erroneous free.

Signed-off-by: David Dillow 
---
 drivers/iommu/intel-iommu.c | 38 ++
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index fc2765ccdb57..3a8036f60367 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1137,8 +1137,9 @@ static void dma_pte_clear_range(struct dmar_domain 
*domain,
 }
 
 static void dma_pte_free_level(struct dmar_domain *domain, int level,
-  struct dma_pte *pte, unsigned long pfn,
-  unsigned long start_pfn, unsigned long last_pfn)
+  int retain_level, struct dma_pte *pte,
+  unsigned long pfn, unsigned long start_pfn,
+  unsigned long last_pfn)
 {
pfn = max(start_pfn, pfn);
pte = [pfn_level_offset(pfn, level)];
@@ -1153,12 +1154,17 @@ static void dma_pte_free_level(struct dmar_domain 
*domain, int level,
level_pfn = pfn & level_mask(level);
level_pte = phys_to_virt(dma_pte_addr(pte));
 
-   if (level > 2)
-   dma_pte_free_level(domain, level - 1, level_pte,
-  level_pfn, start_pfn, last_pfn);
+   if (level > 2) {
+   dma_pte_free_level(domain, level - 1, retain_level,
+  level_pte, level_pfn, start_pfn,
+  last_pfn);
+   }
 
-   /* If range covers entire pagetable, free it */
-   if (!(start_pfn > level_pfn ||
+   /*
+* Free the page table if we're below the level we want to
+* retain and the range covers the entire table.
+*/
+   if (level < retain_level && !(start_pfn > level_pfn ||
  last_pfn < level_pfn + level_size(level) - 1)) {
dma_clear_pte(pte);
domain_flush_cache(domain, pte, sizeof(*pte));
@@ -1169,10 +1175,14 @@ static void dma_pte_free_level(struct dmar_domain 
*domain, int level,
} while (!first_pte_in_page(++pte) && pfn <= last_pfn);
 }
 
-/* clear last level (leaf) ptes and free page table pages. */
+/*
+ * clear last level (leaf) ptes and free page table pages below the
+ * level we wish to keep intact.
+ */
 static void dma_pte_free_pagetable(struct dmar_domain *domain,
   unsigned long start_pfn,
-  unsigned long last_pfn)
+  unsigned long last_pfn,
+  int retain_level)
 {
BUG_ON(!domain_pfn_supported(domain, start_pfn));
BUG_ON(!domain_pfn_supported(domain, last_pfn));
@@ -1181,7 +1191,7 @@ static void dma_pte_free_pagetable(struct dmar_domain 
*domain,
dma_pte_clear_range(domain, start_pfn, last_pfn);
 
/* We don't need lock here; nobody else touches the iova range */
-   dma_pte_free_level(domain, agaw_to_level(domain->agaw),
+   dma_pte_free_level(domain, agaw_to_level(domain->agaw), retain_level,
   domain->pgd, 0, start_pfn, last_pfn);
 
/* free pgd */
@@ -2277,8 +2287,11 @@ static int __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,
/*
 * Ensure that old small page tables are
 * removed to make room for superpage(s).
+* We're adding new large pages, so make sure
+* we don't remove their parent tables.
 */
-   dma_pte_free_pagetable(domain, iov_pfn, 
end_pfn);
+   dma_pte_free_pagetable(domain, iov_pfn, end_pfn,
+  largepage_lvl + 1);
 

Re: [PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction

2017-06-28 Thread Leizhen (ThunderTown)


On 2017/6/28 17:32, Will Deacon wrote:
> Hi Zhen Lei,
> 
> Nate (CC'd), Robin and I have been working on something very similar to
> this series, but this patch is different to what we had planned. More below.
> 
> On Mon, Jun 26, 2017 at 09:38:46PM +0800, Zhen Lei wrote:
>> Because all TLBI commands should be followed by a SYNC command, to make
>> sure that it has been completely finished. So we can just add the TLBI
>> commands into the queue, and put off the execution until meet SYNC or
>> other commands. To prevent the followed SYNC command waiting for a long
>> time because of too many commands have been delayed, restrict the max
>> delayed number.
>>
>> According to my test, I got the same performance data as I replaced writel
>> with writel_relaxed in queue_inc_prod.
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 42 +-
>>  1 file changed, 37 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 291da5f..4481123 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -337,6 +337,7 @@
>>  /* Command queue */
>>  #define CMDQ_ENT_DWORDS 2
>>  #define CMDQ_MAX_SZ_SHIFT   8
>> +#define CMDQ_MAX_DELAYED32
>>  
>>  #define CMDQ_ERR_SHIFT  24
>>  #define CMDQ_ERR_MASK   0x7f
>> @@ -472,6 +473,7 @@ struct arm_smmu_cmdq_ent {
>>  };
>>  } cfgi;
>>  
>> +#define CMDQ_OP_TLBI_NH_ALL 0x10
>>  #define CMDQ_OP_TLBI_NH_ASID0x11
>>  #define CMDQ_OP_TLBI_NH_VA  0x12
>>  #define CMDQ_OP_TLBI_EL2_ALL0x20
>> @@ -499,6 +501,7 @@ struct arm_smmu_cmdq_ent {
>>  
>>  struct arm_smmu_queue {
>>  int irq; /* Wired interrupt */
>> +u32 nr_delay;
>>  
>>  __le64  *base;
>>  dma_addr_t  base_dma;
>> @@ -722,11 +725,16 @@ static int queue_sync_prod(struct arm_smmu_queue *q)
>>  return ret;
>>  }
>>  
>> -static void queue_inc_prod(struct arm_smmu_queue *q)
>> +static void queue_inc_swprod(struct arm_smmu_queue *q)
>>  {
>> -u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1;
>> +u32 prod = q->prod + 1;
>>  
>>  q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
>> +}
>> +
>> +static void queue_inc_prod(struct arm_smmu_queue *q)
>> +{
>> +queue_inc_swprod(q);
>>  writel(q->prod, q->prod_reg);
>>  }
>>  
>> @@ -761,13 +769,24 @@ static void queue_write(__le64 *dst, u64 *src, size_t 
>> n_dwords)
>>  *dst++ = cpu_to_le64(*src++);
>>  }
>>  
>> -static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent)
>> +static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent, int 
>> optimize)
>>  {
>>  if (queue_full(q))
>>  return -ENOSPC;
>>  
>>  queue_write(Q_ENT(q, q->prod), ent, q->ent_dwords);
>> -queue_inc_prod(q);
>> +
>> +/*
>> + * We don't want too many commands to be delayed, this may lead the
>> + * followed sync command to wait for a long time.
>> + */
>> +if (optimize && (++q->nr_delay < CMDQ_MAX_DELAYED)) {
>> +queue_inc_swprod(q);
>> +} else {
>> +queue_inc_prod(q);
>> +q->nr_delay = 0;
>> +}
>> +
> 
> So here, you're effectively putting invalidation commands into the command
> queue without updating PROD. Do you actually see a performance advantage
> from doing so? Another side of the argument would be that we should be
Yes, my sas ssd performance test showed that it can improve about 
100-150K/s(the same to I directly replace
writel with writel_relaxed). And the average execution time of 
iommu_unmap(which called by iommu_dma_unmap_sg)
dropped from 10us to 5us.

> moving PROD as soon as we can, so that the SMMU can process invalidation
> commands in the background and reduce the cost of the final SYNC operation
> when the high-level unmap operation is complete.
There maybe that __iowmb() is more expensive than wait for tlbi complete. 
Except the time of __iowmb()
itself, it also protected by spinlock, lock confliction will rise rapidly in 
the stress scene. __iowmb()
average cost 300-500ns(Sorry, I forget the exact value).

In addition, after applied this patcheset and Robin's v2, and my earlier dma64 
iova optimization patchset.
Our net performance test got the same data to global bypass. But sas ssd still 
have more than 20% dropped.
Maybe we should still focus at map/unamp, because the average execution time of 
iova alloc/free is only
about 400ns.

By the way, patch2-5 is more effective than this one, it can improve more than 
350K/s. And with it, we can
got about 100-150K/s improvement of Robin's v2. Otherwise, I saw non effective 
of Robin's v2. Sorry, I have
not tested how about this patch without 

Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Thiago Padilha
Some more data from 3DMark benchmarks:

Time Spy(DirectX 12):
- Graphics test 1:
  - npt=0: 37.65 FPS
  - npt=1: 24.22 FPS (36% drop)
- Graphics test 2:
  - npt=0: 33.05 FPS
  - npt=1: 29.65 FPS (10% drop)
- CPU test:
  - npt=0: 17.35 FPS
  - npt=1: 12.03 FPS (31% drop)

Fire Strike(DirectX 11):
- Graphics test 1:
  - npt=0: 80.56 FPS
  - npt=1: 41.89 FPS (49% drop)
- Graphics test 2:
  - npt=0: 70.64 FPS
  - npt=1: 60.75 FPS (14% drop)
- Physics test:
  - npt=0: 50.14 FPS
  - npt=1: 5.78 FPS (89% drop)
- Combined test:
  - npt=0: 32.83 FPS
  - npt=1: 17.70 FPS (47% drop)

Sky Diver(DirectX 11):
- Graphics test 1:
  - npt=0: 248.81 FPS
  - npt=1: 173.63 FPS (31% drop)
- Graphics test 2:
  - npt=0: 250.49 FPS
  - npt=1: 124.84 FPS (51% drop)
- Physics test:
  - 8 threads:
- npt=0: 140.93 FPS
- npt=1: 119.08 FPS (15% drop)
  - 24 threads:
- npt=0: 110.22 FPS
- npt=1: 74.55 FPS (33% drop)
  - 48 threads:
- npt=1: 71.56 FPS
- npt=1: 45.93 FPS (36% drop)
  - 96 threads:
- npt=0: 41.04 FPS
- npt=1: 24.81 FPS (40% drop)
- Combined test:
  - npt=0: 75.65 FPS
  - npt=1: 50.45 FPS (33% drop)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Nick Sarnie
On Wed, Jun 28, 2017 at 9:50 PM, Thiago Padilha  wrote:
> Some more data from 3DMark benchmarks:
>
> Time Spy(DirectX 12):
> - Graphics test 1:
>   - npt=0: 37.65 FPS
>   - npt=1: 24.22 FPS (36% drop)
> - Graphics test 2:
>   - npt=0: 33.05 FPS
>   - npt=1: 29.65 FPS (10% drop)
> - CPU test:
>   - npt=0: 17.35 FPS
>   - npt=1: 12.03 FPS (31% drop)
>
> Fire Strike(DirectX 11):
> - Graphics test 1:
>   - npt=0: 80.56 FPS
>   - npt=1: 41.89 FPS (49% drop)
> - Graphics test 2:
>   - npt=0: 70.64 FPS
>   - npt=1: 60.75 FPS (14% drop)
> - Physics test:
>   - npt=0: 50.14 FPS
>   - npt=1: 5.78 FPS (89% drop)
> - Combined test:
>   - npt=0: 32.83 FPS
>   - npt=1: 17.70 FPS (47% drop)
>
> Sky Diver(DirectX 11):
> - Graphics test 1:
>   - npt=0: 248.81 FPS
>   - npt=1: 173.63 FPS (31% drop)
> - Graphics test 2:
>   - npt=0: 250.49 FPS
>   - npt=1: 124.84 FPS (51% drop)
> - Physics test:
>   - 8 threads:
> - npt=0: 140.93 FPS
> - npt=1: 119.08 FPS (15% drop)
>   - 24 threads:
> - npt=0: 110.22 FPS
> - npt=1: 74.55 FPS (33% drop)
>   - 48 threads:
> - npt=1: 71.56 FPS
> - npt=1: 45.93 FPS (36% drop)
>   - 96 threads:
> - npt=0: 41.04 FPS
> - npt=1: 24.81 FPS (40% drop)
> - Combined test:
>   - npt=0: 75.65 FPS
>   - npt=1: 50.45 FPS (33% drop)

Hi Thiago,

Thanks for the data, I'm sure it will be useful. Sorry, I should have
noted that I've only tested a few games/benchmarks and never any
non-intensive loads.

Sarnex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Thiago Padilha
On Wed, Jun 28, 2017 at 7:34 PM, Nick Sarnie  wrote:
> Hi Suravee,
>
> Thanks a lot for helping. Torcs does not appear graphically demanding
> on modern hardware, so this issue may not be easily noticeable. I was
> able to easily reproduce the problem using the Unigine Heaven
> benchmark, but I'm sure anything moderately graphically demanding
> would show a performance loss with NPT enabled. As an example, when I
> tested this with Fedora on my RX480, I got around 30-35 FPS with NPT
> on and around 55-60 with NPT off.
>
> Let me know if you need any more information or have any questions.
>
> (no problem John, thanks a lot for taking interest in this)
>
> Thanks again,
> Sarnex

Hi

I don't think the FPS drop is proportional to how graphically demanding the
workload is. On the contrary, at first sight it would seem like the less
demanding a workload is, the bigger the FPS impact suffered, though as some
numbers I will show in a moment suggest, this is not always the case.

Unfortunately I haven't been able to find a pattern to what causes the most
impact in FPS except that the relative drop increases with higher FPS
values. Other
than that, it seems very specific to the workload/benchmark used.

Here's some data I've collected to help with the investigation. The system is
Ryzen 1700 (no overclock, 3ghz), GTX 1070, windows 10 guest.

I've used Unigine Heaven and Passmark's PerformanceTest 9.0.

First Heaven benchmark with ultra settings on 1920x1080:

- DirectX 11:
  - npt=0: 87.0 fps
  - npt=1: 78.4 fps (10% drop)
- DirectX 9:
  - npt=0: 100.0 fps
  - npt=1: 66.4 fps (33% drop)
- OpenGL:
  - npt=0: 82.5 fps
  - npt=1: 35.2 fps (58% drop)

Heaven Benchmark again, this time with low settings on 1280x720:

- DirectX 11:
  - npt=0: 182.5 fps
  - npt=1: 140.1 fps (25% drop)
- DirectX 9:
  - npt=0: 169.2 fps
  - npt=1: 74.1 fps (56% drop)
- OpenGL:
  - npt=0: 202.8 fps
  - npt=1: 45.0 fps (78% drop)

PerformanceTest 9.0 3d benchmark:

- DirectX 9:
  - npt=0: 157 fps
  - npt=1: 13 fps (92% drop)
- DirectX 10:
  - npt=0: 220 fps
  - npt=1: 212 fps (4% drop)
- DirectX 11:
  - npt=0: 234 fps
  - npt=1: 140 fps (40% drop)
- DirectX 12:
  - npt=0: 88 fps (scored 35 because of the penalized FPS of not being
able to run at 4k)
  - npt=1: 4.5 fps (scored 1, 95% drop)
- GPU Compute:
  - Mandel:
- npt=0: ~= 2000 fps
- npt=1: ~= 2000 fps
  - Bitonic Sort:
- npt=0: ~= 153583696.0 elements/sec
- npt=1: ~= 106233376.0 elements/sec (31% drop)
  - QJulia4D:
- npt=0: ~= 1000 fps
- npt=1: ~= 1000 fps
  - OpenCL:
- npt=0: ~= 750 fps
- npt=1: ~= 220 fps

As you can see, in some cases there's only about 5% drop(which could be within
the margin of error), while others the drop is as high as 95%. Some points of
interest:

- Passmark directx9 is not graphically demanding(runs at 1024x768, gtx 1070
  doesn't break a sweat) and suffers a 92% drop in FPS.
- Unigine directx11 on ultra is graphically demanding and suffers less than 10%
  drop in FPS.
- Passmark directx12 is graphically demanding and suffers 95% drop in FPS.
- The bitonic sort is not a graphical benchmark, it shows the results(avg number
  of sorted elements/sec) in a console window, yet it suffers 31% drop in
  performance.

I think it would take someone with experience in GPU programming, and with
knowledge of what each benchmark does, to find a pattern in these numbers.

Thiago
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Nick Sarnie
On Wed, Jun 28, 2017 at 4:33 PM, Paolo Bonzini  wrote:
>
>
> On 28/06/2017 21:52, Graham Neville wrote:
>> Although not related to graphics card performance, there is definitely
>> another issue with regards to running KVM nested L2 guests when npt=1.
>>
>> Thought I'd mention this in case it helps with identifying performance
>> issues with NPT.
>>
>> I'm unable to start any L2 guests with KVM acceleration (--enable-kvm).
>> As soon as it attempts to bring up the L2 guest the L1 host crashes, L0
>> host remains online. Nothing is printed in either L1 or L0's dmesg.
>>
>> My L0 is running Arch with 4.11.0-rc6, with qemu 2.8.0. I've tried
>> different L1 hosts (Ubuntu,Arch) and different kernels right to 4.12-rc5
>> kernel, along with different qemu versions.
>>
>> This used to work fine with my Intel i7-4770s setup.
>>
>> With npt=0, L2 guests can start but performance is dier.
>
> Nested AMD needs some care.  It's known, but time has been lacking...
>
> Paolo
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

Hi Suravee,

Thanks a lot for helping. Torcs does not appear graphically demanding
on modern hardware, so this issue may not be easily noticeable. I was
able to easily reproduce the problem using the Unigine Heaven
benchmark, but I'm sure anything moderately graphically demanding
would show a performance loss with NPT enabled. As an example, when I
tested this with Fedora on my RX480, I got around 30-35 FPS with NPT
on and around 55-60 with NPT off.

Let me know if you need any more information or have any questions.

(no problem John, thanks a lot for taking interest in this)

Thanks again,
Sarnex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-28 Thread Deucher, Alexander
> -Original Message-
> From: Joerg Roedel [mailto:j...@8bytes.org]
> Sent: Wednesday, June 28, 2017 4:37 AM
> To: Jan Vesely; Deucher, Alexander
> Cc: Lendacky, Thomas; Nath, Arindam; Craig Stein; iommu@lists.linux-
> foundation.org; Duran, Leo; Suthikulpanit, Suravee
> Subject: Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush
> 
> [Adding Alex Deucher]
> 
> Hey Alex,
> 
> On Tue, Jun 27, 2017 at 12:24:35PM -0400, Jan Vesely wrote:
> > On Mon, 2017-06-26 at 14:14 +0200, Joerg Roedel wrote:
> 
> > > How does that 'dGPU goes to sleep' work? Do you put it to sleep
> manually
> > > via sysfs or something? Or is that something that amdgpu does on its
> > > own?
> >
> > AMD folks should be able to provide more details. afaik, the driver
> > uses ACPI methods to power on/off the device. Driver routines wake the
> > device up before accessing it and there is a timeout to turn it off
> > after few seconds of inactivity.
> >
> > >
> > > It looks like the GPU just switches the ATS unit off when it goes to
> > > sleep and doesn't answer the invalidation anymore, which explains the
> > > completion-wait timeouts.
> >
> > Both MMIO regs and PCIe config regs are turned off so it would not
> > surprise me if all PCIe requests were ignored by the device in off
> > state. it should be possible to request device wake up before
> > invalidating the relevant IOMMU domain. I'll leave to more
> > knowledgeable ppl to judge whether it's a good idea (we can also
> > postpone such invalidations until the device is woken by other means)
> 
> Can you maybe sched some light on how the sleep-mode of the GPUs work?
> Is it initiated by the GPU driver or from somewhere else? In the case
> discussed here it looks like the ATS unit of the GPU is switched of,
> causing IOTLB invalidation timeouts on the IOMMU side.
> 
> If that is the case we might need some sort of dma-api extension so that
> the GPU driver can tell the iommu driver that the device is going to be
> quiet.

I assume you are talking about Hybrid/PowerXpress laptops where the dGPU can be 
powered down dynamically?  That is done via the runtime pm subsystem in the 
kernel.  We register several callbacks with that, and then it takes care of the 
power down auto timers and such.  The actual mechanism to power down the GPU 
varies for platform to platform (platform specific ACPI methods on early 
systems, D3cold on newer ones).

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Paolo Bonzini


On 28/06/2017 21:52, Graham Neville wrote:
> Although not related to graphics card performance, there is definitely
> another issue with regards to running KVM nested L2 guests when npt=1.
> 
> Thought I'd mention this in case it helps with identifying performance
> issues with NPT.
> 
> I'm unable to start any L2 guests with KVM acceleration (--enable-kvm).
> As soon as it attempts to bring up the L2 guest the L1 host crashes, L0
> host remains online. Nothing is printed in either L1 or L0's dmesg.
> 
> My L0 is running Arch with 4.11.0-rc6, with qemu 2.8.0. I've tried
> different L1 hosts (Ubuntu,Arch) and different kernels right to 4.12-rc5
> kernel, along with different qemu versions.
> 
> This used to work fine with my Intel i7-4770s setup.
> 
> With npt=0, L2 guests can start but performance is dier.

Nested AMD needs some care.  It's known, but time has been lacking...

Paolo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Graham Neville
Although not related to graphics card performance, there is definitely
another issue with regards to running KVM nested L2 guests when npt=1.

Thought I'd mention this in case it helps with identifying performance
issues with NPT.

I'm unable to start any L2 guests with KVM acceleration (--enable-kvm). As
soon as it attempts to bring up the L2 guest the L1 host crashes, L0 host
remains online. Nothing is printed in either L1 or L0's dmesg.

My L0 is running Arch with 4.11.0-rc6, with qemu 2.8.0. I've tried
different L1 hosts (Ubuntu,Arch) and different kernels right to 4.12-rc5
kernel, along with different qemu versions.

This used to work fine with my Intel i7-4770s setup.

With npt=0, L2 guests can start but performance is dier.

On Wed, Jun 28, 2017 at 7:29 PM, Bridgman, John 
wrote:

>
>
> From: Alex Williamson 
> Sent: June 28, 2017 3:08 PM
> To: Suthikulpanit, Suravee
> Cc: Steven Walter; Nick Sarnie; Paolo Bonzini;
> iommu@lists.linux-foundation.org; Matthias Ehrenfeuchter;
> k...@vger.kernel.org; Bridgman, John
> Subject: Re: AMD Ryzen KVM/NPT/IOMMU issue
>
> On Thu, 29 Jun 2017 01:53:57 +0700
> Suravee Suthikulpanit  wrote:
>
> > On 6/29/17 00:26, Steven Walter wrote:
> > >> So, I'm trying to reproduce this issue on the Ryzen system w/ the
> following
> > >> setup:
> > >>
> > >>   * Host kernel v4.11 (with this patch  https://lkml.org/lkml/2017/6/
> 23/295)
> > >>
> > >>   * guest VM RHEL7.3
> > >>
> > >>   * guest graphic driver = radeon
> > >>
> > >>   * qemu-system-x86_64 --version
> > >> QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)
> > >>
> > >>   * kvm-amd npt=1
> > >>
> > >>   * dGPU is 08:00.0 VGA compatible controller: Advanced Micro
> Devices, Inc.
> > >> [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)
> > >>
> > >>   * qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host
> -bios
> > >> /usr/share/qemu/bios.bin -device
> > >> ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,
> chassis=1,id=root.1
> > >> -drive file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net
> none
> > >> -vga none -nodefaults -device
> > >> vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,
> bus=root.1,romfile=/sandbox/vm-images/vbios.rom
> > >> -usb -device usb-host,hostbus=3,hostport=1 -device
> > >> usb-host,hostbus=3,hostport=3 -device vfio-pci,host=:08:00.1
> -device
> > >> vfio-pci,host=:09:00.0
> > >>
> > >> With this setup, I am able to pass-through the dGPU and run the
> following
> > >> test:
> > >>   * Starting up the guest w/ full GNOME GUI on the attached monitor.
> > >>   * glxgears (running @ 60 FPS)
> > >>   * Playing 1080p HD video on Youtube
> > >>
> > >> I am not noticing issues here. What kind of test are you running in
> the
> > >> guest VM?
> > > Try running the open source game "torcs" inside the VM.  I think
> > > you'll find that there's a very noticeable performance different
> > > between npt=1 and npt=0
> >
> > Hm.. actually torcs seems to be running fine w/ npt=1 setup. Although I
> think my
> > driving skill is ~20% worse :(
>
> >A clarification on the issue, it's not that these games/benchmarks
> >don't work or aren't playable, it's that they run slower (as measured by
> >framerate) with npt=1 vs npt=0.  A virtualized guest on AMD hardware is
> >hindered either by lower graphics performance (npt=1) or higher CPU
> >virtualization overhead (npt=0) for high intensity games or graphics
> >workloads.  Intel's equivalent ept feature does not have this issue.  I
> >would encourage looking at the framerate for one mode vs the other
> >before drawing any conclusions on whether it's working well.  Thanks,
> >Alex
>
> One more data point - Nick did some testing with Xen enabling/disabling
> npt and
> found that (a) performance was not affected much whether npt was on or
> off, and
> (b) performance with npt on was pretty close to KVM performance with npt
> off.
>
> This suggests something specific to KVM that doesn't play well with npt,
> although
> I have no idea what that might be. I was going to talk to our folks to see
> if they had
> any suggestions re: ways to narrow down where the performance impact is
> coming
> from, but I ended up going off sick instead.
>
> Haven't gone through the whole thread here yet so apologies if this has
> already been
> mentioned (and Nick sorry for the delay).
>
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Bridgman, John


From: Alex Williamson 
Sent: June 28, 2017 3:08 PM
To: Suthikulpanit, Suravee
Cc: Steven Walter; Nick Sarnie; Paolo Bonzini; 
iommu@lists.linux-foundation.org; Matthias Ehrenfeuchter; k...@vger.kernel.org; 
Bridgman, John
Subject: Re: AMD Ryzen KVM/NPT/IOMMU issue
    
On Thu, 29 Jun 2017 01:53:57 +0700
Suravee Suthikulpanit  wrote:

> On 6/29/17 00:26, Steven Walter wrote:
> >> So, I'm trying to reproduce this issue on the Ryzen system w/ the following
> >> setup:
> >>
> >>   * Host kernel v4.11 (with this patch  
> >>https://lkml.org/lkml/2017/6/23/295)
> >>
> >>   * guest VM RHEL7.3
> >>
> >>   * guest graphic driver = radeon
> >>
> >>   * qemu-system-x86_64 --version
> >> QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)
> >>
> >>   * kvm-amd npt=1
> >>
> >>   * dGPU is 08:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> >> [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)
> >>
> >>   * qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host -bios
> >> /usr/share/qemu/bios.bin -device
> >> ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1
> >> -drive file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net none
> >> -vga none -nodefaults -device
> >> vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,bus=root.1,romfile=/sandbox/vm-images/vbios.rom
> >> -usb -device usb-host,hostbus=3,hostport=1 -device
> >> usb-host,hostbus=3,hostport=3 -device vfio-pci,host=:08:00.1 -device
> >> vfio-pci,host=:09:00.0
> >>
> >> With this setup, I am able to pass-through the dGPU and run the following
> >> test:
> >>   * Starting up the guest w/ full GNOME GUI on the attached monitor.
> >>   * glxgears (running @ 60 FPS)
> >>   * Playing 1080p HD video on Youtube
> >>
> >> I am not noticing issues here. What kind of test are you running in the
> >> guest VM?  
> > Try running the open source game "torcs" inside the VM.  I think
> > you'll find that there's a very noticeable performance different
> > between npt=1 and npt=0  
> 
> Hm.. actually torcs seems to be running fine w/ npt=1 setup. Although I think 
> my
> driving skill is ~20% worse :(

>A clarification on the issue, it's not that these games/benchmarks
>don't work or aren't playable, it's that they run slower (as measured by
>framerate) with npt=1 vs npt=0.  A virtualized guest on AMD hardware is
>hindered either by lower graphics performance (npt=1) or higher CPU
>virtualization overhead (npt=0) for high intensity games or graphics
>workloads.  Intel's equivalent ept feature does not have this issue.  I
>would encourage looking at the framerate for one mode vs the other
>before drawing any conclusions on whether it's working well.  Thanks,
>Alex

One more data point - Nick did some testing with Xen enabling/disabling npt and 
found that (a) performance was not affected much whether npt was on or off, and
(b) performance with npt on was pretty close to KVM performance with npt off. 

This suggests something specific to KVM that doesn't play well with npt, 
although
I have no idea what that might be. I was going to talk to our folks to see if 
they had
any suggestions re: ways to narrow down where the performance impact is coming
from, but I ended up going off sick instead. 

Haven't gone through the whole thread here yet so apologies if this has already 
been
mentioned (and Nick sorry for the delay). 
  
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Bridgman, John


From: Alex Williamson 
Sent: June 28, 2017 3:08 PM
To: Suthikulpanit, Suravee
Cc: Steven Walter; Nick Sarnie; Paolo Bonzini; 
iommu@lists.linux-foundation.org; Matthias Ehrenfeuchter; k...@vger.kernel.org; 
Bridgman, John
Subject: Re: AMD Ryzen KVM/NPT/IOMMU issue

On Thu, 29 Jun 2017 01:53:57 +0700
Suravee Suthikulpanit  wrote:

> On 6/29/17 00:26, Steven Walter wrote:
> >> So, I'm trying to reproduce this issue on the Ryzen system w/ the following
> >> setup:
> >>
> >>   * Host kernel v4.11 (with this patch https://lkml.org/lkml/2017/6/23/295)
> >>
> >>   * guest VM RHEL7.3
> >>
> >>   * guest graphic driver = radeon
> >>
> >>   * qemu-system-x86_64 --version
> >> QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)
> >>
> >>   * kvm-amd npt=1
> >>
> >>   * dGPU is 08:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> >> [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)
> >>
> >>   * qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host -bios
> >> /usr/share/qemu/bios.bin -device
> >> ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1
> >> -drive file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net none
> >> -vga none -nodefaults -device
> >> vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,bus=root.1,romfile=/sandbox/vm-images/vbios.rom
> >> -usb -device usb-host,hostbus=3,hostport=1 -device
> >> usb-host,hostbus=3,hostport=3 -device vfio-pci,host=:08:00.1 -device
> >> vfio-pci,host=:09:00.0
> >>
> >> With this setup, I am able to pass-through the dGPU and run the following
> >> test:
> >>   * Starting up the guest w/ full GNOME GUI on the attached monitor.
> >>   * glxgears (running @ 60 FPS)
> >>   * Playing 1080p HD video on Youtube
> >>
> >> I am not noticing issues here. What kind of test are you running in the
> >> guest VM?
> > Try running the open source game "torcs" inside the VM.  I think
> > you'll find that there's a very noticeable performance different
> > between npt=1 and npt=0
>
> Hm.. actually torcs seems to be running fine w/ npt=1 setup. Although I think 
> my
> driving skill is ~20% worse :(

>A clarification on the issue, it's not that these games/benchmarks
>don't work or aren't playable, it's that they run slower (as measured by
>framerate) with npt=1 vs npt=0.  A virtualized guest on AMD hardware is
>hindered either by lower graphics performance (npt=1) or higher CPU
>virtualization overhead (npt=0) for high intensity games or graphics
>workloads.  Intel's equivalent ept feature does not have this issue.  I
>would encourage looking at the framerate for one mode vs the other
>before drawing any conclusions on whether it's working well.  Thanks,
>Alex

One more data point - Nick did some testing with Xen enabling/disabling npt and
found that (a) performance was not affected much whether npt was on or off, and
(b) performance with npt on was pretty close to KVM performance with npt off.

This suggests something specific to KVM that doesn't play well with npt, 
although
I have no idea what that might be. I was going to talk to our folks to see if 
they had
any suggestions re: ways to narrow down where the performance impact is coming
from, but I ended up going off sick instead.

Haven't gone through the whole thread here yet so apologies if this has already 
been
mentioned.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Alex Williamson
On Thu, 29 Jun 2017 01:53:57 +0700
Suravee Suthikulpanit  wrote:

> On 6/29/17 00:26, Steven Walter wrote:
> >> So, I'm trying to reproduce this issue on the Ryzen system w/ the following
> >> setup:
> >>
> >>   * Host kernel v4.11 (with this patch https://lkml.org/lkml/2017/6/23/295)
> >>
> >>   * guest VM RHEL7.3
> >>
> >>   * guest graphic driver = radeon
> >>
> >>   * qemu-system-x86_64 --version
> >> QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)
> >>
> >>   * kvm-amd npt=1
> >>
> >>   * dGPU is 08:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> >> [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)
> >>
> >>   * qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host -bios
> >> /usr/share/qemu/bios.bin -device
> >> ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1
> >> -drive file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net none
> >> -vga none -nodefaults -device
> >> vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,bus=root.1,romfile=/sandbox/vm-images/vbios.rom
> >> -usb -device usb-host,hostbus=3,hostport=1 -device
> >> usb-host,hostbus=3,hostport=3 -device vfio-pci,host=:08:00.1 -device
> >> vfio-pci,host=:09:00.0
> >>
> >> With this setup, I am able to pass-through the dGPU and run the following
> >> test:
> >>   * Starting up the guest w/ full GNOME GUI on the attached monitor.
> >>   * glxgears (running @ 60 FPS)
> >>   * Playing 1080p HD video on Youtube
> >>
> >> I am not noticing issues here. What kind of test are you running in the
> >> guest VM?  
> > Try running the open source game "torcs" inside the VM.  I think
> > you'll find that there's a very noticeable performance different
> > between npt=1 and npt=0  
> 
> Hm.. actually torcs seems to be running fine w/ npt=1 setup. Although I think 
> my 
> driving skill is ~20% worse :(

A clarification on the issue, it's not that these games/benchmarks
don't work or aren't playable, it's that they run slower (as measured by
framerate) with npt=1 vs npt=0.  A virtualized guest on AMD hardware is
hindered either by lower graphics performance (npt=1) or higher CPU
virtualization overhead (npt=0) for high intensity games or graphics
workloads.  Intel's equivalent ept feature does not have this issue.  I
would encourage looking at the framerate for one mode vs the other
before drawing any conclusions on whether it's working well.  Thanks,

Alex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Suravee Suthikulpanit



On 6/29/17 00:26, Steven Walter wrote:

So, I'm trying to reproduce this issue on the Ryzen system w/ the following
setup:

  * Host kernel v4.11 (with this patch https://lkml.org/lkml/2017/6/23/295)

  * guest VM RHEL7.3

  * guest graphic driver = radeon

  * qemu-system-x86_64 --version
QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)

  * kvm-amd npt=1

  * dGPU is 08:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)

  * qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host -bios
/usr/share/qemu/bios.bin -device
ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1
-drive file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net none
-vga none -nodefaults -device
vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,bus=root.1,romfile=/sandbox/vm-images/vbios.rom
-usb -device usb-host,hostbus=3,hostport=1 -device
usb-host,hostbus=3,hostport=3 -device vfio-pci,host=:08:00.1 -device
vfio-pci,host=:09:00.0

With this setup, I am able to pass-through the dGPU and run the following
test:
  * Starting up the guest w/ full GNOME GUI on the attached monitor.
  * glxgears (running @ 60 FPS)
  * Playing 1080p HD video on Youtube

I am not noticing issues here. What kind of test are you running in the
guest VM?

Try running the open source game "torcs" inside the VM.  I think
you'll find that there's a very noticeable performance different
between npt=1 and npt=0


Hm.. actually torcs seems to be running fine w/ npt=1 setup. Although I think my 
driving skill is ~20% worse :(


S
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/2] acpi/iort: numa: Add numa node mapping for smmuv3 devices

2017-06-28 Thread Robert Richter
On 15.06.17 14:46:03, Lorenzo Pieralisi wrote:
> On Thu, Jun 08, 2017 at 10:14:19AM +0530, Ganapatrao Kulkarni wrote:
> > Add code to parse proximity domain in SMMUv3 IORT table to
> > set numa node mapping for smmuv3 devices.
> > 
> > Signed-off-by: Ganapatrao Kulkarni 
> > ---
> >  drivers/acpi/arm64/iort.c | 28 ++--
> >  1 file changed, 26 insertions(+), 2 deletions(-)
> 
> I am happy to take this but I want to know what we shall do with
> patch 1 and related ACPICA changes first.

The change is now in acpica:

 
https://github.com/acpica/acpica/commit/8cadc4fb500e2aa52241e367c87a0f95d9760c58

So we could guard the code with an #ifdef until that patch is pulled
in via acpica tree:

> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index bba2b59..e804386 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -882,6 +882,23 @@ static bool __init arm_smmu_v3_is_coherent(struct 
> > acpi_iort_node *node)
> > return smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE;
> >  }
> >  
> > +/*
> > + * set numa proximity domain for smmuv3 device
> > + */
> > +static void  __init arm_smmu_v3_set_proximity(struct acpi_iort_node *node,
> > +   struct device *dev)
> > +{

#ifdef ACPI_IORT_SMMU_V3_PXM_VALID

> > +   struct acpi_iort_smmu_v3 *smmu;
> > +
> > +   smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
> > +   if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) {
> > +   set_dev_node(dev, acpi_map_pxm_to_node(smmu->pxm));
> > +   pr_info("SMMUV3[%llx] Mapped to Proximity domain %d\n",
> > +   smmu->base_address,
> > +   smmu->pxm);
> > +   }

#endif

> > +}
> > +

Could the patch be applied with this change?

Thanks,

-Robert

> >  static int __init arm_smmu_count_resources(struct acpi_iort_node *node)
> >  {
> > struct acpi_iort_smmu *smmu;
> > @@ -951,20 +968,24 @@ struct iort_iommu_config {
> > int (*iommu_count_resources)(struct acpi_iort_node *node);
> > void (*iommu_init_resources)(struct resource *res,
> >  struct acpi_iort_node *node);
> > +   void (*iommu_set_proximity)(struct acpi_iort_node *node,
> > +struct device *dev);
> >  };
> >  
> >  static const struct iort_iommu_config iort_arm_smmu_v3_cfg __initconst = {
> > .name = "arm-smmu-v3",
> > .iommu_is_coherent = arm_smmu_v3_is_coherent,
> > .iommu_count_resources = arm_smmu_v3_count_resources,
> > -   .iommu_init_resources = arm_smmu_v3_init_resources
> > +   .iommu_init_resources = arm_smmu_v3_init_resources,
> > +   .iommu_set_proximity = arm_smmu_v3_set_proximity
> >  };
> >  
> >  static const struct iort_iommu_config iort_arm_smmu_cfg __initconst = {
> > .name = "arm-smmu",
> > .iommu_is_coherent = arm_smmu_is_coherent,
> > .iommu_count_resources = arm_smmu_count_resources,
> > -   .iommu_init_resources = arm_smmu_init_resources
> > +   .iommu_init_resources = arm_smmu_init_resources,
> > +   .iommu_set_proximity = NULL
> >  };
> >  
> >  static __init
> > @@ -1002,6 +1023,9 @@ static int __init 
> > iort_add_smmu_platform_device(struct acpi_iort_node *node)
> > if (!pdev)
> > return -ENOMEM;
> >  
> > +   if (ops->iommu_set_proximity)
> > +   ops->iommu_set_proximity(node, >dev);
> > +
> > count = ops->iommu_count_resources(node);
> >  
> > r = kcalloc(count, sizeof(*r), GFP_KERNEL);
> > -- 
> > 1.8.1.4
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Alex Williamson
On Thu, 29 Jun 2017 00:23:20 +0700
Suravee Suthikulpanit  wrote:
> So, I'm trying to reproduce this issue on the Ryzen system w/ the following 
> setup:
> 
>* Host kernel v4.11 (with this patch https://lkml.org/lkml/2017/6/23/295)
> 
>* guest VM RHEL7.3
> 
>* guest graphic driver = radeon
> 
>* qemu-system-x86_64 --version
>  QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)
> 
>* kvm-amd npt=1
> 
>* dGPU is 08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
> [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)
> 
>* qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host -bios 
> /usr/share/qemu/bios.bin -device 
> ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 
> -drive 
> file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net none -vga none 
> -nodefaults -device 
> vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,bus=root.1,romfile=/sandbox/vm-images/vbios.rom
>  
> -usb -device usb-host,hostbus=3,hostport=1 -device 
> usb-host,hostbus=3,hostport=3 
> -device vfio-pci,host=:08:00.1 -device vfio-pci,host=:09:00.0
> 
> With this setup, I am able to pass-through the dGPU and run the following 
> test:
>* Starting up the guest w/ full GNOME GUI on the attached monitor.
>* glxgears (running @ 60 FPS)
>* Playing 1080p HD video on Youtube
> 
> I am not noticing issues here. What kind of test are you running in the guest 
> VM?


Hi Suravee,

Thanks for your help!  I think we'd be in real trouble if glxgears
reported something less than 60fps (perhaps that might be easier to
debug though).  I'd suggest trying one of the Unigine benchmarks, like
Heaven.  There should be a noticeable and consistent framerate
difference for npt=0/1.  Thanks,

Alex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Steven Walter
On Wed, Jun 28, 2017 at 1:23 PM, Suravee Suthikulpanit
 wrote:
>
>
> On 6/25/17 12:55, Nick Sarnie wrote:
>>
>> On Fri, May 5, 2017 at 1:27 PM, Alex Williamson
>>  wrote:
>>>
>>> On Wed, 3 May 2017 12:28:35 -0400
>>> Nick Sarnie  wrote:
>>>
 On Wed, May 3, 2017 at 10:37 AM, Matthias Ehrenfeuchter
  wrote:
>
> Hi,
>
> There are a lot of messages/threads out there about bad performance
> while
> using AMDs Ryzen with KVM GPU passthrough. It revolves all on
> enabling/disabling npt, while enabled overall VM performance is nice
> but the
> GPU performance gives me about 20% (and a lot of drops to zero GPU
> usage,
> while CPU/Disk/Ram also doing nothing) compared to npt disabled. But
> while
> npt is disabled overall VM performance is like beeing on 4x86 with
> floppy
> disk as only storage. (Ex. it takes 2 seconds just to open startmenu
> while
> host and vm are in idle, and neither CPU pinning, changing CPU model,
> changing storage device nor using hugepages changed anything).
>
> So everything I read pointed to a bug in the npt implementation?
> Anything I
> could do to get closer to the "thing" issuing this?
>
> Best Regards
>
> efeu
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


 I heard from Joerg that it might be related to a lower intercept rate
 being used when NPT is enabled, but we haven't been able to find a way
 to trace that to confirm.
>>>
>>>
>>> Joerg/Paolo, any ideas how we might debug this?  Anyone from AMD
>>> watching?  Thanks,
>>>
>>> Alex
>>
>>
>>
>> Hi all,
>>
>> A somewhat major update.
>>
>> I managed to install Xen with my GPU passthrough config and test the
>> performance with NPT enabled.
>>
>> There is no performance drop with NPT on Xen, it matches the GPU
>> performance of KVM with NPT disabled. The CPU performance is also
>> great.
>>
>> John Bridgman (ccd) from AMD says he's going to ask around AMD about
>> this next week, but it would be even better if some AMD guys that read
>> this ML shared their ideas or took a look.
>>
>> Let me know if you need any more information.
>>
>> Thanks,
>> Sarnex
>> ___
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>>
>
> So, I'm trying to reproduce this issue on the Ryzen system w/ the following
> setup:
>
>   * Host kernel v4.11 (with this patch https://lkml.org/lkml/2017/6/23/295)
>
>   * guest VM RHEL7.3
>
>   * guest graphic driver = radeon
>
>   * qemu-system-x86_64 --version
> QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)
>
>   * kvm-amd npt=1
>
>   * dGPU is 08:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)
>
>   * qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host -bios
> /usr/share/qemu/bios.bin -device
> ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1
> -drive file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net none
> -vga none -nodefaults -device
> vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,bus=root.1,romfile=/sandbox/vm-images/vbios.rom
> -usb -device usb-host,hostbus=3,hostport=1 -device
> usb-host,hostbus=3,hostport=3 -device vfio-pci,host=:08:00.1 -device
> vfio-pci,host=:09:00.0
>
> With this setup, I am able to pass-through the dGPU and run the following
> test:
>   * Starting up the guest w/ full GNOME GUI on the attached monitor.
>   * glxgears (running @ 60 FPS)
>   * Playing 1080p HD video on Youtube
>
> I am not noticing issues here. What kind of test are you running in the
> guest VM?

Try running the open source game "torcs" inside the VM.  I think
you'll find that there's a very noticeable performance different
between npt=1 and npt=0
-- 
-Steven Walter 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-06-28 Thread Suravee Suthikulpanit



On 6/25/17 12:55, Nick Sarnie wrote:

On Fri, May 5, 2017 at 1:27 PM, Alex Williamson
 wrote:

On Wed, 3 May 2017 12:28:35 -0400
Nick Sarnie  wrote:


On Wed, May 3, 2017 at 10:37 AM, Matthias Ehrenfeuchter  wrote:

Hi,

There are a lot of messages/threads out there about bad performance while
using AMDs Ryzen with KVM GPU passthrough. It revolves all on
enabling/disabling npt, while enabled overall VM performance is nice but the
GPU performance gives me about 20% (and a lot of drops to zero GPU usage,
while CPU/Disk/Ram also doing nothing) compared to npt disabled. But while
npt is disabled overall VM performance is like beeing on 4x86 with floppy
disk as only storage. (Ex. it takes 2 seconds just to open startmenu while
host and vm are in idle, and neither CPU pinning, changing CPU model,
changing storage device nor using hugepages changed anything).

So everything I read pointed to a bug in the npt implementation? Anything I
could do to get closer to the "thing" issuing this?

Best Regards

efeu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


I heard from Joerg that it might be related to a lower intercept rate
being used when NPT is enabled, but we haven't been able to find a way
to trace that to confirm.


Joerg/Paolo, any ideas how we might debug this?  Anyone from AMD
watching?  Thanks,

Alex



Hi all,

A somewhat major update.

I managed to install Xen with my GPU passthrough config and test the
performance with NPT enabled.

There is no performance drop with NPT on Xen, it matches the GPU
performance of KVM with NPT disabled. The CPU performance is also
great.

John Bridgman (ccd) from AMD says he's going to ask around AMD about
this next week, but it would be even better if some AMD guys that read
this ML shared their ideas or took a look.

Let me know if you need any more information.

Thanks,
Sarnex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



So, I'm trying to reproduce this issue on the Ryzen system w/ the following 
setup:

  * Host kernel v4.11 (with this patch https://lkml.org/lkml/2017/6/23/295)

  * guest VM RHEL7.3

  * guest graphic driver = radeon

  * qemu-system-x86_64 --version
QEMU emulator version 2.9.50 (v2.9.0-1659-g577caa2-dirty)

  * kvm-amd npt=1

  * dGPU is 08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81)


  * qemu-system-x86_64 -smp 4 -enable-kvm -M q35 -m 4096 -cpu host -bios 
/usr/share/qemu/bios.bin -device 
ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 -drive 
file=/sandbox/vm-images/rhel7.3.qcow2,if=virtio,id=disk0 -net none -vga none 
-nodefaults -device 
vfio-pci,host=08:00.0,x-vga=on,addr=0.0,multifunction=on,bus=root.1,romfile=/sandbox/vm-images/vbios.rom 
-usb -device usb-host,hostbus=3,hostport=1 -device usb-host,hostbus=3,hostport=3 
-device vfio-pci,host=:08:00.1 -device vfio-pci,host=:09:00.0


With this setup, I am able to pass-through the dGPU and run the following test:
  * Starting up the guest w/ full GNOME GUI on the attached monitor.
  * glxgears (running @ 60 FPS)
  * Playing 1080p HD video on Youtube

I am not noticing issues here. What kind of test are you running in the guest 
VM?

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] io-pgtable lock removal

2017-06-28 Thread Ray Jui via iommu
Hi Will/Robin,

On 6/28/17 4:46 AM, Will Deacon wrote:
> Hi Ray,
> 
> Robin and I have been bashing our heads against the tlb_sync_pending flag
> this morning, and we reckon it could have something to do with your timeouts
> on MMU-500.
> 
> On Tue, Jun 27, 2017 at 09:43:19AM -0700, Ray Jui wrote:
 Also, in a few occasions, I observed the following message during the
 test, when multiple cores are involved:

 arm-smmu 6400.mmu: TLB sync timed out -- SMMU may be deadlocked
> 
> The tlb_sync_pending logic was written under the assumption of a global
> page-table lock, so it assumes that it only has to care about syncing
> flushes from the current CPU/context. That's not true anymore, and the
> current code can accidentally skip syncs and (what I think is happening in
> your case) allow concurrent syncs, which will potentially lead to timeouts
> if a CPU is unlucky enough to keep missing the Ack.
> 
> Please can you try the diff below and see if it fixes things for you?
> This applies on top of my for-joerg/arm-smmu/updates branch, but note
> that I've only shown it to the compiler. Not tested at all.
> 
> Will
> 

Thanks for looking into this. I'm a bit busy at work but will certainly
find time to test the diff below. I hopefully will get to it later this
week.

Thanks,

Ray

> --->8
> 
> diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
> index 127558d83667..cd8d7aaec161 100644
> --- a/drivers/iommu/io-pgtable.c
> +++ b/drivers/iommu/io-pgtable.c
> @@ -59,6 +59,7 @@ struct io_pgtable_ops *alloc_io_pgtable_ops(enum 
> io_pgtable_fmt fmt,
>   iop->cookie = cookie;
>   iop->cfg= *cfg;
>  
> + atomic_set(>tlb_sync_pending, 0);
>   return >ops;
>  }
>  
> diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> index 524263a7ae6f..b64580c9d03d 100644
> --- a/drivers/iommu/io-pgtable.h
> +++ b/drivers/iommu/io-pgtable.h
> @@ -1,5 +1,7 @@
>  #ifndef __IO_PGTABLE_H
>  #define __IO_PGTABLE_H
> +
> +#include 
>  #include 
>  
>  /*
> @@ -165,7 +167,7 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops);
>  struct io_pgtable {
>   enum io_pgtable_fmt fmt;
>   void*cookie;
> - booltlb_sync_pending;
> + atomic_ttlb_sync_pending;
>   struct io_pgtable_cfg   cfg;
>   struct io_pgtable_ops   ops;
>  };
> @@ -175,22 +177,20 @@ struct io_pgtable {
>  static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
>  {
>   iop->cfg.tlb->tlb_flush_all(iop->cookie);
> - iop->tlb_sync_pending = true;
> + atomic_set_release(>tlb_sync_pending, 1);
>  }
>  
>  static inline void io_pgtable_tlb_add_flush(struct io_pgtable *iop,
>   unsigned long iova, size_t size, size_t granule, bool leaf)
>  {
>   iop->cfg.tlb->tlb_add_flush(iova, size, granule, leaf, iop->cookie);
> - iop->tlb_sync_pending = true;
> + atomic_set_release(>tlb_sync_pending, 1);
>  }
>  
>  static inline void io_pgtable_tlb_sync(struct io_pgtable *iop)
>  {
> - if (iop->tlb_sync_pending) {
> + if (atomic_xchg_relaxed(>tlb_sync_pending, 0))
>   iop->cfg.tlb->tlb_sync(iop->cookie);
> - iop->tlb_sync_pending = false;
> - }
>  }
>  
>  /**
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/omap: Return ERR_PTR in device_group call-back

2017-06-28 Thread Suman Anna via iommu
On 06/28/2017 07:00 AM, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Make sure that the device_group callback returns an ERR_PTR
> instead of NULL.
> 
> Signed-off-by: Joerg Roedel 

Thanks for the patch,
Acked-by: Suman Anna 

regards
Suman

> ---
>  drivers/iommu/omap-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
> index 95dfca3..641e035 100644
> --- a/drivers/iommu/omap-iommu.c
> +++ b/drivers/iommu/omap-iommu.c
> @@ -1309,7 +1309,7 @@ static void omap_iommu_remove_device(struct device *dev)
>  static struct iommu_group *omap_iommu_device_group(struct device *dev)
>  {
>   struct omap_iommu_arch_data *arch_data = dev->archdata.iommu;
> - struct iommu_group *group = NULL;
> + struct iommu_group *group = ERR_PTR(-EINVAL);
>  
>   if (arch_data->iommu_dev)
>   group = arch_data->iommu_dev->group;
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/9] iommu: Introduce fault notifier API

2017-06-28 Thread Jacob Pan
On Wed, 28 Jun 2017 12:16:03 +0200
Joerg Roedel  wrote:

> On Tue, Jun 27, 2017 at 12:47:59PM -0700, Jacob Pan wrote:
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index d973555..07cfd92 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -48,6 +48,7 @@ struct iommu_group {
> > struct list_head devices;
> > struct mutex mutex;
> > struct blocking_notifier_head notifier;
> > +   struct blocking_notifier_head fault_notifier;  
> 
> Do you really need a notifier chain here? Will there ever be more than
> one callback registered to it?
> 
yes, this notifier chain is shared by all devices under a group. the
event contains device info which notifier callbacks can filter.
> > +struct iommu_fault_event {
> > +   struct device *dev;  
> 
> Putting a 'struct device *' member in a uapi struct looks
> fundamentally wrong.
> 
> 
my mistake, it was originally (RFC) not in uapi but with the
consideration of using vfio to expose it to user space I have moved it
to uapi. But you are right, it should be some other forms of device
representation used by vfio. VFIO layer has to do the translation and
inject that into the guest. In kernel driver users can use struct
device to identify the faulting device.

> 
>   Joerg
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/9] iommu: Introduce iommu do invalidate API function

2017-06-28 Thread Jacob Pan
On Wed, 28 Jun 2017 12:08:23 +0200
Joerg Roedel  wrote:

> On Tue, Jun 27, 2017 at 12:47:57PM -0700, Jacob Pan wrote:
> > From: "Liu, Yi L" 
> > 
> > When a SVM capable device is assigned to a guest, the first level
> > page tables are owned by the guest and the guest PASID table
> > pointer is linked to the device context entry of the physical IOMMU.
> > 
> > Host IOMMU driver has no knowledge of caching structure updates
> > unless the guest invalidation activities are passed down to the
> > host. The primary usage is derived from emulated IOMMU in the
> > guest, where QEMU can trap invalidation activities before pass them
> > down the host/physical IOMMU. There are IOMMU architectural
> > specific actions need to be taken which requires the generic APIs
> > introduced in this patch to have opaque data in the
> > tlb_invalidate_info argument.  
> 
> Which "IOMMU architectural specific actions" are you thinking of?
> 
construction of queued invalidation descriptors, then submit them to
the IOMMU QI interface.
> > +int iommu_invalidate(struct iommu_domain *domain,
> > +   struct device *dev, struct tlb_invalidate_info
> > *inv_info) +{
> > +   int ret = 0;
> > +
> > +   if (unlikely(!domain->ops->invalidate))
> > +   return -ENODEV;
> > +
> > +   ret = domain->ops->invalidate(domain, dev, inv_info);
> > +
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_invalidate);  
> 
> [...]
> 
> > +struct tlb_invalidate_info {
> > +   __u32   model;
> > +   __u32   length;
> > +   __u8opaque[];
> > +};  
> 
> This interface is aweful. It requires the user of a generic api to
> know details about the implementation behind to do anything useful.
> 
> Please explain in more detail why this is needed. My feeling is that
> we can make this more generic with a small set of invalidation
> functions in the iommu-api.
> 
My thinking was that via configuration control, there will be unlikely
any mixed IOMMU models between pIOMMU and vIOMMU. We could have just
model specific data pass through layers of SW (QEMU, VFIO) for
performance reasons. We do have an earlier hybrid version that has
generic data and opaque raw data. Would the below work for all IOMMU
models?

https://www.spinics.net/lists/kvm/msg148798.html

struct tlb_invalidate_info
{
__u32   model;  /* Vendor number */
__u8 granularity
#define DEVICE_SELECTVIE_INV(1 << 0)
#define PAGE_SELECTIVE_INV  (1 << 0)
#define PASID_SELECTIVE_INV (1 << 1)
__u32 pasid;
__u64 addr;
__u64 size;

/* Since IOMMU format has already been validated for this table,
   the IOMMU driver knows that the following structure is in a
   format it knows */
__u8 opaque[];
};

> 
> 
>   Joerg
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] iommu: Return ERR_PTR() values from device_group call-backs

2017-06-28 Thread Gerald Schaefer
On Wed, 28 Jun 2017 14:00:56 +0200
Joerg Roedel  wrote:

> From: Joerg Roedel 
> 
> The generic device_group call-backs in iommu.c return NULL
> in case of error. Since they are getting ERR_PTR values from
> iommu_group_alloc(), just pass them up instead.
> 
> Reported-by: Gerald Schaefer 
> Signed-off-by: Joerg Roedel 
> ---

Looks good,
Reviewed-by: Gerald Schaefer 


>  drivers/iommu/iommu.c | 14 ++
>  1 file changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index cf7ca7e..de09e1e 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -915,13 +915,7 @@ static int get_pci_alias_or_group(struct pci_dev *pdev, 
> u16 alias, void *opaque)
>   */
>  struct iommu_group *generic_device_group(struct device *dev)
>  {
> - struct iommu_group *group;
> -
> - group = iommu_group_alloc();
> - if (IS_ERR(group))
> - return NULL;
> -
> - return group;
> + return iommu_group_alloc();
>  }
> 
>  /*
> @@ -988,11 +982,7 @@ struct iommu_group *pci_device_group(struct device *dev)
>   return group;
> 
>   /* No shared group found, allocate new */
> - group = iommu_group_alloc();
> - if (IS_ERR(group))
> - return NULL;
> -
> - return group;
> + return iommu_group_alloc();
>  }
> 
>  /**

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Fix interrupt remapping when disable guest_mode

2017-06-28 Thread Joerg Roedel
On Mon, Jun 26, 2017 at 04:28:04AM -0500, Suravee Suthikulpanit wrote:
> Pass-through devices to VM guest can get updated IRQ affinity
> information via irq_set_affinity() when not running in guest mode.
> Currently, AMD IOMMU driver in GA mode ignores the updated information
> if the pass-through device is setup to use vAPIC regardless of guest_mode.
> This could cause invalid interrupt remapping.
> 
> Also, the guest_mode bit should be set and cleared only when
> SVM updates posted-interrupt interrupt remapping information.
> 
> Signed-off-by: Suravee Suthikulpanit 
> Cc: Joerg Roedel 

Nervermind, I added this fixes line and applied the patch:

Fixes: d98de49a53e48 ('iommu/amd: Enable vAPIC interrupt remapping mode 
by default'
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/vt-d: constify intel_dma_ops

2017-06-28 Thread Joerg Roedel
On Wed, Jun 28, 2017 at 04:39:32PM +0530, Arvind Yadav wrote:
> Most dma_map_ops structures are never modified. Constify these
> structures such that these can be write-protected.
> 
> Signed-off-by: Arvind Yadav 
> ---
> Changes in v2:
>   Added description.
> 
>  drivers/iommu/intel-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/3] iommu: Warn once when device_group callback returns NULL

2017-06-28 Thread Joerg Roedel
From: Joerg Roedel 

This callback should never return NULL. Print a warning if
that happens so that we notice and can fix it.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index de09e1e..3f6ea16 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1010,6 +1010,9 @@ struct iommu_group *iommu_group_get_for_dev(struct device 
*dev)
if (ops && ops->device_group)
group = ops->device_group(dev);
 
+   if (WARN_ON_ONCE(group == NULL))
+   return ERR_PTR(-EINVAL);
+
if (IS_ERR(group))
return group;
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/3] iommu/omap: Return ERR_PTR in device_group call-back

2017-06-28 Thread Joerg Roedel
From: Joerg Roedel 

Make sure that the device_group callback returns an ERR_PTR
instead of NULL.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/omap-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 95dfca3..641e035 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1309,7 +1309,7 @@ static void omap_iommu_remove_device(struct device *dev)
 static struct iommu_group *omap_iommu_device_group(struct device *dev)
 {
struct omap_iommu_arch_data *arch_data = dev->archdata.iommu;
-   struct iommu_group *group = NULL;
+   struct iommu_group *group = ERR_PTR(-EINVAL);
 
if (arch_data->iommu_dev)
group = arch_data->iommu_dev->group;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu: Return ERR_PTR() values from device_group call-backs

2017-06-28 Thread Joerg Roedel
From: Joerg Roedel 

The generic device_group call-backs in iommu.c return NULL
in case of error. Since they are getting ERR_PTR values from
iommu_group_alloc(), just pass them up instead.

Reported-by: Gerald Schaefer 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/iommu.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cf7ca7e..de09e1e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -915,13 +915,7 @@ static int get_pci_alias_or_group(struct pci_dev *pdev, 
u16 alias, void *opaque)
  */
 struct iommu_group *generic_device_group(struct device *dev)
 {
-   struct iommu_group *group;
-
-   group = iommu_group_alloc();
-   if (IS_ERR(group))
-   return NULL;
-
-   return group;
+   return iommu_group_alloc();
 }
 
 /*
@@ -988,11 +982,7 @@ struct iommu_group *pci_device_group(struct device *dev)
return group;
 
/* No shared group found, allocate new */
-   group = iommu_group_alloc();
-   if (IS_ERR(group))
-   return NULL;
-
-   return group;
+   return iommu_group_alloc();
 }
 
 /**
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] io-pgtable lock removal

2017-06-28 Thread Will Deacon
Hi Ray,

Robin and I have been bashing our heads against the tlb_sync_pending flag
this morning, and we reckon it could have something to do with your timeouts
on MMU-500.

On Tue, Jun 27, 2017 at 09:43:19AM -0700, Ray Jui wrote:
> >> Also, in a few occasions, I observed the following message during the
> >> test, when multiple cores are involved:
> >>
> >> arm-smmu 6400.mmu: TLB sync timed out -- SMMU may be deadlocked

The tlb_sync_pending logic was written under the assumption of a global
page-table lock, so it assumes that it only has to care about syncing
flushes from the current CPU/context. That's not true anymore, and the
current code can accidentally skip syncs and (what I think is happening in
your case) allow concurrent syncs, which will potentially lead to timeouts
if a CPU is unlucky enough to keep missing the Ack.

Please can you try the diff below and see if it fixes things for you?
This applies on top of my for-joerg/arm-smmu/updates branch, but note
that I've only shown it to the compiler. Not tested at all.

Will

--->8

diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index 127558d83667..cd8d7aaec161 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -59,6 +59,7 @@ struct io_pgtable_ops *alloc_io_pgtable_ops(enum 
io_pgtable_fmt fmt,
iop->cookie = cookie;
iop->cfg= *cfg;
 
+   atomic_set(>tlb_sync_pending, 0);
return >ops;
 }
 
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 524263a7ae6f..b64580c9d03d 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -1,5 +1,7 @@
 #ifndef __IO_PGTABLE_H
 #define __IO_PGTABLE_H
+
+#include 
 #include 
 
 /*
@@ -165,7 +167,7 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops);
 struct io_pgtable {
enum io_pgtable_fmt fmt;
void*cookie;
-   booltlb_sync_pending;
+   atomic_ttlb_sync_pending;
struct io_pgtable_cfg   cfg;
struct io_pgtable_ops   ops;
 };
@@ -175,22 +177,20 @@ struct io_pgtable {
 static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
 {
iop->cfg.tlb->tlb_flush_all(iop->cookie);
-   iop->tlb_sync_pending = true;
+   atomic_set_release(>tlb_sync_pending, 1);
 }
 
 static inline void io_pgtable_tlb_add_flush(struct io_pgtable *iop,
unsigned long iova, size_t size, size_t granule, bool leaf)
 {
iop->cfg.tlb->tlb_add_flush(iova, size, granule, leaf, iop->cookie);
-   iop->tlb_sync_pending = true;
+   atomic_set_release(>tlb_sync_pending, 1);
 }
 
 static inline void io_pgtable_tlb_sync(struct io_pgtable *iop)
 {
-   if (iop->tlb_sync_pending) {
+   if (atomic_xchg_relaxed(>tlb_sync_pending, 0))
iop->cfg.tlb->tlb_sync(iop->cookie);
-   iop->tlb_sync_pending = false;
-   }
 }
 
 /**
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/iova: don't disable preempt around this_cpu_ptr()

2017-06-28 Thread Joerg Roedel
On Wed, Jun 28, 2017 at 11:31:55AM +0200, Sebastian Andrzej Siewior wrote:
> It really does. The spin_lock() does disable preemption but this is not
> the problem. The thing is that the preempt_disable() is superfluous and
> it hurts Preempt-RT (and this is how I noticed it). Also the
> get_cpu_ptr() is not requited and was only added to keep lockdep quiet
> (according to the history).
> Everything else here can stay as-is, I am just asking for the removal of
> the redundant preempt_disable() where it is not required.

Okay, makes sense, I applied both patches.


Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: constify intel_dma_ops.

2017-06-28 Thread Joerg Roedel
On Wed, Jun 28, 2017 at 03:31:16PM +0530, Arvind Yadav wrote:
> Most dma_map_ops structures are never modified. Constify these
> structures such that these can be write-protected. This file size diff
> will show the difference between data and text segment.

I know what the diff shows, but it doesn't matter for this patch because
thats just an implementation detail of the compiler. The real reason for
making it 'const' is to write-protect them, and that should be clear in
the commit-message.


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/9] iommu: Introduce iommu do invalidate API function

2017-06-28 Thread Joerg Roedel
On Tue, Jun 27, 2017 at 12:47:57PM -0700, Jacob Pan wrote:
> From: "Liu, Yi L" 
> 
> When a SVM capable device is assigned to a guest, the first level page
> tables are owned by the guest and the guest PASID table pointer is
> linked to the device context entry of the physical IOMMU.
> 
> Host IOMMU driver has no knowledge of caching structure updates unless
> the guest invalidation activities are passed down to the host. The
> primary usage is derived from emulated IOMMU in the guest, where QEMU
> can trap invalidation activities before pass them down the
> host/physical IOMMU. There are IOMMU architectural specific actions
> need to be taken which requires the generic APIs introduced in this
> patch to have opaque data in the tlb_invalidate_info argument.

Which "IOMMU architectural specific actions" are you thinking of?

> +int iommu_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info)
> +{
> + int ret = 0;
> +
> + if (unlikely(!domain->ops->invalidate))
> + return -ENODEV;
> +
> + ret = domain->ops->invalidate(domain, dev, inv_info);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_invalidate);

[...]

> +struct tlb_invalidate_info {
> + __u32   model;
> + __u32   length;
> + __u8opaque[];
> +};

This interface is aweful. It requires the user of a generic api to know
details about the implementation behind to do anything useful.

Please explain in more detail why this is needed. My feeling is that we
can make this more generic with a small set of invalidation functions in
the iommu-api.



Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: constify intel_dma_ops.

2017-06-28 Thread Arvind Yadav

Hi,

Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected. This file size diff
will show the difference between data and text segment.

Thanks,
~arvind

On Wednesday 28 June 2017 02:23 PM, Joerg Roedel wrote:

On Tue, Jun 13, 2017 at 03:48:34PM +0530, Arvind Yadav wrote:

File size before:
text   data bss dec hex filename
   327657581824   353478a13 drivers/iommu/intel-iommu.o

File size After adding 'const':
text   data bss dec hex filename
   329255981824   353478a13 drivers/iommu/intel-iommu.o

What is the point of those commit-messages? They just show that the
struct moved from data to the text segment. The overall size doesn't
change much.



Joerg



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/9] iommu/vt-d: add bind_pasid_table function

2017-06-28 Thread Joerg Roedel
On Tue, Jun 27, 2017 at 12:47:56PM -0700, Jacob Pan wrote:
> Add Intel VT-d ops to the generic iommu_bind_pasid_table API
> functions.
> 
> The primary use case is for direct assignment of SVM capable
> device. Originated from emulated IOMMU in the guest, the request goes
> through many layers (e.g. VFIO). Upon calling host IOMMU driver, caller
> passes guest PASID table pointer (GPA) and size.
> 
> Device context table entry is modified by Intel IOMMU specific
> bind_pasid_table function. This will turn on nesting mode and matching
> translation type.
> 
> The unbind operation restores default context mapping.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Ashok Raj 
> ---
>  drivers/iommu/intel-iommu.c   | 117 
> ++
>  include/linux/dma_remapping.h |   1 +
>  2 files changed, 118 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 8274ce3..ef05b59 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5430,6 +5430,119 @@ struct intel_iommu *intel_svm_device_to_iommu(struct 
> device *dev)
>  
>   return iommu;
>  }
> +
> +static int intel_iommu_bind_pasid_table(struct iommu_domain *domain,
> + struct device *dev, struct pasid_table_info *pasidt_binfo)
> +{
> + struct intel_iommu *iommu;
> + struct context_entry *context;
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct device_domain_info *info;
> + struct pci_dev *pdev;
> + u8 bus, devfn;
> + u16 did, *sid;
> + int ret = 0;
> + unsigned long flags;
> + u64 ctx_lo;
> +
> + if (pasidt_binfo == NULL || pasidt_binfo->model != 
> IOMMU_MODEL_INTEL_VTD) {
> + pr_warn("%s: Invalid bind request!\n", __func__);
> + return -EINVAL;
> + }
> +
> + iommu = device_to_iommu(dev, , );
> + if (!iommu)
> + return -ENODEV;
> +
> + sid = (u16 *)_binfo->opaque;
> + /*
> +  * check SID, if it is not correct, return success to allow looping
> +  * through all devices within a group
> +  */
> + if (PCI_DEVID(bus, devfn) != *sid)
> + return 0;
> +
> + if (!dev || !dev_is_pci(dev))
> + return -ENODEV;
> +
> + pdev = to_pci_dev(dev);
> + if (!pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI))
> + return -EINVAL;
> +
> + info = dev->archdata.iommu;
> + if (!info || !info->pasid_supported ||
> + !pci_enable_pasid(pdev, info->pasid_supported & ~1)) {
> + pr_err("PCI %04x:%02x:%02x.%d: has no PASID support\n",
> +pci_domain_nr(pdev->bus), bus, PCI_SLOT(devfn),
> +PCI_FUNC(devfn));

You can either use dev_name() here to decode the device name, or just
use dev_err() instead.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/9] iommu: Introduce bind_pasid_table API function

2017-06-28 Thread Joerg Roedel
On Tue, Jun 27, 2017 at 12:47:55PM -0700, Jacob Pan wrote:
> +int iommu_bind_pasid_table(struct iommu_domain *domain, struct device *dev,
> + struct pasid_table_info *pasidt_binfo)
> +{
> + if (unlikely(!domain->ops->bind_pasid_table))
> + return -EINVAL;

I think its better to return -ENODEV here, like other iommu-api
functions do when a callback is NULL.

> +enum iommu_model {
> + IOMMU_MODEL_INTEL_VTD,
> + IOMMU_MODEL_ARM_SMMU,
> +};

AMD IOMMU also supports shared virtual memory.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 RESEND 27/38] iommu/amd: Allow the AMD IOMMU to work with memory encryption

2017-06-28 Thread Joerg Roedel
Hi Tom,

On Tue, Jun 27, 2017 at 10:12:30AM -0500, Tom Lendacky wrote:
> ---
>  drivers/iommu/amd_iommu.c   |   30 --
>  drivers/iommu/amd_iommu_init.c  |   34 --
>  drivers/iommu/amd_iommu_proto.h |   10 ++
>  drivers/iommu/amd_iommu_types.h |2 +-
>  4 files changed, 55 insertions(+), 21 deletions(-)

Looks like a straightforward change. Just one nit below.

> +static bool amd_iommu_supports_sme(void)
> +{
> + if (!sme_active() || (boot_cpu_data.x86 != 0x17))
> + return true;
> +
> + /* For Fam17h, a specific level of support is required */
> + if (boot_cpu_data.microcode >= 0x08001205)
> + return true;
> +
> + if ((boot_cpu_data.microcode >= 0x08001126) &&
> + (boot_cpu_data.microcode <= 0x080011ff))
> + return true;
> +
> + pr_notice("AMD-Vi: IOMMU not currently supported when SME is active\n");
> +
> + return false;
> +}

The name of the function is misleading. It checks whether the IOMMU can
be enabled when SME is active. But the name suggests that it checks
whether the iommu hardware supports SME.

How about renaming it to amd_iommu_sme_check()?

With that change the patch is:

Acked-by: Joerg Roedel 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction

2017-06-28 Thread Will Deacon
Hi Zhen Lei,

Nate (CC'd), Robin and I have been working on something very similar to
this series, but this patch is different to what we had planned. More below.

On Mon, Jun 26, 2017 at 09:38:46PM +0800, Zhen Lei wrote:
> Because all TLBI commands should be followed by a SYNC command, to make
> sure that it has been completely finished. So we can just add the TLBI
> commands into the queue, and put off the execution until meet SYNC or
> other commands. To prevent the followed SYNC command waiting for a long
> time because of too many commands have been delayed, restrict the max
> delayed number.
> 
> According to my test, I got the same performance data as I replaced writel
> with writel_relaxed in queue_inc_prod.
> 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/iommu/arm-smmu-v3.c | 42 +-
>  1 file changed, 37 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 291da5f..4481123 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -337,6 +337,7 @@
>  /* Command queue */
>  #define CMDQ_ENT_DWORDS  2
>  #define CMDQ_MAX_SZ_SHIFT8
> +#define CMDQ_MAX_DELAYED 32
>  
>  #define CMDQ_ERR_SHIFT   24
>  #define CMDQ_ERR_MASK0x7f
> @@ -472,6 +473,7 @@ struct arm_smmu_cmdq_ent {
>   };
>   } cfgi;
>  
> + #define CMDQ_OP_TLBI_NH_ALL 0x10
>   #define CMDQ_OP_TLBI_NH_ASID0x11
>   #define CMDQ_OP_TLBI_NH_VA  0x12
>   #define CMDQ_OP_TLBI_EL2_ALL0x20
> @@ -499,6 +501,7 @@ struct arm_smmu_cmdq_ent {
>  
>  struct arm_smmu_queue {
>   int irq; /* Wired interrupt */
> + u32 nr_delay;
>  
>   __le64  *base;
>   dma_addr_t  base_dma;
> @@ -722,11 +725,16 @@ static int queue_sync_prod(struct arm_smmu_queue *q)
>   return ret;
>  }
>  
> -static void queue_inc_prod(struct arm_smmu_queue *q)
> +static void queue_inc_swprod(struct arm_smmu_queue *q)
>  {
> - u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1;
> + u32 prod = q->prod + 1;
>  
>   q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
> +}
> +
> +static void queue_inc_prod(struct arm_smmu_queue *q)
> +{
> + queue_inc_swprod(q);
>   writel(q->prod, q->prod_reg);
>  }
>  
> @@ -761,13 +769,24 @@ static void queue_write(__le64 *dst, u64 *src, size_t 
> n_dwords)
>   *dst++ = cpu_to_le64(*src++);
>  }
>  
> -static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent)
> +static int queue_insert_raw(struct arm_smmu_queue *q, u64 *ent, int optimize)
>  {
>   if (queue_full(q))
>   return -ENOSPC;
>  
>   queue_write(Q_ENT(q, q->prod), ent, q->ent_dwords);
> - queue_inc_prod(q);
> +
> + /*
> +  * We don't want too many commands to be delayed, this may lead the
> +  * followed sync command to wait for a long time.
> +  */
> + if (optimize && (++q->nr_delay < CMDQ_MAX_DELAYED)) {
> + queue_inc_swprod(q);
> + } else {
> + queue_inc_prod(q);
> + q->nr_delay = 0;
> + }
> +

So here, you're effectively putting invalidation commands into the command
queue without updating PROD. Do you actually see a performance advantage
from doing so? Another side of the argument would be that we should be
moving PROD as soon as we can, so that the SMMU can process invalidation
commands in the background and reduce the cost of the final SYNC operation
when the high-level unmap operation is complete.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/iova: don't disable preempt around this_cpu_ptr()

2017-06-28 Thread Sebastian Andrzej Siewior
On 2017-06-28 11:22:05 [+0200], Joerg Roedel wrote:
> On Tue, Jun 27, 2017 at 06:16:47PM +0200, Sebastian Andrzej Siewior wrote:
> > Commit 583248e6620a ("iommu/iova: Disable preemption around use of
> > this_cpu_ptr()") disables preemption while accessing a per-CPU variable.
> > This does keep lockdep quiet. However I don't see the point why it is
> > bad if we get migrated after its access to another CPU.
> > __iova_rcache_insert() and __iova_rcache_get() immediately locks the
> > variable after obtaining it - before accessing its members.
> > _If_ we get migrated away after retrieving the address of cpu_rcache
> > before taking the lock then the *other* task on the same CPU will
> > retrieve the same address of cpu_rcache and will spin on the lock.
> > 
> > alloc_iova_fast() disables preemption while invoking
> > free_cpu_cached_iovas() on each CPU. The function itself uses
> > per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr()
> > does). It _could_ make sense to use get_online_cpus() instead but the we
> > have a hotplug notifier for CPU down (and none for up) so we are good.
> 
> Does that really matter? The spin_lock disables irqs and thus avoids
> preemption too. We also can't get rid of the irqsave lock here because
> these locks are taken in the dma-api path which is used from interrupt
> context.

It really does. The spin_lock() does disable preemption but this is not
the problem. The thing is that the preempt_disable() is superfluous and
it hurts Preempt-RT (and this is how I noticed it). Also the
get_cpu_ptr() is not requited and was only added to keep lockdep quiet
(according to the history).
Everything else here can stay as-is, I am just asking for the removal of
the redundant preempt_disable() where it is not required.

>   Joerg

Sebastian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/iova: don't disable preempt around this_cpu_ptr()

2017-06-28 Thread Joerg Roedel
On Tue, Jun 27, 2017 at 06:16:47PM +0200, Sebastian Andrzej Siewior wrote:
> Commit 583248e6620a ("iommu/iova: Disable preemption around use of
> this_cpu_ptr()") disables preemption while accessing a per-CPU variable.
> This does keep lockdep quiet. However I don't see the point why it is
> bad if we get migrated after its access to another CPU.
> __iova_rcache_insert() and __iova_rcache_get() immediately locks the
> variable after obtaining it - before accessing its members.
> _If_ we get migrated away after retrieving the address of cpu_rcache
> before taking the lock then the *other* task on the same CPU will
> retrieve the same address of cpu_rcache and will spin on the lock.
> 
> alloc_iova_fast() disables preemption while invoking
> free_cpu_cached_iovas() on each CPU. The function itself uses
> per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr()
> does). It _could_ make sense to use get_online_cpus() instead but the we
> have a hotplug notifier for CPU down (and none for up) so we are good.

Does that really matter? The spin_lock disables irqs and thus avoids
preemption too. We also can't get rid of the irqsave lock here because
these locks are taken in the dma-api path which is used from interrupt
context.



Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Fix interrupt remapping when disable guest_mode

2017-06-28 Thread Joerg Roedel
Hi Suravee,

On Mon, Jun 26, 2017 at 04:28:04AM -0500, Suravee Suthikulpanit wrote:
> Pass-through devices to VM guest can get updated IRQ affinity
> information via irq_set_affinity() when not running in guest mode.
> Currently, AMD IOMMU driver in GA mode ignores the updated information
> if the pass-through device is setup to use vAPIC regardless of guest_mode.
> This could cause invalid interrupt remapping.
> 
> Also, the guest_mode bit should be set and cleared only when
> SVM updates posted-interrupt interrupt remapping information.
> 
> Signed-off-by: Suravee Suthikulpanit 
> Cc: Joerg Roedel 

Can you please provide a 'Fixes:' tag too? No need to resend the whole
patch, just reply with the tag and I'll add it when applying this patch.


Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: constify intel_dma_ops.

2017-06-28 Thread Joerg Roedel
On Tue, Jun 13, 2017 at 03:48:34PM +0530, Arvind Yadav wrote:
> File size before:
>text  data bss dec hex filename
>   32765   7581824   353478a13 drivers/iommu/intel-iommu.o
> 
> File size After adding 'const':
>text  data bss dec hex filename
>   32925   5981824   353478a13 drivers/iommu/intel-iommu.o

What is the point of those commit-messages? They just show that the
struct moved from data to the text segment. The overall size doesn't
change much.



Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] iommu/arm-smmu: Updates for 4.13

2017-06-28 Thread Joerg Roedel
On Mon, Jun 26, 2017 at 11:42:42AM +0100, Will Deacon wrote:
> The following changes since commit 2ea659a9ef488125eb46da6eb571de5eae5c43f6:
> 
>   Linux 4.12-rc1 (2017-05-13 13:19:49 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
> for-joerg/arm-smmu/updates
> 
> for you to fetch changes up to f935448acf462c26142e8b04f1c8829b28d3b9d8:
> 
>   iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 
> (2017-06-23 17:58:04 +0100)

Pulled, thanks Will.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-28 Thread Joerg Roedel
[Adding Alex Deucher]

Hey Alex,

On Tue, Jun 27, 2017 at 12:24:35PM -0400, Jan Vesely wrote:
> On Mon, 2017-06-26 at 14:14 +0200, Joerg Roedel wrote:

> > How does that 'dGPU goes to sleep' work? Do you put it to sleep manually
> > via sysfs or something? Or is that something that amdgpu does on its
> > own?
> 
> AMD folks should be able to provide more details. afaik, the driver
> uses ACPI methods to power on/off the device. Driver routines wake the
> device up before accessing it and there is a timeout to turn it off
> after few seconds of inactivity.
> 
> > 
> > It looks like the GPU just switches the ATS unit off when it goes to
> > sleep and doesn't answer the invalidation anymore, which explains the
> > completion-wait timeouts.
> 
> Both MMIO regs and PCIe config regs are turned off so it would not
> surprise me if all PCIe requests were ignored by the device in off
> state. it should be possible to request device wake up before
> invalidating the relevant IOMMU domain. I'll leave to more
> knowledgeable ppl to judge whether it's a good idea (we can also
> postpone such invalidations until the device is woken by other means)

Can you maybe sched some light on how the sleep-mode of the GPUs work?
Is it initiated by the GPU driver or from somewhere else? In the case
discussed here it looks like the ATS unit of the GPU is switched of,
causing IOTLB invalidation timeouts on the IOMMU side.

If that is the case we might need some sort of dma-api extension so that
the GPU driver can tell the iommu driver that the device is going to be
quiet.


Thanks,

   Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu