Re: [PATCH] iommu/amd: Fix event counter availability check

2020-05-31 Thread Alexander Monakov
Hi,

Adding Shuah Khan to Cc: I've noticed you've seen this issue on Ryzen 2400GE;
can you have a look at the patch? Would be nice to know if it fixes the
problem for you too.

Thanks.
Alexander

On Fri, 29 May 2020, Alexander Monakov wrote:

> The driver performs an extra check if the IOMMU's capabilities advertise
> presence of performance counters: it verifies that counters are writable
> by writing a hard-coded value to a counter and testing that reading that
> counter gives back the same value.
> 
> Unfortunately it does so quite early, even before pci_enable_device is
> called for the IOMMU, i.e. when accessing its MMIO space is not
> guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test:
> the driver assumes the counters are not writable, and disables the
> functionality.
> 
> Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves
> the issue. This is the earliest point in amd_iommu_init_pci where the
> call succeeds on my laptop.
> 
> Signed-off-by: Alexander Monakov 
> Cc: Joerg Roedel 
> Cc: Suravee Suthikulpanit 
> Cc: iommu@lists.linux-foundation.org
> ---
> 
> PS. I'm seeing another hiccup with IOMMU probing on my system:
> pci :00:00.2: can't derive routing for PCI INT A
> pci :00:00.2: PCI INT A: not connected
> 
> Hopefully I can figure it out, but I'd appreciate hints.
> 
>  drivers/iommu/amd_iommu_init.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 5b81fd16f5fa..1b7ec6b6a282 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu 
> *iommu)
>   if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
>   amd_iommu_np_cache = true;
>  
> - init_iommu_perf_ctr(iommu);
> -
>   if (is_rd890_iommu(iommu->dev)) {
>   int i, j;
>  
> @@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void)
>  
>   init_device_table_dma();
>  
> - for_each_iommu(iommu)
> + for_each_iommu(iommu) {
>   iommu_flush_all_caches(iommu);
> + init_iommu_perf_ctr(iommu);
> + }
>  
>   if (!ret)
>   print_iommu_info();
> 
> base-commit: 75caf310d16cc5e2f851c048cd597f5437013368
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/vt-d: Set U/S bit in first level page table by default

2020-05-31 Thread Lu Baolu
When using first-level translation for IOVA, currently the U/S bit in the
page table is cleared which implies DMA requests with user privilege are
blocked. As the result, following error messages might be observed when
passing through a device to user level:

DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Read] Request device [41:00.0] PASID 1 fault addr 7ecdcd000
[fault reason 129] SM: U/S set 0 for first-level translation
with user privilege

This fixes it by setting U/S bit in the first level page table and makes
IOVA over first level compatible with previous second-level translation.

Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level")
Reported-by: Xin Zeng 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 5 ++---
 include/linux/intel-iommu.h | 1 +
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 648a785e078a..d148712466b4 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -921,7 +921,7 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domain 
*domain,
domain_flush_cache(domain, tmp_page, VTD_PAGE_SIZE);
pteval = ((uint64_t)virt_to_dma_pfn(tmp_page) << 
VTD_PAGE_SHIFT) | DMA_PTE_READ | DMA_PTE_WRITE;
if (domain_use_first_level(domain))
-   pteval |= DMA_FL_PTE_XD;
+   pteval |= DMA_FL_PTE_XD | DMA_FL_PTE_US;
if (cmpxchg64(&pte->val, 0ULL, pteval))
/* Someone else set it while we were thinking; 
use theirs. */
free_pgtable_page(tmp_page);
@@ -1951,7 +1951,6 @@ static inline void
 context_set_sm_rid2pasid(struct context_entry *context, unsigned long pasid)
 {
context->hi |= pasid & ((1 << 20) - 1);
-   context->hi |= (1 << 20);
 }
 
 /*
@@ -2243,7 +2242,7 @@ static int __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,
 
attr = prot & (DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP);
if (domain_use_first_level(domain))
-   attr |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD;
+   attr |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD | DMA_FL_PTE_US;
 
if (!sg) {
sg_res = nr_pages;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 4100bd224f5c..3e8fa1c7a1e6 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -41,6 +41,7 @@
 #define DMA_PTE_SNPBIT_ULL(11)
 
 #define DMA_FL_PTE_PRESENT BIT_ULL(0)
+#define DMA_FL_PTE_US  BIT_ULL(2)
 #define DMA_FL_PTE_XD  BIT_ULL(63)
 
 #define ADDR_WIDTH_5LEVEL  (57)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu/vt-d: Make Intel SVM code 64-bit only

2020-05-31 Thread Lu Baolu
Current Intel SVM is designed by setting the pgd_t of the processor page
table to FLPTR field of the PASID entry. The first level translation only
supports 4 and 5 level paging structures, hence it's infeasible for the
IOMMU to share a processor's page table when it's running in 32-bit mode.
Let's disable 32bit support for now and claim support only when all the
missing pieces are ready in the future.

Fixes: 1c4f88b7f1f92 ("iommu/vt-d: Shared virtual address in scalable mode")
Suggested-by: Joerg Roedel 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index aca76383f201..e6e0259c0a1c 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -211,7 +211,7 @@ config INTEL_IOMMU_DEBUGFS
 
 config INTEL_IOMMU_SVM
bool "Support for Shared Virtual Memory with Intel IOMMU"
-   depends on INTEL_IOMMU && X86
+   depends on INTEL_IOMMU && X86_64
select PCI_PASID
select PCI_PRI
select MMU_NOTIFIER
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] iommu/vt-d: Two fixes for v5.8

2020-05-31 Thread Lu Baolu
Hi Joerg,

This encloses two fixes for v5.8.
- Make Intel SVM code 64-bit only
- Set U/S bit to make IOVA over first level compatible with 2nd level
  translations.

Best regards,
baolu

Lu Baolu (2):
  iommu/vt-d: Make Intel SVM code 64-bit only
  iommu/vt-d: Set U/S bit in first level page table by default

 drivers/iommu/Kconfig   | 2 +-
 drivers/iommu/intel-iommu.c | 5 ++---
 include/linux/intel-iommu.h | 1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix event counter availability check

2020-05-31 Thread Paul Menzel

Dear Alexander,


Thank you very much for the patch.


Am 31.05.20 um 09:22 schrieb Alexander Monakov:


Adding Shuah Khan to Cc: I've noticed you've seen this issue on Ryzen 2400GE;
can you have a look at the patch? Would be nice to know if it fixes the
problem for you too.



On Fri, 29 May 2020, Alexander Monakov wrote:


The driver performs an extra check if the IOMMU's capabilities advertise
presence of performance counters: it verifies that counters are writable
by writing a hard-coded value to a counter and testing that reading that
counter gives back the same value.

Unfortunately it does so quite early, even before pci_enable_device is
called for the IOMMU, i.e. when accessing its MMIO space is not
guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test:
the driver assumes the counters are not writable, and disables the
functionality.

Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves
the issue. This is the earliest point in amd_iommu_init_pci where the
call succeeds on my laptop.

Signed-off-by: Alexander Monakov 
Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Cc: iommu@lists.linux-foundation.org
---

PS. I'm seeing another hiccup with IOMMU probing on my system:
pci :00:00.2: can't derive routing for PCI INT A
pci :00:00.2: PCI INT A: not connected

Hopefully I can figure it out, but I'd appreciate hints.


I guess it’s a firmware bug, but I contacted the linux-pci folks [1].


  drivers/iommu/amd_iommu_init.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5b81fd16f5fa..1b7ec6b6a282 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
amd_iommu_np_cache = true;
  
-	init_iommu_perf_ctr(iommu);

-
if (is_rd890_iommu(iommu->dev)) {
int i, j;
  
@@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void)
  
  	init_device_table_dma();
  
-	for_each_iommu(iommu)

+   for_each_iommu(iommu) {
iommu_flush_all_caches(iommu);
+   init_iommu_perf_ctr(iommu);
+   }
  
  	if (!ret)

print_iommu_info();

base-commit: 75caf310d16cc5e2f851c048cd597f5437013368


Thank you very much for fixing this issue, which is almost two years old 
for me.


Tested-by: Paul Menzel 
MSI MSI MS-7A37/B350M MORTAR with AMD Ryzen 3 2200G
Link: https://lore.kernel.org/linux-iommu/20180727102710.ga6...@8bytes.org/


Kind regards,

Paul


[1]: 
https://lore.kernel.org/linux-pci/8579bd14-e369-1141-917b-204d20cff...@molgen.mpg.de/

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

DMAR errors on Wildcat Point-LP xHCI (Lenovo T450s)

2020-05-31 Thread Vincent Pelletier
Hello,

Trying to use a built-in USB device I rarely use (Sierra EM7345 LTE
modem), I ran into issues from the modem flapping (getting removed
from USB bus before LTE network registration is visible) up to all
devices on the bus being "disconnected" (xhci giving up, in my
understanding, with only the root hubs listed by lsusb).
Checking the syslog, I find this:

May 30 17:35:46 localhost kernel: [  278.999480] DMAR: DRHD: handling
fault status reg 3
May 30 17:35:46 localhost kernel: [  278.999485] DMAR: [DMA Read]
Request device [00:16.7] PASID  fault addr 9cdff000 [fault
reason 02] Present bit in context entry is clear
May 30 17:35:46 localhost kernel: [  278.999488] DMAR: DRHD: handling
fault status reg 3
May 30 17:35:46 localhost kernel: [  278.999490] DMAR: [DMA Read]
Request device [00:16.7] PASID  fault addr 9cdff000 [fault
reason 02] Present bit in context entry is clear
May 30 17:35:46 localhost kernel: [  279.001076] DMAR: DRHD: handling
fault status reg 2
May 30 17:35:46 localhost kernel: [  279.001078] DMAR: [DMA Write]
Request device [00:16.7] PASID  fault addr 9cdff000 [fault
reason 02] Present bit in context entry is clear
May 30 17:35:46 localhost kernel: [  279.001120] DMAR: DRHD: handling
fault status reg 2
May 30 17:35:47 localhost kernel: [  280.738192] usb 2-4: USB
disconnect, device number 10
May 30 17:35:47 localhost kernel: [  280.738224] cdc_mbim 2-4:1.0: Tx
URB error: -19
May 30 17:35:47 localhost kernel: [  280.738303] cdc_mbim 2-4:1.0
wwan0: unregister 'cdc_mbim' usb-:00:14.0-4, CDC MBIM
May 30 17:35:47 localhost ModemManager[736]:   (net/wwan0):
released by device '/sys/devices/pci:00/:00:14.0/usb2/2-4'
May 30 17:35:47 localhost ModemManager[736]: [/dev/cdc-wdm0]
unexpected port hangup!
May 30 17:35:47 localhost ModemManager[736]: [/dev/cdc-wdm0] channel destroyed
May 30 17:35:47 localhost ModemManager[736]:   Connection to
mbim-proxy for /dev/cdc-wdm0 lost, reprobing
May 30 17:35:47 localhost ModemManager[736]:   [device
/sys/devices/pci:00/:00:14.0/usb2/2-4] creating modem with
plugin 'Sierra' and '2' ports
May 30 17:35:47 localhost ModemManager[736]:   Could not
recreate modem for device
'/sys/devices/pci:00/:00:14.0/usb2/2-4': Failed to find a net
port in the MBIM modem
May 30 17:35:47 localhost ModemManager[736]: 
(usbmisc/cdc-wdm0): released by device
'/sys/devices/pci:00/:00:14.0/usb2/2-4'
May 30 17:35:47 localhost ModemManager[736]:   (tty/ttyACM0):
released by device '/sys/devices/pci:00/:00:14.0/usb2/2-4'
May 30 17:35:48 localhost kernel: [  281.202173] usb 2-4: new
high-speed USB device number 11 using xhci_hcd
May 30 17:35:48 localhost kernel: [  281.224037] usb 2-4: New USB
device found, idVendor=8087, idProduct=0716, bcdDevice= 0.00
May 30 17:35:48 localhost kernel: [  281.224043] usb 2-4: New USB
device strings: Mfr=0, Product=0, SerialNumber=0
May 30 17:35:48 localhost kernel: [  281.225646] usb_serial_simple
2-4:1.0: flashloader converter detected
May 30 17:35:48 localhost kernel: [  281.225891] usb 2-4: flashloader
converter now attached to ttyUSB0

which makes me suspect a missing IOMMU mapping in ACPI for the xhci
controller. In this case, the xhci could recover and re-enumerated the
device fairly quickly.
Booting with "intel_iommu=off" makes the LTE modem work at least far
enough that it gets registered to network (can send/receive SMS). I
have not tried data communication (no data plan on current SIM).
I have noticed for a while that this machine had a tendency to lose
all USB devices more often than I enable the LTE modem, so it seems
the modem just make this issue more likely, and is not their direct
cause.

This is on a 5.6.7 (Debian Sid 5.6.0-1-amd64, version from which
pasted logs are extracted), and reproduced with 5.6.14 (Debian Sid
5.6.0-2-amd64).
The USB issues have been happening for a long time, and I use this
modem rarely enough that I would not notice a new issue before several
kernel versions.
The modem usually worked "well enough" but always has had a bit of
flapping, sometimes working after one or two suspend/resume cycles,
and until now I did not feel the need to investigate more (I assumed a
less-than-optimal modem/modem driver).
This time it never ended up working after several suspend/resume
cycles and reboots. So I do not believe it is a localised regression,
but a bad situation getting just nudged over the edge.

$ lspci
00:00.0 Host bridge: Intel Corporation Broadwell-U Host Bridge -OPI (rev 09)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 5500 (rev 09)
00:03.0 Audio device: Intel Corporation Broadwell-U Audio Controller (rev 09)
00:14.0 USB controller: Intel Corporation Wildcat Point-LP USB xHCI
Controller (rev 03)
00:16.0 Communication controller: Intel Corporation Wildcat Point-LP
MEI Controller #1 (rev 03)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (3)
I218-V (rev 03)
00:1b.0 Audio device: Intel Corporation Wildcat Point

[PATCH] iommu: amd: Fix IO_PAGE_FAULT due to __unmap_single() size overflow

2020-05-31 Thread Suravee Suthikulpanit
Currently, an integer is used to specify the size in unmap_sg().
With 2GB worth of pages (512k 4k pages), it requires 31 bits
(i.e. (1 << 19) << 12), which overflows the integer, and ends up
unmapping more pages than intended. Subsequently, this results in
IO_PAGE_FAULT.

Uses size_t instead of int to pass parameter to __unmap_single().

Reported-by: Robert Lippert 
Signed-off-by: Suravee Suthikulpanit 
---
Note: This patch is intended for stable tree prior 5.5 due to commit
be62dbf554c5 ("iommu/amd: Convert AMD iommu driver to the dma-iommu api"),
where the function unmap_sg() was removed.

 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 32de8e7bb8b4..7adc021932b8 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2670,7 +2670,7 @@ static void unmap_sg(struct device *dev, struct 
scatterlist *sglist,
struct protection_domain *domain;
struct dma_ops_domain *dma_dom;
unsigned long startaddr;
-   int npages;
+   size_t npages;
 
domain = get_domain(dev);
if (IS_ERR(domain))
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu