[PATCH v5 13/23] iommu: introduce device fault report API

2018-05-11 Thread Jacob Pan
Traditionally, device specific faults are detected and handled within their own device drivers. When IOMMU is enabled, faults such as DMA related transactions are detected by IOMMU. There is no generic reporting mechanism to report faults back to the in-kernel device driver or the guest OS in case

Re: [PATCH] iommu/arm-smmu-v3: Set GBPA to abort all transactions

2018-05-11 Thread Nate Watterson
Hi Mark, On 4/12/2018 7:56 AM, Marc Zyngier wrote: On 12/04/18 11:17, Robin Murphy wrote: On 11/04/18 17:54, Marc Zyngier wrote: Hi Sammer, On 11/04/18 16:58, Goel, Sameer wrote: On 3/28/2018 9:00 AM, Marc Zyngier wrote: On 2018-03-28 15:39, Timur Tabi wrote: From: Sameer Goel Set SMMU

[PATCH v5 21/23] iommu/vt-d: add intel iommu page response function

2018-05-11 Thread Jacob Pan
This patch adds page response support for Intel VT-d. Generic response data is taken from the IOMMU API then parsed into VT-d specific response descriptor format. Signed-off-by: Jacob Pan --- drivers/iommu/intel-iommu.c | 47 + include/linux/intel-iomm

[PATCH v5 22/23] trace/iommu: add sva trace events

2018-05-11 Thread Jacob Pan
Signed-off-by: Jacob Pan --- include/trace/events/iommu.h | 112 +++ 1 file changed, 112 insertions(+) diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h index 72b4582..e64eb29 100644 --- a/include/trace/events/iommu.h +++ b/include/t

[PATCH v5 23/23] iommu: use sva invalidate and device fault trace event

2018-05-11 Thread Jacob Pan
For performance and debugging purposes, these trace events help analyzing device faults and passdown invalidations that interact with IOMMU subsystem. E.g. IOMMU::00:0a.0 type=2 reason=0 addr=0x007ff000 pasid=1 group=1 last=0 prot=1 Signed-off-by: Jacob Pan --- drivers/iommu/iommu.c

[PATCH v5 19/23] iommu/intel-svm: replace dev ops with fault report API

2018-05-11 Thread Jacob Pan
With the introduction of generic IOMMU device fault reporting API, we can replace the private fault callback functions with standard function and event data. Signed-off-by: Jacob Pan --- drivers/iommu/intel-svm.c | 7 +-- include/linux/intel-svm.h | 20 +++- 2 files changed,

[PATCH v5 20/23] iommu/intel-svm: do not flush iotlb for viommu

2018-05-11 Thread Jacob Pan
vIOMMU passdown invalidation will be inclusive, PASID cache invalidation includes TLBs. See Intel VT-d Specification Ch 6.5.2.2 for details. Signed-off-by: Jacob Pan --- drivers/iommu/intel-svm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-svm.c b/d

[PATCH v5 16/23] iommu/config: add build dependency for dmar

2018-05-11 Thread Jacob Pan
Intel VT-d interrupts come from both IRQ remapping and DMA remapping. In order to report non-recoverable faults back to device driver, we need to have access to IOMMU fault reporting APIs. This patch adds build depenency to DMAR code where fault IRQ handlers can selectively report faults. Signed-o

[PATCH v5 14/23] iommu: introduce page response function

2018-05-11 Thread Jacob Pan
IO page faults can be handled outside IOMMU subsystem. For an example, when nested translation is turned on and guest owns the first level page tables, device page request can be forwared to the guest for handling faults. As the page response returns by the guest, IOMMU driver on the host need to p

[PATCH v5 18/23] iommu/intel-svm: report device page request

2018-05-11 Thread Jacob Pan
If the source device of a page request has its PASID table pointer bound to a guest, the first level page tables are owned by the guest. In this case, we shall let guest OS to manage page fault. This patch uses the IOMMU fault reporting API to send fault events, possibly via VFIO, to the guest OS.

[PATCH v5 15/23] iommu: handle page response timeout

2018-05-11 Thread Jacob Pan
When IO page faults are reported outside IOMMU subsystem, the page request handler may fail for various reasons. E.g. a guest received page requests but did not have a chance to run for a long time. The irresponsive behavior could hold off limited resources on the pending device. There can be hardw

[PATCH v5 17/23] iommu/vt-d: report non-recoverable faults to device

2018-05-11 Thread Jacob Pan
Currently, dmar fault IRQ handler does nothing more than rate limited printk, no critical hardware handling need to be done in IRQ context. For some use case such as vIOMMU, it might be useful to report non-recoverable faults outside host IOMMU subsystem. DMAR fault can come from both DMA and inter

[PATCH v5 08/23] iommu/vt-d: support flushing more translation cache types

2018-05-11 Thread Jacob Pan
When Shared Virtual Memory is exposed to a guest via vIOMMU, extended IOTLB invalidation may be passed down from outside IOMMU subsystems. This patch adds invalidation functions that can be used for additional translation cache types. Signed-off-by: Jacob Pan --- drivers/iommu/dmar.c| 44

[PATCH v5 11/23] driver core: add per device iommu param

2018-05-11 Thread Jacob Pan
DMA faults can be detected by IOMMU at device level. Adding a pointer to struct device allows IOMMU subsystem to report relevant faults back to the device driver for further handling. For direct assigned device (or user space drivers), guest OS holds responsibility to handle and respond per device

[PATCH v5 12/23] iommu: add a timeout parameter for prq response

2018-05-11 Thread Jacob Pan
When an IO page request is processed outside IOMMU subsystem, response can be delayed or lost. Add a tunable setup parameter such that user can chooose the timeout for IOMMU to track pending page requests. This timeout mechanism is a basic safty net which can be implemented in conjunction with cre

[PATCH v5 05/23] iommu: introduce iommu invalidate API function

2018-05-11 Thread Jacob Pan
From: "Liu, Yi L" When an SVM capable device is assigned to a guest, the first level page tables are owned by the guest and the guest PASID table pointer is linked to the device context entry of the physical IOMMU. Host IOMMU driver has no knowledge of caching structure updates unless the guest

[PATCH v5 04/23] iommu/vt-d: add bind_pasid_table function

2018-05-11 Thread Jacob Pan
Add Intel VT-d ops to the generic iommu_bind_pasid_table API functions. The primary use case is for direct assignment of SVM capable device. Originated from emulated IOMMU in the guest, the request goes through many layers (e.g. VFIO). Upon calling host IOMMU driver, caller passes guest PASID tabl

[PATCH v5 10/23] iommu: introduce device fault data

2018-05-11 Thread Jacob Pan
Device faults detected by IOMMU can be reported outside IOMMU subsystem for further processing. This patch intends to provide a generic device fault data such that device drivers can be communicated with IOMMU faults without model specific knowledge. The proposed format is the result of discussion

[PATCH v5 01/23] iommu: introduce bind_pasid_table API function

2018-05-11 Thread Jacob Pan
Virtual IOMMU was proposed to support Shared Virtual Memory (SVM) use in the guest: https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html As part of the proposed architecture, when an SVM capable PCI device is assigned to a guest, nested mode is turned on. Guest owns the first level

[PATCH v5 02/23] iommu/vt-d: move device_domain_info to header

2018-05-11 Thread Jacob Pan
Allow both intel-iommu.c and dmar.c to access device_domain_info. Prepare for additional per device arch data used in TLB flush function Signed-off-by: Jacob Pan --- drivers/iommu/intel-iommu.c | 18 -- include/linux/intel-iommu.h | 19 +++ 2 files changed, 19 ins

[PATCH v5 07/23] iommu/vt-d: fix dev iotlb pfsid use

2018-05-11 Thread Jacob Pan
PFSID should be used in the invalidation descriptor for flushing device IOTLBs on SRIOV VFs. Signed-off-by: Jacob Pan --- drivers/iommu/dmar.c| 6 +++--- drivers/iommu/intel-iommu.c | 16 +++- include/linux/intel-iommu.h | 5 ++--- 3 files changed, 20 insertions(+), 7 delet

[PATCH v5 03/23] iommu/vt-d: add a flag for pasid table bound status

2018-05-11 Thread Jacob Pan
Adding a flag in device domain into to track whether a guest or user PASID table is bound to a device. Signed-off-by: Jacob Pan --- include/linux/intel-iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index 304afae..ddc7d79 100

[PATCH v5 09/23] iommu/vt-d: add svm/sva invalidate function

2018-05-11 Thread Jacob Pan
When Shared Virtual Address (SVA) is enabled for a guest OS via vIOMMU, we need to provide invalidation support at IOMMU API and driver level. This patch adds Intel VT-d specific function to implement iommu passdown invalidate API for shared virtual address. The use case is for supporting caching

[PATCH v5 06/23] iommu/vt-d: add definitions for PFSID

2018-05-11 Thread Jacob Pan
When SRIOV VF device IOTLB is invalidated, we need to provide the PF source ID such that IOMMU hardware can gauge the depth of invalidation queue which is shared among VFs. This is needed when device invalidation throttle (DIT) capability is supported. This patch adds bit definitions for checking

[PATCH v5 00/23] IOMMU and VT-d driver support for Shared Virtual Address (SVA)

2018-05-11 Thread Jacob Pan
Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel platforms allow address space sharing between device DMA and applications. SVA can reduce programming complexity and enhance security. To enable SVA in the guest, i.e. shared guest application address space and physical devic

[PATCH v1 7/9] iommu/tegra: gart: Provide single domain and group for all devices

2018-05-11 Thread Dmitry Osipenko
On 11.05.2018 15:32, Robin Murphy wrote: > On 08/05/18 19:16, Dmitry Osipenko wrote: >> GART aperture is shared by all devices, hence there is a single IOMMU >> domain and group shared by these devices. Allocation of a group per >> device only wastes resources and allowance of having more than one

[PATCH v1 8/9] iommu: Introduce iotlb_sync_map callback

2018-05-11 Thread Dmitry Osipenko
On 11.05.2018 16:02, Robin Murphy wrote: > On 08/05/18 19:16, Dmitry Osipenko wrote: >> Introduce iotlb_sync_map() callback that is invoked in the end of >> iommu_map(). This new callback allows IOMMU drivers to avoid syncing >> on mapping of each contiguous chunk and sync only when whole mapping >

Re: [PATCH v1 6/9] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT

2018-05-11 Thread Dmitry Osipenko
Hi Robin, On 11.05.2018 14:34, Robin Murphy wrote: > Hi Dmitry, > > On 08/05/18 19:16, Dmitry Osipenko wrote: >> GART can't handle all devices, ignore devices that aren't related to GART. >> Device tree must explicitly assign GART IOMMU to the devices. >> >> Signed-off-by: Dmitry Osipenko >> ---

Re: [PATCH v6 1/2] iommu - Enable debugfs exposure of IOMMU driver internals

2018-05-11 Thread Gary R Hook
On 05/11/2018 10:22 AM, Robin Murphy wrote: Hi Gary, Just a few trivial nitpicks below, otherwise: Reviewed-by: Robin Murphy On 11/05/18 15:34, Gary R Hook wrote: Provide base enablement for using debugfs to expose internal data of an IOMMU driver. When called, create the /sys/kernel/debug/i

[PATCH v2 40/40] iommu/arm-smmu-v3: Add support for PCI PASID

2018-05-11 Thread Jean-Philippe Brucker
Enable PASID for PCI devices that support it. Unlike PRI, we can't enable PASID lazily in iommu_sva_device_init(), because it has to be enabled before ATS, and because we have to allocate substream tables early. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 54 ++

[PATCH v2 39/40] iommu/arm-smmu-v3: Add support for PRI

2018-05-11 Thread Jean-Philippe Brucker
For PCI devices that support it, enable the PRI capability and handle PRI Page Requests with the generic fault handler. It is enabled on demand by iommu_sva_device_init(). Signed-off-by: Jean-Philippe Brucker --- v1->v2: * Terminate the page request and disable PRI if no handler is registered *

[PATCH v2 33/40] iommu/arm-smmu-v3: Add stall support for platform devices

2018-05-11 Thread Jean-Philippe Brucker
The SMMU provides a Stall model for handling page faults in platform devices. It is similar to PCI PRI, but doesn't require devices to have their own translation cache. Instead, faulting transactions are parked and the OS is given a chance to fix the page tables and retry the transaction. Enable s

[PATCH v2 36/40] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops

2018-05-11 Thread Jean-Philippe Brucker
The core calls us when an mm is modified. Perform the required ATC invalidations. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c i

[PATCH v2 38/40] PCI: Make "PRG Response PASID Required" handling common

2018-05-11 Thread Jean-Philippe Brucker
The PASID ECN to the PCIe spec added a bit in the PRI status register that allows a Function to declare whether a PRG Response should contain the PASID prefix or not. Move the helper that accesses it from amd_iommu into the PCI subsystem, renaming it to be consistent with the current PCI Express s

[PATCH v2 35/40] iommu/arm-smmu-v3: Add support for PCI ATS

2018-05-11 Thread Jean-Philippe Brucker
PCIe devices can implement their own TLB, named Address Translation Cache (ATC). Enable Address Translation Service (ATS) for devices that support it and send them invalidation requests whenever we invalidate the IOTLBs. Range calculation - The invalidation packet itself is a

[PATCH v2 37/40] iommu/arm-smmu-v3: Disable tagged pointers

2018-05-11 Thread Jean-Philippe Brucker
The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the MMU mask out bits [63:56] of an address, allowing a userspace application to store data in its pointers. This option is incompatible with PCI ATS. If TBI is enabled in the SMMU and userspace triggers DMA transactions on tagge

[PATCH v2 34/40] ACPI/IORT: Check ATS capability in root complex nodes

2018-05-11 Thread Jean-Philippe Brucker
Root complex node in IORT has a bit telling whether it supports ATS or not. Store this bit in the IOMMU fwspec when setting up a device, so it can be accessed later by an IOMMU driver. Use the negative version (NO_ATS) at the moment because it's not clear if/how the bit needs to be integrated in o

[PATCH v2 32/40] iommu/arm-smmu-v3: Maintain a SID->device structure

2018-05-11 Thread Jean-Philippe Brucker
When handling faults from the event or PRI queue, we need to find the struct device associated to a SID. Add a rb_tree to keep track of SIDs. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 114 +++- 1 file changed, 113 insertions(+), 1 dele

[PATCH v2 31/40] iommu/arm-smmu-v3: Improve add_device error handling

2018-05-11 Thread Jean-Philippe Brucker
As add_device becomes more likely to fail when adding new features, let it clean up behind itself. The iommu_bus_init function does call remove_device on error, but other sites (e.g. of_iommu) do not. Don't free level-2 stream tables because we'd have to track if we allocated each of them or if th

[PATCH v2 29/40] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

2018-05-11 Thread Jean-Philippe Brucker
If the SMMU supports it and the kernel was built with HTTU support, enable hardware update of access and dirty flags. This is essential for shared page tables, to reduce the number of access faults on the fault queue. We can enable HTTU even if CPUs don't support it, because the kernel always chec

[PATCH v2 30/40] iommu/arm-smmu-v3: Register I/O Page Fault queue

2018-05-11 Thread Jean-Philippe Brucker
When using PRI or Stall, the PRI or event handler enqueues faults into the core fault queue. Register it based on the SMMU features. When the core stops using a PASID, it notifies the SMMU to flush all instances of this PASID from the PRI queue. Add a way to flush the PRI and event queue. PRI and

[PATCH v2 28/40] iommu/arm-smmu-v3: Implement mm operations

2018-05-11 Thread Jean-Philippe Brucker
Hook mm operations to support PASID and page table sharing with the SMMUv3: * mm_alloc allocates a context descriptor. * mm_free releases the context descriptor. * mm_attach checks device capabilities and writes the context descriptor. * mm_detach clears the context descriptor and sends required

[PATCH v2 27/40] iommu/arm-smmu-v3: Add SVA feature checking

2018-05-11 Thread Jean-Philippe Brucker
Aggregate all sanity-checks for sharing CPU page tables with the SMMU under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to check FEAT_ATS and FEAT_PRI. For platform SVM, they will most likely have to check FEAT_STALLS. Signed-off-by: Jean-Philippe Brucker --- v1->v2: Add 52-bit

[PATCH v2 26/40] iommu/arm-smmu-v3: Enable broadcast TLB maintenance

2018-05-11 Thread Jean-Philippe Brucker
The SMMUv3 can handle invalidation targeted at TLB entries with shared ASIDs. If the implementation supports broadcast TLB maintenance, enable it and keep track of it in a feature bit. The SMMU will then be affected by inner-shareable TLB invalidations from other agents. A major side-effect of thi

[PATCH v2 25/40] iommu/arm-smmu-v3: Add support for VHE

2018-05-11 Thread Jean-Philippe Brucker
ARMv8.1 extensions added Virtualization Host Extensions (VHE), which allow to run a host kernel at EL2. When using normal DMA, Device and CPU address spaces are dissociated, and do not need to implement the same capabilities, so VHE hasn't been used in the SMMU until now. With shared address space

[PATCH v2 23/40] iommu/arm-smmu-v3: Share process page tables

2018-05-11 Thread Jean-Philippe Brucker
With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR, MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split into two sets, shared and private. Shared ASIDs correspond to those obtained from the arch ASID allocator, and private ASIDs are used for "classic" map/unm

[PATCH v2 24/40] iommu/arm-smmu-v3: Seize private ASID

2018-05-11 Thread Jean-Philippe Brucker
The SMMU has a single ASID space, the union of shared and private ASID sets. This means that the context table module competes with the arch allocator for ASIDs. Shared ASIDs are those of Linux processes, allocated by the arch, and contribute in broadcast TLB maintenance. Private ASIDs are allocate

[PATCH v2 22/40] iommu/arm-smmu-v3: Add second level of context descriptor table

2018-05-11 Thread Jean-Philippe Brucker
The SMMU can support up to 20 bits of SSID. Add a second level of page tables to accommodate this. Devices that support more than 1024 SSIDs now have a table of 1024 L1 entries (8kB), pointing to tables of 1024 context descriptors (64kB), allocated on demand. Signed-off-by: Jean-Philippe Brucker

[PATCH v2 21/40] iommu/arm-smmu-v3: Add support for Substream IDs

2018-05-11 Thread Jean-Philippe Brucker
At the moment, the SMMUv3 driver offers only one stage-1 or stage-2 address space to each device. SMMUv3 allows to associate multiple address spaces per device. In addition to the Stream ID (SID), that identifies a device, we can now have Substream IDs (SSID) identifying an address space. In PCIe l

[PATCH v2 20/40] iommu/arm-smmu-v3: Move context descriptor code

2018-05-11 Thread Jean-Philippe Brucker
In preparation for substream ID support, move the context descriptor code into a separate module. At the moment it only manages context descriptor zero, which is used for non-PASID translations. One important behavior change is the ASID allocator, which is now global instead of per-SMMU. If we end

[PATCH v2 19/40] iommu: Add generic PASID table library

2018-05-11 Thread Jean-Philippe Brucker
Add a small API within the IOMMU subsystem to handle different formats of PASID tables. It uses the same principle as io-pgtable: * The IOMMU driver registers a PASID table with some invalidation callbacks. * The pasid-table lib allocates a set of tables of the right format, and returns an iom

[PATCH v2 17/40] iommu/arm-smmu-v3: Link domains and devices

2018-05-11 Thread Jean-Philippe Brucker
When removing a mapping from a domain, we need to send an invalidation to all devices that might have stored it in their Address Translation Cache (ATC). In addition when updating the context descriptor of a live domain, we'll need to send invalidations for all devices attached to it. Maintain a l

[PATCH v2 18/40] iommu/io-pgtable-arm: Factor out ARM LPAE register defines

2018-05-11 Thread Jean-Philippe Brucker
For SVA, we'll need to extract CPU page table information and mirror it in the substream setup. Move relevant defines to a common header. Fix TCR_SZ_MASK while we're at it. Signed-off-by: Jean-Philippe Brucker --- MAINTAINERS| 3 +- drivers/iommu/io-pgtable-arm.c | 49 +

[PATCH v2 16/40] arm64: mm: Pin down ASIDs for sharing mm with devices

2018-05-11 Thread Jean-Philippe Brucker
To enable address space sharing with the IOMMU, introduce mm_context_get() and mm_context_put(), that pin down a context and ensure that it will keep its ASID after a rollover. Pinning is necessary because a device constantly needs a valid ASID, unlike tasks that only require one when running. Wit

[PATCH v2 13/40] vfio: Add support for Shared Virtual Addressing

2018-05-11 Thread Jean-Philippe Brucker
Add two new ioctls for VFIO containers. VFIO_IOMMU_BIND_PROCESS creates a bond between a container and a process address space, identified by a Process Address Space ID (PASID). Devices in the container append this PASID to DMA transactions in order to access the process' address space. The process

[PATCH v2 15/40] iommu/of: Add stall and pasid properties to iommu_fwspec

2018-05-11 Thread Jean-Philippe Brucker
Add stall and pasid properties to iommu_fwspec, and fill them when dma-can-stall and pasid-bits properties are present in the device tree. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/of_iommu.c | 12 include/linux/iommu.h| 2 ++ 2 files changed, 14 insertions(+) dif

[PATCH v2 12/40] mm: export symbol mmput_async

2018-05-11 Thread Jean-Philippe Brucker
In some cases releasing a mm bound to a device might invoke an exit handler, that takes a lock already held by the function calling mmput(). This is the case for VFIO, which needs to call mmput_async to avoid a deadlock. Other drivers using SVA might follow. Since they can be built as modules, expo

[PATCH v2 11/40] mm: export symbol find_get_task_by_vpid

2018-05-11 Thread Jean-Philippe Brucker
Userspace drivers implemented with VFIO might want to bind sub-processes to their devices. In a VFIO ioctl, they provide a pid that is used to find a task and its mm. Since VFIO can be built as a module, export the find_get_task_by_vpid symbol. Cc: a...@linux-foundation.org Signed-off-by: Jean-Phi

[PATCH v2 14/40] dt-bindings: document stall and PASID properties for IOMMU masters

2018-05-11 Thread Jean-Philippe Brucker
On ARM systems, some platform devices behind an IOMMU may support stall and PASID features. Stall is the ability to recover from page faults and PASID offers multiple process address spaces to the device. Together they allow to do paging with a device. Let the firmware tell us when a device support

[PATCH v2 08/40] iommu/iopf: Handle mm faults

2018-05-11 Thread Jean-Philippe Brucker
When a recoverable page fault is handled by the fault workqueue, find the associated mm and call handle_mm_fault. Signed-off-by: Jean-Philippe Brucker --- v1->v2: let IOMMU drivers deal with Stop PASID --- drivers/iommu/io-pgfault.c | 86 +- 1 file changed, 8

[PATCH v2 10/40] mm: export symbol mm_access

2018-05-11 Thread Jean-Philippe Brucker
Some devices can access process address spaces directly. When creating such bond, to check that a process controlling the device is allowed to access the target address space, the device driver uses mm_access(). Since the drivers (in this case VFIO) can be built as a module, export the mm_access sy

[PATCH v2 07/40] iommu: Add a page fault handler

2018-05-11 Thread Jean-Philippe Brucker
Some systems allow devices to handle I/O Page Faults in the core mm. For example systems implementing the PCI PRI extension or Arm SMMU stall model. Infrastructure for reporting these recoverable page faults was recently added to the IOMMU core for SVA virtualisation. Add a page fault handler for h

[PATCH v2 09/40] iommu/sva: Register page fault handler

2018-05-11 Thread Jean-Philippe Brucker
Let users call iommu_sva_device_init() with the IOMMU_SVA_FEAT_IOPF flag, that enables the I/O Page Fault queue. The IOMMU driver checks is the device supports a form of page fault, in which case they add the device to a fault queue. If the device doesn't support page faults, the IOMMU driver abort

[PATCH v2 06/40] iommu/sva: Search mm by PASID

2018-05-11 Thread Jean-Philippe Brucker
The fault handler will need to find an mm given its PASID. This is the reason we have an IDR for storing address spaces, so hook it up. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu-sva.c | 26 ++ include/linux/iommu.h | 6 ++ 2 files changed, 32 i

[PATCH v2 03/40] iommu/sva: Manage process address spaces

2018-05-11 Thread Jean-Philippe Brucker
Allocate IOMMU mm structures and binding them to devices. Four operations are added to IOMMU drivers: * mm_alloc(): to create an io_mm structure and perform architecture- specific operations required to grab the process (for instance on ARM, pin down the CPU ASID so that the process doesn't ge

[PATCH v2 04/40] iommu/sva: Add a mm_exit callback for device drivers

2018-05-11 Thread Jean-Philippe Brucker
When an mm exits, devices that were bound to it must stop performing DMA on its PASID. Let device drivers register a callback to be notified on mm exit. Add the callback to the sva_param structure attached to struct device. Signed-off-by: Jean-Philippe Brucker --- v1->v2: use iommu_sva_device_in

[PATCH v2 05/40] iommu/sva: Track mm changes with an MMU notifier

2018-05-11 Thread Jean-Philippe Brucker
When creating an io_mm structure, register an MMU notifier that informs us when the virtual address space changes and disappears. Add a new operation to the IOMMU driver, mm_invalidate, called when a range of addresses is unmapped to let the IOMMU driver send ATC invalidations. mm_invalidate canno

[PATCH v2 00/40] Shared Virtual Addressing for the IOMMU

2018-05-11 Thread Jean-Philippe Brucker
This is version 2 of the Shared Virtual Addressing (SVA) series, which adds process address space management (1-6) and I/O page faults (7-9) to the IOMMU core. It also includes two example users of the API: VFIO as device driver (10-13), and Arm SMMUv3 as IOMMU driver (14-40). The series is gettin

[PATCH v2 01/40] iommu: Introduce Shared Virtual Addressing API

2018-05-11 Thread Jean-Philippe Brucker
Shared Virtual Addressing (SVA) provides a way for device drivers to bind process address spaces to devices. This requires the IOMMU to support page table format and features compatible with the CPUs, and usually requires the system to support I/O Page Faults (IOPF) and Process Address Space ID (PA

[PATCH v2 02/40] iommu/sva: Bind process address spaces to devices

2018-05-11 Thread Jean-Philippe Brucker
Add bind() and unbind() operations to the IOMMU API. Bind() returns a PASID that drivers can program in hardware, to let their devices access an mm. This patch only adds skeletons for the device driver API, most of the implementation is still missing. IOMMU groups with more than one device aren't

Re: [PATCH 3/3] iommu: armsmmu: set iommu ops for rpmsg bus

2018-05-11 Thread Robin Murphy
On 07/05/18 20:28, Bjorn Andersson wrote: On Fri, Mar 2, 2018 at 8:59 AM, Robin Murphy wrote: On 02/03/18 14:55, srinivas.kandaga...@linaro.org wrote: From: Srinivas Kandagatla On Qualcomm SoCs, ADSP exposes many functions like audio and others. These services need iommu access to allocate

Re: [PATCH] iommu/arm-smmu-v3: Set GBPA to abort all transactions

2018-05-11 Thread Goel, Sameer
On 4/12/2018 5:56 AM, Marc Zyngier wrote: > On 12/04/18 11:17, Robin Murphy wrote: >> On 11/04/18 17:54, Marc Zyngier wrote: >>> Hi Sammer, >>> >>> On 11/04/18 16:58, Goel, Sameer wrote: On 3/28/2018 9:00 AM, Marc Zyngier wrote: > On 2018-03-28 15:39, Timur Tabi wrote: >> F

Re: [PATCH v6 1/2] iommu - Enable debugfs exposure of IOMMU driver internals

2018-05-11 Thread Robin Murphy
Hi Gary, Just a few trivial nitpicks below, otherwise: Reviewed-by: Robin Murphy On 11/05/18 15:34, Gary R Hook wrote: Provide base enablement for using debugfs to expose internal data of an IOMMU driver. When called, create the /sys/kernel/debug/iommu directory. Emit a strong warning at boo

[PATCH v6 2/2] iommu/amd: Add basic debugfs infrastructure for AMD IOMMU

2018-05-11 Thread Gary R Hook
Implement a skeleton framework for debugfs support in the AMD IOMMU. Signed-off-by: Gary R Hook --- drivers/iommu/Makefile|5 + drivers/iommu/amd_iommu_debugfs.c | 39 + drivers/iommu/amd_iommu_init.c|6 -- drivers/iommu/amd_

[PATCH v6 1/2] iommu - Enable debugfs exposure of IOMMU driver internals

2018-05-11 Thread Gary R Hook
Provide base enablement for using debugfs to expose internal data of an IOMMU driver. When called, create the /sys/kernel/debug/iommu directory. Emit a strong warning at boot time to indicate that this feature is enabled. This function is called from iommu_init, and creates the initial DebugFS di

[PATCH v6 0/2] Base enablement of IOMMU debugfs support

2018-05-11 Thread Gary R Hook
These patches create a top-level function, called at IOMMU initialization, to create a debugfs directory for the IOMMU. Under this directory drivers may create and populate-specific directories for their device internals. Patch 1: general IOMMU enablement Patch 2: basic AMD enablement to demonstra

[PATCH v3 0/3] arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)

2018-05-11 Thread Catalin Marinas
Hi, The previous version of this patch [1] didn't make it into 4.17 because of the (compile-time) conflicts with the generic dma-direct.h changes. I'm reposting it for 4.18 with some minor changes: - phys_to_dma()/dma_to_phys() now gained underscores to match the generic dma-direct.h implementa

[PATCH v3 3/3] arm64: Force swiotlb bounce buffering for non-coherent DMA with large CWG

2018-05-11 Thread Catalin Marinas
On systems with a Cache Writeback Granule (CTR_EL0.CWG) greater than ARCH_DMA_MINALIGN, DMA cache maintenance on sub-CWG ranges is not safe, leading to data corruption. If such configuration is detected, the kernel will force swiotlb bounce buffering for all non-coherent devices. Cc: Will Deacon

[PATCH v3 2/3] arm64: Increase ARCH_DMA_MINALIGN to 128

2018-05-11 Thread Catalin Marinas
This patch increases the ARCH_DMA_MINALIGN to 128 so that it covers the currently known Cache Writeback Granule (CTR_EL0.CWG) on arm64 and moves the fallback in cache_line_size() from L1_CACHE_BYTES to this constant. Cc: Will Deacon Cc: Robin Murphy Signed-off-by: Catalin Marinas --- arch/arm6

[PATCH v3 1/3] Revert "arm64: Increase the max granular size"

2018-05-11 Thread Catalin Marinas
This reverts commit 97303480753e48fb313dc0e15daaf11b0451cdb8. Commit 97303480753e ("arm64: Increase the max granular size") increased the cache line size to 128 to match Cavium ThunderX, apparently for some performance benefit which could not be confirmed. This change, however, has an impact on th

Re: [PATCH 04/20] arm-nommu: use generic dma_noncoherent_ops

2018-05-11 Thread John Garry
On 11/05/2018 08:59, Christoph Hellwig wrote: Switch to the generic noncoherent direct mapping implementation for the nommu dma map implementation. Signed-off-by: Christoph Hellwig --- arch/arc/Kconfig| 1 + arch/arm/Kconfig| 4 + arch/arm/mm/dma-mapping-nom

Re: [PATCH v1 8/9] iommu: Introduce iotlb_sync_map callback

2018-05-11 Thread Robin Murphy
On 08/05/18 19:16, Dmitry Osipenko wrote: Introduce iotlb_sync_map() callback that is invoked in the end of iommu_map(). This new callback allows IOMMU drivers to avoid syncing on mapping of each contiguous chunk and sync only when whole mapping is completed, optimizing performance of the mapping

Re: [PATCH 03/20] arc: use generic dma_noncoherent_ops

2018-05-11 Thread Alexey Brodkin
Hi Christoph, On Fri, 2018-05-11 at 09:59 +0200, Christoph Hellwig wrote: > Switch to the generic noncoherent direct mapping implementation. > > Signed-off-by: Christoph Hellwig > --- > arch/arc/Kconfig | 4 + > arch/arc/include/asm/Kbuild| 1 + > arch/arc/include/

Re: [PATCH v1 7/9] iommu/tegra: gart: Provide single domain and group for all devices

2018-05-11 Thread Robin Murphy
On 08/05/18 19:16, Dmitry Osipenko wrote: GART aperture is shared by all devices, hence there is a single IOMMU domain and group shared by these devices. Allocation of a group per device only wastes resources and allowance of having more than one domain is simply wrong because IOMMU mappings made

Re: [PATCH v1 6/9] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT

2018-05-11 Thread Robin Murphy
Hi Dmitry, On 08/05/18 19:16, Dmitry Osipenko wrote: GART can't handle all devices, ignore devices that aren't related to GART. Device tree must explicitly assign GART IOMMU to the devices. Signed-off-by: Dmitry Osipenko --- drivers/iommu/tegra-gart.c | 33 -

Re: [PATCH v1 7/9] iommu/tegra: gart: Provide single domain and group for all devices

2018-05-11 Thread Dmitry Osipenko
On 08.05.2018 21:16, Dmitry Osipenko wrote: > GART aperture is shared by all devices, hence there is a single IOMMU > domain and group shared by these devices. Allocation of a group per > device only wastes resources and allowance of having more than one domain > is simply wrong because IOMMU mappi

Re: [PATCH 04/20] arm-nommu: use generic dma_noncoherent_ops

2018-05-11 Thread Russell King - ARM Linux
On Fri, May 11, 2018 at 09:59:29AM +0200, Christoph Hellwig wrote: > Switch to the generic noncoherent direct mapping implementation for > the nommu dma map implementation. > > Signed-off-by: Christoph Hellwig > --- > arch/arc/Kconfig| 1 + > arch/arm/Kconfig|

[PATCH 19/20] sparc: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. This removes the previous sync_single_for_device implementation, which looks bogus given that no syncing is happening in the similar but more important map_single case. Signed-off-by: Christoph Hellwig --- arch/sparc/Kconfig

[PATCH 20/20] parisc: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. Parisc previously had two different non-coherent dma ops implementation that just different in the way coherent allocations were handled or not handled. The different behavior is not selected at runtime in the arch_dma_alloc and arc

[PATCH 18/20] xtensa: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. Signed-off-by: Christoph Hellwig --- arch/xtensa/Kconfig | 3 + arch/xtensa/include/asm/Kbuild| 1 + arch/xtensa/include/asm/dma-mapping.h | 26 -- arch/xtensa/kernel/pci-dma.c | 130 +++-

[PATCH 16/20] mm: split arch/sh/mm/consistent.c

2018-05-11 Thread Christoph Hellwig
Half of the file just contains platform device memory setup code which is required for all builds, and half contains helpers for dma coherent allocation, which is only needed if CONFIG_DMA_NONCOHERENT is enabled. Signed-off-by: Christoph Hellwig --- arch/sh/kernel/Makefile | 2 +- arch/sh

[PATCH 17/20] sh: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. Signed-off-by: Christoph Hellwig --- arch/sh/Kconfig | 3 +- arch/sh/include/asm/Kbuild| 1 + arch/sh/include/asm/dma-mapping.h | 26 --- arch/sh/kernel/Makefile | 2 +- arch/sh/kernel

[PATCH 15/20] sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case

2018-05-11 Thread Christoph Hellwig
This is a slight change in behavior as we avoid the detour through the virtual mapping for the coherent allocator, but if this CPU really is coherent that should be the right thing to do. Signed-off-by: Christoph Hellwig --- arch/sh/Kconfig | 1 + arch/sh/include/asm/dma-mappin

[PATCH 14/20] sh: introduce a sh_cacheop_vaddr helper

2018-05-11 Thread Christoph Hellwig
And use it in the maple bus code to avoid a dma API dependency. Signed-off-by: Christoph Hellwig --- arch/sh/include/asm/cacheflush.h | 7 +++ arch/sh/mm/consistent.c | 5 + drivers/sh/maple/maple.c | 7 --- 3 files changed, 12 insertions(+), 7 deletions(-) diff --g

[PATCH 09/20] microblaze: remove the consistent_sync and consistent_sync_page

2018-05-11 Thread Christoph Hellwig
Both unused. Signed-off-by: Christoph Hellwig --- arch/microblaze/include/asm/pgtable.h | 3 -- arch/microblaze/mm/consistent.c | 45 --- 2 files changed, 48 deletions(-) diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h i

[PATCH 12/20] openrisc: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. Fix sync_single_for_device to do the same cache coherency operations as the more tested map_single path, as both should transfer ownership to the device. Remove the sync_single_for_cpu implementation as no cache coherency operations

[PATCH 08/20] microblaze: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. This removes the direction-based optimizations in sync_{single,sg}_for_{cpu,device} which were marked untestested and do not match the usually very well tested {un,}map_{single,sg} implementations. Signed-off-by: Christoph Hellwig

[PATCH 11/20] nios2: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. Signed-off-by: Christoph Hellwig --- arch/nios2/Kconfig | 3 + arch/nios2/include/asm/Kbuild| 1 + arch/nios2/include/asm/dma-mapping.h | 20 arch/nios2/mm/dma-mapping.c | 139 +++---

[PATCH 13/20] sh: simplify get_arch_dma_ops

2018-05-11 Thread Christoph Hellwig
Remove the indirection through the dma_ops variable, and just return nommu_dma_ops directly from get_arch_dma_ops. Signed-off-by: Christoph Hellwig --- arch/sh/include/asm/dma-mapping.h | 5 ++--- arch/sh/kernel/dma-nommu.c| 8 +--- arch/sh/mm/consistent.c | 3 --- arch/

[PATCH 07/20] m68k: use generic dma_noncoherent_ops

2018-05-11 Thread Christoph Hellwig
Switch to the generic noncoherent direct mapping implementation. Signed-off-by: Christoph Hellwig --- arch/m68k/Kconfig | 2 + arch/m68k/include/asm/Kbuild| 1 + arch/m68k/include/asm/dma-mapping.h | 12 - arch/m68k/kernel/dma.c | 68 -

  1   2   >