Re: [PATCH v4 0/7] Use 1st-level for IOVA translation

2019-12-20 Thread Lu Baolu

Hi again,

On 2019/12/20 19:50, Liu, Yi L wrote:

3) Per VT-d spec, FLPT has canonical requirement to the input
addresses. So I'd suggest to add some enhance regards to it.
Please refer to chapter 3.6:-).

3.6 First-Level Translation
First-level translation restricts the input-address to a canonical address 
(i.e., address bits 63:N have
the same value as address bit [N-1], where N is 48-bits with 4-level paging and 
57-bits with 5-level
paging). Requests subject to first-level translation by remapping hardware are 
subject to canonical
address checking as a pre-condition for first-level translation, and a 
violation is treated as a
translation-fault.


It seems to be a conflict at bit 63. It should be the same as bit[N-1]
according to the canonical address requirement; but it is also used as
the XD control. Any thought?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/7] Use 1st-level for IOVA translation

2019-12-20 Thread Lu Baolu

Hi Yi,

Thanks for the comments.

On 12/20/19 7:50 PM, Liu, Yi L wrote:

Hi Baolu,

In a brief, this version is pretty good to me. However, I still want
to have the following checks to see if anything missed. Wish it
helps.

1) would using IOVA over FLPT default on?
My opinion is that before we have got gIOVA nested translation
done for passthru devices, we should make this feature as off.


No worry.

IOVA over first level is a sub-feature of scalable mode. Currently,
scalable mode is default off and we won't switch it on until all
features are done.



2) the domain->agaw is somehow calculated according to the
capabilities related to second level page table. As we are moving
IOVA to FLPT, I'd suggest to calculate domain->agaw with the
translation modes FLPT supports (e.g. 4 level and 5 level)


We merged first level and second level, hence the domain->agaw should be
selected for both. The only shortcoming of this is that it doesn't
support a 3-only second level in scalable mode. But I don't think we
have any chances to see such hardware.



3) Per VT-d spec, FLPT has canonical requirement to the input
addresses. So I'd suggest to add some enhance regards to it.
Please refer to chapter 3.6 :-).


Yes. Good catch! We should manipulate the page table entry according to
this requirement.



3.6 First-Level Translation
First-level translation restricts the input-address to a canonical address 
(i.e., address bits 63:N have
the same value as address bit [N-1], where N is 48-bits with 4-level paging and 
57-bits with 5-level
paging). Requests subject to first-level translation by remapping hardware are 
subject to canonical
address checking as a pre-condition for first-level translation, and a 
violation is treated as a
translation-fault.

Regards,
Yi Liu


Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 4/7] iommu/vt-d: Setup pasid entries for iova over first level

2019-12-20 Thread Lu Baolu

Hi Yi,

On 12/20/19 7:44 PM, Liu, Yi L wrote:

From: Lu Baolu [mailto:baolu...@linux.intel.com]
Sent: Thursday, December 19, 2019 11:17 AM
To: Joerg Roedel ; David Woodhouse ;
Alex Williamson 
Subject: [PATCH v4 4/7] iommu/vt-d: Setup pasid entries for iova over first 
level

Intel VT-d in scalable mode supports two types of page tables for IOVA 
translation:
first level and second level. The IOMMU driver can choose one from both for IOVA
translation according to the use case. This sets up the pasid entry if a domain 
is
selected to use the first-level page table for iova translation.

Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel-iommu.c | 48 +++--
  include/linux/intel-iommu.h | 16 -
  2 files changed, 56 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index
2b5a47584baf..f0813997dea2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -571,6 +571,11 @@ static inline int domain_type_is_si(struct dmar_domain
*domain)
return domain->flags & DOMAIN_FLAG_STATIC_IDENTITY;  }

+static inline bool domain_use_first_level(struct dmar_domain *domain) {
+   return domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL; }
+
  static inline int domain_pfn_supported(struct dmar_domain *domain,
   unsigned long pfn)
  {
@@ -2288,6 +2293,8 @@ static int __domain_mapping(struct dmar_domain
*domain, unsigned long iov_pfn,
return -EINVAL;

prot &= DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP;
+   if (domain_use_first_level(domain))
+   prot |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD;

if (!sg) {
sg_res = nr_pages;
@@ -2515,6 +2522,36 @@ dmar_search_domain_by_dev_info(int segment, int bus,
int devfn)
return NULL;
  }

+static int domain_setup_first_level(struct intel_iommu *iommu,
+   struct dmar_domain *domain,
+   struct device *dev,
+   int pasid)
+{
+   int flags = PASID_FLAG_SUPERVISOR_MODE;


Hi Baolu,

Could you explain a bit why PASID_FLAG_SUPERVISOR_MODE is
required?



This flag indicates a PASID which can be used for access to kernel
addresses (static 1:1 only). Otherwise, DMA requests requesting
supervisor level privilege level will be blocked.


Regards,
Yi Liu



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Freedreno] [PATCH 5/5] drm/msm/a6xx: Add support for using system cache(LLC)

2019-12-20 Thread Jordan Crouse
On Fri, Dec 20, 2019 at 03:40:59PM +0530, smase...@codeaurora.org wrote:
> On 2019-12-20 01:28, Jordan Crouse wrote:
> >On Thu, Dec 19, 2019 at 06:44:46PM +0530, Sharat Masetty wrote:
> >>The last level system cache can be partitioned to 32 different slices
> >>of which GPU has two slices preallocated. One slice is used for caching
> >>GPU
> >>buffers and the other slice is used for caching the GPU SMMU pagetables.
> >>This patch talks to the core system cache driver to acquire the slice
> >>handles,
> >>configure the SCID's to those slices and activates and deactivates the
> >>slices
> >>upon GPU power collapse and restore.
> >>
> >>Some support from the IOMMU driver is also needed to make use of the
> >>system cache. IOMMU_QCOM_SYS_CACHE is a buffer protection flag which
> >>enables
> >>caching GPU data buffers in the system cache with memory attributes such
> >>as outer cacheable, read-allocate, write-allocate for buffers. The GPU
> >>then has the ability to override a few cacheability parameters which it
> >>does to override write-allocate to write-no-allocate as the GPU hardware
> >>does not benefit much from it.
> >>
> >>Similarly DOMAIN_ATTR_QCOM_SYS_CACHE is another domain level attribute
> >>used by the IOMMU driver to set the right attributes to cache the
> >>hardware
> >>pagetables into the system cache.
> >>
> >>Signed-off-by: Sharat Masetty 
> >>---
> >> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 122
> >>+-
> >> drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   9 +++
> >> drivers/gpu/drm/msm/msm_iommu.c   |  13 
> >> drivers/gpu/drm/msm/msm_mmu.h |   3 +
> >> 4 files changed, 146 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>index faff6ff..0c7fdee 100644
> >>--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>@@ -9,6 +9,7 @@
> >> #include "a6xx_gmu.xml.h"
> >>
> >> #include 
> >>+#include 
> >>
> >> #define GPU_PAS_ID 13
> >>
> >>@@ -781,6 +782,117 @@ static void
> >>a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu)
> >>gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> >> }
> >>
> >>+#define A6XX_LLC_NUM_GPU_SCIDS 5
> >>+#define A6XX_GPU_LLC_SCID_NUM_BITS 5
> >
> >As I mention below, I'm not sure if we need these
> >
> >>+#define A6XX_GPU_LLC_SCID_MASK \
> >>+   ((1 << (A6XX_LLC_NUM_GPU_SCIDS * A6XX_GPU_LLC_SCID_NUM_BITS)) - 1)
> >>+
> >>+#define A6XX_GPUHTW_LLC_SCID_SHIFT 25
> >>+#define A6XX_GPUHTW_LLC_SCID_MASK \
> >>+   (((1 << A6XX_GPU_LLC_SCID_NUM_BITS) - 1) <<
> >>A6XX_GPUHTW_LLC_SCID_SHIFT)
> >>+
> >
> >Normally these go into the envytools regmap but if we're going to do these
> >guys
> >lets use the power of  for good.
> >
> >#define A6XX_GPU_LLC_SCID GENMASK(24, 0)
> >#define A6XX_GPUHTW_LLC_SCID GENMASK(29, 25)
> >
> >>+static inline void a6xx_gpu_cx_rmw(struct a6xx_llc *llc,
> >
> >Don't mark C functions as inline - let the compiler figure it out for you.
> >
> >>+   u32 reg, u32 mask, u32 or)
> >>+{
> >>+   msm_rmw(llc->mmio + (reg << 2), mask, or);
> >>+}
> >>+
> >>+static void a6xx_llc_deactivate(struct a6xx_llc *llc)
> >>+{
> >>+   llcc_slice_deactivate(llc->gpu_llc_slice);
> >>+   llcc_slice_deactivate(llc->gpuhtw_llc_slice);
> >>+}
> >>+
> >>+static void a6xx_llc_activate(struct a6xx_llc *llc)
> >>+{
> >>+   if (!llc->mmio)
> >>+   return;
> >>+
> >>+   /* Program the sub-cache ID for all GPU blocks */
> >>+   if (!llcc_slice_activate(llc->gpu_llc_slice))
> >>+   a6xx_gpu_cx_rmw(llc,
> >>+   REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1,
> >>+   A6XX_GPU_LLC_SCID_MASK,
> >>+   (llc->cntl1_regval &
> >>+A6XX_GPU_LLC_SCID_MASK));
> >
> >This is out of order with the comments below, but if we store the slice id
> >then
> >you could calculate regval here and not have to store it.
> >
> >>+
> >>+   /* Program the sub-cache ID for the GPU pagetables */
> >>+   if (!llcc_slice_activate(llc->gpuhtw_llc_slice))
> >
> >val |= FIELD_SET(A6XX_GPUHTW_LLC_SCID, htw_llc_sliceid);
> >
> >>+   a6xx_gpu_cx_rmw(llc,
> >>+   REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1,
> >>+   A6XX_GPUHTW_LLC_SCID_MASK,
> >>+   (llc->cntl1_regval &
> >>+A6XX_GPUHTW_LLC_SCID_MASK));
> >
> >And this could be FIELD_SET(A6XX_GPUHTW_LLC_SCID, sliceid);
> >
> >In theory you could just calculate the u32 and write it directly without a
> >rmw.
> >In fact, that might be preferable - if the slice activate failed, you
> >don't want
> >to run the risk that the scid for htw is still populated.
> >
> >>+
> >>+   /* Program cacheability overrides */
> >>+   a6xx_gpu_cx_rmw(llc, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF,
> >>+   llc->cntl0_regval);
> >
> >As below, this could easily be a constant.
> >
> >>+}
> >>+
> >>+st

Re: [RFC 00/13] virtio-iommu on non-devicetree platforms

2019-12-20 Thread Jacob Pan (Jun)
On Wed, 18 Dec 2019 12:20:44 +0100
Jean-Philippe Brucker  wrote:

> On Tue, Dec 03, 2019 at 07:01:36PM -0800, Jacob Pan (Jun) wrote:
> > Hi Jean,
> > 
> > Sorry for the delay, I was out last week. Comments inline below.
> > 
> > On Mon, 25 Nov 2019 19:02:47 +0100
> > Jean-Philippe Brucker  wrote:
> >   
> > > On Fri, Nov 22, 2019 at 04:01:02PM -0800, Jacob Pan (Jun) wrote:  
> > > > > (1) ACPI has one table per vendor (DMAR for Intel, IVRS for
> > > > > AMD and IORT for Arm). From my point of view IORT is easier to
> > > > > extend, since we just need to introduce a new node type. There
> > > > > are no dependencies to Arm in the Linux IORT driver, so it
> > > > > works well with CONFIG_X86.   
> > > > From my limited understanding, IORT and VIOT is to solve device
> > > > topology enumeration only? I am not sure how it can be expanded
> > > > to cover information beyond device topology. e.g. DMAR has NUMA
> > > > information and root port ATS, I guess they are not used today
> > > > in the guest but might be additions in the future.
> > > 
> > > The PCI root-complex node of IORT has an ATS attribute, which we
> > > can already use. However its scope is the root complex, not
> > > individual root ports like with DMAR.
> > > 
> > > I'm not very familiar with NUMA, but it looks like we just need to
> > > specify a proximity domain in relation to the SRAT table, for each
> > > viommu? The SMMUv3 node in IORT has a 4-bytes "proximity domain"
> > > field for this. We can add the same to the VIOT virtio-iommu nodes
> > > later, since the structures are extensible.
> > >   
> > I think there the proximity domain is more for each assigned device
> > than vIOMMU. vIOMMU in the guest can have assigned devices belong to
> > different pIOMMU and proximity domains. If the guest owns the first
> > level page tables (gIOVA or SVA), we want to make sure page tables
> > are allocated from the close proximity domain.
> > 
> > My understanding is virtio IOMMU supports both virtio devices and
> > assigned devices. we could care less about the former in terms of
> > NUMA.
> > 
> > In ACPI, we have _PXM method to retrieve device proximity domain. I
> > don't know if there is something equivalent or a generic way to get
> > _PXM information. I think VMM also need to make sure when an
> > assigned device is used with vIOMMU, there are some memory is
> > allocated from the device's proximity domain.
> >   
> > > But it might be better to keep the bare minimum information in
> > > the FW descriptor, and put the rest in the virtio-iommu. So yes
> > > topology enumeration is something the device cannot do itself
> > > (not fully that is, see (2)) but for the rest, virtio-iommu's
> > > PROBE request can provide details about each endpoint in relation
> > > to their physical IOMMU.
> > > 
> > > We could for example add a bit in a PROBE property saying that the
> > > whole path between the IOMMU and the endpoint supports ATS. For
> > > NUMA it might also be more interesting to have a finer
> > > granularity, since one viommu could be managing endpoints that
> > > are behind different physical IOMMUs. If in the future we want to
> > > allocate page tables close to the physical IOMMU for example, we
> > > might need to describe multiple NUMA nodes per viommu, using the
> > > PROBE request. 
> > Should we reinvent something for NUMA or use ACPI's SRAT, _PXM?   
> 
> Regardless whether we put it in the VIOT or in the virtio-iommu PROBE
> request, we necessarily need to reuse the node IDs defined by SRAT (or
> numa-node-id on devicetree, also a 32-bit value). A virtio-pci based
> virtio-iommu already has the _PXM of its closest bridge and wouldn't
> need anything more in the VIOT, while a virtio-mmio based
> virtio-iommu would need a proximity domain field in the VIOT. That
> could be added later since the table is extensible, but as you
> pointed out, that information might not be very useful.
> 
> > I am not sure how it is handled today in QEMU in terms of guest-host
> > NUMA proximity domain mapping.  
> 
> It looks like the user can specify this guest-host mapping on the
> command-line:
> 
>   -object memory-backend-ram,id=mem0,size=4G,host-nodes=3,policy=bind
>   -object memory-backend-ram,id=mem1,size=4G,host-nodes=4,policy=bind
>   -numa node,memdev=mem0,nodeid=numa0
>   -numa node,memdev=mem1,nodeid=numa1
>   -numa cpu,node-id=numa0,socket-id=0
>   -numa cpu,node-id=numa1,socket-id=1
> 
> numa0 and numa1 would get proximity domains 0 and 1, corresponding to
> host domains 3 and 4. It is also possible to specify the NUMA node of
> a PCI bus (via the PCI expander bridge), and therefore to assign a
> VFIO PCI device in the same proximity domain as its physical location.
> 
>   -device pxb,id=bridge1,bus=pci.0,numa_node=1 (simplified)
>   -device vfio-pci,host=03:01.0,bus=bridge1
> 
Thanks a lot for the thorough explanation. I will give that a try on
x86, I assume the ACPI tables also built to match these cmdline options.
> Linux can

Re: [git pull] IOMMU Fixes for Linux v5.5-rc2

2019-12-20 Thread pr-tracker-bot
The pull request you sent on Fri, 20 Dec 2019 12:30:26 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
> tags/iommu-fixes-v5.5-rc2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b371ddb94fae82b6565020639b7db31934043c65

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 00/16] iommu: Permit modular builds of ARM SMMU[v3] drivers

2019-12-20 Thread Joerg Roedel
On Thu, Dec 19, 2019 at 12:03:36PM +, Will Deacon wrote:
> Ard Biesheuvel (1):
>   iommu/arm-smmu: Support SMMU module probing from the IORT
> 
> Greg Kroah-Hartman (1):
>   PCI/ATS: Restore EXPORT_SYMBOL_GPL() for pci_{enable,disable}_ats()
> 
> Will Deacon (14):
>   drivers/iommu: Export core IOMMU API symbols to permit modular drivers
>   iommu/of: Request ACS from the PCI core when configuring IOMMU linkage
>   PCI: Export pci_ats_disabled() as a GPL symbol to modules
>   drivers/iommu: Take a ref to the IOMMU driver prior to ->add_device()
>   iommu/of: Take a ref to the IOMMU driver during ->of_xlate()
>   drivers/iommu: Allow IOMMU bus ops to be unregistered
>   Revert "iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modular"
>   Revert "iommu/arm-smmu: Make arm-smmu explicitly non-modular"
>   iommu/arm-smmu: Prevent forced unbinding of Arm SMMU drivers
>   iommu/arm-smmu-v3: Unregister IOMMU and bus ops on device removal
>   iommu/arm-smmu-v3: Allow building as a module
>   iommu/arm-smmu: Unregister IOMMU and bus ops on device removal
>   iommu/arm-smmu: Allow building as a module
>   iommu/arm-smmu: Update my email address in MODULE_AUTHOR()
> 
>  drivers/acpi/arm64/iort.c   |   4 +-
>  drivers/iommu/Kconfig   |  16 -
>  drivers/iommu/Makefile  |   3 +-
>  drivers/iommu/arm-smmu-v3.c |  94 +-
>  drivers/iommu/arm-smmu.c| 128 +---
>  drivers/iommu/iommu-sysfs.c |   5 ++
>  drivers/iommu/iommu.c   |  32 -
>  drivers/iommu/of_iommu.c|  19 --
>  drivers/pci/ats.c   |   2 +
>  drivers/pci/pci.c   |   1 +
>  include/linux/iommu.h   |   4 +-
>  11 files changed, 223 insertions(+), 85 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 03/16] PCI/ATS: Restore EXPORT_SYMBOL_GPL() for pci_{enable,disable}_ats()

2019-12-20 Thread Bjorn Helgaas
On Fri, Dec 20, 2019 at 09:43:03AM +0100, Joerg Roedel wrote:
> Hi Bjorn,
> 
> On Thu, Dec 19, 2019 at 12:03:39PM +, Will Deacon wrote:
> > From: Greg Kroah-Hartman 
> > 
> > Commit d355bb209783 ("PCI/ATS: Remove unnecessary EXPORT_SYMBOL_GPL()")
> > unexported a bunch of symbols from the PCI core since the only external
> > users were non-modular IOMMU drivers. Although most of those symbols
> > can remain private for now, 'pci_{enable,disable_ats()' is required for
> > the ARM SMMUv3 driver to build as a module, otherwise we get a build
> > failure as follows:
> > 
> >   | ERROR: "pci_enable_ats" [drivers/iommu/arm-smmu-v3.ko] undefined!
> >   | ERROR: "pci_disable_ats" [drivers/iommu/arm-smmu-v3.ko] undefined!
> > 
> > Re-export these two functions so that the ARM SMMUv3 driver can be build
> > as a module.
> > 
> > Cc: Bjorn Helgaas 
> > Cc: Joerg Roedel 
> > Signed-off-by: Greg Kroah-Hartman 
> > [will: rewrote commit message]
> > Signed-off-by: Will Deacon 
> > ---
> >  drivers/pci/ats.c | 2 ++
> >  1 file changed, 2 insertions(+)
> 
> Are you fine with this change? I would apply this series to my tree
> then.

Yep, thanks!  You can add my

Acked-by: Bjorn Helgaas 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v4 0/7] Use 1st-level for IOVA translation

2019-12-20 Thread Liu, Yi L
Hi Baolu,

In a brief, this version is pretty good to me. However, I still want
to have the following checks to see if anything missed. Wish it
helps.

1) would using IOVA over FLPT default on?
My opinion is that before we have got gIOVA nested translation
done for passthru devices, we should make this feature as off.

2) the domain->agaw is somehow calculated according to the
capabilities related to second level page table. As we are moving
IOVA to FLPT, I'd suggest to calculate domain->agaw with the
translation modes FLPT supports (e.g. 4 level and 5 level)

3) Per VT-d spec, FLPT has canonical requirement to the input
addresses. So I'd suggest to add some enhance regards to it.
Please refer to chapter 3.6 :-).

3.6 First-Level Translation
First-level translation restricts the input-address to a canonical address 
(i.e., address bits 63:N have
the same value as address bit [N-1], where N is 48-bits with 4-level paging and 
57-bits with 5-level
paging). Requests subject to first-level translation by remapping hardware are 
subject to canonical
address checking as a pre-condition for first-level translation, and a 
violation is treated as a
translation-fault.

Regards,
Yi Liu

> From: Lu Baolu [mailto:baolu...@linux.intel.com]
> Sent: Thursday, December 19, 2019 11:16 AM
> To: Joerg Roedel ; David Woodhouse ;
> Alex Williamson 
> Subject: [PATCH v4 0/7] Use 1st-level for IOVA translation
> 
> Intel VT-d in scalable mode supports two types of page tables
> for DMA translation: the first level page table and the second
> level page table. The first level page table uses the same
> format as the CPU page table, while the second level page table
> keeps compatible with previous formats. The software is able
> to choose any one of them for DMA remapping according to the use
> case.
> 
> This patchset aims to move IOVA (I/O Virtual Address) translation
> to 1st-level page table in scalable mode. This will simplify vIOMMU
> (IOMMU simulated by VM hypervisor) design by using the two-stage
> translation, a.k.a. nested mode translation.
> 
> As Intel VT-d architecture offers caching mode, guest IOVA (GIOVA)
> support is currently implemented in a shadow page manner. The device
> simulation software, like QEMU, has to figure out GIOVA->GPA mappings
> and write them to a shadowed page table, which will be used by the
> physical IOMMU. Each time when mappings are created or destroyed in
> vIOMMU, the simulation software has to intervene. Hence, the changes
> on GIOVA->GPA could be shadowed to host.
> 
> 
>  .---.
>  |  vIOMMU   |
>  |---| ..
>  |   |IOTLB flush trap |QEMU|
>  .---. (map/unmap) ||
>  |GIOVA->GPA |>|..  |
>  '---' || GIOVA->HPA |  |
>  |   | |''  |
>  '---' ||
>||
>''
> |
> <
> |
> v VFIO/IOMMU API
>   .---.
>   |  pIOMMU   |
>   |---|
>   |   |
>   .---.
>   |GIOVA->HPA |
>   '---'
>   |   |
>   '---'
> 
> In VT-d 3.0, scalable mode is introduced, which offers two-level
> translation page tables and nested translation mode. Regards to
> GIOVA support, it can be simplified by 1) moving the GIOVA support
> over 1st-level page table to store GIOVA->GPA mapping in vIOMMU,
> 2) binding vIOMMU 1st level page table to the pIOMMU, 3) using pIOMMU
> second level for GPA->HPA translation, and 4) enable nested (a.k.a.
> dual-stage) translation in host. Compared with current shadow GIOVA
> support, the new approach makes the vIOMMU design simpler and more
> efficient as we only need to flush the pIOMMU IOTLB and possible
> device-IOTLB when an IOVA mapping in vIOMMU is torn down.
> 
>  .---.
>  |  vIOMMU   |
>  |---| .---.
>  |   |IOTLB flush trap |   QEMU|
>  .---.(unmap)  |---|
>  |GIOVA->GPA |>|   |
>  '---' '---'
>  |   |   |
>  '---'   |
><--
>|  VFIO/IOMMU
>|  cache invalidation and
>| guest gpd bind interfaces
>v
>  .---.
>  |  pIOMMU   |
>  |---|
>  .---.
>  |GIOVA->GPA |<---First level
>  '---'
>  | GPA->HPA  |<---Scond level
>  '---'
>  '---'
> 
> This patch applies the first level page table for IOVA translation

RE: [PATCH v4 4/7] iommu/vt-d: Setup pasid entries for iova over first level

2019-12-20 Thread Liu, Yi L
> From: Lu Baolu [mailto:baolu...@linux.intel.com]
> Sent: Thursday, December 19, 2019 11:17 AM
> To: Joerg Roedel ; David Woodhouse ;
> Alex Williamson 
> Subject: [PATCH v4 4/7] iommu/vt-d: Setup pasid entries for iova over first 
> level
> 
> Intel VT-d in scalable mode supports two types of page tables for IOVA 
> translation:
> first level and second level. The IOMMU driver can choose one from both for 
> IOVA
> translation according to the use case. This sets up the pasid entry if a 
> domain is
> selected to use the first-level page table for iova translation.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel-iommu.c | 48 +++--
>  include/linux/intel-iommu.h | 16 -
>  2 files changed, 56 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index
> 2b5a47584baf..f0813997dea2 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -571,6 +571,11 @@ static inline int domain_type_is_si(struct dmar_domain
> *domain)
>   return domain->flags & DOMAIN_FLAG_STATIC_IDENTITY;  }
> 
> +static inline bool domain_use_first_level(struct dmar_domain *domain) {
> + return domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL; }
> +
>  static inline int domain_pfn_supported(struct dmar_domain *domain,
>  unsigned long pfn)
>  {
> @@ -2288,6 +2293,8 @@ static int __domain_mapping(struct dmar_domain
> *domain, unsigned long iov_pfn,
>   return -EINVAL;
> 
>   prot &= DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP;
> + if (domain_use_first_level(domain))
> + prot |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD;
> 
>   if (!sg) {
>   sg_res = nr_pages;
> @@ -2515,6 +2522,36 @@ dmar_search_domain_by_dev_info(int segment, int bus,
> int devfn)
>   return NULL;
>  }
> 
> +static int domain_setup_first_level(struct intel_iommu *iommu,
> + struct dmar_domain *domain,
> + struct device *dev,
> + int pasid)
> +{
> + int flags = PASID_FLAG_SUPERVISOR_MODE;

Hi Baolu,

Could you explain a bit why PASID_FLAG_SUPERVISOR_MODE is
required?

Regards,
Yi Liu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[git pull] IOMMU Fixes for Linux v5.5-rc2

2019-12-20 Thread Joerg Roedel
Hi Linus,

thanks for taking care of the KASAN related IOMMU fix while I was dived
into other development work and sorry for the inconvenience. Here are
the other IOMMU fixes that piled up during the last weeks:

The following changes since commit d1eef1c619749b2a57e514a3fa67d9a516ffa919:

  Linux 5.5-rc2 (2019-12-15 15:16:08 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
tags/iommu-fixes-v5.5-rc2

for you to fetch changes up to c18647900ec864d401ba09b3bbd5b34f331f8d26:

  iommu/dma: Relax locking in iommu_dma_prepare_msi() (2019-12-18 17:41:36 
+0100)


IOMMU Fixes for Linux v5.5-rc2

Including:

- Fix kmemleak warning in IOVA code

- Fix compile warnings on ARM32/64 in dma-iommu code due to
  dma_mask type mismatches

- Make ISA reserved regions relaxable, so that VFIO can assign
  devices which have such regions defined

- Fix mapping errors resulting in IO page-faults in the VT-d
  driver

- Make sure direct mappings for a domain are created after the
  default domain is updated

- Map ISA reserved regions in the VT-d driver with correct
  permissions

- Remove unneeded check for PSI capability in the IOTLB flush
  code of the VT-d driver

- Lockdep fix iommu_dma_prepare_msi()


Alex Williamson (1):
  iommu/vt-d: Set ISA bridge reserved region as relaxable

Jerry Snitselaar (2):
  iommu: set group default domain before creating direct mappings
  iommu/vt-d: Allocate reserved region for ISA with correct permission

Lu Baolu (2):
  iommu/vt-d: Fix dmar pte read access not set error
  iommu/vt-d: Remove incorrect PSI capability check

Robin Murphy (2):
  iommu/dma: Rationalise types for DMA masks
  iommu/dma: Relax locking in iommu_dma_prepare_msi()

Xiaotao Yin (1):
  iommu/iova: Init the struct iova to fix the possible memleak

 drivers/iommu/dma-iommu.c   | 23 +++
 drivers/iommu/intel-iommu.c | 12 ++--
 drivers/iommu/intel-svm.c   |  6 +-
 drivers/iommu/iommu.c   |  4 ++--
 drivers/iommu/iova.c|  2 +-
 5 files changed, 17 insertions(+), 30 deletions(-)

Please pull.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RESEND,PATCH 01/13] dt-bindings: mediatek: Add bindings for MT6779

2019-12-20 Thread chao hao
On Mon, 2019-12-16 at 20:05 +0800, Yong Wu wrote:
> On Mon, 2019-11-04 at 19:52 +0800, Chao Hao wrote:
> > This patch adds description for MT6779 IOMMU.
> > 
> > MT6779 has two iommus, they are MM_IOMMU and APU_IOMMU which
> > use ARM Short-Descriptor translation format.
> > 
> > The MT6779 IOMMU hardware diagram is as below, it is only a brief
> > diagram about iommu, it don't focus on the part of smi_larb, so
> > I don't describe the smi_larb detailedly.
> > 
> >  EMI
> >   |
> >--
> >||
> > MM_IOMMUAPU_IOMMU
> >||
> >SMI_COMMOM--- APU_BUS
> >   |||
> > SMI_LARB(0~11)  SMI_LARB12(FAKE)SMI_LARB13(FAKE)
> >   |||
> >   ||   --
> >   ||   | |  |
> >Multimedia engine  CCU VPU   MDLA   EMDA
> > 
> > All the connections are hardware fixed, software can not adjust it.
> > 
> > From the diagram above, MM_IOMMU provides mapping for multimedia engine,
> > but CCU is connected with smi_common directly, we can take them as larb12.
> > APU_IOMMU provides mapping for APU engine, we can take them larb13.
> > Larb12 and Larb13 are fake larbs.
> > 
> > Signed-off-by: Chao Hao 
> > ---
> >  .../bindings/iommu/mediatek,iommu.txt |   2 +
> >  include/dt-bindings/memory/mt6779-larb-port.h | 217 ++
> >  2 files changed, 219 insertions(+)
> >  create mode 100644 include/dt-bindings/memory/mt6779-larb-port.h
> > 
> > diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt 
> > b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> > index ce59a505f5a4..c1ccd8582eb2 100644
> > --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> > +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> > @@ -58,6 +58,7 @@ Required properties:
> >  - compatible : must be one of the following string:
> > "mediatek,mt2701-m4u" for mt2701 which uses generation one m4u HW.
> > "mediatek,mt2712-m4u" for mt2712 which uses generation two m4u HW.
> > +   "mediatek,mt6779-m4u" for mt6779 which uses generation two m4u HW.
> > "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
> >  generation one m4u HW.
> > "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
> > @@ -78,6 +79,7 @@ Required properties:
> > Specifies the mtk_m4u_id as defined in
> > dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623
> > dt-binding/memory/mt2712-larb-port.h for mt2712,
> > +   dt-binding/memory/mt6779-larb-port.h for mt6779,
> > dt-binding/memory/mt8173-larb-port.h for mt8173, and
> > dt-binding/memory/mt8183-larb-port.h for mt8183.
> >  
> > diff --git a/include/dt-bindings/memory/mt6779-larb-port.h 
> > b/include/dt-bindings/memory/mt6779-larb-port.h
> > new file mode 100644
> > index ..8b7f2d2446ea
> > --- /dev/null
> > +++ b/include/dt-bindings/memory/mt6779-larb-port.h
> > @@ -0,0 +1,217 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2019 MediaTek Inc.
> > + * Author: Chao Hao 
> > + */
> > +
> > +#ifndef _DTS_IOMMU_PORT_MT6779_H_
> > +#define _DTS_IOMMU_PORT_MT6779_H_
> > +
> > +#define MTK_M4U_ID(larb, port)  (((larb) << 5) | (port))
> > +
> > +#define M4U_LARB0_ID0
> > +#define M4U_LARB1_ID1
> > +#define M4U_LARB2_ID2
> > +#define M4U_LARB3_ID3
> > +#define M4U_LARB4_ID4
> > +#define M4U_LARB5_ID5
> > +#define M4U_LARB6_ID6
> > +#define M4U_LARB7_ID7
> > +#define M4U_LARB8_ID8
> > +#define M4U_LARB9_ID9
> > +#define M4U_LARB10_ID   10
> > +#define M4U_LARB11_ID   11
> > +#define M4U_LARB12_ID   12
> > +#define M4U_LARB13_ID   13
> > +
> > +/* larb0 */
> > +#define M4U_PORT_DISP_POSTMASK0 MTK_M4U_ID(M4U_LARB0_ID, 0)
> > +#define M4U_PORT_DISP_OVL0_HDR  MTK_M4U_ID(M4U_LARB0_ID, 1)
> > +#define M4U_PORT_DISP_OVL1_HDR  MTK_M4U_ID(M4U_LARB0_ID, 2)
> > +#define M4U_PORT_DISP_OVL0  MTK_M4U_ID(M4U_LARB0_ID, 3)
> > +#define M4U_PORT_DISP_OVL1  MTK_M4U_ID(M4U_LARB0_ID, 4)
> > +#define M4U_PORT_DISP_PVRIC0MTK_M4U_ID(M4U_LARB0_ID, 5)
> > +#define M4U_PORT_DISP_RDMA0 MTK_M4U_ID(M4U_LARB0_ID, 6)
> > +#define M4U_PORT_DISP_WDMA0 MTK_M4U_ID(M4U_LARB0_ID, 7)
> > +#define M4U_PORT_DISP

Re: [PATCH 5/5] drm/msm/a6xx: Add support for using system cache(LLC)

2019-12-20 Thread smasetty

On 2019-12-20 01:28, Jordan Crouse wrote:

On Thu, Dec 19, 2019 at 06:44:46PM +0530, Sharat Masetty wrote:

The last level system cache can be partitioned to 32 different slices
of which GPU has two slices preallocated. One slice is used for 
caching GPU
buffers and the other slice is used for caching the GPU SMMU 
pagetables.
This patch talks to the core system cache driver to acquire the slice 
handles,
configure the SCID's to those slices and activates and deactivates the 
slices

upon GPU power collapse and restore.

Some support from the IOMMU driver is also needed to make use of the
system cache. IOMMU_QCOM_SYS_CACHE is a buffer protection flag which 
enables
caching GPU data buffers in the system cache with memory attributes 
such

as outer cacheable, read-allocate, write-allocate for buffers. The GPU
then has the ability to override a few cacheability parameters which 
it
does to override write-allocate to write-no-allocate as the GPU 
hardware

does not benefit much from it.

Similarly DOMAIN_ATTR_QCOM_SYS_CACHE is another domain level attribute
used by the IOMMU driver to set the right attributes to cache the 
hardware

pagetables into the system cache.

Signed-off-by: Sharat Masetty 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 122 
+-

 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   9 +++
 drivers/gpu/drm/msm/msm_iommu.c   |  13 
 drivers/gpu/drm/msm/msm_mmu.h |   3 +
 4 files changed, 146 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c

index faff6ff..0c7fdee 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -9,6 +9,7 @@
 #include "a6xx_gmu.xml.h"

 #include 
+#include 

 #define GPU_PAS_ID 13

@@ -781,6 +782,117 @@ static void 
a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu)

gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
 }

+#define A6XX_LLC_NUM_GPU_SCIDS 5
+#define A6XX_GPU_LLC_SCID_NUM_BITS 5


As I mention below, I'm not sure if we need these


+#define A6XX_GPU_LLC_SCID_MASK \
+   ((1 << (A6XX_LLC_NUM_GPU_SCIDS * A6XX_GPU_LLC_SCID_NUM_BITS)) - 1)
+
+#define A6XX_GPUHTW_LLC_SCID_SHIFT 25
+#define A6XX_GPUHTW_LLC_SCID_MASK \
+	(((1 << A6XX_GPU_LLC_SCID_NUM_BITS) - 1) << 
A6XX_GPUHTW_LLC_SCID_SHIFT)

+


Normally these go into the envytools regmap but if we're going to do 
these guys

lets use the power of  for good.

#define A6XX_GPU_LLC_SCID GENMASK(24, 0)
#define A6XX_GPUHTW_LLC_SCID GENMASK(29, 25)


+static inline void a6xx_gpu_cx_rmw(struct a6xx_llc *llc,


Don't mark C functions as inline - let the compiler figure it out for 
you.



+   u32 reg, u32 mask, u32 or)
+{
+   msm_rmw(llc->mmio + (reg << 2), mask, or);
+}
+
+static void a6xx_llc_deactivate(struct a6xx_llc *llc)
+{
+   llcc_slice_deactivate(llc->gpu_llc_slice);
+   llcc_slice_deactivate(llc->gpuhtw_llc_slice);
+}
+
+static void a6xx_llc_activate(struct a6xx_llc *llc)
+{
+   if (!llc->mmio)
+   return;
+
+   /* Program the sub-cache ID for all GPU blocks */
+   if (!llcc_slice_activate(llc->gpu_llc_slice))
+   a6xx_gpu_cx_rmw(llc,
+   REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1,
+   A6XX_GPU_LLC_SCID_MASK,
+   (llc->cntl1_regval &
+A6XX_GPU_LLC_SCID_MASK));


This is out of order with the comments below, but if we store the slice 
id then

you could calculate regval here and not have to store it.


+
+   /* Program the sub-cache ID for the GPU pagetables */
+   if (!llcc_slice_activate(llc->gpuhtw_llc_slice))


val |= FIELD_SET(A6XX_GPUHTW_LLC_SCID, htw_llc_sliceid);


+   a6xx_gpu_cx_rmw(llc,
+   REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1,
+   A6XX_GPUHTW_LLC_SCID_MASK,
+   (llc->cntl1_regval &
+A6XX_GPUHTW_LLC_SCID_MASK));


And this could be FIELD_SET(A6XX_GPUHTW_LLC_SCID, sliceid);

In theory you could just calculate the u32 and write it directly 
without a rmw.
In fact, that might be preferable - if the slice activate failed, you 
don't want

to run the risk that the scid for htw is still populated.


+
+   /* Program cacheability overrides */
+   a6xx_gpu_cx_rmw(llc, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF,
+   llc->cntl0_regval);


As below, this could easily be a constant.


+}
+
+static void a6xx_llc_slices_destroy(struct a6xx_llc *llc)
+{
+   if (llc->mmio)
+   iounmap(llc->mmio);


msm_ioremap returns a devm_ managed resource, so do not use iounmap() 
to free
it. Bets to just leave it and let the gpu device handle it when it goes 
boom.



+
+   llcc_slice_putd(llc->gpu_llc_slice);
+   llcc_slice_putd(llc->gpuhtw_llc_slice);
+}
+
+static int a6xx_llc_slices_init(struct platform_d

Re: [PATCH] iommu/vt-d: Don't reject nvme host due to scope mismatch

2019-12-20 Thread Jerry Snitselaar

On Fri Dec 20 19, jimyan wrote:

On a system with an Intel PCIe port configured as a nvme host device, iommu
initialization fails with

   DMAR: Device scope type does not match for :80:00.0

This is because the DMAR table reports this device as having scope 2
(ACPI_DMAR_SCOPE_TYPE_BRIDGE):



Isn't that a problem to be fixed in the DMAR table then?


but the device has a type 0 PCI header:
80:00.0 Class 0600: Device 8086:2020 (rev 06)
00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00
30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00

VT-d works perfectly on this system, so there's no reason to bail out
on initialization due to this apparent scope mismatch. Add the class
0x600 ("PCI_CLASS_BRIDGE_HOST") as a heuristic for allowing DMAR
initialization for non-bridge PCI devices listed with scope bridge.

Signed-off-by: jimyan 
---
drivers/iommu/dmar.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index eecd6a421667..9faf2f0e0237 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -244,6 +244,7 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
 info->dev->hdr_type != PCI_HEADER_TYPE_NORMAL) ||
(scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE &&
 (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL &&
+ info->dev->class >> 8 != PCI_CLASS_BRIDGE_HOST &&
  info->dev->class >> 8 != PCI_CLASS_BRIDGE_OTHER))) {
pr_warn("Device scope type does not match for %s\n",
pci_name(info->dev));
--
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 03/16] PCI/ATS: Restore EXPORT_SYMBOL_GPL() for pci_{enable,disable}_ats()

2019-12-20 Thread Joerg Roedel
Hi Bjorn,

On Thu, Dec 19, 2019 at 12:03:39PM +, Will Deacon wrote:
> From: Greg Kroah-Hartman 
> 
> Commit d355bb209783 ("PCI/ATS: Remove unnecessary EXPORT_SYMBOL_GPL()")
> unexported a bunch of symbols from the PCI core since the only external
> users were non-modular IOMMU drivers. Although most of those symbols
> can remain private for now, 'pci_{enable,disable_ats()' is required for
> the ARM SMMUv3 driver to build as a module, otherwise we get a build
> failure as follows:
> 
>   | ERROR: "pci_enable_ats" [drivers/iommu/arm-smmu-v3.ko] undefined!
>   | ERROR: "pci_disable_ats" [drivers/iommu/arm-smmu-v3.ko] undefined!
> 
> Re-export these two functions so that the ARM SMMUv3 driver can be build
> as a module.
> 
> Cc: Bjorn Helgaas 
> Cc: Joerg Roedel 
> Signed-off-by: Greg Kroah-Hartman 
> [will: rewrote commit message]
> Signed-off-by: Will Deacon 
> ---
>  drivers/pci/ats.c | 2 ++
>  1 file changed, 2 insertions(+)

Are you fine with this change? I would apply this series to my tree
then.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu