Re: [PATCH v3 3/6] iommu: add ARM short descriptor page table allocator.

2015-07-31 Thread Yong Wu
On Mon, 2015-07-27 at 12:21 +0800, Yong Wu wrote:

> On Fri, 2015-07-24 at 17:53 +0100, Will Deacon wrote:
> > On Fri, Jul 24, 2015 at 06:24:26AM +0100, Yong Wu wrote:
> > > On Tue, 2015-07-21 at 18:11 +0100, Will Deacon wrote:
> > > > On Thu, Jul 16, 2015 at 10:04:32AM +0100, Yong Wu wrote:
> > > > > +/* level 2 pagetable */
> > > > > +#define ARM_SHORT_PTE_TYPE_LARGE   BIT(0)
> > > > > +#define ARM_SHORT_PTE_SMALL_XN BIT(0)
> > > > > +#define ARM_SHORT_PTE_TYPE_SMALL   BIT(1)
> > > > > +#define ARM_SHORT_PTE_BBIT(2)
> > > > > +#define ARM_SHORT_PTE_CBIT(3)
> > > > > +#define ARM_SHORT_PTE_SMALL_TEX0   BIT(6)
> > > > > +#define ARM_SHORT_PTE_IMPLEBIT(9)
> > > >
> > > > This is AP[2] for small pages.
> > > 
> > > Sorry, In our pagetable bit9 in PGD and PTE is PA[32] that is for  the
> > > dram size over 4G. I didn't care it is different in PTE of the standard
> > > spec.
> > > And I don't use the AP[2] currently, so I only delete this line in next
> > > time.
> > 
> > Is this related to the "special bit". What would be good is a comment
> > next to the #define for the quirk describing *exactly* that differs in
> > your implementation. Without that, it's very difficult to know what is
> > intentional and what is actually broken.
> 
> I will add the comment alongside the #define.
> 
> > 
> > > > > +static arm_short_iopte
> > > > > +__arm_short_pte_prot(struct arm_short_io_pgtable *data, int prot, 
> > > > > bool large)
> > > > > +{
> > > > > +   arm_short_iopte pteprot;
> > > > > +
> > > > > +   pteprot = ARM_SHORT_PTE_S | ARM_SHORT_PTE_nG;
> > > > > +   pteprot |= large ? ARM_SHORT_PTE_TYPE_LARGE :
> > > > > +   ARM_SHORT_PTE_TYPE_SMALL;
> > > > > +   if (prot & IOMMU_CACHE)
> > > > > +   pteprot |=  ARM_SHORT_PTE_B | ARM_SHORT_PTE_C;
> > > > > +   if (prot & IOMMU_WRITE)
> > > > > +   pteprot |= large ? ARM_SHORT_PTE_LARGE_TEX0 :
> > > > > +   ARM_SHORT_PTE_SMALL_TEX0;
> > > >
> > > > This doesn't make any sense. TEX[2:0] is all about memory attributes, 
> > > > not
> > > > permissions, so you're making the mapping write-back, write-allocate but
> > > > that's not what the IOMMU_* values are about.
> > > 
> > >  I will delete it.
> > 
> > Well, can you not control mapping permissions with the AP bits? The idea
> > of the IOMMU flags are:
> > 
> >   IOMMU_CACHE : Install a normal, cacheable mapping (you've got this right)
> >   IOMMU_READ : Allow read access for the device
> >   IOMMU_WRITE : Allow write access for the device
> >   IOMMU_NOEXEC : Disallow execute access for the device
> > 
> > so the caller to iommu_map passes in a bitmap of these, which you need to
> > encode in the page-table entry.
> 
> From the spec, AP[2] differentiate the read/write and readonly.
> How about this?: 
> //===
>   #define ARM_SHORT_PGD_FULL_ACCESS  (3 << 10) 
>   #define ARM_SHORT_PGD_RDONLY   BIT(15)
> 
>   pgdprot |= ARM_SHORT_PGD_FULL_ACCESS;/* or other names? */
>   if(!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
>  pgdprot |= ARM_SHORT_PGD_RDONLY;
> //===
> pte is the same. 
> 
> Sorry, Our HW don't meet the standard spec fully. it don't implement the
> AP bits.



Hi Will, 
About the AP bits, I may have to add a new quirk for it...

  Current I add AP in pte like this:
 #define ARM_SHORT_PTE_RD_WR(3 << 4) 
 #define ARM_SHORT_PTE_RDONLY   BIT(9)

 pteprot |=  ARM_SHORT_PTE_RD_WR;

 If(!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))

  pteprot |= ARM_SHORT_PTE_RDONLY;

The problem is that the BIT(9) in the level1 and level2 pagetable of our
HW has been used for PA[32] that is for the dram size over 4G.

so I had to add a quirk to disable bit9 while RDONLY case.
(If BIT9 isn't disabled, the HW treat it as the PA[32] case then it will
translation fault..)

like: IO_PGTABLE_QUIRK_SHORT_MTK ?


> > 
> > Will
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/9] iommu/io-pgtable-arm: Allow appropriate DMA API use

2015-07-31 Thread Will Deacon
Hi Robin,

On Wed, Jul 29, 2015 at 07:46:04PM +0100, Robin Murphy wrote:
> Currently, users of the LPAE page table code are (ab)using dma_map_page()
> as a means to flush page table updates for non-coherent IOMMUs. Since
> from the CPU's point of view, creating IOMMU page tables *is* passing
> DMA buffers to a device (the IOMMU's page table walker), there's little
> reason not to use the DMA API correctly.
> 
> Allow IOMMU drivers to opt into DMA API operations for page table
> allocation and updates by providing their appropriate device pointer.
> The expectation is that an LPAE IOMMU should have a full view of system
> memory, so use streaming mappings to avoid unnecessary pressure on
> ZONE_DMA, and treat any DMA translation as a warning sign.
> 
> Signed-off-by: Robin Murphy 
> ---
> 
> Changes since v1[1]:
> - Make device pointer mandatory and use DMA API unconditionally
> - Remove flush_pgtable callback entirely
> - Style, consistency and typo fixes

I think this is looking good now, thanks. I'll add it to my ARM SMMU queue
for 4.3.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v2: ThunderX(errata-23399) mis-extends 64bit registers

2015-07-31 Thread Will Deacon
On Thu, Jul 30, 2015 at 09:54:04PM +0100, Chalamarla, Tirumalesh wrote:
> is some thing like this looks good

That's the right sort of idea, but please send a proper patch that you've
actually tested. The diff below mixes up reg64 and reg.

Will

> +#ifdef CONFIG_64BIT
> +#define smmu_writeq(reg64, addr)   writeq_relaxed((reg64), (addr))
> +#else
> +#define smmu_writeq(reg64, addr)   \
> +   writel_relaxed(((reg64) >> 32), ((addr) + 4));  \
> +   writel_relaxed((reg64), (addr))
> +
> +
>  /* Configuration registers */
>  #define ARM_SMMU_GR0_sCR0  0x0
>  #define sCR0_CLIENTPD  (1 << 0)
> @@ -226,7 +234,7 @@
>  #define TTBCR2_SEP_SHIFT   15
>  #define TTBCR2_SEP_UPSTREAM(0x7 << TTBCR2_SEP_SHIFT)
> 
> -#define TTBRn_HI_ASID_SHIFT16
> +#define TTBRn_ASID_SHIFT   48
> 
>  #define FSR_MULTI  (1 << 31)
>  #define FSR_SS (1 << 30)
> @@ -719,6 +727,7 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
>struct io_pgtable_cfg *pgtbl_cfg)
>  {
> u32 reg;
> +   u64 reg64;
> bool stage1;
> struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> struct arm_smmu_device *smmu = smmu_domain->smmu;
> @@ -762,22 +771,16 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
> 
> /* TTBRs */
> if (stage1) {
> -   reg = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_LO);
> -   reg = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0] >> 32;
> -   reg |= ARM_SMMU_CB_ASID(cfg) << TTBRn_HI_ASID_SHIFT;
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_HI);
> -
> -   reg = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR1_LO);
> -   reg = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1] >> 32;
> -   reg |= ARM_SMMU_CB_ASID(cfg) << TTBRn_HI_ASID_SHIFT;
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR1_HI);
> +   reg64 = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> +   reg64 |= ((u64)ARM_SMMU_CB_ASID(cfg)) << TTBRn_ASID_SHIFT;
> +   smmu_writeq(reg64, cb_base + ARM_SMMU_CB_TTBR0_LO);
> +
> +   reg64 = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
> +   reg64 |= ARM_SMMU_CB_ASID(cfg) << TTBRn_ASID_SHIFT;
> +   smmu_writeq(reg, cb_base + ARM_SMMU_CB_TTBR1_LO);
> } else {
> -   reg = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_LO);
> -   reg = pgtbl_cfg->arm_lpae_s2_cfg.vttbr >> 32;
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0_HI);
> +   reg64 = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> +   smmu_writeq(reg, cb_base + ARM_SMMU_CB_TTBR0_LO);
> }
> 
> /* TTBCR */
> @@ -1236,10 +1239,8 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct 
> iommu_domain *domain,
> u32 reg = iova & ~0xfff;
> writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
> } else {
> -   u32 reg = iova & ~0xfff;
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
> -   reg = ((u64)iova & ~0xfff) >> 32;
> -   writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI);
> +   u64 reg = iova & ~0xfff;
> +   smmu_writeq(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO);
> }
> 
> if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 3/6] iommu: add ARM short descriptor page table allocator.

2015-07-31 Thread Will Deacon
On Fri, Jul 31, 2015 at 08:55:37AM +0100, Yong Wu wrote:
> About the AP bits, I may have to add a new quirk for it...
> 
>   Current I add AP in pte like this:
> #define ARM_SHORT_PTE_RD_WR(3 << 4)
> #define ARM_SHORT_PTE_RDONLY   BIT(9)
> 
> pteprot |=  ARM_SHORT_PTE_RD_WR;
> 
> 
>  If(!(prot & IOMMU_WRITE) && (prot & IOMMU_READ))
> 
> 
>   pteprot |= ARM_SHORT_PTE_RDONLY;
> 
> The problem is that the BIT(9) in the level1 and level2 pagetable of our
> HW has been used for PA[32] that is for the dram size over 4G.

Aha, now *thats* a case of page-table abuse!

> so I had to add a quirk to disable bit9 while RDONLY case.
> (If BIT9 isn't disabled, the HW treat it as the PA[32] case then it will
> translation fault..)
> 
> like: IO_PGTABLE_QUIRK_SHORT_MTK ?

Given that you don't have XN either, maybe IO_PGTABLE_QUIRK_NO_PERMS?
When set, IOMMU_READ/WRITE/EXEC are ignored and the mapping will never
generate a permission fault.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v2: ThunderX(errata-23399) mis-extends 64bit registers

2015-07-31 Thread Russell King - ARM Linux
On Thu, Jul 30, 2015 at 08:54:04PM +, Chalamarla, Tirumalesh wrote:
> is some thing like this looks good
> 
> +#ifdef CONFIG_64BIT
> +#define smmu_writeq(reg64, addr)   writeq_relaxed((reg64), (addr))
> +#else
> +#define smmu_writeq(reg64, addr)   \
> +   writel_relaxed(((reg64) >> 32), ((addr) + 4));  \
> +   writel_relaxed((reg64), (addr))

It's missing a #endif.

This also suffers from multiple argument evaluation, and it hides that
there's two expressions here - which makes future maintanence harder.

#define smmu_writeq(reg64, addr)\
do {\
u64 __val = (reg64);\
volatile void __iomem *__addr = (addr); \
writel_relaxed(__val >> 32, __addr + 4);\
writel_relaxed(__val, __addr);  \
} while (0)

is longer but is much preferred as it won't suffer side effects from
stuff like:

if (...)
smmu_writeq(val++, addr);

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[git pull] IOMMU Fixes for Linux v4.2-rc4

2015-07-31 Thread Joerg Roedel
Hi Linus,

The following changes since commit cbfe8fa6cd672011c755c3cd85c9ffd4e2d10a6f:

  Linux 4.2-rc4 (2015-07-26 12:26:21 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
tags/iommu-fixes-v4.2-rc4

for you to fetch changes up to 1c1cc454aa694a89572689515fdaaf27b8c9f42a:

  iommu/amd: Allow non-ATS devices in IOMMUv2 domains (2015-07-31 15:15:41 
+0200)


IOMMU Fixes for Linux v4.2-rc4

These fixes are all for the AMD IOMMU driver:

* A regression with HSA caused by the conversion of the driver to
  default domains. The fixes make sure that an HSA device can
  still be attached to an IOMMUv2 domain and that these domains
  also allow non-IOMMUv2 capable devices.

* Fix iommu=pt mode which did not work because the dma_ops where
  set to nommu_ops, which breaks devices that can only do 32bit
  DMA.

* Fix an issue with non-PCI devices not working, because there
  are no dma_ops for them. This issue was discovered recently as
  new AMD x86 platforms have non-PCI devices too.


Joerg Roedel (6):
  iommu/amd: Use iommu_attach_group()
  iommu/amd: Use iommu core for passthrough mode
  iommu/amd: Allow non-IOMMUv2 devices in IOMMUv2 domains
  iommu/amd: Use swiotlb in passthrough mode
  iommu/amd: Set global dma_ops if swiotlb is disabled
  iommu/amd: Allow non-ATS devices in IOMMUv2 domains

 drivers/iommu/amd_iommu.c  | 98 +-
 drivers/iommu/amd_iommu_init.c | 10 +
 drivers/iommu/amd_iommu_v2.c   | 24 +--
 3 files changed, 51 insertions(+), 81 deletions(-)

Please pull.

Thanks,

Joerg



signature.asc
Description: Digital signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v5 0/3] arm64: IOMMU-backed DMA mapping

2015-07-31 Thread Robin Murphy
Hi all,

Here's an update following Catalin's feedback on v4[1].

Changes this round:
- Rebased onto linux-next
  - IOVA alignment fix applied already
  - iommu_iova_cache_init() is now iova_cache_get()
- Tidied up iommu_dma_alloc()
  - Simplified pgprot handling
  - Removed redundant memset
  - Skip coherent page-flushing in a simpler way
- Spotted a bug in iommu_dma_init_domain() where the checks for
  reinitialising an existing domain were backwards.

If it is going to be down to me to tackle all the driver fixes and
conversion of arch/arm dma_ops, I'd still much rather have this
code merged first as a stable base to work with (and un-block arm64
in the meantime). Have we decided yet whether this should go via the
IOMMU tree or the arm64 tree?

Thanks,
Robin.

[1]:http://thread.gmane.org/gmane.linux.kernel.iommu/10181

Robin Murphy (3):
  iommu: Implement common IOMMU ops for DMA mapping
  arm64: Add IOMMU dma_ops
  arm64: Hook up IOMMU dma_ops

 arch/arm64/Kconfig   |   1 +
 arch/arm64/include/asm/dma-mapping.h |  15 +-
 arch/arm64/mm/dma-mapping.c  | 449 +
 drivers/iommu/Kconfig|   7 +
 drivers/iommu/Makefile   |   1 +
 drivers/iommu/dma-iommu.c| 534 +++
 include/linux/dma-iommu.h|  84 ++
 include/linux/iommu.h|   1 +
 8 files changed, 1084 insertions(+), 8 deletions(-)
 create mode 100644 drivers/iommu/dma-iommu.c
 create mode 100644 include/linux/dma-iommu.h

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/3] iommu: Implement common IOMMU ops for DMA mapping

2015-07-31 Thread Robin Murphy
Taking inspiration from the existing arch/arm code, break out some
generic functions to interface the DMA-API to the IOMMU-API. This will
do the bulk of the heavy lifting for IOMMU-backed dma-mapping.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/Kconfig |   7 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/dma-iommu.c | 534 ++
 include/linux/dma-iommu.h |  84 
 include/linux/iommu.h |   1 +
 5 files changed, 627 insertions(+)
 create mode 100644 drivers/iommu/dma-iommu.c
 create mode 100644 include/linux/dma-iommu.h

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 8a1bc38..4996dc3 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -48,6 +48,13 @@ config OF_IOMMU
def_bool y
depends on OF && IOMMU_API
 
+# IOMMU-agnostic DMA-mapping layer
+config IOMMU_DMA
+   bool
+   depends on NEED_SG_DMA_LENGTH
+   select IOMMU_API
+   select IOMMU_IOVA
+
 config FSL_PAMU
bool "Freescale IOMMU support"
depends on PPC32
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index dc6f511..45efa2a 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,6 +1,7 @@
 obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
+obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
 obj-$(CONFIG_IOMMU_IOVA) += iova.o
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
new file mode 100644
index 000..f34fd46
--- /dev/null
+++ b/drivers/iommu/dma-iommu.c
@@ -0,0 +1,534 @@
+/*
+ * A fairly generic DMA-API to IOMMU-API glue layer.
+ *
+ * Copyright (C) 2014-2015 ARM Ltd.
+ *
+ * based in part on arch/arm/mm/dma-mapping.c:
+ * Copyright (C) 2000-2004 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+int iommu_dma_init(void)
+{
+   return iova_cache_get();
+}
+
+/**
+ * iommu_get_dma_cookie - Acquire DMA-API resources for a domain
+ * @domain: IOMMU domain to prepare for DMA-API usage
+ *
+ * IOMMU drivers should normally call this from their domain_alloc
+ * callback when domain->type == IOMMU_DOMAIN_DMA.
+ */
+int iommu_get_dma_cookie(struct iommu_domain *domain)
+{
+   struct iova_domain *iovad;
+
+   if (domain->dma_api_cookie)
+   return -EEXIST;
+
+   iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
+   domain->dma_api_cookie = iovad;
+
+   return iovad ? 0 : -ENOMEM;
+}
+EXPORT_SYMBOL(iommu_get_dma_cookie);
+
+/**
+ * iommu_put_dma_cookie - Release a domain's DMA mapping resources
+ * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
+ *
+ * IOMMU drivers should normally call this from their domain_free callback.
+ */
+void iommu_put_dma_cookie(struct iommu_domain *domain)
+{
+   struct iova_domain *iovad = domain->dma_api_cookie;
+
+   if (!iovad)
+   return;
+
+   put_iova_domain(iovad);
+   kfree(iovad);
+   domain->dma_api_cookie = NULL;
+}
+EXPORT_SYMBOL(iommu_put_dma_cookie);
+
+/**
+ * iommu_dma_init_domain - Initialise a DMA mapping domain
+ * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
+ * @base: IOVA at which the mappable address space starts
+ * @size: Size of IOVA space
+ *
+ * @base and @size should be exact multiples of IOMMU page granularity to
+ * avoid rounding surprises. If necessary, we reserve the page at address 0
+ * to ensure it is an invalid IOVA.
+ */
+int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 
size)
+{
+   struct iova_domain *iovad = domain->dma_api_cookie;
+   unsigned long order, base_pfn, end_pfn;
+
+   if (!iovad)
+   return -ENODEV;
+
+   /* Use the smallest supported page size for IOVA granularity */
+   order = __ffs(domain->ops->pgsize_bitmap);
+   base_pfn = max_t(unsigned long, 1, base >> order);
+   end_pfn = (base + size - 1) >> order;
+
+   /* Check the domain allows at least some access to the device... */
+   if (domain->geometry.force_aperture) {
+   if (base > domain->geometry.aperture_end ||
+   base + size <= domain->geometry.aperture_start) {
+   pr_warn("specified DMA range outside IOMMU 
capabil

[PATCH v5 2/3] arm64: Add IOMMU dma_ops

2015-07-31 Thread Robin Murphy
Taking some inspiration from the arch/arm code, implement the
arch-specific side of the DMA mapping ops using the new IOMMU-DMA layer.

Unfortunately the device setup code has to start out as a big ugly mess
in order to work usefully right now, as 'proper' operation depends on
changes to device probe and DMA configuration ordering, IOMMU groups for
platform devices, and default domain support in arm/arm64 IOMMU drivers.
The workarounds here need only exist until that work is finished.

Signed-off-by: Robin Murphy 
---
 arch/arm64/mm/dma-mapping.c | 425 
 1 file changed, 425 insertions(+)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index e5d74cd..c735f45 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -534,3 +534,428 @@ static int __init dma_debug_do_init(void)
return 0;
 }
 fs_initcall(dma_debug_do_init);
+
+
+#ifdef CONFIG_IOMMU_DMA
+#include 
+#include 
+#include 
+
+/* Thankfully, all cache ops are by VA so we can ignore phys here */
+static void flush_page(struct device *dev, const void *virt, phys_addr_t phys)
+{
+   __dma_flush_range(virt, virt + PAGE_SIZE);
+}
+
+static void *__iommu_alloc_attrs(struct device *dev, size_t size,
+dma_addr_t *handle, gfp_t gfp,
+struct dma_attrs *attrs)
+{
+   bool coherent = is_device_dma_coherent(dev);
+   int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent);
+   void *addr;
+
+   if (WARN(!dev, "cannot create IOMMU mapping for unknown device\n"))
+   return NULL;
+   /*
+* Some drivers rely on this, and we probably don't want the
+* possibility of stale kernel data being read by devices anyway.
+*/
+   gfp |= __GFP_ZERO;
+
+   if (gfp & __GFP_WAIT) {
+   struct page **pages;
+   pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
+
+   pages = iommu_dma_alloc(dev, size, gfp, ioprot, handle,
+   flush_page);
+   if (!pages)
+   return NULL;
+
+   addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
+ __builtin_return_address(0));
+   if (!addr)
+   iommu_dma_free(dev, pages, size, handle);
+   } else {
+   struct page *page;
+   /*
+* In atomic context we can't remap anything, so we'll only
+* get the virtually contiguous buffer we need by way of a
+* physically contiguous allocation.
+*/
+   if (coherent) {
+   page = alloc_pages(gfp, get_order(size));
+   addr = page ? page_address(page) : NULL;
+   } else {
+   addr = __alloc_from_pool(size, &page, gfp);
+   }
+   if (!addr)
+   return NULL;
+
+   *handle = iommu_dma_map_page(dev, page, 0, size, ioprot);
+   if (iommu_dma_mapping_error(dev, *handle)) {
+   if (coherent)
+   __free_pages(page, get_order(size));
+   else
+   __free_from_pool(addr, size);
+   addr = NULL;
+   }
+   }
+   return addr;
+}
+
+static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
+  dma_addr_t handle, struct dma_attrs *attrs)
+{
+   /*
+* @cpu_addr will be one of 3 things depending on how it was allocated:
+* - A remapped array of pages from iommu_dma_alloc(), for all
+*   non-atomic allocations.
+* - A non-cacheable alias from the atomic pool, for atomic
+*   allocations by non-coherent devices.
+* - A normal lowmem address, for atomic allocations by
+*   coherent devices.
+* Hence how dodgy the below logic looks...
+*/
+   if (__in_atomic_pool(cpu_addr, size)) {
+   iommu_dma_unmap_page(dev, handle, size, 0, NULL);
+   __free_from_pool(cpu_addr, size);
+   } else if (is_vmalloc_addr(cpu_addr)){
+   struct vm_struct *area = find_vm_area(cpu_addr);
+
+   if (WARN_ON(!area || !area->pages))
+   return;
+   iommu_dma_free(dev, area->pages, size, &handle);
+   dma_common_free_remap(cpu_addr, size, VM_USERMAP);
+   } else {
+   iommu_dma_unmap_page(dev, handle, size, 0, NULL);
+   __free_pages(virt_to_page(cpu_addr), get_order(size));
+   }
+}
+
+static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
+ void *cpu_addr, dma_addr_t dma_addr, size_t size,
+ struct dma_attrs *attrs)

[PATCH v5 3/3] arm64: Hook up IOMMU dma_ops

2015-07-31 Thread Robin Murphy
With iommu_dma_ops in place, hook them up to the configuration code, so
IOMMU-fronted devices will get them automatically.

Acked-by: Catalin Marinas 
Signed-off-by: Robin Murphy 
---
 arch/arm64/Kconfig   |  1 +
 arch/arm64/include/asm/dma-mapping.h | 15 +++
 arch/arm64/mm/dma-mapping.c  | 24 
 3 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c8933dc..81584ef 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -73,6 +73,7 @@ config ARM64
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE
select HAVE_SYSCALL_TRACEPOINTS
+   select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select MODULES_USE_ELF_RELA
diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
index f0d6d0b..7f9edcb 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -56,16 +56,15 @@ static inline struct dma_map_ops *get_dma_ops(struct device 
*dev)
return __generic_dma_ops(dev);
 }
 
-static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 
size,
- struct iommu_ops *iommu, bool coherent)
-{
-   if (!acpi_disabled && !dev->archdata.dma_ops)
-   dev->archdata.dma_ops = dma_ops;
-
-   dev->archdata.dma_coherent = coherent;
-}
+void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+   struct iommu_ops *iommu, bool coherent);
 #define arch_setup_dma_ops arch_setup_dma_ops
 
+#ifdef CONFIG_IOMMU_DMA
+void arch_teardown_dma_ops(struct device *dev);
+#define arch_teardown_dma_ops  arch_teardown_dma_ops
+#endif
+
 /* do not use this function in a driver */
 static inline bool is_device_dma_coherent(struct device *dev)
 {
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index c735f45..175dcb2 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -952,6 +952,20 @@ out_no_domain:
pr_warn("Failed to set up IOMMU domain for device %s\n", dev_name(dev));
 }
 
+void arch_teardown_dma_ops(struct device *dev)
+{
+   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+
+   if (domain) {
+   iommu_detach_device(domain, dev);
+   if (domain->type & __IOMMU_DOMAIN_ARM64_IOVA)
+   iommu_put_dma_cookie(domain);
+   if (domain->type & __IOMMU_DOMAIN_ARM64)
+   iommu_domain_free(domain);
+   dev->archdata.dma_ops = NULL;
+   }
+}
+
 #else
 
 static void __iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
@@ -959,3 +973,13 @@ static void __iommu_setup_dma_ops(struct device *dev, u64 
dma_base, u64 size,
 { }
 
 #endif  /* CONFIG_IOMMU_DMA */
+
+void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+   struct iommu_ops *iommu, bool coherent)
+{
+   if (!acpi_disabled && !dev->archdata.dma_ops)
+   dev->archdata.dma_ops = dma_ops;
+
+   dev->archdata.dma_coherent = coherent;
+   __iommu_setup_dma_ops(dev, dma_base, size, iommu);
+}
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu