Re: [PATCH v1 0/2] Misc fixes for ARM page table allocator (long/short descriptor format)

2017-03-16 Thread Oleksandr Tyshchenko
Hi, all.

Any comments?

On Mon, Feb 27, 2017 at 2:30 PM, Oleksandr Tyshchenko
 wrote:
> Hi.
>
> There is a small patch series which contains the same fix for both
> descriptor formats based on the preceding RFC patches:
> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020411.html
> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020477.html
>
> Oleksandr Tyshchenko (2):
>   iommu/io-pgtable-arm: Check for leaf entry before dereferencing it
>   iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
>
>  drivers/iommu/io-pgtable-arm-v7s.c | 6 +-
>  drivers/iommu/io-pgtable-arm.c | 6 +-
>  2 files changed, 10 insertions(+), 2 deletions(-)
>
> --
> 2.7.4
>



-- 
Regards,

Oleksandr Tyshchenko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 0/2] Misc fixes for ARM page table allocator (long/short descriptor format)

2017-03-16 Thread Robin Murphy
On 16/03/17 09:19, Oleksandr Tyshchenko wrote:
> Hi, all.
> 
> Any comments?

Er, no, but in a good way ;)

Patches look fine to me, and I see Will's already queued them anyway.

Thanks for the respin, and apologies for losing it in my inbox!

Robin.

> 
> On Mon, Feb 27, 2017 at 2:30 PM, Oleksandr Tyshchenko
>  wrote:
>> Hi.
>>
>> There is a small patch series which contains the same fix for both
>> descriptor formats based on the preceding RFC patches:
>> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020411.html
>> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020477.html
>>
>> Oleksandr Tyshchenko (2):
>>   iommu/io-pgtable-arm: Check for leaf entry before dereferencing it
>>   iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
>>
>>  drivers/iommu/io-pgtable-arm-v7s.c | 6 +-
>>  drivers/iommu/io-pgtable-arm.c | 6 +-
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> --
>> 2.7.4
>>
> 
> 
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 0/2] Misc fixes for ARM page table allocator (long/short descriptor format)

2017-03-16 Thread Oleksandr Tyshchenko
On Thu, Mar 16, 2017 at 1:19 PM, Robin Murphy  wrote:
> On 16/03/17 09:19, Oleksandr Tyshchenko wrote:
>> Hi, all.
>>
>> Any comments?
>
> Er, no, but in a good way ;)
>
> Patches look fine to me, and I see Will's already queued them anyway.
>
> Thanks for the respin, and apologies for losing it in my inbox!
>
> Robin.

Great, thank you.

>
>>
>> On Mon, Feb 27, 2017 at 2:30 PM, Oleksandr Tyshchenko
>>  wrote:
>>> Hi.
>>>
>>> There is a small patch series which contains the same fix for both
>>> descriptor formats based on the preceding RFC patches:
>>> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020411.html
>>> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020477.html
>>>
>>> Oleksandr Tyshchenko (2):
>>>   iommu/io-pgtable-arm: Check for leaf entry before dereferencing it
>>>   iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
>>>
>>>  drivers/iommu/io-pgtable-arm-v7s.c | 6 +-
>>>  drivers/iommu/io-pgtable-arm.c | 6 +-
>>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>>
>>> --
>>> 2.7.4
>>>
>>
>>
>>
>



-- 
Regards,

Oleksandr Tyshchenko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] IOVA allocation improvements for iommu-dma

2017-03-16 Thread Sunil Kovvuri
On Wed, Mar 15, 2017 at 7:03 PM, Robin Murphy  wrote:
> Hi all,
>
> Here's the first bit of lock contention removal to chew on - feedback
> welcome! Note that for the current users of the io-pgtable framework,
> this is most likely to simply push more contention onto the io-pgtable
> lock, so may not show a great improvement alone. Will and I both have
> rough proof-of-concept implementations of lock-free io-pgtable code
> which we need to sit down and agree on at some point, hopefullt fairly
> soon.

Thanks for working on this.
As you said, it's indeed pushing lock contention down to pgtable lock from
iova rbtree lock but now morethan lock I see issue is with yielding CPU while
waiting for tlb_sync. Below are some numbers.

I have tweaked '__arm_smmu_tlb_sync' in SMMUv2 driver i.e basically removed
cpu_relax() and udelay() to make it a busy loop.

Before: 1.1 Gbps
With your patches: 1.45Gbps
With your patches + busy loop in tlb_sync: 7Gbps

If we reduce pgtable contention a bit
With your patches + busy loop in tlb_sync + Iperf threads reduced to 8
from 16: ~9Gbps

So looks like along with pgtable lock, some optimization can be done
to tlb_sync code as well.

Thanks,
Sunil.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/5] iommu/dmar: Return directly from a loop in dmar_dev_scope_status()

2017-03-16 Thread Andy Shevchenko
There is no need to have a temporary variable.

Signed-off-by: Andy Shevchenko 
---
 drivers/iommu/dmar.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index edcf7410f736..71d774f1d406 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -557,11 +557,10 @@ static int __init dmar_table_detect(void)
 static int dmar_walk_remapping_entries(struct acpi_dmar_header *start,
   size_t len, struct dmar_res_callback *cb)
 {
-   int ret = 0;
struct acpi_dmar_header *iter, *next;
struct acpi_dmar_header *end = ((void *)start) + len;
 
-   for (iter = start; iter < end && ret == 0; iter = next) {
+   for (iter = start; iter < end; iter = next) {
next = (void *)iter + iter->length;
if (iter->length == 0) {
/* Avoid looping forever on bad ACPI tables */
@@ -570,8 +569,7 @@ static int dmar_walk_remapping_entries(struct 
acpi_dmar_header *start,
} else if (next > end) {
/* Avoid passing table end */
pr_warn(FW_BUG "Record passes table end\n");
-   ret = -EINVAL;
-   break;
+   return -EINVAL;
}
 
if (cb->print_entry)
@@ -582,15 +580,19 @@ static int dmar_walk_remapping_entries(struct 
acpi_dmar_header *start,
pr_debug("Unknown DMAR structure type %d\n",
 iter->type);
} else if (cb->cb[iter->type]) {
+   int ret;
+
ret = cb->cb[iter->type](iter, cb->arg[iter->type]);
+   if (ret)
+   return ret;
} else if (!cb->ignore_unhandled) {
pr_warn("No handler for DMAR structure type %d\n",
iter->type);
-   ret = -EINVAL;
+   return -EINVAL;
}
}
 
-   return ret;
+   return 0;
 }
 
 static inline int dmar_walk_dmar_table(struct acpi_table_dmar *dmar,
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/5] iommu/vt-d: Use lo_hi_readq() / lo_hi_writeq()

2017-03-16 Thread Andy Shevchenko
There is already helper functions to do 64-bit I/O on 32-bit machines or
buses, thus we don't need to reinvent the wheel.

Signed-off-by: Andy Shevchenko 
---
 include/linux/intel-iommu.h | 18 ++
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index c573a52ae440..485a5b48f038 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -30,6 +30,8 @@
 #include 
 #include 
 #include 
+#include 
+
 #include 
 #include 
 
@@ -72,24 +74,8 @@
 
 #define OFFSET_STRIDE  (9)
 
-#ifdef CONFIG_64BIT
 #define dmar_readq(a) readq(a)
 #define dmar_writeq(a,v) writeq(v,a)
-#else
-static inline u64 dmar_readq(void __iomem *addr)
-{
-   u32 lo, hi;
-   lo = readl(addr);
-   hi = readl(addr + 4);
-   return (((u64) hi) << 32) + lo;
-}
-
-static inline void dmar_writeq(void __iomem *addr, u64 val)
-{
-   writel((u32)val, addr);
-   writel((u32)(val >> 32), addr + 4);
-}
-#endif
 
 #define DMAR_VER_MAJOR(v)  (((v) & 0xf0) >> 4)
 #define DMAR_VER_MINOR(v)  ((v) & 0x0f)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/5] iommu/dmar: Remove redundant ' != 0' when check return code

2017-03-16 Thread Andy Shevchenko
Usual pattern when we check for return code, which might be negative
errno, is either (ret) or (!ret).

Remove extra ' != 0' from condition.

Signed-off-by: Andy Shevchenko 
---
 drivers/iommu/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 9a44e20d7d88..cbf7763d8091 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -311,7 +311,7 @@ static int dmar_pci_bus_add_dev(struct dmar_pci_notify_info 
*info)
((void *)drhd) + drhd->header.length,
dmaru->segment,
dmaru->devices, dmaru->devices_cnt);
-   if (ret != 0)
+   if (ret)
break;
}
if (ret >= 0)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/5] iommu/dmar: Remove redundant assignment of ret

2017-03-16 Thread Andy Shevchenko
There is no need to assign ret to 0 in some cases. Moreover it might
shadow some errors in the future.

Remove such assignments.

Signed-off-by: Andy Shevchenko 
---
 drivers/iommu/dmar.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 71d774f1d406..9a44e20d7d88 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -391,7 +391,7 @@ static int dmar_parse_one_drhd(struct acpi_dmar_header 
*header, void *arg)
 {
struct acpi_dmar_hardware_unit *drhd;
struct dmar_drhd_unit *dmaru;
-   int ret = 0;
+   int ret;
 
drhd = (struct acpi_dmar_hardware_unit *)header;
dmaru = dmar_find_dmaru(drhd);
@@ -609,8 +609,8 @@ static int __init
 parse_dmar_table(void)
 {
struct acpi_table_dmar *dmar;
-   int ret = 0;
int drhd_count = 0;
+   int ret;
struct dmar_res_callback cb = {
.print_entry = true,
.ignore_unhandled = true,
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/5] iommu/dmar: Rectify return code handling in detect_intel_iommu()

2017-03-16 Thread Andy Shevchenko
There is inconsistency in return codes across the functions called from
detect_intel_iommu().

Make it consistent and propagate return code to the caller.

Signed-off-by: Andy Shevchenko 
---
 drivers/iommu/dmar.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 36e3f430d265..edcf7410f736 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -551,7 +551,7 @@ static int __init dmar_table_detect(void)
status = AE_NOT_FOUND;
}
 
-   return (ACPI_SUCCESS(status) ? 1 : 0);
+   return ACPI_SUCCESS(status) ? 0 : -ENOENT;
 }
 
 static int dmar_walk_remapping_entries(struct acpi_dmar_header *start,
@@ -891,17 +891,17 @@ int __init detect_intel_iommu(void)
 
down_write(&dmar_global_lock);
ret = dmar_table_detect();
-   if (ret)
-   ret = !dmar_walk_dmar_table((struct acpi_table_dmar *)dmar_tbl,
-   &validate_drhd_cb);
-   if (ret && !no_iommu && !iommu_detected && !dmar_disabled) {
+   if (!ret)
+   ret = dmar_walk_dmar_table((struct acpi_table_dmar *)dmar_tbl,
+  &validate_drhd_cb);
+   if (!ret && !no_iommu && !iommu_detected && !dmar_disabled) {
iommu_detected = 1;
/* Make sure ACS will be enabled */
pci_request_acs();
}
 
 #ifdef CONFIG_X86
-   if (ret)
+   if (!ret)
x86_init.iommu.iommu_init = intel_iommu_init;
 #endif
 
@@ -911,10 +911,9 @@ int __init detect_intel_iommu(void)
}
up_write(&dmar_global_lock);
 
-   return ret ? 1 : -ENODEV;
+   return ret ? ret : 1;
 }
 
-
 static void unmap_iommu(struct intel_iommu *iommu)
 {
iounmap(iommu->reg);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/5] iommu/arm-smmu-v3: Install bypass STEs for IOMMU_DOMAIN_IDENTITY domains

2017-03-16 Thread Nate Watterson

Hi Will,

On 2017-03-10 15:49, Will Deacon wrote:

In preparation for allowing the default domain type to be overridden,
this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
ARM SMMUv3 driver.

An identity domain is created by placing the corresponding stream table
entries into "bypass" mode, which allows transactions to flow through
the SMMU without any translation.



What about masters that require SMMU intervention to override their
native memory attributes to make them consistent with the CCA (acpi)
or dma-coherent (dt) values specified in FW? To make sure those cases
are handled, you could store away the master's coherency setting in
its strtab_ent at attach time and then setup STE[MemAttr/ALLOCCFG/SHCFG]
so the attributes are forced to the correct values while still
bypassing translation.


Signed-off-by: Will Deacon 
---
 drivers/iommu/arm-smmu-v3.c | 58 
+

 1 file changed, 37 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e18dbcd26f66..75fa4809f49e 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -554,9 +554,14 @@ struct arm_smmu_s2_cfg {
 };

 struct arm_smmu_strtab_ent {
-   boolvalid;
-
-   boolbypass; /* Overrides s1/s2 config */
+   /*
+* An STE is "assigned" if the master emitting the corresponding SID
+* is attached to a domain. The behaviour of an unassigned STE is
+* determined by the disable_bypass parameter, whereas an assigned
+* STE behaves according to s1_cfg/s2_cfg, which themselves are
+* configured according to the domain type.
+*/
+   boolassigned;
struct arm_smmu_s1_cfg  *s1_cfg;
struct arm_smmu_s2_cfg  *s2_cfg;
 };
@@ -632,6 +637,7 @@ enum arm_smmu_domain_stage {
ARM_SMMU_DOMAIN_S1 = 0,
ARM_SMMU_DOMAIN_S2,
ARM_SMMU_DOMAIN_NESTED,
+   ARM_SMMU_DOMAIN_BYPASS,
 };

 struct arm_smmu_domain {
@@ -1005,9 +1011,9 @@ static void arm_smmu_write_strtab_ent(struct
arm_smmu_device *smmu, u32 sid,
 * This is hideously complicated, but we only really care about
 * three cases at the moment:
 *
-* 1. Invalid (all zero) -> bypass  (init)
-* 2. Bypass -> translation (attach)
-* 3. Translation -> bypass (detach)
+* 1. Invalid (all zero) -> bypass/fault (init)
+* 2. Bypass/fault -> translation/bypass (attach)
+* 3. Translation/bypass -> bypass/fault (detach)
 *
 * Given that we can't update the STE atomically and the SMMU
 * doesn't read the thing in a defined order, that leaves us
@@ -1046,11 +1052,15 @@ static void arm_smmu_write_strtab_ent(struct
arm_smmu_device *smmu, u32 sid,
}

/* Nuke the existing STE_0 value, as we're going to rewrite it */
-   val = ste->valid ? STRTAB_STE_0_V : 0;
+   val = STRTAB_STE_0_V;
+
+   /* Bypass/fault */
+   if (!ste->assigned || !(ste->s1_cfg || ste->s2_cfg)) {
+   if (!ste->assigned && disable_bypass)
+   val |= STRTAB_STE_0_CFG_ABORT;
+   else
+   val |= STRTAB_STE_0_CFG_BYPASS;

-   if (ste->bypass) {
-   val |= disable_bypass ? STRTAB_STE_0_CFG_ABORT
- : STRTAB_STE_0_CFG_BYPASS;
dst[0] = cpu_to_le64(val);
dst[1] = cpu_to_le64(STRTAB_STE_1_SHCFG_INCOMING
 << STRTAB_STE_1_SHCFG_SHIFT);
@@ -,10 +1121,7 @@ static void arm_smmu_write_strtab_ent(struct
arm_smmu_device *smmu, u32 sid,
 static void arm_smmu_init_bypass_stes(u64 *strtab, unsigned int nent)
 {
unsigned int i;
-   struct arm_smmu_strtab_ent ste = {
-   .valid  = true,
-   .bypass = true,
-   };
+   struct arm_smmu_strtab_ent ste = { .assigned = false };

for (i = 0; i < nent; ++i) {
arm_smmu_write_strtab_ent(NULL, -1, strtab, &ste);
@@ -1378,7 +1385,9 @@ static struct iommu_domain
*arm_smmu_domain_alloc(unsigned type)
 {
struct arm_smmu_domain *smmu_domain;

-   if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+   if (type != IOMMU_DOMAIN_UNMANAGED &&
+   type != IOMMU_DOMAIN_DMA &&
+   type != IOMMU_DOMAIN_IDENTITY)
return NULL;

/*
@@ -1509,6 +1518,11 @@ static int arm_smmu_domain_finalise(struct
iommu_domain *domain)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;

+   if (domain->type == IOMMU_DOMAIN_IDENTITY) {
+   smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
+   return 0;
+   }
+
/* Restrict the stage to what we can actually support */
if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))

Re: [PATCH v2 3/5] iommu/arm-smmu-v3: Make arm_smmu_install_ste_for_dev return void

2017-03-16 Thread Nate Watterson

Hi Will,

On 2017-03-10 15:49, Will Deacon wrote:

arm_smmu_install_ste_for_dev cannot fail and always returns 0, however
the fact that it returns int means that callers end up implementing
redundant error handling code which complicates STE tracking and is
never executed.

This patch changes the return type of arm_smmu_install_ste_for_dev
to avoid, to make it explicit that it cannot fail.

Did you mean "a void" or just "void" instead of "avoid"?



Signed-off-by: Will Deacon 
---
 drivers/iommu/arm-smmu-v3.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 3d38e682071a..e18dbcd26f66 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1579,7 +1579,7 @@ static __le64 *arm_smmu_get_step_for_sid(struct
arm_smmu_device *smmu, u32 sid)
return step;
 }

-static int arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
+static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 {
int i;
struct arm_smmu_master_data *master = fwspec->iommu_priv;
@@ -1591,8 +1591,6 @@ static int arm_smmu_install_ste_for_dev(struct
iommu_fwspec *fwspec)

arm_smmu_write_strtab_ent(smmu, sid, step, &master->ste);
}
-
-   return 0;
 }

 static void arm_smmu_detach_dev(struct device *dev)
@@ -1600,8 +1598,7 @@ static void arm_smmu_detach_dev(struct device 
*dev)

struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;

master->ste.bypass = true;
-   if (arm_smmu_install_ste_for_dev(dev->iommu_fwspec) < 0)
-   dev_warn(dev, "failed to install bypass STE\n");
+   arm_smmu_install_ste_for_dev(dev->iommu_fwspec);
 }

 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct 
device *dev)

@@ -1653,10 +1650,7 @@ static int arm_smmu_attach_dev(struct
iommu_domain *domain, struct device *dev)
ste->s2_cfg = &smmu_domain->s2_cfg;
}

-   ret = arm_smmu_install_ste_for_dev(dev->iommu_fwspec);
-   if (ret < 0)
-   ste->valid = false;
-
+   arm_smmu_install_ste_for_dev(dev->iommu_fwspec);
 out_unlock:
mutex_unlock(&smmu_domain->init_mutex);
return ret;


--
Qualcomm Datacenter Technologies, Inc. on behalf of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a 
Linux

Foundation Collaborative Project.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/4] iommu: Disambiguate MSI region types

2017-03-16 Thread Robin Murphy
The introduction of reserved regions has left a couple of rough edges
which we could do with sorting out sooner rather than later. Since we
are not yet addressing the potential dynamic aspect of software-managed
reservations and presenting them at arbitrary fixed addresses, it is
incongruous that we end up displaying hardware vs. software-managed MSI
regions to userspace differently, especially since ARM-based systems may
actually require one or the other, or even potentially both at once,
(which iommu-dma currently has no hope of dealing with at all). Let's
resolve the former user-visible inconsistency ASAP before the ABI has
been baked into a kernel release, in a way that also lays the groundwork
for the latter shortcoming to be addressed by follow-up patches.

For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use
IOMMU_RESV_MSI to describe the hardware type, and document everything a
little bit. Since the x86 MSI remapping hardware falls squarely under
this meaning of IOMMU_RESV_MSI, apply that type to their regions as well,
so that we tell the same story to userspace across all platforms.

Secondly, as the various region types require quite different handling,
and it really makes little sense to ever try combining them, convert the
bitfield-esque #defines to a plain enum in the process before anyone
gets the wrong impression.

Fixes: d30ddcaa7b02 ("iommu: Add a new type field in iommu_resv_region")
Reviewed-by: Eric Auger 
CC: Alex Williamson 
CC: David Woodhouse 
CC: k...@vger.kernel.org
Signed-off-by: Robin Murphy 
---

Notes:
v2:
- Reword commit message
- Convert type to an enum [Eric]
- Rename vfio_iommu_has_resv_msi() for clarity [Eric]

 drivers/iommu/amd_iommu.c   |  2 +-
 drivers/iommu/arm-smmu-v3.c |  2 +-
 drivers/iommu/arm-smmu.c|  2 +-
 drivers/iommu/intel-iommu.c |  2 +-
 drivers/iommu/iommu.c   |  5 +++--
 drivers/vfio/vfio_iommu_type1.c |  7 +++
 include/linux/iommu.h   | 18 +-
 7 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 98940d1392cb..b17536d6e69b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3202,7 +3202,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 
region = iommu_alloc_resv_region(MSI_RANGE_START,
 MSI_RANGE_END - MSI_RANGE_START + 1,
-0, IOMMU_RESV_RESERVED);
+0, IOMMU_RESV_MSI);
if (!region)
return;
list_add_tail(®ion->list, head);
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5806a6acc94e..591bb96047c9 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1888,7 +1888,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_MSI);
+prot, IOMMU_RESV_SW_MSI);
if (!region)
return;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index abf6496843a6..b493c99e17f7 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1608,7 +1608,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_MSI);
+prot, IOMMU_RESV_SW_MSI);
if (!region)
return;
 
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 238ad3447712..f1611fd6f5b0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5249,7 +5249,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
 
reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
  IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
- 0, IOMMU_RESV_RESERVED);
+ 0, IOMMU_RESV_MSI);
if (!reg)
return;
list_add_tail(®->list, head);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8ea14f41a979..3b67144dead2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -72,6 +72,7 @@ static const char * const iommu_group_resv_type_string[] = {
[IOMMU_RESV_DIRECT] = "direct",
[IOMMU_RESV_RESERVED]   = "reserved",
[IOMMU_RESV_MSI]= "msi",
+   [IOMMU_RESV_SW_MSI] = "msi",
 };
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
@@ -1743,8 +1744,8 @@ void iommu_put_resv_regions(struct device *dev, struct 
list_head *list)
 }
 
 stru

[PATCH v2 2/4] iommu/dma: Don't reserve PCI I/O windows

2017-03-16 Thread Robin Murphy
Even if a host controller's CPU-side MMIO windows into PCI I/O space do
happen to leak into PCI memory space such that it might treat them as
peer addresses, trying to reserve the corresponding I/O space addresses
doesn't do anything to help solve that problem. Stop doing a silly thing.

Fixes: fade1ec055dc ("iommu/dma: Avoid PCI host bridge windows")
Reviewed-by: Eric Auger 
Signed-off-by: Robin Murphy 
---

Notes:
v2:
- No change

 drivers/iommu/dma-iommu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 48d36ce59efb..1e0983488a8d 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -175,8 +175,7 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
unsigned long lo, hi;
 
resource_list_for_each_entry(window, &bridge->windows) {
-   if (resource_type(window->res) != IORESOURCE_MEM &&
-   resource_type(window->res) != IORESOURCE_IO)
+   if (resource_type(window->res) != IORESOURCE_MEM)
continue;
 
lo = iova_pfn(iovad, window->res->start - window->offset);
-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/4] IOMMU reserved region tweaks

2017-03-16 Thread Robin Murphy
Hi Joerg,

Here's v2 incorporating Eric's feedback and R-b tags, with patch #4 added
to finish the job. Whilst presented as a logical series because it's all
touching the same code, it doesn't necessarily need merging as such - I'd
be happier with patch #1 going into 4.11 before anyone starts using the
sysfs region info in anger, but it's your call; #2 is really a standalone
fix (other than context conflicts) but is non-urgent; #3 and #4 can
happily wait for 4.12.

Robin.

Robin Murphy (4):
  iommu: Disambiguate MSI region types
  iommu/dma: Don't reserve PCI I/O windows
  iommu/dma: Handle IOMMU API reserved regions
  iommu/dma: Make PCI window reservation generic

 drivers/iommu/amd_iommu.c   |   2 +-
 drivers/iommu/arm-smmu-v3.c |   4 +-
 drivers/iommu/arm-smmu.c|   4 +-
 drivers/iommu/dma-iommu.c   | 111 ++--
 drivers/iommu/intel-iommu.c |   2 +-
 drivers/iommu/iommu.c   |   5 +-
 drivers/vfio/vfio_iommu_type1.c |   7 ++-
 include/linux/dma-iommu.h   |   5 ++
 include/linux/iommu.h   |  18 +--
 9 files changed, 127 insertions(+), 31 deletions(-)

-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/4] iommu/dma: Handle IOMMU API reserved regions

2017-03-16 Thread Robin Murphy
Now that it's simple to discover the necessary reservations for a given
device/IOMMU combination, let's wire up the appropriate handling. Basic
reserved regions and direct-mapped regions we simply have to carve out
of IOVA space (the IOMMU core having already mapped the latter before
attaching the device). For hardware MSI regions, we also pre-populate
the cookie with matching msi_pages. That way, irqchip drivers which
normally assume MSIs to require mapping at the IOMMU can keep working
without having to special-case their iommu_dma_map_msi_msg() hook, or
indeed be aware at all of quirks preventing the IOMMU from translating
certain addresses.

Reviewed-by: Eric Auger 
Signed-off-by: Robin Murphy 
---

Notes:
v2:
- Make the IOVA alignment logic safer [Eric]
- Don't duplicate direct region mapping [Eric]
- Don't wait for the subsequent patch to start using the return value [Eric]
- Fix the sneaky off-by-one for pfn_hi
- Minor cosmetic tweaks

 drivers/iommu/dma-iommu.c | 76 ++-
 1 file changed, 69 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 1e0983488a8d..5787f919f4ec 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -184,6 +184,66 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
}
 }
 
+static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
+   phys_addr_t start, phys_addr_t end)
+{
+   struct iova_domain *iovad = &cookie->iovad;
+   struct iommu_dma_msi_page *msi_page;
+   int i, num_pages;
+
+   start -= iova_offset(iovad, start);
+   num_pages = iova_align(iovad, end - start) >> iova_shift(iovad);
+
+   msi_page = kcalloc(num_pages, sizeof(*msi_page), GFP_KERNEL);
+   if (!msi_page)
+   return -ENOMEM;
+
+   for (i = 0; i < num_pages; i++) {
+   msi_page[i].phys = start;
+   msi_page[i].iova = start;
+   INIT_LIST_HEAD(&msi_page[i].list);
+   list_add(&msi_page[i].list, &cookie->msi_page_list);
+   start += iovad->granule;
+   }
+
+   return 0;
+}
+
+static int iova_reserve_iommu_regions(struct device *dev,
+   struct iommu_domain *domain)
+{
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
+   struct iova_domain *iovad = &cookie->iovad;
+   struct iommu_resv_region *region;
+   LIST_HEAD(resv_regions);
+   int ret = 0;
+
+   if (dev_is_pci(dev))
+   iova_reserve_pci_windows(to_pci_dev(dev), iovad);
+
+   iommu_get_resv_regions(dev, &resv_regions);
+   list_for_each_entry(region, &resv_regions, list) {
+   unsigned long lo, hi;
+
+   /* We ARE the software that manages these! */
+   if (region->type == IOMMU_RESV_SW_MSI)
+   continue;
+
+   lo = iova_pfn(iovad, region->start);
+   hi = iova_pfn(iovad, region->start + region->length - 1);
+   reserve_iova(iovad, lo, hi);
+
+   if (region->type == IOMMU_RESV_MSI)
+   ret = cookie_init_hw_msi_region(cookie, region->start,
+   region->start + region->length);
+   if (ret)
+   break;
+   }
+   iommu_put_resv_regions(dev, &resv_regions);
+
+   return ret;
+}
+
 /**
  * iommu_dma_init_domain - Initialise a DMA mapping domain
  * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
@@ -202,7 +262,6 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
dma_addr_t base,
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
unsigned long order, base_pfn, end_pfn;
-   bool pci = dev && dev_is_pci(dev);
 
if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
return -EINVAL;
@@ -232,7 +291,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
dma_addr_t base,
 * leave the cache limit at the top of their range to save an rb_last()
 * traversal on every allocation.
 */
-   if (pci)
+   if (dev && dev_is_pci(dev))
end_pfn &= DMA_BIT_MASK(32) >> order;
 
/* start_pfn is always nonzero for an already-initialised domain */
@@ -247,12 +306,15 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
dma_addr_t base,
 * area cache limit down for the benefit of the smaller one.
 */
iovad->dma_32bit_pfn = min(end_pfn, iovad->dma_32bit_pfn);
-   } else {
-   init_iova_domain(iovad, 1UL << order, base_pfn, end_pfn);
-   if (pci)
-   iova_reserve_pci_windows(to_pci_dev(dev), iovad);
+
+   return 0;
}
-   return 0;
+
+   init_iova_domain(iovad, 1UL << order, base_pfn, end_pfn);
+   if (!dev)
+   

[PATCH v2 4/4] iommu/dma: Make PCI window reservation generic

2017-03-16 Thread Robin Murphy
Now that we're applying the IOMMU API reserved regions to our IOVA
domains, we shouldn't need to privately special-case PCI windows, or
indeed anything else which isn't specific to our iommu-dma layer.
However, since those aren't IOMMU-specific either, rather than start
duplicating code into IOMMU drivers let's transform the existing
function into an iommu_get_resv_regions() helper that they can share.

Signed-off-by: Robin Murphy 
---

Notes:
v2:
- New

 drivers/iommu/arm-smmu-v3.c |  2 ++
 drivers/iommu/arm-smmu.c|  2 ++
 drivers/iommu/dma-iommu.c   | 38 --
 include/linux/dma-iommu.h   |  5 +
 4 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 591bb96047c9..bbd46efbe075 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1893,6 +1893,8 @@ static void arm_smmu_get_resv_regions(struct device *dev,
return;
 
list_add_tail(®ion->list, head);
+
+   iommu_dma_get_resv_regions(dev, head);
 }
 
 static void arm_smmu_put_resv_regions(struct device *dev,
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b493c99e17f7..9b33700b7c69 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1613,6 +1613,8 @@ static void arm_smmu_get_resv_regions(struct device *dev,
return;
 
list_add_tail(®ion->list, head);
+
+   iommu_dma_get_resv_regions(dev, head);
 }
 
 static void arm_smmu_put_resv_regions(struct device *dev,
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 5787f919f4ec..85652110c8ff 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -167,22 +167,43 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL(iommu_put_dma_cookie);
 
-static void iova_reserve_pci_windows(struct pci_dev *dev,
-   struct iova_domain *iovad)
+/**
+ * iommu_dma_get_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @list: Reserved region list from iommu_get_resv_regions()
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions callback
+ * for general non-IOMMU-specific reservations. Currently, this covers host
+ * bridge windows for PCI devices.
+ */
+void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 {
-   struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+   struct pci_host_bridge *bridge;
struct resource_entry *window;
-   unsigned long lo, hi;
 
+   if (!dev_is_pci(dev))
+   return;
+
+   bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
resource_list_for_each_entry(window, &bridge->windows) {
+   struct iommu_resv_region *region;
+   phys_addr_t start;
+   size_t length;
+
if (resource_type(window->res) != IORESOURCE_MEM)
continue;
 
-   lo = iova_pfn(iovad, window->res->start - window->offset);
-   hi = iova_pfn(iovad, window->res->end - window->offset);
-   reserve_iova(iovad, lo, hi);
+   start = window->res->start - window->offset;
+   length = window->res->end - window->res->start + 1;
+   region = iommu_alloc_resv_region(start, length, 0,
+   IOMMU_RESV_RESERVED);
+   if (!region)
+   return;
+
+   list_add_tail(®ion->list, list);
}
 }
+EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
 static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
phys_addr_t start, phys_addr_t end)
@@ -218,9 +239,6 @@ static int iova_reserve_iommu_regions(struct device *dev,
LIST_HEAD(resv_regions);
int ret = 0;
 
-   if (dev_is_pci(dev))
-   iova_reserve_pci_windows(to_pci_dev(dev), iovad);
-
iommu_get_resv_regions(dev, &resv_regions);
list_for_each_entry(region, &resv_regions, list) {
unsigned long lo, hi;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 5725c94b1f12..b6635a46fc7c 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -71,6 +71,7 @@ int iommu_dma_mapping_error(struct device *dev, dma_addr_t 
dma_addr);
 
 /* The DMA API isn't _quite_ the whole story, though... */
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
+void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
 #else
 
@@ -100,6 +101,10 @@ static inline void iommu_dma_map_msi_msg(int irq, struct 
msi_msg *msg)
 {
 }
 
+static inline void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
+{
+}
+
 #endif /* CONFIG_IOMMU_DMA */
 #endif /* __KERNEL__ */
 #endif /* __DMA_IOMMU_H */
-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
h

Re: [PATCH v2 4/5] iommu/arm-smmu-v3: Install bypass STEs for IOMMU_DOMAIN_IDENTITY domains

2017-03-16 Thread Robin Murphy
Hi Nate, Will,

On 16/03/17 16:24, Nate Watterson wrote:
> Hi Will,
> 
> On 2017-03-10 15:49, Will Deacon wrote:
>> In preparation for allowing the default domain type to be overridden,
>> this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
>> ARM SMMUv3 driver.
>>
>> An identity domain is created by placing the corresponding stream table
>> entries into "bypass" mode, which allows transactions to flow through
>> the SMMU without any translation.
>>
> 
> What about masters that require SMMU intervention to override their
> native memory attributes to make them consistent with the CCA (acpi)
> or dma-coherent (dt) values specified in FW?

Well, we've already broken them ;) My interpretation of "dma-coherent"
is as the equivalent of DACS=1,CPM=1, i.e. not dependent on SMMU
override. For the CCA=1,DACS=0 case (I'm going to pretend the DT
equivalent will never exist...) the first problem to solve is how to
inherit the appropriate configuration from the firmware, because right
now we're not even pretending to support that.

>  To make sure those cases
> are handled, you could store away the master's coherency setting in
> its strtab_ent at attach time and then setup STE[MemAttr/ALLOCCFG/SHCFG]
> so the attributes are forced to the correct values while still
> bypassing translation.

However, for this particular piece of the puzzle, that does sound about
right - the attribute overrides are pretty much orthogonal to the stage
of translation (or bypass), so the master's strtab_ent will indeed
probably be the most appropriate place to keep them once we get there
(cf. the master's s2cr in SMMUv2).

Now, while I'm here...

>> Signed-off-by: Will Deacon 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 58
>> +
>>  1 file changed, 37 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index e18dbcd26f66..75fa4809f49e 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -554,9 +554,14 @@ struct arm_smmu_s2_cfg {
>>  };
>>
>>  struct arm_smmu_strtab_ent {
>> -boolvalid;
>> -
>> -boolbypass;/* Overrides s1/s2 config */
>> +/*
>> + * An STE is "assigned" if the master emitting the corresponding SID
>> + * is attached to a domain. The behaviour of an unassigned STE is
>> + * determined by the disable_bypass parameter, whereas an assigned
>> + * STE behaves according to s1_cfg/s2_cfg, which themselves are
>> + * configured according to the domain type.
>> + */
>> +boolassigned;
>>  struct arm_smmu_s1_cfg*s1_cfg;
>>  struct arm_smmu_s2_cfg*s2_cfg;
>>  };
>> @@ -632,6 +637,7 @@ enum arm_smmu_domain_stage {
>>  ARM_SMMU_DOMAIN_S1 = 0,
>>  ARM_SMMU_DOMAIN_S2,
>>  ARM_SMMU_DOMAIN_NESTED,
>> +ARM_SMMU_DOMAIN_BYPASS,
>>  };
>>
>>  struct arm_smmu_domain {
>> @@ -1005,9 +1011,9 @@ static void arm_smmu_write_strtab_ent(struct
>> arm_smmu_device *smmu, u32 sid,
>>   * This is hideously complicated, but we only really care about
>>   * three cases at the moment:
>>   *
>> - * 1. Invalid (all zero) -> bypass  (init)
>> - * 2. Bypass -> translation (attach)
>> - * 3. Translation -> bypass (detach)
>> + * 1. Invalid (all zero) -> bypass/fault (init)
>> + * 2. Bypass/fault -> translation/bypass (attach)
>> + * 3. Translation/bypass -> bypass/fault (detach)
>>   *
>>   * Given that we can't update the STE atomically and the SMMU
>>   * doesn't read the thing in a defined order, that leaves us
>> @@ -1046,11 +1052,15 @@ static void arm_smmu_write_strtab_ent(struct
>> arm_smmu_device *smmu, u32 sid,
>>  }
>>
>>  /* Nuke the existing STE_0 value, as we're going to rewrite it */
>> -val = ste->valid ? STRTAB_STE_0_V : 0;
>> +val = STRTAB_STE_0_V;
>> +
>> +/* Bypass/fault */
>> +if (!ste->assigned || !(ste->s1_cfg || ste->s2_cfg)) {
>> +if (!ste->assigned && disable_bypass)

...yuck. After about 5 minutes of staring at that, I've convinced myself
that it would make much more sense to always clear the strtab_ent
configs on detach, such that you never need the outer !ste->assigned
check here...

>> +val |= STRTAB_STE_0_CFG_ABORT;
>> +else
>> +val |= STRTAB_STE_0_CFG_BYPASS;
>>
>> -if (ste->bypass) {
>> -val |= disable_bypass ? STRTAB_STE_0_CFG_ABORT
>> -  : STRTAB_STE_0_CFG_BYPASS;
>>  dst[0] = cpu_to_le64(val);
>>  dst[1] = cpu_to_le64(STRTAB_STE_1_SHCFG_INCOMING
>>   << STRTAB_STE_1_SHCFG_SHIFT);
>> @@ -,10 +1121,7 @@ static void arm_smmu_write_strtab_ent(struct
>> arm_smmu_device *smmu, u32 sid,
>>  static void arm_smmu_init_bypass_stes(u64 *strtab, unsigned int nent)
>>  {
>>  unsigned int i;
>> -struct arm_smmu_strtab_ent ste = {
>> -.valid= true,
>> -.bypass= true,
>> 

Re: [PATCH V3 2/5] iommu/arm-smmu: Add support for MMU40x/500 clocks

2017-03-16 Thread Rob Herring
On Thu, Mar 09, 2017 at 09:05:45PM +0530, Sricharan R wrote:
> The MMU400x/500 is the implementation of the SMMUv2
> arch specification. It is split in to two blocks
> TBU, TCU. TBU caches the page table, instantiated
> for each master locally, clocked by the TBUn_clk.
> TCU manages the address translation with PTW and has
> the programming interface as well, clocked using the
> TCU_CLK. The TBU can also be sharing the same clock
> domain as TCU, in which case both are clocked using
> the TCU_CLK.
> 
> This defines the clock bindings for the same and adds the
> init, enable and disable functions for handling the
> clocks.
> 
> Signed-off-by: Sricharan R 
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt | 27 ++
>  drivers/iommu/arm-smmu.c   | 95 
> +-
>  2 files changed, 121 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> index 6cdf32d..b369c13 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> @@ -60,6 +60,28 @@ conditions.
>aliases of secure registers have to be used during
>SMMU configuration.
>  
> +- clock-names:Should be "tbu_clk" and "tcu_clk" and "cfg_clk" for

_clk is redundant

> +  "arm,mmu-400", "arm,mmu-401" and "arm,mmu-500"
> +
> +  "tcu_clk" is required for smmu's register access using the
> +  programming interface and ptw for downstream bus access.
> +
> +  "tbu_clk" is required for access to the TBU connected to 
> the
> +  master locally. This clock is optional and not required 
> when
> +  TBU is in the same clock domain as the TCU or when the TBU 
> is
> +  clocked along with the master.
> +
> +  "cfg_clk" is optional if required to access the TCU's 
> programming
> +  interface, apart from the "tcu_clk".

Clocks generally shouldn't be optional. Either the h/w has a clock or it 
doesn't.

> +
> +- clocks: Phandles for respective clocks described by clock-names.
> +
> +- power-domains:  Phandles to SMMU's power domain specifier. This is
> +  required even if SMMU belongs to the master's power
> +  domain, as the SMMU will have to be enabled and
> +  accessed before master gets enabled and linked to its
> +  SMMU.
> +
>  ** Deprecated properties:
>  
>  - mmu-masters (deprecated in favour of the generic "iommus" binding) :
> @@ -84,6 +106,11 @@ conditions.
>   <0 36 4>,
>   <0 37 4>;
>  #iommu-cells = <1>;
> +clocks = <&gcc GCC_SMMU_CFG_CLK>,
> + <&gcc GCC_APSS_TCU_CLK>,
> +  <&gcc GCC_MDP_TBU_CLK>;
> +
> + clock-names = "cfg_clk", "tcu_clk", "tbu_clk";
>  };
>  
>  /* device with two stream IDs, 0 and 7 */
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index f7e11d3..720a1ef 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -341,6 +341,12 @@ struct arm_smmu_master_cfg {
>  #define for_each_cfg_sme(fw, i, idx) \
>   for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i)
>  
> +struct mmu500_clk {
> + struct clk *cfg_clk;
> + struct clk *tcu_clk;
> + struct clk *tbu_clk;
> +};
> +
>  struct arm_smmu_clks {
>   void *clks;
>   int (*init_clocks)(struct arm_smmu_device *smmu);
> @@ -455,6 +461,92 @@ static struct arm_smmu_domain *to_smmu_domain(struct 
> iommu_domain *dom)
>   return container_of(dom, struct arm_smmu_domain, domain);
>  }
>  
> +static int mmu500_enable_clocks(struct arm_smmu_device *smmu)
> +{
> + int ret = 0;
> + struct mmu500_clk *sclks = smmu->smmu_clks.clks;
> +
> + if (!sclks)
> + return 0;
> +
> + ret = clk_prepare_enable(sclks->cfg_clk);
> + if (ret) {
> + dev_err(smmu->dev, "Couldn't enable cfg_clk");
> + return ret;
> + }
> +
> + ret = clk_prepare_enable(sclks->tcu_clk);
> + if (ret) {
> + dev_err(smmu->dev, "Couldn't enable tcu_clk");
> + clk_disable_unprepare(sclks->cfg_clk);
> + return ret;
> + }
> +
> + ret = clk_prepare_enable(sclks->tbu_clk);
> + if (ret) {
> + dev_err(smmu->dev, "Couln't enable tbu_clk");
> + clk_disable_unprepare(sclks->tcu_clk);
> + clk_disable_unprepare(sclks->cfg_clk);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static void mmu500_disable_clocks(struct arm_smmu_device *smmu)
> +{
> + struct mmu500_clk *sclks = smmu->smmu_clks.clks;
> +
> + if (!sclks) {
> + clk_disable_unprepare(sclks->tbu_clk);
> +

Re: [PATCH V3 3/5] drivers: arm-smmu: Add clock support for QCOM_SMMUV2

2017-03-16 Thread Rob Herring
On Thu, Mar 09, 2017 at 09:05:46PM +0530, Sricharan R wrote:
> The QCOM_SMMUV2 is an implementation of the arm,smmu-v2 architecture.
> The qcom,smmu is instantiated for each of the multimedia cores (for eg)
> Venus (video encoder/decoder), mdp (display) etc, and they are connected
> to the Multimedia Aggregator Interconnect (MMAGIC). So the access to
> any of the MMU's registers, as well as MMU's downstream bus access,
> requires the specified MMAGIC clocks to be enabled. So adding a new
> binding for the qcom,smmu-v2 and the required mmagic clock bindings for
> the same. Also adding the support for enabling the qcom,smmu-v2 clocks in
> the driver.
> 
>-  -
>|   VENUS   |  | MDP   |
>  |   |  |   |
>  -  
>|   |
>|   |
>   --  -
>   |SMMU | | SMMU  |
>   |   | |   |
>   --  
>   ||
>   ||
>-
>|  MMAGIC INTERCONNECT (MMSS NOC) |
>|   |
>-
>| |
>  |   --
>  - |  SYSTEM NOC  |
>  |DDR|   ||
>  - -
>  ||
>|   --
>  |<-| CPU|
> --
> 
> Signed-off-by: Sricharan R 
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt |   8 ++
>  drivers/iommu/arm-smmu.c   | 124 
> +
>  2 files changed, 132 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> index b369c13..88e02d6 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> @@ -17,6 +17,7 @@ conditions.
>  "arm,mmu-401"
>  "arm,mmu-500"
>  "cavium,smmu-v2"
> + "qcom,smmu-v2"

I know Cavium did it, but I'd prefer to see SoC specific compatibles 
here.

>  
>depending on the particular implementation and/or the
>version of the architecture implemented.
> @@ -74,6 +75,13 @@ conditions.
>"cfg_clk" is optional if required to access the TCU's 
> programming
>interface, apart from the "tcu_clk".
>  
> +   Should have "mmagic_ahb_clk", "mmagic_cfg_ahb_clk",
> +  "smmu_core_ahb_clk", "smmu_core_axi_clk",
> +  "mmagic_core_axi_clk" for "qcom,smmu-v2"

This is instead of the above clocks?

Are these clocks all really part of the SMMU or are the mmagic clocks 
working around no proper driver for the mmagic?

> +
> +   "mmagic_core_axi_clk" is required for smmu's access to the
> +   downstream bus and rest for the smmu's register group 
> access.
> +
>  - clocks: Phandles for respective clocks described by clock-names.
>  
>  - power-domains:  Phandles to SMMU's power domain specifier. This is
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V3 2/5] iommu/arm-smmu: Add support for MMU40x/500 clocks

2017-03-16 Thread Rob Clark
On Thu, Mar 9, 2017 at 10:35 AM, Sricharan R  wrote:
> The MMU400x/500 is the implementation of the SMMUv2
> arch specification. It is split in to two blocks
> TBU, TCU. TBU caches the page table, instantiated
> for each master locally, clocked by the TBUn_clk.
> TCU manages the address translation with PTW and has
> the programming interface as well, clocked using the
> TCU_CLK. The TBU can also be sharing the same clock
> domain as TCU, in which case both are clocked using
> the TCU_CLK.
>
> This defines the clock bindings for the same and adds the
> init, enable and disable functions for handling the
> clocks.
>
> Signed-off-by: Sricharan R 
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt | 27 ++
>  drivers/iommu/arm-smmu.c   | 95 
> +-
>  2 files changed, 121 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> index 6cdf32d..b369c13 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> @@ -60,6 +60,28 @@ conditions.
>aliases of secure registers have to be used during
>SMMU configuration.
>
> +- clock-names:Should be "tbu_clk" and "tcu_clk" and "cfg_clk" for
> +  "arm,mmu-400", "arm,mmu-401" and "arm,mmu-500"

I guess that should be: "Should be "tbu_clk" *or* "tcu_clk" and
"cfg_clk" for..."

Also, possibly we should define our own compat strings for various
SoC's that require these clks so we can properly describe when they
are required?  I guess that would address Rob H's comment.

BR,
-R

> +  "tcu_clk" is required for smmu's register access using the
> +  programming interface and ptw for downstream bus access.
> +
> +  "tbu_clk" is required for access to the TBU connected to 
> the
> +  master locally. This clock is optional and not required 
> when
> +  TBU is in the same clock domain as the TCU or when the TBU 
> is
> +  clocked along with the master.
> +
> +  "cfg_clk" is optional if required to access the TCU's 
> programming
> +  interface, apart from the "tcu_clk".
> +
> +- clocks: Phandles for respective clocks described by clock-names.
> +
> +- power-domains:  Phandles to SMMU's power domain specifier. This is
> +  required even if SMMU belongs to the master's power
> +  domain, as the SMMU will have to be enabled and
> +  accessed before master gets enabled and linked to its
> +  SMMU.
> +
>  ** Deprecated properties:
>
>  - mmu-masters (deprecated in favour of the generic "iommus" binding) :
> @@ -84,6 +106,11 @@ conditions.
>   <0 36 4>,
>   <0 37 4>;
>  #iommu-cells = <1>;
> +clocks = <&gcc GCC_SMMU_CFG_CLK>,
> + <&gcc GCC_APSS_TCU_CLK>,
> +<&gcc GCC_MDP_TBU_CLK>;
> +
> +   clock-names = "cfg_clk", "tcu_clk", "tbu_clk";
>  };
>
>  /* device with two stream IDs, 0 and 7 */
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index f7e11d3..720a1ef 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -341,6 +341,12 @@ struct arm_smmu_master_cfg {
>  #define for_each_cfg_sme(fw, i, idx) \
> for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i)
>
> +struct mmu500_clk {
> +   struct clk *cfg_clk;
> +   struct clk *tcu_clk;
> +   struct clk *tbu_clk;
> +};
> +
>  struct arm_smmu_clks {
> void *clks;
> int (*init_clocks)(struct arm_smmu_device *smmu);
> @@ -455,6 +461,92 @@ static struct arm_smmu_domain *to_smmu_domain(struct 
> iommu_domain *dom)
> return container_of(dom, struct arm_smmu_domain, domain);
>  }
>
> +static int mmu500_enable_clocks(struct arm_smmu_device *smmu)
> +{
> +   int ret = 0;
> +   struct mmu500_clk *sclks = smmu->smmu_clks.clks;
> +
> +   if (!sclks)
> +   return 0;
> +
> +   ret = clk_prepare_enable(sclks->cfg_clk);
> +   if (ret) {
> +   dev_err(smmu->dev, "Couldn't enable cfg_clk");
> +   return ret;
> +   }
> +
> +   ret = clk_prepare_enable(sclks->tcu_clk);
> +   if (ret) {
> +   dev_err(smmu->dev, "Couldn't enable tcu_clk");
> +   clk_disable_unprepare(sclks->cfg_clk);
> +   return ret;
> +   }
> +
> +   ret = clk_prepare_enable(sclks->tbu_clk);
> +   if (ret) {
> +   dev_err(smmu->dev, "Couln't enable tbu_clk");
> +   clk_disable_unprepare(sclks->tcu_clk);
> +   clk_disable_unprepare(sclks->cfg_clk);
> +   return ret;
> +   }
> +
> +   return 0;
>

Re: [PATCH V3 2/5] iommu/arm-smmu: Add support for MMU40x/500 clocks

2017-03-16 Thread Sricharan R

Hi,

On 3/17/2017 4:22 AM, Rob Clark wrote:

On Thu, Mar 9, 2017 at 10:35 AM, Sricharan R  wrote:

The MMU400x/500 is the implementation of the SMMUv2
arch specification. It is split in to two blocks
TBU, TCU. TBU caches the page table, instantiated
for each master locally, clocked by the TBUn_clk.
TCU manages the address translation with PTW and has
the programming interface as well, clocked using the
TCU_CLK. The TBU can also be sharing the same clock
domain as TCU, in which case both are clocked using
the TCU_CLK.

This defines the clock bindings for the same and adds the
init, enable and disable functions for handling the
clocks.

Signed-off-by: Sricharan R 
---
 .../devicetree/bindings/iommu/arm,smmu.txt | 27 ++
 drivers/iommu/arm-smmu.c   | 95 +-
 2 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 6cdf32d..b369c13 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -60,6 +60,28 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.

+- clock-names:Should be "tbu_clk" and "tcu_clk" and "cfg_clk" for
+  "arm,mmu-400", "arm,mmu-401" and "arm,mmu-500"


I guess that should be: "Should be "tbu_clk" *or* "tcu_clk" and
"cfg_clk" for..."

Also, possibly we should define our own compat strings for various
SoC's that require these clks so we can properly describe when they
are required?  I guess that would address Rob H's comment.



Ok, calling "optional" does not look correct. But i was trying to define
something which would be right for all implementations of MMU-500, but
looks like that's not the right way. Making it soc specific would give a
exact description, except for the number of compatibles and have to
change the code to not consider any of tbu , tcu, cfg clk as mandatory
for adding support for other compatibles later which are mostly similar.
So i am trying to setup the clocks using functions specific for each
compatible. Not sure, if at this point, some soc specific callback from 
the driver where these things can be initialized might be good.


Regards,
 Sricharan



Regards,
 Sricharan


BR,
-R


+  "tcu_clk" is required for smmu's register access using the
+  programming interface and ptw for downstream bus access.
+
+  "tbu_clk" is required for access to the TBU connected to the
+  master locally. This clock is optional and not required when
+  TBU is in the same clock domain as the TCU or when the TBU is
+  clocked along with the master.
+
+  "cfg_clk" is optional if required to access the TCU's 
programming
+  interface, apart from the "tcu_clk".
+
+- clocks: Phandles for respective clocks described by clock-names.
+
+- power-domains:  Phandles to SMMU's power domain specifier. This is
+  required even if SMMU belongs to the master's power
+  domain, as the SMMU will have to be enabled and
+  accessed before master gets enabled and linked to its
+  SMMU.
+
 ** Deprecated properties:

 - mmu-masters (deprecated in favour of the generic "iommus" binding) :
@@ -84,6 +106,11 @@ conditions.
  <0 36 4>,
  <0 37 4>;
 #iommu-cells = <1>;
+clocks = <&gcc GCC_SMMU_CFG_CLK>,
+ <&gcc GCC_APSS_TCU_CLK>,
+<&gcc GCC_MDP_TBU_CLK>;
+
+   clock-names = "cfg_clk", "tcu_clk", "tbu_clk";
 };

 /* device with two stream IDs, 0 and 7 */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f7e11d3..720a1ef 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -341,6 +341,12 @@ struct arm_smmu_master_cfg {
 #define for_each_cfg_sme(fw, i, idx) \
for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i)

+struct mmu500_clk {
+   struct clk *cfg_clk;
+   struct clk *tcu_clk;
+   struct clk *tbu_clk;
+};
+
 struct arm_smmu_clks {
void *clks;
int (*init_clocks)(struct arm_smmu_device *smmu);
@@ -455,6 +461,92 @@ static struct arm_smmu_domain *to_smmu_domain(struct 
iommu_domain *dom)
return container_of(dom, struct arm_smmu_domain, domain);
 }

+static int mmu500_enable_clocks(struct arm_smmu_device *smmu)
+{
+   int ret = 0;
+   struct mmu500_clk *sclks = smmu->smmu_clks.clks;
+
+   if (!sclks)
+   return 0;
+
+   ret = clk_prepare_enable(sclks->cfg_clk);
+   if (ret) {
+   dev_err(smmu->dev, "Couldn't enable cfg_clk");
+   return ret;
+   }
+
+   ret = clk_prepare_enable(s

[RFC PATCH] iommu/dma: account pci host bridge dma_mask for IOVA allocation

2017-03-16 Thread Oza Pawandeep via iommu
It is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound
transaction addressing. As an example, consider NVME SSD device
connected to iproc-PCIe controller.

Currently, the IOMMU DMA ops only considers PCI device dma_mask
when allocating an IOVA. This is particularly problematic on
ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
PA for in-bound transactions only after PCI Host has forwarded
these transactions on SOC IO bus. This means on such ARM/ARM64
SOCs the IOVA of in-bound transactions has to honor the addressing
restrictions of the PCI Host.

this patch is inspired by
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1306545.html
http://www.spinics.net/lists/arm-kernel/msg566947.html

but above inspiraiton solves the half of the problem.
the rest of the problem is descrbied below, what we face on iproc based
SOCs.

current pcie frmework and of framework integration assumes dma-ranges
in a way where memory-mapped devices define their dma-ranges.
dma-ranges: (child-bus-address, parent-bus-address, length).

but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

of_dma_configure is specifically witten to take care of memory mapped devices.
but no implementation exists for pci to take care of pcie based memory ranges.
in fact pci world doesnt seem to define standard dma-ranges
since there is an absense of the same, the dma_mask used to remain 32bit 
because of
0 size return (parsed by of_dma_configure())

this patch also implements of_pci_get_dma_ranges to cater to pci world 
dma-ranges.
so then the returned size get best possible (largest) dma_mask.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

conclusion: there are following problems
1) linux pci and iommu framework integration has glitches with respect to 
dma-ranges
2) pci linux framework look very uncertain about dma-ranges, rather binding is 
not defined
   the way it is defined for memory mapped devices.
   rcar and iproc based SOCs use their custom one dma-ranges
   (rather can be standard)
3) even if in case of default parser of_dma_get_ranges,:
   it throws and erro"
   "no dma-ranges found for node"
   because of the bug which exists.
   following lines should be moved to the end of while(1)
839 node = of_get_next_parent(node);
840 if (!node)
841 break;

Reviewed-by: Anup Patel 
Reviewed-by: Scott Branden 
Signed-off-by: Oza Pawandeep 

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8c7c244..20cfff7 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -217,6 +217,9 @@ config NEED_DMA_MAP_STATE
 config NEED_SG_DMA_LENGTH
def_bool y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+   def_bool y
+
 config SMP
def_bool y
 
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 73d5bab..64b4dc3 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -20,6 +20,7 @@ struct dev_archdata {
 #ifdef CONFIG_IOMMU_API
void *iommu;/* private IOMMU data */
 #endif
+   u64 parent_dma_mask;
bool dma_coherent;
 };
 
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 81cdb2e..5845ecd 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -564,6 +564,7 @@ static void flush_page(struct device *dev, const void 
*virt, phys_addr_t phys)
__dma_flush_area(virt, PAGE_SIZE);
 }
 
+
 static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 dma_addr_t *handle, gfp_t gfp,
 unsigned long attrs)
@@ -795,6 +796,20 @@ static void __iommu_unmap_sg_attrs(struct device *dev,
iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs);
 }
 
+static int __iommu_set_dma_mask(struct device *dev, u64 mask)
+{
+   /* device is not DMA capable */
+   if (!dev->dma_mask)
+   return -EIO;
+
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   *dev->dma_mask = mask;
+
+   return 0;
+}
+
 static const struct dma_map_ops iommu_dma_ops = {
.alloc = __iommu_alloc_attrs,
.free = __iommu_free_attrs,
@@ -811,8 +826,21 @@ static void __iommu_unmap_sg_attrs(struct device *dev,
.map_resource = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
.mapping_error = iommu_dma_mapping_error,
+   .set_dma_mask = __iommu_set_dma_mask,
 };
 
+int dma_set_coherent_mask(struct device *dev, u64 mask)
+{
+   if (get_dma_ops(dev) == &iommu_dma_ops &&
+   mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.p

RE: [RFC PATCH] iommu/dma: account pci host bridge dma_mask for IOVA allocation

2017-03-16 Thread Oza Oza via iommu
Hi,

There are certain areas which requires contemplation.
And this problem requires more attention from Pci of framework and iommu,
and integration of both.

Regards,
Oza.

-Original Message-
From: Oza Pawandeep [mailto:oza@broadcom.com]
Sent: Friday, March 17, 2017 11:41 AM
To: Joerg Roedel; Robin Murphy
Cc: iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org;
bcm-kernel-feedback-l...@broadcom.com; Oza Pawandeep
Subject: [RFC PATCH] iommu/dma: account pci host bridge dma_mask for IOVA
allocation

It is possible that PCI device supports 64-bit DMA addressing, and thus
it's driver sets device's dma_mask to DMA_BIT_MASK(64), however PCI host
bridge may have limitations on the inbound transaction addressing. As an
example, consider NVME SSD device connected to iproc-PCIe controller.

Currently, the IOMMU DMA ops only considers PCI device dma_mask when
allocating an IOVA. This is particularly problematic on
ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to PA for
in-bound transactions only after PCI Host has forwarded these transactions
on SOC IO bus. This means on such ARM/ARM64 SOCs the IOVA of in-bound
transactions has to honor the addressing restrictions of the PCI Host.

this patch is inspired by
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1306545.html
http://www.spinics.net/lists/arm-kernel/msg566947.html

but above inspiraiton solves the half of the problem.
the rest of the problem is descrbied below, what we face on iproc based
SOCs.

current pcie frmework and of framework integration assumes dma-ranges in a
way where memory-mapped devices define their dma-ranges.
dma-ranges: (child-bus-address, parent-bus-address, length).

but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

of_dma_configure is specifically witten to take care of memory mapped
devices.
but no implementation exists for pci to take care of pcie based memory
ranges.
in fact pci world doesnt seem to define standard dma-ranges since there is
an absense of the same, the dma_mask used to remain 32bit because of
0 size return (parsed by of_dma_configure())

this patch also implements of_pci_get_dma_ranges to cater to pci world
dma-ranges.
so then the returned size get best possible (largest) dma_mask.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>; we should get
dev->coherent_dma_mask=0x7f.

conclusion: there are following problems
1) linux pci and iommu framework integration has glitches with respect to
dma-ranges
2) pci linux framework look very uncertain about dma-ranges, rather
binding is not defined
   the way it is defined for memory mapped devices.
   rcar and iproc based SOCs use their custom one dma-ranges
   (rather can be standard)
3) even if in case of default parser of_dma_get_ranges,:
   it throws and erro"
   "no dma-ranges found for node"
   because of the bug which exists.
   following lines should be moved to the end of while(1)
839 node = of_get_next_parent(node);
840 if (!node)
841 break;

Reviewed-by: Anup Patel 
Reviewed-by: Scott Branden 
Signed-off-by: Oza Pawandeep 

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index
8c7c244..20cfff7 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -217,6 +217,9 @@ config NEED_DMA_MAP_STATE  config NEED_SG_DMA_LENGTH
def_bool y

+config ARCH_HAS_DMA_SET_COHERENT_MASK
+   def_bool y
+
 config SMP
def_bool y

diff --git a/arch/arm64/include/asm/device.h
b/arch/arm64/include/asm/device.h index 73d5bab..64b4dc3 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -20,6 +20,7 @@ struct dev_archdata {
 #ifdef CONFIG_IOMMU_API
void *iommu;/* private IOMMU data */
 #endif
+   u64 parent_dma_mask;
bool dma_coherent;
 };

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 81cdb2e..5845ecd 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -564,6 +564,7 @@ static void flush_page(struct device *dev, const void
*virt, phys_addr_t phys)
__dma_flush_area(virt, PAGE_SIZE);
 }

+
 static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 dma_addr_t *handle, gfp_t gfp,
 unsigned long attrs)
@@ -795,6 +796,20 @@ static void __iommu_unmap_sg_attrs(struct device
*dev,
iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs);  }

+static int __iommu_set_dma_mask(struct device *dev, u64 mask) {
+   /* device is not DMA capable */
+   if (!dev->dma_mask)
+   return -EIO;
+
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   *dev->dma_mask = mask;
+
+   return 0;
+}
+
 static const struc