Re: [PATCH 2/3] iommu/pci: reserve iova for PCI masters

2017-05-05 Thread Oza Oza via iommu
On Fri, May 5, 2017 at 9:21 PM, Robin Murphy  wrote:
> On 04/05/17 19:52, Oza Oza wrote:
>> On Thu, May 4, 2017 at 11:50 PM, Robin Murphy  wrote:
>>> On 03/05/17 05:46, Oza Pawandeep wrote:
 this patch reserves the iova for PCI masters.
 ARM64 based SOCs may have scattered memory banks.
 such as iproc based SOC has

 <0x 0x8000 0x0 0x8000>, /* 2G @ 2G */
 <0x0008 0x8000 0x3 0x8000>, /* 14G @ 34G */
 <0x0090 0x 0x4 0x>, /* 16G @ 576G */
 <0x00a0 0x 0x4 0x>; /* 16G @ 640G */

 but incoming PCI transcation addressing capability is limited
 by host bridge, for example if max incoming window capability
 is 512 GB, then 0x0090 and 0x00a0 will fall beyond it.

 to address this problem, iommu has to avoid allocating iovas which
 are reserved. which inturn does not allocate iova if it falls into hole.
>>>
>>> I don't necessarily disagree with doing this, as we could do with facing
>>> up to the issue of discontiguous DMA ranges in particular (I too have a
>>> platform with this problem), but I'm still not overly keen on pulling DT
>>> specifics into this layer. More than that, though, if we are going to do
>>> it, then we should do it for all devices with a restrictive
>>> "dma-ranges", not just PCI ones.
>>>
>>
>> How do you propose to do it ?
>>
>> my thinking is this:
>> iova_reserve_pci_windows is written specific for PCI, and I am adding there.
>>
>> ideally
>> struct pci_host_bridge should have new member:
>>
>> struct list_head inbound_windows; /* resource_entry */
>>
>> but somehow this resource have to be filled much before
>> iommu_dma_init_domain happens.
>> and use brdge resource directly in iova_reserve_pci_windows as it is
>> already doing it for outbound memory.
>>
>> this will detach the DT specifics from dma-immu layer.
>> let me know how this sounds.
>
> Please check the state of the code currently queued in Joerg's tree and
> in linux-next - iommu_dma_get_resv_regions() has room for
> device-agnostic stuff before the if (!dev_is_pci(dev)) check.
>
> Furthermore, with the probe-deferral changes we end up with a common
> dma_configure() routine to abstract the firmware-specifics of
> of_dma_configure() vs. acpi_dma_configure(), so it would make sense to
> give drivers etc. a similar interface for interrogating ranges. i.e.
> some common function that abstracts the difference between parsing DT
> dma-ranges vs. the ACPI _DMA object, either with a list-based get/put
> model or perhaps an iterator with a user-provided callback (so users
> could process in-place or create their own list as necessary). Unless of
> course we go all the way to making the ranges an inherent part of the
> device layer like some MIPS platforms currently do.
>
> Robin.
>

you are suggesting to wait till iommu_dma_get_resv_regions gets in ?

Oza.


 Bug: SOC-5216
 Change-Id: Icbfc99a045d730be143fef427098c937b9d46353
 Signed-off-by: Oza Pawandeep 
 Reviewed-on: http://gerrit-ccxsw.broadcom.net/40760
 Reviewed-by: vpx_checkpatch status 
 Reviewed-by: CCXSW 
 Tested-by: vpx_autobuild status 
 Tested-by: vpx_smoketest status 
 Tested-by: CCXSW 
 Reviewed-by: Scott Branden 

 diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
 index 48d36ce..08764b0 100644
 --- a/drivers/iommu/dma-iommu.c
 +++ b/drivers/iommu/dma-iommu.c
 @@ -27,6 +27,7 @@
  #include 
  #include 
  #include 
 +#include 
  #include 
  #include 
  #include 
 @@ -171,8 +172,12 @@ static void iova_reserve_pci_windows(struct pci_dev 
 *dev,
   struct iova_domain *iovad)
  {
   struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
 + struct device_node *np = bridge->dev.parent->of_node;
   struct resource_entry *window;
   unsigned long lo, hi;
 + int ret;
 + dma_addr_t tmp_dma_addr = 0, dma_addr;
 + LIST_HEAD(res);

   resource_list_for_each_entry(window, &bridge->windows) {
   if (resource_type(window->res) != IORESOURCE_MEM &&
 @@ -183,6 +188,36 @@ static void iova_reserve_pci_windows(struct pci_dev 
 *dev,
   hi = iova_pfn(iovad, window->res->end - window->offset);
   reserve_iova(iovad, lo, hi);
   }
 +
 + /* PCI inbound memory reservation. */
 + ret = of_pci_get_dma_ranges(np, &res);
 + if (!ret) {
 + resource_list_for_each_entry(window, &res) {
 + struct resource *res_dma = window->res;
 +
 + dma_addr = res_dma->start - window->offset;
 + if (tmp_dma_addr > dma_addr) {
 + pr_warn("PCI: failed to reserve iovas; 
 ranges should be sorted\n");
>>>
>>> I don't see anything in the DT spec about the entries ha

Re: [PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

2017-05-05 Thread Oza Oza via iommu
On Fri, May 5, 2017 at 8:55 PM, Robin Murphy  wrote:
> On 04/05/17 19:41, Oza Oza wrote:
> [...]
 5) leaves scope of adding PCI flag handling for inbound memory
 by the new function.
>>>
>>> Which flags would ever actually matter? DMA windows aren't going to be
>>> to config or I/O space, so the memory type can be assumed, and the
>>> 32/64-bit distinction is irrelevant as it's not a relocatable BAR;
>>> DMA-able system memory isn't going to be read-sensitive, so the
>>> prefetchable flag shouldn't matter; and not being a BAR none of the
>>> others would be relevant either.
>>>
>>
>> Thanks Robin; for your reply and attention:
>>
>> agree with you, at present it would not matter,
>> but it does not mean that we do not scope it to make it matter in future.
>>
>> now where it could matter:
>> there is Relaxed Ordering for inbound memory for PCI.
>> According to standard, Relaxed Ordering (RO) bit can be set only for
>> Memory requests and completions (if present in the original request).
>> Also, according to transaction ordering rules, I/O and configuration
>> requests can still be re-ordered ahead of each other.
>> and we would like to make use of it.
>> for e.g. lets say we mark memory as Relaxed Ordered with flag.
>> the special about this memory is incoming PCI transactions can be
>> reordered and rest memory has to be strongly ordered.
>
> Please look at "PCI Bus Binding to: IEEE Std 1275-1994 Standard for Boot
> (Initialization Configuration) Firmware" (as referenced in DTSpec) and
> explain how PCIe Relaxed Order has anything to do with the DT binding.
>
>> how it our SOC would make use of this is out of scope for the
>> discussion at this point of time, but I am just bringing in the
>> idea/point how flags could be useful
>> for inbound memory, since we would not like throw-away flags completely.
>
> The premise for implementing a PCI-specific parser is that you claim we
> need to do something with the phys.hi cell of a DT PCI address, rather
> than just taking the numerical part out of the phys.mid and phys.lo
> cells. Please make that argument in reference to the flags which that
> upper cell actually encodes, not unrelated things.
>

I think, the whole discussion around inbound flags is not what I
intended to bring.
this patch does nothing about inbound flag and never intends to solve
anything regarding inbound flags.
infact I would like to remove point 5 form the commit message. which
should keep it out of discussion completely.

let met tell what this patch is trying to address/solve 2 BUGs
1) fix wrong size return from of_dma_configure for PCI masters. (which
is right now BUG)

2) handles multiple dma-ranges cleanly

3) It takes care of dma-ranges being optional.

4) following is the comment on function of_dma_get_range (which is also a BUG)
"It returns -ENODEV if "dma-ranges" property was not found
 * for this device in DT."
which I think is wrong for PCI device, because if dma-ranges are
absent then instead of returning  -ENODEV,
it should return 0 with largest possible host memory.

it solves all the above 4 problems.

> [...]
 +int of_pci_get_dma_ranges(struct device_node *np, struct list_head 
 *resources)
 +{
 + struct device_node *node = of_node_get(np);
 + int rlen;
 + int ret = 0;
 + const int na = 3, ns = 2;
 + struct resource *res;
 + struct of_pci_range_parser parser;
 + struct of_pci_range range;
 +
 + if (!node)
 + return -EINVAL;
 +
 + parser.node = node;
 + parser.pna = of_n_addr_cells(node);
 + parser.np = parser.pna + na + ns;
 +
 + parser.range = of_get_property(node, "dma-ranges", &rlen);
 +
 + if (!parser.range) {
 + pr_debug("pcie device has no dma-ranges defined for 
 node(%s)\n",
 +   np->full_name);
 + ret = -EINVAL;
 + goto out;
 + }
 +
 + parser.end = parser.range + rlen / sizeof(__be32);
 +
 + for_each_of_pci_range(&parser, &range) {
>>>
>>> This is plain wrong - of_pci_range_parser_one() will translate upwards
>>> through parent "ranges" properties, which is completely backwards for
>>> DMA addresses.
>>>
>>> Robin.
>>>
>>
>> No it does not, this patch is thoroughly tested on our SOC and it works.
>> of_pci_range_parser_one does not translate upwards through parent. it
>> just sticks to given PCI parser.
>
> Frankly, I'm losing patience with this attitude. Please look at the code
> you call:
>
> #define for_each_of_pci_range(parser, range) \
> for (; of_pci_range_parser_one(parser, range);)
>
>
> struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser
> *parser,
> struct of_pci_range *range)
> {
> const int na = 3, ns = 2;
>
> if (!range)
> return NULL;
>
> if (!pa

Re: [PATCH v3 1/7] iommu/arm-smmu-v3: Introduce SMMU option PAGE0_REGS_ONLY for ThunderX2 errata #74

2017-05-05 Thread Robert Richter
On 05.05.17 17:38:05, Geetha sowjanya wrote:
> From: Linu Cherian 
> 
> Cavium ThunderX2 SMMU implementation doesn't support page 1 register space
> and PAGE0_REGS_ONLY option will be enabled as an errata workaround.
> 
> This option when turned on, replaces all page 1 offsets used for
> EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets.
> 
> Signed-off-by: Linu Cherian 
> Signed-off-by: Geetha Sowjanya 
> ---
>  .../devicetree/bindings/iommu/arm,smmu-v3.txt  |  6 +++
>  drivers/iommu/arm-smmu-v3.c| 44 
> --
>  2 files changed, 38 insertions(+), 12 deletions(-)

> @@ -1995,8 +2011,10 @@ static int arm_smmu_init_queues(struct arm_smmu_device 
> *smmu)
>   if (!(smmu->features & ARM_SMMU_FEAT_PRI))
>   return 0;
>  
> - return arm_smmu_init_one_queue(smmu, &smmu->priq.q, ARM_SMMU_PRIQ_PROD,
> -ARM_SMMU_PRIQ_CONS, PRIQ_ENT_DWORDS);
> + return arm_smmu_init_one_queue(smmu, &smmu->priq.q,
> +ARM_SMMU_PRIQ_PROD(smmu),
> +ARM_SMMU_PRIQ_CONS(smmu),
> +PRIQ_ENT_DWORDS);

I would also suggest Robin's idea from the v1 review here. This works
if we rework arm_smmu_init_one_queue() to pass addresses instead of
offsets.

This would make these widespread offset calculations obsolete.

-Robert
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/7] iommu/arm-smmu-v3: Introduce SMMU option PAGE0_REGS_ONLY for ThunderX2 errata #74

2017-05-05 Thread Robert Richter
On 05.05.17 17:38:05, Geetha sowjanya wrote:
> From: Linu Cherian 
> 
> Cavium ThunderX2 SMMU implementation doesn't support page 1 register space
> and PAGE0_REGS_ONLY option will be enabled as an errata workaround.
> 
> This option when turned on, replaces all page 1 offsets used for
> EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets.
> 
> Signed-off-by: Linu Cherian 
> Signed-off-by: Geetha Sowjanya 
> ---
>  .../devicetree/bindings/iommu/arm,smmu-v3.txt  |  6 +++
>  drivers/iommu/arm-smmu-v3.c| 44 
> --
>  2 files changed, 38 insertions(+), 12 deletions(-)

> @@ -412,6 +412,9 @@
>  #define MSI_IOVA_BASE0x800
>  #define MSI_IOVA_LENGTH  0x10
>  
> +#define ARM_SMMU_PAGE0_REGS_ONLY(smmu)   \
> + ((smmu)->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY)

Why hide the check behind this macro? Maybe make
ARM_SMMU_OPT_PAGE0_REGS_ONLY shorter a bit instead?

-Robert
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/7] Cavium ThunderX2 SMMUv3 errata workarounds

2017-05-05 Thread Robert Richter
On 05.05.17 17:38:04, Geetha sowjanya wrote:
> From: Linu Cherian 
> 
> Cavium ThunderX2 SMMUv3 implementation has two Silicon Erratas.
> 1. Errata ID #74
>SMMU register alias Page 1 is not implemented
> 2. Errata ID #126
>SMMU doesnt support unique IRQ lines and also MSI for gerror, 
>eventq and cmdq-sync
> 
> The following patchset does software workaround for these two erratas.
> 
> This series is based on patchset. 
> https://www.spinics.net/lists/arm-kernel/msg578443.html
> 
> Changes from v1:
>  Since the use of MIDR register is rejected and SMMU_IIDR is broken on this 
>  silicon, as suggested by Will Deacon modified the patches to use ThunderX2 
>  SMMUv3 IORT model number to enable errata workaround.
> 
> Changes from v2:
>  Updated "Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt" document 
> with 
>  new SMMU option used to enable errata workaround.
>  
> Geetha Sowjanya (1):
>   iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126
> 
> Linu Cherian (6):
>   iommu/arm-smmu-v3: Introduce smmu option PAGE0_REGS_ONLY for ThunderX2
> errata#74.
>   iommu/arm-smmu-v3: Do resource size checks based on SMMU option
> PAGE0_REGS_ONLY
>   ACPICA: IORT: Add Cavium ThunderX2 SMMUv3 model definition.
>   iommu/arm-smmu-v3: For ACPI based device probing, set PAGE0_REGS_ONLY
> option for ThunderX2 SMMUv3 implementations.
>   ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3
> model
>   arm64: Documentation: Add Cavium ThunderX2 SMMUv3 erratas

This split into patches does not look reasonable to me. 1 patch only
for each workaround should be sufficient.

-Robert

> 
>  Documentation/arm64/silicon-errata.txt |   2 +
>  drivers/acpi/arm64/iort.c  |  10 ++-
>  drivers/iommu/arm-smmu-v3.c| 122 
> ++---
>  include/acpi/actbl2.h  |   2 +
>  4 files changed, 110 insertions(+), 26 deletions(-)
> 
> -- 
> 1.8.3.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/7] ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model

2017-05-05 Thread Robert Richter
On 05.05.17 17:38:09, Geetha sowjanya wrote:
> From: Linu Cherian 
> 
> Cavium ThunderX2 implementation doesn't support second page in SMMU
> register space. Hence, resource size is set as 64k for this model.
> 
> Signed-off-by: Linu Cherian 
> Signed-off-by: Geetha Sowjanya 
> ---
>  drivers/acpi/arm64/iort.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)

The whole patch can be dropped. See my comment in #2.

-Robert

> 
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index c5fecf9..23c5350 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -833,12 +833,20 @@ static void __init arm_smmu_v3_init_resources(struct 
> resource *res,
>  {
>   struct acpi_iort_smmu_v3 *smmu;
>   int num_res = 0;
> + unsigned long size = SZ_128K;
>  
>   /* Retrieve SMMUv3 specific data */
>   smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
>  
> + /*
> +  * Override the size, for Cavium ThunderX2 implementation
> +  * which doesn't support the page 1 SMMU register space.
> +  */
> + if (smmu->model == ACPI_IORT_SMMU_V3_CAVIUM_CN99XX)
> + size = SZ_64K;
> +
>   res[num_res].start = smmu->base_address;
> - res[num_res].end = smmu->base_address + SZ_128K - 1;
> + res[num_res].end = smmu->base_address + size - 1;
>   res[num_res].flags = IORESOURCE_MEM;
>  
>   num_res++;
> -- 
> 1.8.3.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/7] iommu/arm-smmu-v3: Do resource size checks based on SMMU

2017-05-05 Thread Robert Richter
On 05.05.17 17:38:06, Geetha sowjanya wrote:
> From: Linu Cherian 
> 
> With implementations supporting only page 0 register space,
> resource size can be 64k as well and hence perform size checks
> based on SMMU option PAGE0_REGS_ONLY.
> 
> For this, arm_smmu_device_dt_probe/acpi_probe has been moved before
> platform_get_resource call, so that SMMU options are set beforehand.
> 
> Signed-off-by:  Linu Cherian 
> Signed-off-by:  Geetha Sowjanya 
> ---
>  drivers/iommu/arm-smmu-v3.c | 26 +-
>  1 file changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 107b4a6..f027676 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2672,6 +2672,14 @@ static int arm_smmu_device_dt_probe(struct 
> platform_device *pdev,
>   return ret;
>  }
>  
> +static unsigned long arm_smmu_resource_size(struct arm_smmu_device *smmu)
> +{
> + if (ARM_SMMU_PAGE0_REGS_ONLY(smmu))
> + return SZ_64K;
> + else
> + return SZ_128K;
> +}
> +

I think this can be dropped. See below.

>  static int arm_smmu_device_probe(struct platform_device *pdev)
>  {
>   int irq, ret;
> @@ -2688,9 +2696,17 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>   }
>   smmu->dev = dev;
>  
> + if (dev->of_node) {
> + ret = arm_smmu_device_dt_probe(pdev, smmu);
> + } else {
> + ret = arm_smmu_device_acpi_probe(pdev, smmu);
> + if (ret == -ENODEV)
> + return ret;
> + }
> +
>   /* Base address */
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - if (resource_size(res) + 1 < SZ_128K) {
> + if (resource_size(res) + 1 < arm_smmu_resource_size(smmu)) {
>   dev_err(dev, "MMIO region too small (%pr)\n", res);
>   return -EINVAL;
>   }

Why not just do the follwoing here:

/* Base address */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (resource_size(res) + 1 < arm_smmu_resource_size(smmu)) {
dev_err(dev, "MMIO region too small (%pr)\n", res);
return -EINVAL;
}
ioaddr = res->start;

+   /*
+* Override the size, for Cavium ThunderX2 implementation
+* which doesn't support the page 1 SMMU register space.
+*/
+   if (smmu->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY)
+   res->end = res->size + SZ_64K -1;
+
smmu->base = devm_ioremap_resource(dev, res);
if (IS_ERR(smmu->base))
return PTR_ERR(smmu->base);

Since we can drop patch #5 then, the fix would be isolated to this
file only. And we can use smmu->options as the onle check for this.

-Robert

> @@ -2717,14 +2733,6 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>   if (irq > 0)
>   smmu->gerr_irq = irq;
>  
> - if (dev->of_node) {
> - ret = arm_smmu_device_dt_probe(pdev, smmu);
> - } else {
> - ret = arm_smmu_device_acpi_probe(pdev, smmu);
> - if (ret == -ENODEV)
> - return ret;
> - }
> -
>   /* Set bypass mode according to firmware probing result */
>   bypass = !!ret;
>  
> -- 
> 1.8.3.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: add qcom_iommu

2017-05-05 Thread Rob Clark
On Fri, May 5, 2017 at 3:50 PM, Rob Herring  wrote:
> On Fri, May 5, 2017 at 2:37 PM, Rob Clark  wrote:
>> On Fri, May 5, 2017 at 3:04 PM, Rob Herring  wrote:
>>> On Fri, May 5, 2017 at 1:21 PM, Rob Clark  wrote:
 An iommu driver for Qualcomm "B" family devices which do not completely
 implement the ARM SMMU spec.  These devices have context-bank register
 layout that is similar to ARM SMMU, but no global register space (or at
 least not one that is accessible).

 Signed-off-by: Rob Clark 
 ---
 v1: original
 v2: bindings cleanups and kconfig issues that kbuild robot pointed out
 v4: fix issues pointed out by Rob H. and actually make device removal
 work
 v3: fix WARN_ON() splats reported by Archit

  drivers/iommu/Kconfig  |   9 +
  drivers/iommu/Makefile |   1 +
  drivers/iommu/qcom_iommu.c | 833 
 +
  3 files changed, 843 insertions(+)
  create mode 100644 drivers/iommu/qcom_iommu.c

 diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
 index 37e204f..f8f79a4 100644
 --- a/drivers/iommu/Kconfig
 +++ b/drivers/iommu/Kconfig
 @@ -359,4 +359,13 @@ config MTK_IOMMU_V1

   if unsure, say N here.

 +config QCOM_IOMMU
 +   bool "Qualcomm IOMMU Support"
>>>
>>> Either this needs to be tristate or...
>>>
>>> [...]
 +#include 
>>>
>>> this include and the things that need it should go. Or some
>>> explanation like "once X happens, then we can enable as module" and
>>> leave it all for now.
>>
>> tbh, I'm not sure what the issue is for modules (other than
>> potentially that you'd want the iommu driver fairly early in boot if
>> you didn't have an initrd).  I just saw that the other iommu drivers
>> are all bool.  (Sorry, I don't really follow iommu-devel so not
>> familiar with the history.)  With my distro hat on, I would prefer
>> them to be modules eventually.
>
> For starters, does it even build as a module if you allow that? It
> might not work because of some run-time ordering, but that's good
> enough for this discussion.

It does in fact build as a module..  I suppose I need to figure out a
more convenient way to re-pack modules in an initrd to actually test
it and see what does or does not explode..

BR,
-R

>>
>>> See this[1] for some background.
>>
>> it mentions there are some downsides, but I can't see where those
>> downsides are listed ;-)
>>
>> I would kinda prefer to leave the MODULE_*() stuff in place unless
>> modular iommu drivers are never going to happen.
>
> Maybe a note, so the module police don't fix it.
>
> Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Rob Clark
On Fri, May 5, 2017 at 3:58 PM, Greg KH  wrote:
> On Fri, May 05, 2017 at 02:56:00PM -0400, Rob Clark wrote:
>> On Fri, May 5, 2017 at 2:24 PM, Greg KH  wrote:
>> > On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
>> >> It looks like it *used* to make sense to free the device.  But now it is
>> >> embedded in 'struct iommu' (which is allocated or embedded in something
>> >> that the device allocated).
>> >>
>> >> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
>> >>
>> >> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
>> >> Signed-off-by: Rob Clark 
>> >> ---
>> >>  drivers/iommu/iommu-sysfs.c | 1 -
>> >>  1 file changed, 1 deletion(-)
>> >>
>> >> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
>> >> index c58351e..ad19cbb 100644
>> >> --- a/drivers/iommu/iommu-sysfs.c
>> >> +++ b/drivers/iommu/iommu-sysfs.c
>> >> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] 
>> >> = {
>> >>
>> >>  static void iommu_release_device(struct device *dev)
>> >>  {
>> >> - kfree(dev);
>> >>  }
>> >
>> > As per the documentation in the kernel tree, I now get to make fun of
>> > you for doing such a crazh and foolish thing!
>> >
>> > Come on, don't do that, a release function _HAS_ to free the memory
>> > involved.  If not, then it is really broken...
>>
>> There are shenanigans going on.. so release isn't counterpoint to a
>> _probe() which allocates some memory.  See iommu_device_sysfs_add().
>> So I'm not the one you get to make fun of ;-)
>>
>> This the correct thing to do.  Whether the way the extra fake device
>> embedded in something allocated in the iommu driver's probe (and
>> free'd it *it's* _release()) stuff for iommu sysfs stuff works is
>> bonkers or not, I'll let someone else decide..  it was like that when
>> I got here.
>
> If you have multiple reference counts in the same structure, your code
> is wrong.  That is the root issue here that needs to be resolved.  Yes,
> your patch papers over that, but again, it isn't right either.
>

fair enough, I should have been more precise and said that this patch
is "the correct thing to do for how the code works now".. as far as
bigger refactoring, I'll leave that to someone who understands why the
code works the way it currently does.  My patch at least makes things
less wrong.  (But removing an iommu is kind of a crazy thing to do so
it's perhaps a rather theoretical problem.)

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: intel: Flush the IOTLB to get rid of the initial kdump mappings

2017-05-05 Thread David Woodhouse
On Fri, 2017-05-05 at 11:39 -0700, KarimAllah Ahmed wrote:
> Ever since commit 091d42e43d ("iommu/vt-d: Copy translation tables from
> old kernel") the kdump kernel copies the IOMMU context tables from the
> previous kernel. Each device mappings will be destroyed once the driver
> for the respective device takes over.
> 
> This unfortunately breaks the workflow of mapping and unmapping a new
> context to the IOMMU. The mapping function assumes that either:
> 
> 1) Unmapping did the proper IOMMU flushing and it only ever flush if the
>    IOMMU unit supports caching invalid entries.
> 2) The system just booted and the initialization code took care of
>    flushing all IOMMU caches.
> 
> This assumption is not true for the kdump kernel since the context
> tables have been copied from the previous kernel and translations could
> have been cached ever since. So make sure to flush the IOTLB as well
> when we destroy these old copied mappings.
> 
> Cc: Joerg Roedel 
> Cc: David Woodhouse 
> Cc: David Woodhouse 
> Cc: Anthony Liguori 
> Signed-off-by: KarimAllah Ahmed 

Acked-by: David Woodhouse 
Cc: sta...@vger.kernel.org  v4.2+

I'm still moderately unhappy about the whole "preserve existing
mappings during kdump" thing, and wanted to have a PCI quirk for the
known-broken-can't-be-reset-after-fault devices, and trigger this
behaviour only then. Although I have a vague recollection of there
being a slightly saner justification for it... perhaps this should be
documented, if there is one?

smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Greg KH
On Fri, May 05, 2017 at 02:56:00PM -0400, Rob Clark wrote:
> On Fri, May 5, 2017 at 2:24 PM, Greg KH  wrote:
> > On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
> >> It looks like it *used* to make sense to free the device.  But now it is
> >> embedded in 'struct iommu' (which is allocated or embedded in something
> >> that the device allocated).
> >>
> >> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
> >>
> >> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
> >> Signed-off-by: Rob Clark 
> >> ---
> >>  drivers/iommu/iommu-sysfs.c | 1 -
> >>  1 file changed, 1 deletion(-)
> >>
> >> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
> >> index c58351e..ad19cbb 100644
> >> --- a/drivers/iommu/iommu-sysfs.c
> >> +++ b/drivers/iommu/iommu-sysfs.c
> >> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] 
> >> = {
> >>
> >>  static void iommu_release_device(struct device *dev)
> >>  {
> >> - kfree(dev);
> >>  }
> >
> > As per the documentation in the kernel tree, I now get to make fun of
> > you for doing such a crazh and foolish thing!
> >
> > Come on, don't do that, a release function _HAS_ to free the memory
> > involved.  If not, then it is really broken...
> 
> There are shenanigans going on.. so release isn't counterpoint to a
> _probe() which allocates some memory.  See iommu_device_sysfs_add().
> So I'm not the one you get to make fun of ;-)
> 
> This the correct thing to do.  Whether the way the extra fake device
> embedded in something allocated in the iommu driver's probe (and
> free'd it *it's* _release()) stuff for iommu sysfs stuff works is
> bonkers or not, I'll let someone else decide..  it was like that when
> I got here.

If you have multiple reference counts in the same structure, your code
is wrong.  That is the root issue here that needs to be resolved.  Yes,
your patch papers over that, but again, it isn't right either.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: add qcom_iommu

2017-05-05 Thread Rob Herring
On Fri, May 5, 2017 at 2:37 PM, Rob Clark  wrote:
> On Fri, May 5, 2017 at 3:04 PM, Rob Herring  wrote:
>> On Fri, May 5, 2017 at 1:21 PM, Rob Clark  wrote:
>>> An iommu driver for Qualcomm "B" family devices which do not completely
>>> implement the ARM SMMU spec.  These devices have context-bank register
>>> layout that is similar to ARM SMMU, but no global register space (or at
>>> least not one that is accessible).
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>> v1: original
>>> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
>>> v4: fix issues pointed out by Rob H. and actually make device removal
>>> work
>>> v3: fix WARN_ON() splats reported by Archit
>>>
>>>  drivers/iommu/Kconfig  |   9 +
>>>  drivers/iommu/Makefile |   1 +
>>>  drivers/iommu/qcom_iommu.c | 833 
>>> +
>>>  3 files changed, 843 insertions(+)
>>>  create mode 100644 drivers/iommu/qcom_iommu.c
>>>
>>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>>> index 37e204f..f8f79a4 100644
>>> --- a/drivers/iommu/Kconfig
>>> +++ b/drivers/iommu/Kconfig
>>> @@ -359,4 +359,13 @@ config MTK_IOMMU_V1
>>>
>>>   if unsure, say N here.
>>>
>>> +config QCOM_IOMMU
>>> +   bool "Qualcomm IOMMU Support"
>>
>> Either this needs to be tristate or...
>>
>> [...]
>>> +#include 
>>
>> this include and the things that need it should go. Or some
>> explanation like "once X happens, then we can enable as module" and
>> leave it all for now.
>
> tbh, I'm not sure what the issue is for modules (other than
> potentially that you'd want the iommu driver fairly early in boot if
> you didn't have an initrd).  I just saw that the other iommu drivers
> are all bool.  (Sorry, I don't really follow iommu-devel so not
> familiar with the history.)  With my distro hat on, I would prefer
> them to be modules eventually.

For starters, does it even build as a module if you allow that? It
might not work because of some run-time ordering, but that's good
enough for this discussion.

>
>> See this[1] for some background.
>
> it mentions there are some downsides, but I can't see where those
> downsides are listed ;-)
>
> I would kinda prefer to leave the MODULE_*() stuff in place unless
> modular iommu drivers are never going to happen.

Maybe a note, so the module police don't fix it.

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: add qcom_iommu

2017-05-05 Thread Rob Clark
On Fri, May 5, 2017 at 3:04 PM, Rob Herring  wrote:
> On Fri, May 5, 2017 at 1:21 PM, Rob Clark  wrote:
>> An iommu driver for Qualcomm "B" family devices which do not completely
>> implement the ARM SMMU spec.  These devices have context-bank register
>> layout that is similar to ARM SMMU, but no global register space (or at
>> least not one that is accessible).
>>
>> Signed-off-by: Rob Clark 
>> ---
>> v1: original
>> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
>> v4: fix issues pointed out by Rob H. and actually make device removal
>> work
>> v3: fix WARN_ON() splats reported by Archit
>>
>>  drivers/iommu/Kconfig  |   9 +
>>  drivers/iommu/Makefile |   1 +
>>  drivers/iommu/qcom_iommu.c | 833 
>> +
>>  3 files changed, 843 insertions(+)
>>  create mode 100644 drivers/iommu/qcom_iommu.c
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 37e204f..f8f79a4 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -359,4 +359,13 @@ config MTK_IOMMU_V1
>>
>>   if unsure, say N here.
>>
>> +config QCOM_IOMMU
>> +   bool "Qualcomm IOMMU Support"
>
> Either this needs to be tristate or...
>
> [...]
>> +#include 
>
> this include and the things that need it should go. Or some
> explanation like "once X happens, then we can enable as module" and
> leave it all for now.

tbh, I'm not sure what the issue is for modules (other than
potentially that you'd want the iommu driver fairly early in boot if
you didn't have an initrd).  I just saw that the other iommu drivers
are all bool.  (Sorry, I don't really follow iommu-devel so not
familiar with the history.)  With my distro hat on, I would prefer
them to be modules eventually.

> See this[1] for some background.

it mentions there are some downsides, but I can't see where those
downsides are listed ;-)

I would kinda prefer to leave the MODULE_*() stuff in place unless
modular iommu drivers are never going to happen.

BR,
-R

> Rob
>
> [1] https://lwn.net/Articles/643854/
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: intel: Flush the IOTLB to get rid of the initial kdump mappings

2017-05-05 Thread KarimAllah Ahmed via iommu
Ever since commit 091d42e43d ("iommu/vt-d: Copy translation tables from
old kernel") the kdump kernel copies the IOMMU context tables from the
previous kernel. Each device mappings will be destroyed once the driver
for the respective device takes over.

This unfortunately breaks the workflow of mapping and unmapping a new
context to the IOMMU. The mapping function assumes that either:

1) Unmapping did the proper IOMMU flushing and it only ever flush if the
   IOMMU unit supports caching invalid entries.
2) The system just booted and the initialization code took care of
   flushing all IOMMU caches.

This assumption is not true for the kdump kernel since the context
tables have been copied from the previous kernel and translations could
have been cached ever since. So make sure to flush the IOTLB as well
when we destroy these old copied mappings.

Cc: Joerg Roedel 
Cc: David Woodhouse 
Cc: David Woodhouse 
Cc: Anthony Liguori 
Signed-off-by: KarimAllah Ahmed 
---
 drivers/iommu/intel-iommu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d412a31..478130d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2050,11 +2050,14 @@ static int domain_context_mapping_one(struct 
dmar_domain *domain,
if (context_copied(context)) {
u16 did_old = context_domain_id(context);
 
-   if (did_old >= 0 && did_old < cap_ndoms(iommu->cap))
+   if (did_old >= 0 && did_old < cap_ndoms(iommu->cap)) {
iommu->flush.flush_context(iommu, did_old,
   (((u16)bus) << 8) | devfn,
   DMA_CCMD_MASK_NOBIT,
   DMA_CCMD_DEVICE_INVL);
+   iommu->flush.flush_iotlb(iommu, did_old, 0, 0,
+DMA_TLB_DSI_FLUSH);
+   }
}
 
pgd = domain->pgd;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: add qcom_iommu

2017-05-05 Thread Rob Herring
On Fri, May 5, 2017 at 1:21 PM, Rob Clark  wrote:
> An iommu driver for Qualcomm "B" family devices which do not completely
> implement the ARM SMMU spec.  These devices have context-bank register
> layout that is similar to ARM SMMU, but no global register space (or at
> least not one that is accessible).
>
> Signed-off-by: Rob Clark 
> ---
> v1: original
> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
> v4: fix issues pointed out by Rob H. and actually make device removal
> work
> v3: fix WARN_ON() splats reported by Archit
>
>  drivers/iommu/Kconfig  |   9 +
>  drivers/iommu/Makefile |   1 +
>  drivers/iommu/qcom_iommu.c | 833 
> +
>  3 files changed, 843 insertions(+)
>  create mode 100644 drivers/iommu/qcom_iommu.c
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 37e204f..f8f79a4 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -359,4 +359,13 @@ config MTK_IOMMU_V1
>
>   if unsure, say N here.
>
> +config QCOM_IOMMU
> +   bool "Qualcomm IOMMU Support"

Either this needs to be tristate or...

[...]
> +#include 

this include and the things that need it should go. Or some
explanation like "once X happens, then we can enable as module" and
leave it all for now.

See this[1] for some background.

Rob

[1] https://lwn.net/Articles/643854/
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Rob Clark
On Fri, May 5, 2017 at 2:24 PM, Greg KH  wrote:
> On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
>> It looks like it *used* to make sense to free the device.  But now it is
>> embedded in 'struct iommu' (which is allocated or embedded in something
>> that the device allocated).
>>
>> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
>>
>> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
>> Signed-off-by: Rob Clark 
>> ---
>>  drivers/iommu/iommu-sysfs.c | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
>> index c58351e..ad19cbb 100644
>> --- a/drivers/iommu/iommu-sysfs.c
>> +++ b/drivers/iommu/iommu-sysfs.c
>> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] = {
>>
>>  static void iommu_release_device(struct device *dev)
>>  {
>> - kfree(dev);
>>  }
>
> As per the documentation in the kernel tree, I now get to make fun of
> you for doing such a crazh and foolish thing!
>
> Come on, don't do that, a release function _HAS_ to free the memory
> involved.  If not, then it is really broken...

There are shenanigans going on.. so release isn't counterpoint to a
_probe() which allocates some memory.  See iommu_device_sysfs_add().
So I'm not the one you get to make fun of ;-)

This the correct thing to do.  Whether the way the extra fake device
embedded in something allocated in the iommu driver's probe (and
free'd it *it's* _release()) stuff for iommu sysfs stuff works is
bonkers or not, I'll let someone else decide..  it was like that when
I got here.

BR,
-R

> greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Greg KH
On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
> It looks like it *used* to make sense to free the device.  But now it is
> embedded in 'struct iommu' (which is allocated or embedded in something
> that the device allocated).
> 
> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
> 
> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/iommu-sysfs.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
> index c58351e..ad19cbb 100644
> --- a/drivers/iommu/iommu-sysfs.c
> +++ b/drivers/iommu/iommu-sysfs.c
> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] = {
>  
>  static void iommu_release_device(struct device *dev)
>  {
> - kfree(dev);
>  }

As per the documentation in the kernel tree, I now get to make fun of
you for doing such a crazh and foolish thing!

Come on, don't do that, a release function _HAS_ to free the memory
involved.  If not, then it is really broken...

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: add qcom_iommu

2017-05-05 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark 
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v4: fix issues pointed out by Rob H. and actually make device removal
work
v3: fix WARN_ON() splats reported by Archit

 drivers/iommu/Kconfig  |   9 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 833 +
 3 files changed, 843 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..f8f79a4 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,13 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..025780a
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,833 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   struct list_head context_list;   /* list of qcom_iommu_context 
*/
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   unsigned int irq;
+   bool secure_init;
+   u32  asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+   struct list_head node;  /* head in 
qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+   struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
+   WARN_ON(!qcom_iommu);
+   return qcom_iommu;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned 
asid)
+{
+   struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+   struct qcom_iommu_ctx *ctx;
+
+   if (!qcom_iommu)
+   return NULL;
+
+   list_for_each_entry(ctx, &qcom_iommu->context_list, node)
+   if (ctx->asid == asid)
+   return ctx;
+
+   W

[PATCH] iommu: fix device remove

2017-05-05 Thread Rob Clark
It looks like it *used* to make sense to free the device.  But now it is
embedded in 'struct iommu' (which is allocated or embedded in something
that the device allocated).

Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.

Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
Signed-off-by: Rob Clark 
---
 drivers/iommu/iommu-sysfs.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
index c58351e..ad19cbb 100644
--- a/drivers/iommu/iommu-sysfs.c
+++ b/drivers/iommu/iommu-sysfs.c
@@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] = {
 
 static void iommu_release_device(struct device *dev)
 {
-   kfree(dev);
 }
 
 static struct class iommu_class = {
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-05-05 Thread Alex Williamson
On Wed, 3 May 2017 12:28:35 -0400
Nick Sarnie  wrote:

> On Wed, May 3, 2017 at 10:37 AM, Matthias Ehrenfeuchter  
> wrote:
> > Hi,
> >
> > There are a lot of messages/threads out there about bad performance while
> > using AMDs Ryzen with KVM GPU passthrough. It revolves all on
> > enabling/disabling npt, while enabled overall VM performance is nice but the
> > GPU performance gives me about 20% (and a lot of drops to zero GPU usage,
> > while CPU/Disk/Ram also doing nothing) compared to npt disabled. But while
> > npt is disabled overall VM performance is like beeing on 4x86 with floppy
> > disk as only storage. (Ex. it takes 2 seconds just to open startmenu while
> > host and vm are in idle, and neither CPU pinning, changing CPU model,
> > changing storage device nor using hugepages changed anything).
> >
> > So everything I read pointed to a bug in the npt implementation? Anything I
> > could do to get closer to the "thing" issuing this?
> >
> > Best Regards
> >
> > efeu
> > ___
> > iommu mailing list
> > iommu@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu  
> 
> I heard from Joerg that it might be related to a lower intercept rate
> being used when NPT is enabled, but we haven't been able to find a way
> to trace that to confirm.

Joerg/Paolo, any ideas how we might debug this?  Anyone from AMD
watching?  Thanks,

Alex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 13/32] x86/boot/e820: Add support to determine the E820 type of an address

2017-05-05 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 04:18:31PM -0500, Tom Lendacky wrote:
> Add a function that will return the E820 type associated with an address
> range.

...

> @@ -110,9 +111,28 @@ bool __init e820__mapped_all(u64 start, u64 end, enum 
> e820_type type)
>* coverage of the desired range exists:
>*/
>   if (start >= end)
> - return 1;
> + return entry;
>   }
> - return 0;
> +
> + return NULL;
> +}
> +
> +/*
> + * This function checks if the entire range  is mapped with type.
> + */
> +bool __init e820__mapped_all(u64 start, u64 end, enum e820_type type)
> +{
> + return __e820__mapped_all(start, end, type) ? 1 : 0;

return !!__e820__mapped_all(start, end, type);

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/pci: reserve iova for PCI masters

2017-05-05 Thread Robin Murphy
On 04/05/17 19:52, Oza Oza wrote:
> On Thu, May 4, 2017 at 11:50 PM, Robin Murphy  wrote:
>> On 03/05/17 05:46, Oza Pawandeep wrote:
>>> this patch reserves the iova for PCI masters.
>>> ARM64 based SOCs may have scattered memory banks.
>>> such as iproc based SOC has
>>>
>>> <0x 0x8000 0x0 0x8000>, /* 2G @ 2G */
>>> <0x0008 0x8000 0x3 0x8000>, /* 14G @ 34G */
>>> <0x0090 0x 0x4 0x>, /* 16G @ 576G */
>>> <0x00a0 0x 0x4 0x>; /* 16G @ 640G */
>>>
>>> but incoming PCI transcation addressing capability is limited
>>> by host bridge, for example if max incoming window capability
>>> is 512 GB, then 0x0090 and 0x00a0 will fall beyond it.
>>>
>>> to address this problem, iommu has to avoid allocating iovas which
>>> are reserved. which inturn does not allocate iova if it falls into hole.
>>
>> I don't necessarily disagree with doing this, as we could do with facing
>> up to the issue of discontiguous DMA ranges in particular (I too have a
>> platform with this problem), but I'm still not overly keen on pulling DT
>> specifics into this layer. More than that, though, if we are going to do
>> it, then we should do it for all devices with a restrictive
>> "dma-ranges", not just PCI ones.
>>
> 
> How do you propose to do it ?
> 
> my thinking is this:
> iova_reserve_pci_windows is written specific for PCI, and I am adding there.
> 
> ideally
> struct pci_host_bridge should have new member:
> 
> struct list_head inbound_windows; /* resource_entry */
> 
> but somehow this resource have to be filled much before
> iommu_dma_init_domain happens.
> and use brdge resource directly in iova_reserve_pci_windows as it is
> already doing it for outbound memory.
> 
> this will detach the DT specifics from dma-immu layer.
> let me know how this sounds.

Please check the state of the code currently queued in Joerg's tree and
in linux-next - iommu_dma_get_resv_regions() has room for
device-agnostic stuff before the if (!dev_is_pci(dev)) check.

Furthermore, with the probe-deferral changes we end up with a common
dma_configure() routine to abstract the firmware-specifics of
of_dma_configure() vs. acpi_dma_configure(), so it would make sense to
give drivers etc. a similar interface for interrogating ranges. i.e.
some common function that abstracts the difference between parsing DT
dma-ranges vs. the ACPI _DMA object, either with a list-based get/put
model or perhaps an iterator with a user-provided callback (so users
could process in-place or create their own list as necessary). Unless of
course we go all the way to making the ranges an inherent part of the
device layer like some MIPS platforms currently do.

Robin.

>>> Bug: SOC-5216
>>> Change-Id: Icbfc99a045d730be143fef427098c937b9d46353
>>> Signed-off-by: Oza Pawandeep 
>>> Reviewed-on: http://gerrit-ccxsw.broadcom.net/40760
>>> Reviewed-by: vpx_checkpatch status 
>>> Reviewed-by: CCXSW 
>>> Tested-by: vpx_autobuild status 
>>> Tested-by: vpx_smoketest status 
>>> Tested-by: CCXSW 
>>> Reviewed-by: Scott Branden 
>>>
>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>> index 48d36ce..08764b0 100644
>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
>>> @@ -27,6 +27,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  #include 
>>>  #include 
>>>  #include 
>>> @@ -171,8 +172,12 @@ static void iova_reserve_pci_windows(struct pci_dev 
>>> *dev,
>>>   struct iova_domain *iovad)
>>>  {
>>>   struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
>>> + struct device_node *np = bridge->dev.parent->of_node;
>>>   struct resource_entry *window;
>>>   unsigned long lo, hi;
>>> + int ret;
>>> + dma_addr_t tmp_dma_addr = 0, dma_addr;
>>> + LIST_HEAD(res);
>>>
>>>   resource_list_for_each_entry(window, &bridge->windows) {
>>>   if (resource_type(window->res) != IORESOURCE_MEM &&
>>> @@ -183,6 +188,36 @@ static void iova_reserve_pci_windows(struct pci_dev 
>>> *dev,
>>>   hi = iova_pfn(iovad, window->res->end - window->offset);
>>>   reserve_iova(iovad, lo, hi);
>>>   }
>>> +
>>> + /* PCI inbound memory reservation. */
>>> + ret = of_pci_get_dma_ranges(np, &res);
>>> + if (!ret) {
>>> + resource_list_for_each_entry(window, &res) {
>>> + struct resource *res_dma = window->res;
>>> +
>>> + dma_addr = res_dma->start - window->offset;
>>> + if (tmp_dma_addr > dma_addr) {
>>> + pr_warn("PCI: failed to reserve iovas; ranges 
>>> should be sorted\n");
>>
>> I don't see anything in the DT spec about the entries having to be
>> sorted, and it's not exactly impossible to sort a list if you need it so
>> (and if I'm being really pedantic, one could still trigger this with a
>> list that *is* sorted, only by different criteria).
>>
> 
> we have to sort it the way we wan

Re: [PATCH v3 3/7] ACPICA: IORT: Add Cavium ThunderX2 SMMUv3 model definition.

2017-05-05 Thread Jon Masters
On 05/05/2017 10:58 AM, Will Deacon wrote:
> On Fri, May 05, 2017 at 07:56:17AM -0700, David Daney wrote:
>> On 05/05/2017 06:53 AM, Hanjun Guo wrote:
>>> On 2017/5/5 20:08, Geetha sowjanya wrote:
 From: Linu Cherian 

 +#define ACPI_IORT_SMMU_V3_CAVIUM_CN99XX 0x0002 /* Cavium ThunderX2
 SMMUv3 */
>>>
>>> There are some other model numbers in the unreleased spec,
>>> I think we need to wait for the updated IORT spec to
>>> be released.

Indeed. I've synced with the author on this and he's got it in hand.

>> ... or if we are fairly confident that the identifier will not need to
>> change, we can merge this as is and establish a de facto specification that
>> the Real IORT specification will then be forced to follow.

Can't do that - this always causes trouble ;) But if there's any delay
I'll ask that the IDs at least be listed somewhere public or something.

>> Is there anything other than bureaucratic inertia holding up the real
>> specification?
> 
> My understanding is that IORT is going to be published imminently (i.e.
> before the next kernel release), so it makes sense to wait rather than fork
> the spec.

Let's track this and get the updated patches posted next week once the
new ID drops. Meanwhile, I suggest reviewing them as-is for other
issues. I'm tracking this for internal purposes and require this to be
upstream asap so I'll be sitting on this thread for updates ;)

Jon.

-- 
Computer Architect | Sent from my Fedora powered laptop

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

2017-05-05 Thread Robin Murphy
On 04/05/17 19:41, Oza Oza wrote:
[...]
>>> 5) leaves scope of adding PCI flag handling for inbound memory
>>> by the new function.
>>
>> Which flags would ever actually matter? DMA windows aren't going to be
>> to config or I/O space, so the memory type can be assumed, and the
>> 32/64-bit distinction is irrelevant as it's not a relocatable BAR;
>> DMA-able system memory isn't going to be read-sensitive, so the
>> prefetchable flag shouldn't matter; and not being a BAR none of the
>> others would be relevant either.
>>
> 
> Thanks Robin; for your reply and attention:
> 
> agree with you, at present it would not matter,
> but it does not mean that we do not scope it to make it matter in future.
> 
> now where it could matter:
> there is Relaxed Ordering for inbound memory for PCI.
> According to standard, Relaxed Ordering (RO) bit can be set only for
> Memory requests and completions (if present in the original request).
> Also, according to transaction ordering rules, I/O and configuration
> requests can still be re-ordered ahead of each other.
> and we would like to make use of it.
> for e.g. lets say we mark memory as Relaxed Ordered with flag.
> the special about this memory is incoming PCI transactions can be
> reordered and rest memory has to be strongly ordered.

Please look at "PCI Bus Binding to: IEEE Std 1275-1994 Standard for Boot
(Initialization Configuration) Firmware" (as referenced in DTSpec) and
explain how PCIe Relaxed Order has anything to do with the DT binding.

> how it our SOC would make use of this is out of scope for the
> discussion at this point of time, but I am just bringing in the
> idea/point how flags could be useful
> for inbound memory, since we would not like throw-away flags completely.

The premise for implementing a PCI-specific parser is that you claim we
need to do something with the phys.hi cell of a DT PCI address, rather
than just taking the numerical part out of the phys.mid and phys.lo
cells. Please make that argument in reference to the flags which that
upper cell actually encodes, not unrelated things.

[...]
>>> +int of_pci_get_dma_ranges(struct device_node *np, struct list_head 
>>> *resources)
>>> +{
>>> + struct device_node *node = of_node_get(np);
>>> + int rlen;
>>> + int ret = 0;
>>> + const int na = 3, ns = 2;
>>> + struct resource *res;
>>> + struct of_pci_range_parser parser;
>>> + struct of_pci_range range;
>>> +
>>> + if (!node)
>>> + return -EINVAL;
>>> +
>>> + parser.node = node;
>>> + parser.pna = of_n_addr_cells(node);
>>> + parser.np = parser.pna + na + ns;
>>> +
>>> + parser.range = of_get_property(node, "dma-ranges", &rlen);
>>> +
>>> + if (!parser.range) {
>>> + pr_debug("pcie device has no dma-ranges defined for 
>>> node(%s)\n",
>>> +   np->full_name);
>>> + ret = -EINVAL;
>>> + goto out;
>>> + }
>>> +
>>> + parser.end = parser.range + rlen / sizeof(__be32);
>>> +
>>> + for_each_of_pci_range(&parser, &range) {
>>
>> This is plain wrong - of_pci_range_parser_one() will translate upwards
>> through parent "ranges" properties, which is completely backwards for
>> DMA addresses.
>>
>> Robin.
>>
> 
> No it does not, this patch is thoroughly tested on our SOC and it works.
> of_pci_range_parser_one does not translate upwards through parent. it
> just sticks to given PCI parser.

Frankly, I'm losing patience with this attitude. Please look at the code
you call:

#define for_each_of_pci_range(parser, range) \
for (; of_pci_range_parser_one(parser, range);)


struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser
*parser,
struct of_pci_range *range)
{
const int na = 3, ns = 2;

if (!range)
return NULL;

if (!parser->range || parser->range + parser->np > parser->end)
return NULL;

range->pci_space = parser->range[0];
range->flags = of_bus_pci_get_flags(parser->range);
range->pci_addr = of_read_number(parser->range + 1, ns);
range->cpu_addr = of_translate_address(parser->node,
parser->range + na);
...


u64 of_translate_address(struct device_node *dev, const __be32 *in_addr)
{
return __of_translate_address(dev, in_addr, "ranges");
}


I don't doubt that you still manage to get the right result on *your*
SoC, because you probably have neither further "ranges" nor "dma-ranges"
translations above your host controller node anyway. That does not
change the fact that the proposed code is still obviously wrong for more
complex DT topologies that do.

We're doing upstream work in core code here: I don't particularly care
about making your SoC work; I don't really care about making Juno work
properly either; what I do care about is that code to parse dma-ranges
actually parses dma-ranges *correctly* for all possible valid uses of
dm

Re: [PATCH v3 3/7] ACPICA: IORT: Add Cavium ThunderX2 SMMUv3 model definition.

2017-05-05 Thread Will Deacon
On Fri, May 05, 2017 at 07:56:17AM -0700, David Daney wrote:
> On 05/05/2017 06:53 AM, Hanjun Guo wrote:
> >On 2017/5/5 20:08, Geetha sowjanya wrote:
> >>From: Linu Cherian 
> >>
> >>Add SMMUv3 model definition for ThunderX2.
> >>
> >>Signed-off-by: Linu Cherian 
> >>Signed-off-by: Geetha Sowjanya 
> >>---
> >> include/acpi/actbl2.h | 2 ++
> >> 1 file changed, 2 insertions(+)
> >>
> >>diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
> >>index faa9f2c..76a6f5d 100644
> >>--- a/include/acpi/actbl2.h
> >>+++ b/include/acpi/actbl2.h
> >>@@ -779,6 +779,8 @@ struct acpi_iort_smmu {
> >> #define ACPI_IORT_SMMU_CORELINK_MMU400  0x0002/* ARM Corelink
> >>MMU-400 */
> >> #define ACPI_IORT_SMMU_CORELINK_MMU500  0x0003/* ARM Corelink
> >>MMU-500 */
> >>
> >>+#define ACPI_IORT_SMMU_V3_CAVIUM_CN99XX 0x0002 /* Cavium ThunderX2
> >>SMMUv3 */
> >
> >There are some other model numbers in the unreleased spec,
> >I think we need to wait for the updated IORT spec to
> >be released.
> >
> 
> ... or if we are fairly confident that the identifier will not need to
> change, we can merge this as is and establish a de facto specification that
> the Real IORT specification will then be forced to follow.
> 
> Is there anything other than bureaucratic inertia holding up the real
> specification?

My understanding is that IORT is going to be published imminently (i.e.
before the next kernel release), so it makes sense to wait rather than fork
the spec.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 3/7] ACPICA: IORT: Add Cavium ThunderX2 SMMUv3 model definition.

2017-05-05 Thread David Daney

On 05/05/2017 06:53 AM, Hanjun Guo wrote:

On 2017/5/5 20:08, Geetha sowjanya wrote:

From: Linu Cherian 

Add SMMUv3 model definition for ThunderX2.

Signed-off-by: Linu Cherian 
Signed-off-by: Geetha Sowjanya 
---
 include/acpi/actbl2.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index faa9f2c..76a6f5d 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -779,6 +779,8 @@ struct acpi_iort_smmu {
 #define ACPI_IORT_SMMU_CORELINK_MMU400  0x0002/* ARM Corelink 
MMU-400 */
 #define ACPI_IORT_SMMU_CORELINK_MMU500  0x0003/* ARM Corelink 
MMU-500 */


+#define ACPI_IORT_SMMU_V3_CAVIUM_CN99XX 0x0002 /* Cavium 
ThunderX2 SMMUv3 */


There are some other model numbers in the unreleased spec,
I think we need to wait for the updated IORT spec to
be released.



... or if we are fairly confident that the identifier will not need to 
change, we can merge this as is and establish a de facto specification 
that the Real IORT specification will then be forced to follow.


Is there anything other than bureaucratic inertia holding up the real 
specification?



David.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 3/3] PCI/of fix of_dma_get_range; get PCI specific dma-ranges

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch fixes the bug in of_dma_get_range, which with as is,
parses the PCI memory ranges and return wrong size as 0.

in order to get largest possible dma_mask. this patch also
retuns the largest possible size based on dma-ranges,

for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

based on which IOVA allocation space will honour PCI host
bridge limitations.

the implementation hooks bus specific callbacks for getting
dma-ranges.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903..b43e347 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -46,6 +47,8 @@ struct of_bus {
int na, int ns, int pna);
int (*translate)(__be32 *addr, u64 offset, int na);
unsigned int(*get_flags)(const __be32 *addr);
+   int (*get_dma_ranges)(struct device_node *np,
+ u64 *dma_addr, u64 *paddr, u64 *size);
 };
 
 /*
@@ -171,6 +174,144 @@ static int of_bus_pci_translate(__be32 *addr, u64 offset, 
int na)
 {
return of_bus_default_translate(addr + 1, offset, na - 1);
 }
+
+static int of_bus_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr,
+u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   int ret = 0;
+   struct resource_entry *window;
+   LIST_HEAD(res);
+
+   if (!node)
+   return -EINVAL;
+
+   *size = 0;
+   /*
+* PCI dma-ranges is not mandatory property.
+* many devices do no need to have it, since
+* host bridge does not require inbound memory
+* configuration or rather have design limitations.
+* so we look for dma-ranges, if missing we
+* just return the caller full size, and also
+* no dma-ranges suggests that, host bridge allows
+* whatever comes in, so we set dma_addr to 0.
+*/
+   ret = of_pci_get_dma_ranges(np, &res);
+   if (!ret) {
+   resource_list_for_each_entry(window, &res) {
+   struct resource *res_dma = window->res;
+
+   if (*size < resource_size(res_dma)) {
+   *dma_addr = res_dma->start - window->offset;
+   *paddr = res_dma->start;
+   *size = resource_size(res_dma);
+   }
+   }
+   }
+   pci_free_resource_list(&res);
+
+   /*
+* return the largest possible size,
+* since PCI master allows everything.
+*/
+   if (*size == 0) {
+   pr_debug("empty/zero size dma-ranges found for node(%s)\n",
+   np->full_name);
+   *size = DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1;
+   *dma_addr = *paddr = 0;
+   ret = 0;
+   }
+
+   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+*dma_addr, *paddr, *size);
+
+   of_node_put(node);
+
+   return ret;
+}
+
+static int get_dma_ranges(struct device_node *np, u64 *dma_addr,
+ u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   const __be32 *ranges = NULL;
+   int len, naddr, nsize, pna;
+   int ret = 0;
+   u64 dmaaddr;
+
+   if (!node)
+   return -EINVAL;
+
+   while (1) {
+   naddr = of_n_addr_cells(node);
+   nsize = of_n_size_cells(node);
+   node = of_get_next_parent(node);
+   if (!node)
+   break;
+
+   ranges = of_get_property(node, "dma-ranges", &len);
+
+   /* Ignore empty ranges, they imply no translation required */
+   if (ranges && len > 0)
+   break;
+
+   /*
+* At least empty ranges has to be defined for parent node if
+* DMA is supported
+*/
+   if (!ranges)
+   break;
+   }
+
+   if (!ranges) {
+   pr_debug("no dma-ranges found for node(%s)\n", np->full_name);
+   ret = -ENODEV;
+   goto out;
+   }
+
+   len /= sizeof(u32);
+
+   pna = of_n_addr_cells(node);
+
+   /* dma-ranges format:
+* DMA addr : naddr cells

[PATCH v5 2/3] iommu/pci: reserve IOVA for PCI masters

2017-05-05 Thread Oza Pawandeep via iommu
this patch reserves the IOVA for PCI masters.
ARM64 based SOCs may have scattered memory banks.
such as iproc based SOC has

<0x 0x8000 0x0 0x8000>, /* 2G @ 2G */
<0x0008 0x8000 0x3 0x8000>, /* 14G @ 34G */
<0x0090 0x 0x4 0x>, /* 16G @ 576G */
<0x00a0 0x 0x4 0x>; /* 16G @ 640G */

but incoming PCI transcation addressing capability is limited
by host bridge, for example if max incoming window capability
is 512 GB, then 0x0090 and 0x00a0 will fall beyond it.

to address this problem, iommu has to avoid allocating IOVA which
are reserved. which inturn does not allocate IOVA if it falls into hole.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 48d36ce..08764b0 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -171,8 +172,12 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
struct iova_domain *iovad)
 {
struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+   struct device_node *np = bridge->dev.parent->of_node;
struct resource_entry *window;
unsigned long lo, hi;
+   int ret;
+   dma_addr_t tmp_dma_addr = 0, dma_addr;
+   LIST_HEAD(res);
 
resource_list_for_each_entry(window, &bridge->windows) {
if (resource_type(window->res) != IORESOURCE_MEM &&
@@ -183,6 +188,36 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
hi = iova_pfn(iovad, window->res->end - window->offset);
reserve_iova(iovad, lo, hi);
}
+
+   /* PCI inbound memory reservation. */
+   ret = of_pci_get_dma_ranges(np, &res);
+   if (!ret) {
+   resource_list_for_each_entry(window, &res) {
+   struct resource *res_dma = window->res;
+
+   dma_addr = res_dma->start - window->offset;
+   if (tmp_dma_addr > dma_addr) {
+   pr_warn("PCI: failed to reserve iovas; ranges 
should be sorted\n");
+   return;
+   }
+   if (tmp_dma_addr != dma_addr) {
+   lo = iova_pfn(iovad, tmp_dma_addr);
+   hi = iova_pfn(iovad, dma_addr - 1);
+   reserve_iova(iovad, lo, hi);
+   }
+   tmp_dma_addr = window->res->end - window->offset;
+   }
+   /*
+* the last dma-range should honour based on the
+* 32/64-bit dma addresses.
+*/
+   if (tmp_dma_addr < DMA_BIT_MASK(sizeof(dma_addr_t) * 8)) {
+   lo = iova_pfn(iovad, tmp_dma_addr);
+   hi = iova_pfn(iovad,
+ DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1);
+   reserve_iova(iovad, lo, hi);
+   }
+   }
 }
 
 /**
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 0/3] OF/PCI address PCI inbound memory limitations

2017-05-05 Thread Oza Pawandeep via iommu
It is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound 
transaction addressing.

This is particularly problematic on ARM/ARM64 SOCs where the 
IOMMU (i.e. SMMU) translates IOVA to PA for in-bound transactions
only after PCI Host has forwarded these transactions on SOC IO bus. 

This means, on such ARM/ARM64 SOCs the IOVA of in-bound transactions
has to honor the addressing restrictions of the PCI Host.

Current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices but no implementation exists for pci devices.

For e.g. iproc based SOCs and other SOCs (such as rcar) have
PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

This patchset
reserves the IOVA ranges for PCI masters based
on the PCI world dma-ranges.
fix of_dma_get_range to cater to PCI dma-ranges.
fix of_dma_get_range which currently returns size 0 for PCI devices.

IOVA allocation patch:
[PATCH 2/3] iommu/pci: reserve iova for PCI masters

Fix of_dma_get_range bug and address PCI master.
[PATCH 3/3] PCI/of fix of_dma_get_range; get PCI specific

Base patch for both of the above patches:
[PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

Changes since v4:
- something wrong with my mail client: not all teh patch went out, attempting 
again:
Changes since v3:
- minor change, redudant checkes removed
Changes since v2:
- removed internal review
Changes since v1:
- Remove internal GERRIT details from patch descriptions
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of this function to of_bus_default_dma_get_ranges
  and adding that to the default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin

Oza Pawandeep (3):
  of/pci/dma: fix DMA configuration for PCI masters
  iommu/pci: reserve IOVA for PCI masters
  PCI/of fix of_dma_get_range; get PCI specific dma-ranges

 drivers/iommu/dma-iommu.c |  35 
 drivers/of/address.c  | 216 --
 drivers/of/of_pci.c   |  77 +
 include/linux/of_pci.h|   7 ++
 4 files changed, 272 insertions(+), 63 deletions(-)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 1/3] of/pci/dma: fix DMA configuration for PCI masters

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch serves following:

1) exposes interface to the pci host driver for their
inbound memory ranges

2) provide an interface to callers such as of_dma_get_ranges.
so then the returned size get best possible (largest) dma_mask.
because PCI RC drivers do not call APIs such as
dma_set_coherent_mask() and hence rather it shows its addressing
capabilities based on dma-ranges.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

3) this patch handles multiple inbound windows and dma-ranges.
it is left to the caller, how it wants to use them.
the new function returns the resources in a standard and unform way

4) this way the callers of for e.g. of_dma_get_ranges
does not need to change.

5) leaves scope of adding PCI flag handling for inbound memory
by the new function.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 0ee42c3..ed6e69a 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -283,6 +283,83 @@ int of_pci_get_host_bridge_resources(struct device_node 
*dev,
return err;
 }
 EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
+/**
+ * of_pci_get_dma_ranges - Parse PCI host bridge inbound resources from DT
+ * @np: device node of the host bridge having the dma-ranges property
+ * @resources: list where the range of resources will be added after DT parsing
+ *
+ * It is the caller's job to free the @resources list.
+ *
+ * This function will parse the "dma-ranges" property of a
+ * PCI host bridge device node and setup the resource mapping based
+ * on its content.
+ *
+ * It returns zero if the range parsing has been successful or a standard error
+ * value if it failed.
+ */
+
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources)
+{
+   struct device_node *node = of_node_get(np);
+   int rlen;
+   int ret = 0;
+   const int na = 3, ns = 2;
+   struct resource *res;
+   struct of_pci_range_parser parser;
+   struct of_pci_range range;
+
+   if (!node)
+   return -EINVAL;
+
+   parser.node = node;
+   parser.pna = of_n_addr_cells(node);
+   parser.np = parser.pna + na + ns;
+
+   parser.range = of_get_property(node, "dma-ranges", &rlen);
+
+   if (!parser.range) {
+   pr_debug("pcie device has no dma-ranges defined for node(%s)\n",
+ np->full_name);
+   ret = -EINVAL;
+   goto out;
+   }
+
+   parser.end = parser.range + rlen / sizeof(__be32);
+
+   for_each_of_pci_range(&parser, &range) {
+   /*
+* If we failed translation or got a zero-sized region
+* then skip this range
+*/
+   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+   continue;
+
+   res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+   if (!res) {
+   ret = -ENOMEM;
+   goto parse_failed;
+   }
+
+   ret = of_pci_range_to_resource(&range, np, res);
+   if (ret) {
+   kfree(res);
+   continue;
+   }
+
+   pci_add_resource_offset(resources, res,
+   res->start - range.pci_addr);
+   }
+
+   return ret;
+
+parse_failed:
+   pci_free_resource_list(resources);
+out:
+   of_node_put(node);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges);
 #endif /* CONFIG_OF_ADDRESS */
 
 #ifdef CONFIG_PCI_MSI
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 0e0974e..617b90d 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { }
 int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
struct list_head *resources, resource_size_t *io_base);
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources);
 #else
 static inline int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
@@ -83,6 +84,12 @@ static inline int of_pci_get_host_bridge_resources(struct 
device_node *dev,
 {
return -EINVAL;
 }
+
+static inline int of_pci_get_dma_ranges(struct device_node *np,
+

Re: [PATCH v3 3/7] ACPICA: IORT: Add Cavium ThunderX2 SMMUv3 model definition.

2017-05-05 Thread Hanjun Guo

On 2017/5/5 20:08, Geetha sowjanya wrote:

From: Linu Cherian 

Add SMMUv3 model definition for ThunderX2.

Signed-off-by: Linu Cherian 
Signed-off-by: Geetha Sowjanya 
---
 include/acpi/actbl2.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index faa9f2c..76a6f5d 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -779,6 +779,8 @@ struct acpi_iort_smmu {
 #define ACPI_IORT_SMMU_CORELINK_MMU400  0x0002 /* ARM Corelink MMU-400 
*/
 #define ACPI_IORT_SMMU_CORELINK_MMU500  0x0003 /* ARM Corelink MMU-500 
*/

+#define ACPI_IORT_SMMU_V3_CAVIUM_CN99XX 0x0002 /* Cavium ThunderX2 SMMUv3 
*/


There are some other model numbers in the unreleased spec,
I think we need to wait for the updated IORT spec to
be released.

Thanks
Hanjun
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 3/3] PCI/of fix of_dma_get_range; get PCI specific dma-ranges

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch fixes the bug in of_dma_get_range, which with as is,
parses the PCI memory ranges and return wrong size as 0.

in order to get largest possible dma_mask. this patch also
retuns the largest possible size based on dma-ranges,

for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

based on which IOVA allocation space will honour PCI host
bridge limitations.

the implementation hooks bus specific callbacks for getting
dma-ranges.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903..b43e347 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -46,6 +47,8 @@ struct of_bus {
int na, int ns, int pna);
int (*translate)(__be32 *addr, u64 offset, int na);
unsigned int(*get_flags)(const __be32 *addr);
+   int (*get_dma_ranges)(struct device_node *np,
+ u64 *dma_addr, u64 *paddr, u64 *size);
 };
 
 /*
@@ -171,6 +174,144 @@ static int of_bus_pci_translate(__be32 *addr, u64 offset, 
int na)
 {
return of_bus_default_translate(addr + 1, offset, na - 1);
 }
+
+static int of_bus_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr,
+u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   int ret = 0;
+   struct resource_entry *window;
+   LIST_HEAD(res);
+
+   if (!node)
+   return -EINVAL;
+
+   *size = 0;
+   /*
+* PCI dma-ranges is not mandatory property.
+* many devices do no need to have it, since
+* host bridge does not require inbound memory
+* configuration or rather have design limitations.
+* so we look for dma-ranges, if missing we
+* just return the caller full size, and also
+* no dma-ranges suggests that, host bridge allows
+* whatever comes in, so we set dma_addr to 0.
+*/
+   ret = of_pci_get_dma_ranges(np, &res);
+   if (!ret) {
+   resource_list_for_each_entry(window, &res) {
+   struct resource *res_dma = window->res;
+
+   if (*size < resource_size(res_dma)) {
+   *dma_addr = res_dma->start - window->offset;
+   *paddr = res_dma->start;
+   *size = resource_size(res_dma);
+   }
+   }
+   }
+   pci_free_resource_list(&res);
+
+   /*
+* return the largest possible size,
+* since PCI master allows everything.
+*/
+   if (*size == 0) {
+   pr_debug("empty/zero size dma-ranges found for node(%s)\n",
+   np->full_name);
+   *size = DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1;
+   *dma_addr = *paddr = 0;
+   ret = 0;
+   }
+
+   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+*dma_addr, *paddr, *size);
+
+   of_node_put(node);
+
+   return ret;
+}
+
+static int get_dma_ranges(struct device_node *np, u64 *dma_addr,
+ u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   const __be32 *ranges = NULL;
+   int len, naddr, nsize, pna;
+   int ret = 0;
+   u64 dmaaddr;
+
+   if (!node)
+   return -EINVAL;
+
+   while (1) {
+   naddr = of_n_addr_cells(node);
+   nsize = of_n_size_cells(node);
+   node = of_get_next_parent(node);
+   if (!node)
+   break;
+
+   ranges = of_get_property(node, "dma-ranges", &len);
+
+   /* Ignore empty ranges, they imply no translation required */
+   if (ranges && len > 0)
+   break;
+
+   /*
+* At least empty ranges has to be defined for parent node if
+* DMA is supported
+*/
+   if (!ranges)
+   break;
+   }
+
+   if (!ranges) {
+   pr_debug("no dma-ranges found for node(%s)\n", np->full_name);
+   ret = -ENODEV;
+   goto out;
+   }
+
+   len /= sizeof(u32);
+
+   pna = of_n_addr_cells(node);
+
+   /* dma-ranges format:
+* DMA addr : naddr cells

[PATCH v4 2/3] iommu/pci: reserve IOVA for PCI masters

2017-05-05 Thread Oza Pawandeep via iommu
this patch reserves the IOVA for PCI masters.
ARM64 based SOCs may have scattered memory banks.
such as iproc based SOC has

<0x 0x8000 0x0 0x8000>, /* 2G @ 2G */
<0x0008 0x8000 0x3 0x8000>, /* 14G @ 34G */
<0x0090 0x 0x4 0x>, /* 16G @ 576G */
<0x00a0 0x 0x4 0x>; /* 16G @ 640G */

but incoming PCI transcation addressing capability is limited
by host bridge, for example if max incoming window capability
is 512 GB, then 0x0090 and 0x00a0 will fall beyond it.

to address this problem, iommu has to avoid allocating IOVA which
are reserved. which inturn does not allocate IOVA if it falls into hole.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 48d36ce..08764b0 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -171,8 +172,12 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
struct iova_domain *iovad)
 {
struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
+   struct device_node *np = bridge->dev.parent->of_node;
struct resource_entry *window;
unsigned long lo, hi;
+   int ret;
+   dma_addr_t tmp_dma_addr = 0, dma_addr;
+   LIST_HEAD(res);
 
resource_list_for_each_entry(window, &bridge->windows) {
if (resource_type(window->res) != IORESOURCE_MEM &&
@@ -183,6 +188,36 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
hi = iova_pfn(iovad, window->res->end - window->offset);
reserve_iova(iovad, lo, hi);
}
+
+   /* PCI inbound memory reservation. */
+   ret = of_pci_get_dma_ranges(np, &res);
+   if (!ret) {
+   resource_list_for_each_entry(window, &res) {
+   struct resource *res_dma = window->res;
+
+   dma_addr = res_dma->start - window->offset;
+   if (tmp_dma_addr > dma_addr) {
+   pr_warn("PCI: failed to reserve iovas; ranges 
should be sorted\n");
+   return;
+   }
+   if (tmp_dma_addr != dma_addr) {
+   lo = iova_pfn(iovad, tmp_dma_addr);
+   hi = iova_pfn(iovad, dma_addr - 1);
+   reserve_iova(iovad, lo, hi);
+   }
+   tmp_dma_addr = window->res->end - window->offset;
+   }
+   /*
+* the last dma-range should honour based on the
+* 32/64-bit dma addresses.
+*/
+   if (tmp_dma_addr < DMA_BIT_MASK(sizeof(dma_addr_t) * 8)) {
+   lo = iova_pfn(iovad, tmp_dma_addr);
+   hi = iova_pfn(iovad,
+ DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1);
+   reserve_iova(iovad, lo, hi);
+   }
+   }
 }
 
 /**
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 0/3] OF/PCI address PCI inbound memory limitations

2017-05-05 Thread Oza Pawandeep via iommu
It is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound 
transaction addressing.

This is particularly problematic on ARM/ARM64 SOCs where the 
IOMMU (i.e. SMMU) translates IOVA to PA for in-bound transactions
only after PCI Host has forwarded these transactions on SOC IO bus. 

This means, on such ARM/ARM64 SOCs the IOVA of in-bound transactions
has to honor the addressing restrictions of the PCI Host.

Current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices but no implementation exists for pci devices.

For e.g. iproc based SOCs and other SOCs (such as rcar) have
PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

This patchset
reserves the IOVA ranges for PCI masters based
on the PCI world dma-ranges.
fix of_dma_get_range to cater to PCI dma-ranges.
fix of_dma_get_range which currently returns size 0 for PCI devices.

IOVA allocation patch:
[PATCH 2/3] iommu/pci: reserve iova for PCI masters

Fix of_dma_get_range bug and address PCI master.
[PATCH 3/3] PCI/of fix of_dma_get_range; get PCI specific

Base patch for both of the above patches:
[PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

Changes since v3:
- redudant check removed
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin
Changes since v2:
- internal review names removed.
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin
Changes since v1:
- Remove internal GERRIT details from patch descriptions
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of this function to of_bus_default_dma_get_ranges
  and adding that to the default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin
  
Oza Pawandeep (3):
  of/pci/dma: fix DMA configuration for PCI masters
  iommu/pci: reserve iova for PCI masters
  PCI/of fix of_dma_get_range; get PCI specific dma-ranges

 drivers/iommu/dma-iommu.c | 35 +
 drivers/of/address.c  | 52 
 drivers/of/of_pci.c   | 77 +++
 include/linux/of_pci.h|  7 +
 4 files changed, 171 insertions(+)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/3] of/pci/dma: fix DMA configuration for PCI masters

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch serves following:

1) exposes interface to the pci host driver for their
inbound memory ranges

2) provide an interface to callers such as of_dma_get_ranges.
so then the returned size get best possible (largest) dma_mask.
because PCI RC drivers do not call APIs such as
dma_set_coherent_mask() and hence rather it shows its addressing
capabilities based on dma-ranges.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

3) this patch handles multiple inbound windows and dma-ranges.
it is left to the caller, how it wants to use them.
the new function returns the resources in a standard and unform way

4) this way the callers of for e.g. of_dma_get_ranges
does not need to change.

5) leaves scope of adding PCI flag handling for inbound memory
by the new function.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 0ee42c3..ed6e69a 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -283,6 +283,83 @@ int of_pci_get_host_bridge_resources(struct device_node 
*dev,
return err;
 }
 EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
+/**
+ * of_pci_get_dma_ranges - Parse PCI host bridge inbound resources from DT
+ * @np: device node of the host bridge having the dma-ranges property
+ * @resources: list where the range of resources will be added after DT parsing
+ *
+ * It is the caller's job to free the @resources list.
+ *
+ * This function will parse the "dma-ranges" property of a
+ * PCI host bridge device node and setup the resource mapping based
+ * on its content.
+ *
+ * It returns zero if the range parsing has been successful or a standard error
+ * value if it failed.
+ */
+
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources)
+{
+   struct device_node *node = of_node_get(np);
+   int rlen;
+   int ret = 0;
+   const int na = 3, ns = 2;
+   struct resource *res;
+   struct of_pci_range_parser parser;
+   struct of_pci_range range;
+
+   if (!node)
+   return -EINVAL;
+
+   parser.node = node;
+   parser.pna = of_n_addr_cells(node);
+   parser.np = parser.pna + na + ns;
+
+   parser.range = of_get_property(node, "dma-ranges", &rlen);
+
+   if (!parser.range) {
+   pr_debug("pcie device has no dma-ranges defined for node(%s)\n",
+ np->full_name);
+   ret = -EINVAL;
+   goto out;
+   }
+
+   parser.end = parser.range + rlen / sizeof(__be32);
+
+   for_each_of_pci_range(&parser, &range) {
+   /*
+* If we failed translation or got a zero-sized region
+* then skip this range
+*/
+   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+   continue;
+
+   res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+   if (!res) {
+   ret = -ENOMEM;
+   goto parse_failed;
+   }
+
+   ret = of_pci_range_to_resource(&range, np, res);
+   if (ret) {
+   kfree(res);
+   continue;
+   }
+
+   pci_add_resource_offset(resources, res,
+   res->start - range.pci_addr);
+   }
+
+   return ret;
+
+parse_failed:
+   pci_free_resource_list(resources);
+out:
+   of_node_put(node);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges);
 #endif /* CONFIG_OF_ADDRESS */
 
 #ifdef CONFIG_PCI_MSI
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 0e0974e..617b90d 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { }
 int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
struct list_head *resources, resource_size_t *io_base);
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources);
 #else
 static inline int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
@@ -83,6 +84,12 @@ static inline int of_pci_get_host_bridge_resources(struct 
device_node *dev,
 {
return -EINVAL;
 }
+
+static inline int of_pci_get_dma_ranges(struct device_node *np,
+

Re: [PATCH V8 07/11] iommu: of: Handle IOMMU lookup failure with deferred probing or error

2017-05-05 Thread Geert Uytterhoeven
Hi Sricharan, Robin,

On Wed, May 3, 2017 at 12:24 PM, Sricharan R  wrote:
> On 5/3/2017 3:24 PM, Robin Murphy wrote:
>> On 02/05/17 19:35, Geert Uytterhoeven wrote:
>>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R  
>>> wrote:
 From: Laurent Pinchart 

 Failures to look up an IOMMU when parsing the DT iommus property need to
 be handled separately from the .of_xlate() failures to support deferred
 probing.

 The lack of a registered IOMMU can be caused by the lack of a driver for
 the IOMMU, the IOMMU device probe not having been performed yet, having
 been deferred, or having failed.

 The first case occurs when the device tree describes the bus master and
 IOMMU topology correctly but no device driver exists for the IOMMU yet
 or the device driver has not been compiled in. Return NULL, the caller
 will configure the device without an IOMMU.

 The second and third cases are handled by deferring the probe of the bus
 master device which will eventually get reprobed after the IOMMU.

 The last case is currently handled by deferring the probe of the bus
 master device as well. A mechanism to either configure the bus master
 device without an IOMMU or to fail the bus master device probe depending
 on whether the IOMMU is optional or mandatory would be a good
 enhancement.

 Tested-by: Marek Szyprowski 
 Signed-off-by: Laurent Pichart 
 Signed-off-by: Sricharan R 
>>>
>>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
>>> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
>>> properties in DT now fail to probe.
>>
>> How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
>> registered then they should merely defer until we reach the point of
>> giving up and ignoring the IOMMU. Is it just that you have no other
>> late-probing drivers or post-init module loads to kick the deferred
>> queue after that point? I did try to find a way to explicitly kick it
>> from a suitably late initcall, but there didn't seem to be any obvious
>> public interface - anyone have any suggestions?
>>
>> I think that's more of a general problem with the probe deferral
>> mechanism itself (I've seen the same thing happen with some of the
>> CoreSight stuff on Juno due to the number of inter-component
>> dependencies) rather than any specific fault of this series.

I had a deeper look into the issue.

What changed, is that of_dma_configure() now returns an error code,
and dma_configure() looks at it.

Actually there are two failure modes:
  1. Devices with an iommus property pointing to a disabled IOMMU node.
 These return -EPROBE_DEFER, and are now retried forever.
  2. Devices that are blacklisted in the IPMMU driver, as we don't want to
 use them with an IOMMU yet.
 These return -ENODEV, due to ipmmu_of_xlate_dma().

> I was thinking of an additional check like below to avoid the
> situation ?
>
> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> From: Sricharan R 
> Date: Wed, 3 May 2017 13:16:59 +0530
> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
>
> While returning EPROBE_DEFER for iommu masters
> take in to account of iommu nodes that could be
> marked in DT as 'status=disabled', in which case
> simply return NULL and let the master's probe
> continue rather than deferring.
>
> Signed-off-by: Sricharan R 
> ---
>  drivers/iommu/of_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 9f44ee8..e6e9bec 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct device_node 
> *np)
>
> ops = iommu_ops_from_fwnode(fwnode);
> if ((ops && !ops->of_xlate) ||
> +   !of_device_is_available(iommu_spec->np) ||
> (!ops && !of_iommu_driver_present(iommu_spec->np)))
> return NULL;

Thanks, this fixes the first class of failures.

The second class can be worked around using:

--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -196,6 +196,11 @@ static const struct iommu_ops
ops = of_iommu_xlate(dev, &iommu_spec);
of_node_put(iommu_spec.np);
idx++;
+   if (PTR_ERR(ops) == -ENODEV) {
+   dev_info(dev, "%s: Ignoring -ENODEV => NULL\n",
+__func__);
+   return NULL;
+   }
if (IS_ERR_OR_NULL(ops))
break;
}

but obviously that's too hackish to apply...
Magnus, do you have a suggestion?

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists

Re: [PATCH 3/4] iommu: add qcom_iommu

2017-05-05 Thread Sricharan R
< snip ..>
>> +
>> +static struct platform_driver qcom_iommu_driver = {
>> +   .driver = {
>> +   .name   = "qcom-iommu",
>> +   .of_match_table = of_match_ptr(qcom_iommu_of_match),
>> +   .pm = &qcom_iommu_pm_ops,
>> +   },
>> +   .probe  = qcom_iommu_device_probe,
>> +   .remove = qcom_iommu_device_remove,
>> +};
>> +module_platform_driver(qcom_iommu_driver);
>> +
>> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
> 
> Is this needed any more with deferred probe now?

Yes, because the __iommu_of_table is still used for to find out
the presence of the driver.

Regards,
 Sricharan

> 
>> +
>> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
>> +MODULE_LICENSE("GPL v2");
>> --
>> 2.9.3
>>

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 7/7] arm64: Documentation: Add Cavium ThunderX2 SMMUv3 erratas

2017-05-05 Thread Geetha sowjanya
From: Linu Cherian 

Add Cavium ThunderX2 SMMUv3 erratas to the errata list.

Signed-off-by: Linu Cherian 
Signed-off-by: Geetha Sowjanya 
---
 Documentation/arm64/silicon-errata.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/arm64/silicon-errata.txt 
b/Documentation/arm64/silicon-errata.txt
index 10f2ddd..42422f6 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -62,6 +62,8 @@ stable kernels.
 | Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154
|
 | Cavium | ThunderX Core   | #27456  | CAVIUM_ERRATUM_27456
|
 | Cavium | ThunderX SMMUv2 | #27704  | N/A 
|
+| Cavium | ThunderX2 SMMUv3| #74 | N/A 
|
+| Cavium | ThunderX2 SMMUv3| #126| N/A 
|
 || | | 
|
 | Freescale/NXP  | LS2080A/LS1043A | A-008585| FSL_ERRATUM_A008585 
|
 || | | 
|
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 6/7] iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126

2017-05-05 Thread Geetha sowjanya
From: Geetha Sowjanya 

Cavium ThunderX2 SMMU doesn't support MSI and also doesn't have unique irq
lines for gerror, eventq and cmdq-sync.

This patch addresses the issue by checking if any interrupt sources are
using same irq number, then they are registered as shared irqs.

Signed-off-by: Geetha Sowjanya 
---
 drivers/iommu/arm-smmu-v3.c | 32 
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 016b702..46428e7 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2236,10 +2236,30 @@ static void arm_smmu_setup_msis(struct arm_smmu_device 
*smmu)
devm_add_action(dev, arm_smmu_free_msis, dev);
 }
 
+static int get_irq_flags(struct arm_smmu_device *smmu, int irq)
+{
+   int match_count = 0;
+
+   if (irq == smmu->evtq.q.irq)
+   match_count++;
+   if (irq == smmu->cmdq.q.irq)
+   match_count++;
+   if (irq == smmu->gerr_irq)
+   match_count++;
+   if (irq == smmu->priq.q.irq)
+   match_count++;
+
+   if (match_count > 1)
+   return IRQF_SHARED | IRQF_ONESHOT;
+
+   return 0;
+}
+
 static int arm_smmu_setup_irqs(struct arm_smmu_device *smmu)
 {
int ret, irq;
u32 irqen_flags = IRQ_CTRL_EVTQ_IRQEN | IRQ_CTRL_GERROR_IRQEN;
+   u32 irqflags = 0;
 
/* Disable IRQs first */
ret = arm_smmu_write_reg_sync(smmu, 0, ARM_SMMU_IRQ_CTRL,
@@ -2254,9 +2274,10 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device 
*smmu)
/* Request interrupt lines */
irq = smmu->evtq.q.irq;
if (irq) {
+   irqflags = get_irq_flags(smmu, irq);
ret = devm_request_threaded_irq(smmu->dev, irq, NULL,
arm_smmu_evtq_thread,
-   IRQF_ONESHOT,
+   IRQF_ONESHOT | irqflags,
"arm-smmu-v3-evtq", smmu);
if (ret < 0)
dev_warn(smmu->dev, "failed to enable evtq irq\n");
@@ -2264,8 +2285,9 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device 
*smmu)
 
irq = smmu->cmdq.q.irq;
if (irq) {
+   irqflags = get_irq_flags(smmu, irq);
ret = devm_request_irq(smmu->dev, irq,
-  arm_smmu_cmdq_sync_handler, 0,
+  arm_smmu_cmdq_sync_handler, irqflags,
   "arm-smmu-v3-cmdq-sync", smmu);
if (ret < 0)
dev_warn(smmu->dev, "failed to enable cmdq-sync irq\n");
@@ -2273,8 +2295,9 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device 
*smmu)
 
irq = smmu->gerr_irq;
if (irq) {
+   irqflags = get_irq_flags(smmu, irq);
ret = devm_request_irq(smmu->dev, irq, arm_smmu_gerror_handler,
-  0, "arm-smmu-v3-gerror", smmu);
+  irqflags, "arm-smmu-v3-gerror", smmu);
if (ret < 0)
dev_warn(smmu->dev, "failed to enable gerror irq\n");
}
@@ -2282,9 +2305,10 @@ static int arm_smmu_setup_irqs(struct arm_smmu_device 
*smmu)
if (smmu->features & ARM_SMMU_FEAT_PRI) {
irq = smmu->priq.q.irq;
if (irq) {
+   irqflags = get_irq_flags(smmu, irq);
ret = devm_request_threaded_irq(smmu->dev, irq, NULL,
arm_smmu_priq_thread,
-   IRQF_ONESHOT,
+   IRQF_ONESHOT | irqflags,
"arm-smmu-v3-priq",
smmu);
if (ret < 0)
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 5/7] ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model

2017-05-05 Thread Geetha sowjanya
From: Linu Cherian 

Cavium ThunderX2 implementation doesn't support second page in SMMU
register space. Hence, resource size is set as 64k for this model.

Signed-off-by: Linu Cherian 
Signed-off-by: Geetha Sowjanya 
---
 drivers/acpi/arm64/iort.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index c5fecf9..23c5350 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -833,12 +833,20 @@ static void __init arm_smmu_v3_init_resources(struct 
resource *res,
 {
struct acpi_iort_smmu_v3 *smmu;
int num_res = 0;
+   unsigned long size = SZ_128K;
 
/* Retrieve SMMUv3 specific data */
smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
 
+   /*
+* Override the size, for Cavium ThunderX2 implementation
+* which doesn't support the page 1 SMMU register space.
+*/
+   if (smmu->model == ACPI_IORT_SMMU_V3_CAVIUM_CN99XX)
+   size = SZ_64K;
+
res[num_res].start = smmu->base_address;
-   res[num_res].end = smmu->base_address + SZ_128K - 1;
+   res[num_res].end = smmu->base_address + size - 1;
res[num_res].flags = IORESOURCE_MEM;
 
num_res++;
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 4/7] iommu/arm-smmu-v3: For ACPI based device probing, set PAGE0_REGS_ONLY option for ThunderX2 SMMUv3 implementation.

2017-05-05 Thread Geetha sowjanya
From: Linu Cherian 

Enable PAGE0_REGS_ONLY option for Cavium ThunderX2 SMMUv3 model.

Signed-off-by: Linu Cherian 
Signed-off-by: Geetha Sowjanya 
---
 drivers/iommu/arm-smmu-v3.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index f027676..8f7d8ad 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2625,6 +2625,14 @@ static int arm_smmu_device_hw_probe(struct 
arm_smmu_device *smmu)
 }
 
 #ifdef CONFIG_ACPI
+static void acpi_smmu_get_options(u32 model, struct arm_smmu_device *smmu)
+{
+   if (model == ACPI_IORT_SMMU_V3_CAVIUM_CN99XX)
+   smmu->options |= ARM_SMMU_OPT_PAGE0_REGS_ONLY;
+
+   dev_notice(smmu->dev, "option mask 0x%x\n", smmu->options);
+}
+
 static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
  struct arm_smmu_device *smmu)
 {
@@ -2637,6 +2645,8 @@ static int arm_smmu_device_acpi_probe(struct 
platform_device *pdev,
/* Retrieve SMMUv3 specific data */
iort_smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
 
+   acpi_smmu_get_options(iort_smmu->model, smmu);
+
if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
smmu->features |= ARM_SMMU_FEAT_COHERENCY;
 
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/7] iommu/arm-smmu-v3: Do resource size checks based on SMMU

2017-05-05 Thread Geetha sowjanya
From: Linu Cherian 

With implementations supporting only page 0 register space,
resource size can be 64k as well and hence perform size checks
based on SMMU option PAGE0_REGS_ONLY.

For this, arm_smmu_device_dt_probe/acpi_probe has been moved before
platform_get_resource call, so that SMMU options are set beforehand.

Signed-off-by:  Linu Cherian 
Signed-off-by:  Geetha Sowjanya 
---
 drivers/iommu/arm-smmu-v3.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 107b4a6..f027676 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2672,6 +2672,14 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev,
return ret;
 }
 
+static unsigned long arm_smmu_resource_size(struct arm_smmu_device *smmu)
+{
+   if (ARM_SMMU_PAGE0_REGS_ONLY(smmu))
+   return SZ_64K;
+   else
+   return SZ_128K;
+}
+
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
int irq, ret;
@@ -2688,9 +2696,17 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
}
smmu->dev = dev;
 
+   if (dev->of_node) {
+   ret = arm_smmu_device_dt_probe(pdev, smmu);
+   } else {
+   ret = arm_smmu_device_acpi_probe(pdev, smmu);
+   if (ret == -ENODEV)
+   return ret;
+   }
+
/* Base address */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   if (resource_size(res) + 1 < SZ_128K) {
+   if (resource_size(res) + 1 < arm_smmu_resource_size(smmu)) {
dev_err(dev, "MMIO region too small (%pr)\n", res);
return -EINVAL;
}
@@ -2717,14 +2733,6 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
if (irq > 0)
smmu->gerr_irq = irq;
 
-   if (dev->of_node) {
-   ret = arm_smmu_device_dt_probe(pdev, smmu);
-   } else {
-   ret = arm_smmu_device_acpi_probe(pdev, smmu);
-   if (ret == -ENODEV)
-   return ret;
-   }
-
/* Set bypass mode according to firmware probing result */
bypass = !!ret;
 
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/7] ACPICA: IORT: Add Cavium ThunderX2 SMMUv3 model definition.

2017-05-05 Thread Geetha sowjanya
From: Linu Cherian 

Add SMMUv3 model definition for ThunderX2.

Signed-off-by: Linu Cherian 
Signed-off-by: Geetha Sowjanya 
---
 include/acpi/actbl2.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index faa9f2c..76a6f5d 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -779,6 +779,8 @@ struct acpi_iort_smmu {
 #define ACPI_IORT_SMMU_CORELINK_MMU400  0x0002 /* ARM Corelink MMU-400 
*/
 #define ACPI_IORT_SMMU_CORELINK_MMU500  0x0003 /* ARM Corelink MMU-500 
*/
 
+#define ACPI_IORT_SMMU_V3_CAVIUM_CN99XX 0x0002 /* Cavium ThunderX2 SMMUv3 
*/
+
 /* Masks for Flags field above */
 
 #define ACPI_IORT_SMMU_DVM_SUPPORTED(1)
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/7] Cavium ThunderX2 SMMUv3 errata workarounds

2017-05-05 Thread Geetha sowjanya
From: Linu Cherian 

Cavium ThunderX2 SMMUv3 implementation has two Silicon Erratas.
1. Errata ID #74
   SMMU register alias Page 1 is not implemented
2. Errata ID #126
   SMMU doesnt support unique IRQ lines and also MSI for gerror, 
   eventq and cmdq-sync

The following patchset does software workaround for these two erratas.

This series is based on patchset. 
https://www.spinics.net/lists/arm-kernel/msg578443.html

Changes from v1:
 Since the use of MIDR register is rejected and SMMU_IIDR is broken on this 
 silicon, as suggested by Will Deacon modified the patches to use ThunderX2 
 SMMUv3 IORT model number to enable errata workaround.

Changes from v2:
 Updated "Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt" document 
with 
 new SMMU option used to enable errata workaround.
 
Geetha Sowjanya (1):
  iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126

Linu Cherian (6):
  iommu/arm-smmu-v3: Introduce smmu option PAGE0_REGS_ONLY for ThunderX2
errata#74.
  iommu/arm-smmu-v3: Do resource size checks based on SMMU option
PAGE0_REGS_ONLY
  ACPICA: IORT: Add Cavium ThunderX2 SMMUv3 model definition.
  iommu/arm-smmu-v3: For ACPI based device probing, set PAGE0_REGS_ONLY
option for ThunderX2 SMMUv3 implementations.
  ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3
model
  arm64: Documentation: Add Cavium ThunderX2 SMMUv3 erratas

 Documentation/arm64/silicon-errata.txt |   2 +
 drivers/acpi/arm64/iort.c  |  10 ++-
 drivers/iommu/arm-smmu-v3.c| 122 ++---
 include/acpi/actbl2.h  |   2 +
 4 files changed, 110 insertions(+), 26 deletions(-)

-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/7] iommu/arm-smmu-v3: Introduce SMMU option PAGE0_REGS_ONLY for ThunderX2 errata #74

2017-05-05 Thread Geetha sowjanya
From: Linu Cherian 

Cavium ThunderX2 SMMU implementation doesn't support page 1 register space
and PAGE0_REGS_ONLY option will be enabled as an errata workaround.

This option when turned on, replaces all page 1 offsets used for
EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets.

Signed-off-by: Linu Cherian 
Signed-off-by: Geetha Sowjanya 
---
 .../devicetree/bindings/iommu/arm,smmu-v3.txt  |  6 +++
 drivers/iommu/arm-smmu-v3.c| 44 --
 2 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
index be57550..e6da62b 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
@@ -49,6 +49,12 @@ the PCIe specification.
 - hisilicon,broken-prefetch-cmd
 : Avoid sending CMD_PREFETCH_* commands to the SMMU.
 
+- cavium-cn99xx,broken-page1-regspace
+: Replaces all page 1 offsets used for EVTQ_PROD/CONS,
+   PRIQ_PROD/CONS register access 
with page 0 offsets.
+   Set for Caviun ThunderX2 
silicon that doesn't support
+   SMMU page1 register space.
+
 ** Example
 
 smmu@2b40 {
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 380969a..107b4a6 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -176,15 +176,15 @@
 #define ARM_SMMU_CMDQ_CONS 0x9c
 
 #define ARM_SMMU_EVTQ_BASE 0xa0
-#define ARM_SMMU_EVTQ_PROD 0x100a8
-#define ARM_SMMU_EVTQ_CONS 0x100ac
+#define ARM_SMMU_EVTQ_PROD(smmu)   (page1_offset_adjust(0x100a8, smmu))
+#define ARM_SMMU_EVTQ_CONS(smmu)   (page1_offset_adjust(0x100ac, smmu))
 #define ARM_SMMU_EVTQ_IRQ_CFG0 0xb0
 #define ARM_SMMU_EVTQ_IRQ_CFG1 0xb8
 #define ARM_SMMU_EVTQ_IRQ_CFG2 0xbc
 
 #define ARM_SMMU_PRIQ_BASE 0xc0
-#define ARM_SMMU_PRIQ_PROD 0x100c8
-#define ARM_SMMU_PRIQ_CONS 0x100cc
+#define ARM_SMMU_PRIQ_PROD(smmu)   (page1_offset_adjust(0x100c8, smmu))
+#define ARM_SMMU_PRIQ_CONS(smmu)   (page1_offset_adjust(0x100cc, smmu))
 #define ARM_SMMU_PRIQ_IRQ_CFG0 0xd0
 #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8
 #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc
@@ -412,6 +412,9 @@
 #define MSI_IOVA_BASE  0x800
 #define MSI_IOVA_LENGTH0x10
 
+#define ARM_SMMU_PAGE0_REGS_ONLY(smmu) \
+   ((smmu)->options & ARM_SMMU_OPT_PAGE0_REGS_ONLY)
+
 static bool disable_bypass;
 module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
@@ -597,6 +600,7 @@ struct arm_smmu_device {
u32 features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
+#define ARM_SMMU_OPT_PAGE0_REGS_ONLY(1 << 1)
u32 options;
 
struct arm_smmu_cmdqcmdq;
@@ -663,9 +667,19 @@ struct arm_smmu_option_prop {
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
+   { ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium-cn99xx,broken-page1-regspace"},
{ 0, NULL},
 };
 
+static inline unsigned long page1_offset_adjust(
+   unsigned long off, struct arm_smmu_device *smmu)
+{
+   if (!ARM_SMMU_PAGE0_REGS_ONLY(smmu))
+   return off;
+   else
+   return (off - SZ_64K);
+}
+
 static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
 {
return container_of(dom, struct arm_smmu_domain, domain);
@@ -1986,8 +2000,10 @@ static int arm_smmu_init_queues(struct arm_smmu_device 
*smmu)
return ret;
 
/* evtq */
-   ret = arm_smmu_init_one_queue(smmu, &smmu->evtq.q, ARM_SMMU_EVTQ_PROD,
- ARM_SMMU_EVTQ_CONS, EVTQ_ENT_DWORDS);
+   ret = arm_smmu_init_one_queue(smmu, &smmu->evtq.q,
+ ARM_SMMU_EVTQ_PROD(smmu),
+ ARM_SMMU_EVTQ_CONS(smmu),
+ EVTQ_ENT_DWORDS);
if (ret)
return ret;
 
@@ -1995,8 +2011,10 @@ static int arm_smmu_init_queues(struct arm_smmu_device 
*smmu)
if (!(smmu->features & ARM_SMMU_FEAT_PRI))
return 0;
 
-   return arm_smmu_init_one_queue(smmu, &smmu->priq.q, ARM_SMMU_PRIQ_PROD,
-  ARM_SMMU_PRIQ_CONS, PRIQ_ENT_DWORDS);
+   return arm_smmu_init_one_queue(smmu, &smmu->priq.q,
+  ARM_SMMU_PRIQ_PROD(smmu),
+  ARM_SMMU_PRIQ_CONS(smmu),
+  PRIQ_E

Re: [PATCH 2/3] iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74

2017-05-05 Thread Geetha Akula
Hi Jon,

Charles Garcia-Tobin has confirmed that the new IORT spec includes Cavium
ThunderX model number.
I have resubmitted patches based on IORT model number.
https://www.spinics.net/lists/arm-kernel/msg579511.html


Thank you,
Geetha.


On Fri, May 5, 2017 at 5:06 AM, Jon Masters  wrote:

> On 05/03/2017 05:47 AM, Will Deacon wrote:
> > Hi Geetha,
> >
> > On Tue, May 02, 2017 at 12:01:15PM +0530, Geetha Akula wrote:
> >> SMMU_IIDR register is broken on T99, that the reason we are using MIDR.
> >
> > Urgh, that's unfortunate. In what way is it broken?
> >
> >> If using MIDR is not accepted, can we enable errata based on SMMU
> resource size?
> >> some thing like below.
> >
> > No, you need to get your model number added to IORT after all if the IIDR
> > can't uniqely identify the part.
> >
> > Sorry
>
> [I've pinged the IORT author directly with a copy of the above message]
>
> Can folks please take action urgently if the IORT spec needs updating to
> accommodate additional vendor IDs.
>
> Jon.
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: AMD Ryzen KVM/NPT/IOMMU issue

2017-05-05 Thread Matthias Ehrenfeuchter
I recognized (with npt disabled) the VM is getting slower over time, 
like in Windows the system process is taking more and more CPU usage. A 
soft restart does help makeing it "usable" again. Also wondering if this 
is an hardware related issue in Ryzen, so the upcoming Naples does have 
it too? This would be a nogo for the server platform and, in my eyes, 
the death even pre-released.


Regards


Am 03.05.2017 um 18:28 schrieb Nick Sarnie:

On Wed, May 3, 2017 at 10:37 AM, Matthias Ehrenfeuchter  wrote:

Hi,

There are a lot of messages/threads out there about bad performance while
using AMDs Ryzen with KVM GPU passthrough. It revolves all on
enabling/disabling npt, while enabled overall VM performance is nice but the
GPU performance gives me about 20% (and a lot of drops to zero GPU usage,
while CPU/Disk/Ram also doing nothing) compared to npt disabled. But while
npt is disabled overall VM performance is like beeing on 4x86 with floppy
disk as only storage. (Ex. it takes 2 seconds just to open startmenu while
host and vm are in idle, and neither CPU pinning, changing CPU model,
changing storage device nor using hugepages changed anything).

So everything I read pointed to a bug in the npt implementation? Anything I
could do to get closer to the "thing" issuing this?

Best Regards

efeu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

I heard from Joerg that it might be related to a lower intercept rate
being used when NPT is enabled, but we haven't been able to find a way
to trace that to confirm.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/3] OF/PCI address PCI inbound memory limitations

2017-05-05 Thread Oza Pawandeep via iommu
It is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound 
transaction addressing.

This is particularly problematic on ARM/ARM64 SOCs where the 
IOMMU (i.e. SMMU) translates IOVA to PA for in-bound transactions
only after PCI Host has forwarded these transactions on SOC IO bus. 

This means, on such ARM/ARM64 SOCs the IOVA of in-bound transactions
has to honor the addressing restrictions of the PCI Host.

Current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices but no implementation exists for pci devices.

For e.g. iproc based SOCs and other SOCs (such as rcar) have
PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

This patchset
reserves the IOVA ranges for PCI masters based
on the PCI world dma-ranges.
fix of_dma_get_range to cater to PCI dma-ranges.
fix of_dma_get_range which currently returns size 0 for PCI devices.

IOVA allocation patch:
[PATCH 2/3] iommu/pci: reserve iova for PCI masters

Fix of_dma_get_range bug and address PCI master.
[PATCH 3/3] PCI/of fix of_dma_get_range; get PCI specific

Base patch for both of the above patches:
[PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

Changes since v2:
- Remove internal detailes such as reviewed by  

Changes since v1:
- Remove internal GERRIT details from patch descriptions
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of this function to of_bus_default_dma_get_ranges
  and adding that to the default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin
  
Oza Pawandeep (3):
  of/pci/dma: fix DMA configuration for PCI masters
  iommu/pci: reserve iova for PCI masters
  PCI/of fix of_dma_get_range; get PCI specific dma-ranges

 drivers/iommu/dma-iommu.c | 35 +
 drivers/of/address.c  | 52 
 drivers/of/of_pci.c   | 77 +++
 include/linux/of_pci.h|  7 +
 4 files changed, 171 insertions(+)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/3] OF/PCI address PCI inbound memory limitations

2017-05-05 Thread Oza Pawandeep via iommu
It is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound 
transaction addressing.

This is particularly problematic on ARM/ARM64 SOCs where the 
IOMMU (i.e. SMMU) translates IOVA to PA for in-bound transactions
only after PCI Host has forwarded these transactions on SOC IO bus. 

This means, on such ARM/ARM64 SOCs the IOVA of in-bound transactions
has to honor the addressing restrictions of the PCI Host.

Current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices but no implementation exists for pci devices.

For e.g. iproc based SOCs and other SOCs (such as rcar) have
PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

This patchset
reserves the IOVA ranges for PCI masters based
on the PCI world dma-ranges.
fix of_dma_get_range to cater to PCI dma-ranges.
fix of_dma_get_range which currently returns size 0 for PCI devices.

IOVA allocation patch:
[PATCH 2/3] iommu/pci: reserve iova for PCI masters

Fix of_dma_get_range bug and address PCI master.
[PATCH 3/3] PCI/of fix of_dma_get_range; get PCI specific

Base patch for both of the above patches:
[PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

Changes since v2:
- Remove internal detailes such as reviewed by  

Changes since v1:
- Remove internal GERRIT details from patch descriptions
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of this function to of_bus_default_dma_get_ranges
  and adding that to the default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin
  
Oza Pawandeep (3):
  of/pci/dma: fix DMA configuration for PCI masters
  iommu/pci: reserve iova for PCI masters
  PCI/of fix of_dma_get_range; get PCI specific dma-ranges

 drivers/iommu/dma-iommu.c | 35 +
 drivers/of/address.c  | 52 
 drivers/of/of_pci.c   | 77 +++
 include/linux/of_pci.h|  7 +
 4 files changed, 171 insertions(+)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/3] of/pci/dma: fix DMA configuration for PCI masters

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch serves following:

1) exposes interface to the pci host driver for their
inbound memory ranges

2) provide an interface to callers such as of_dma_get_ranges.
so then the returned size get best possible (largest) dma_mask.
because PCI RC drivers do not call APIs such as
dma_set_coherent_mask() and hence rather it shows its addressing
capabilities based on dma-ranges.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

3) this patch handles multiple inbound windows and dma-ranges.
it is left to the caller, how it wants to use them.
the new function returns the resources in a standard and unform way

4) this way the callers of for e.g. of_dma_get_ranges
does not need to change.

5) leaves scope of adding PCI flag handling for inbound memory
by the new function.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 0ee42c3..ed6e69a 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -283,6 +283,83 @@ int of_pci_get_host_bridge_resources(struct device_node 
*dev,
return err;
 }
 EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
+/**
+ * of_pci_get_dma_ranges - Parse PCI host bridge inbound resources from DT
+ * @np: device node of the host bridge having the dma-ranges property
+ * @resources: list where the range of resources will be added after DT parsing
+ *
+ * It is the caller's job to free the @resources list.
+ *
+ * This function will parse the "dma-ranges" property of a
+ * PCI host bridge device node and setup the resource mapping based
+ * on its content.
+ *
+ * It returns zero if the range parsing has been successful or a standard error
+ * value if it failed.
+ */
+
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources)
+{
+   struct device_node *node = of_node_get(np);
+   int rlen;
+   int ret = 0;
+   const int na = 3, ns = 2;
+   struct resource *res;
+   struct of_pci_range_parser parser;
+   struct of_pci_range range;
+
+   if (!node)
+   return -EINVAL;
+
+   parser.node = node;
+   parser.pna = of_n_addr_cells(node);
+   parser.np = parser.pna + na + ns;
+
+   parser.range = of_get_property(node, "dma-ranges", &rlen);
+
+   if (!parser.range) {
+   pr_debug("pcie device has no dma-ranges defined for node(%s)\n",
+ np->full_name);
+   ret = -EINVAL;
+   goto out;
+   }
+
+   parser.end = parser.range + rlen / sizeof(__be32);
+
+   for_each_of_pci_range(&parser, &range) {
+   /*
+* If we failed translation or got a zero-sized region
+* then skip this range
+*/
+   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+   continue;
+
+   res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+   if (!res) {
+   ret = -ENOMEM;
+   goto parse_failed;
+   }
+
+   ret = of_pci_range_to_resource(&range, np, res);
+   if (ret) {
+   kfree(res);
+   continue;
+   }
+
+   pci_add_resource_offset(resources, res,
+   res->start - range.pci_addr);
+   }
+
+   return ret;
+
+parse_failed:
+   pci_free_resource_list(resources);
+out:
+   of_node_put(node);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges);
 #endif /* CONFIG_OF_ADDRESS */
 
 #ifdef CONFIG_PCI_MSI
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 0e0974e..617b90d 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { }
 int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
struct list_head *resources, resource_size_t *io_base);
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources);
 #else
 static inline int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
@@ -83,6 +84,12 @@ static inline int of_pci_get_host_bridge_resources(struct 
device_node *dev,
 {
return -EINVAL;
 }
+
+static inline int of_pci_get_dma_ranges(struct device_node *np,
+

[PATCH v3 3/3] PCI/of fix of_dma_get_range; get PCI specific dma-ranges

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch fixes the bug in of_dma_get_range, which with as is,
parses the PCI memory ranges and return wrong size as 0.

in order to get largest possible dma_mask. this patch also
retuns the largest possible size based on dma-ranges,

for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

based on which IOVA allocation space will honour PCI host
bridge limitations.

the implementation hooks bus specific callbacks for getting
dma-ranges.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903..cc0fc28 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -46,6 +47,8 @@ struct of_bus {
int na, int ns, int pna);
int (*translate)(__be32 *addr, u64 offset, int na);
unsigned int(*get_flags)(const __be32 *addr);
+   int (*get_dma_ranges)(struct device_node *np,
+ u64 *dma_addr, u64 *paddr, u64 *size);
 };
 
 /*
@@ -171,6 +174,146 @@ static int of_bus_pci_translate(__be32 *addr, u64 offset, 
int na)
 {
return of_bus_default_translate(addr + 1, offset, na - 1);
 }
+
+static int of_bus_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr,
+u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   int ret = 0;
+   struct resource_entry *window;
+   LIST_HEAD(res);
+
+   if (!node)
+   return -EINVAL;
+
+   if (of_bus_pci_match(np)) {
+   *size = 0;
+   /*
+* PCI dma-ranges is not mandatory property.
+* many devices do no need to have it, since
+* host bridge does not require inbound memory
+* configuration or rather have design limitations.
+* so we look for dma-ranges, if missing we
+* just return the caller full size, and also
+* no dma-ranges suggests that, host bridge allows
+* whatever comes in, so we set dma_addr to 0.
+*/
+   ret = of_pci_get_dma_ranges(np, &res);
+   if (!ret) {
+   resource_list_for_each_entry(window, &res) {
+   struct resource *res_dma = window->res;
+
+   if (*size < resource_size(res_dma)) {
+   *dma_addr = res_dma->start - window->offset;
+   *paddr = res_dma->start;
+   *size = resource_size(res_dma);
+   }
+   }
+   }
+   pci_free_resource_list(&res);
+
+   /*
+* return the largest possible size,
+* since PCI master allows everything.
+*/
+   if (*size == 0) {
+   pr_debug("empty/zero size dma-ranges found for 
node(%s)\n",
+   np->full_name);
+   *size = DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1;
+   *dma_addr = *paddr = 0;
+   ret = 0;
+   }
+
+   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+*dma_addr, *paddr, *size);
+   }
+
+   of_node_put(node);
+
+   return ret;
+}
+
+static int get_dma_ranges(struct device_node *np, u64 *dma_addr,
+ u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   const __be32 *ranges = NULL;
+   int len, naddr, nsize, pna;
+   int ret = 0;
+   u64 dmaaddr;
+
+   if (!node)
+   return -EINVAL;
+
+   while (1) {
+   naddr = of_n_addr_cells(node);
+   nsize = of_n_size_cells(node);
+   node = of_get_next_parent(node);
+   if (!node)
+   break;
+
+   ranges = of_get_property(node, "dma-ranges", &len);
+
+   /* Ignore empty ranges, they imply no translation required */
+   if (ranges && len > 0)
+   break;
+
+   /*
+* At least empty ranges has to be defined for parent node if
+* DMA is supported
+*/
+   if (!ranges)
+   

[PATCH v2] iommu/arm-smmu-v3: Increase CMDQ drain timeout value

2017-05-05 Thread sunil . kovvuri
From: Sunil Goutham 

Processing queue full of TLB invalidation commands might
take more time on some platforms than current timeout
of 100us. So increased drain timeout value.

Also now udelay time is increased exponentially for each poll.

Signed-off-by: Sunil Goutham 
---

v2
- Addressed Will's and Robin's comments i.e
  increased poll timeout only for drain case and removed spin loop
  whose logic anyway is screwed up.
- Also changed commit subject and message as this patch no longer
  exactly reflects what is done in SMMU driver i.e
  8513c8930069 iommu/arm-smmu: Poll for TLB sync completion more effectively

 drivers/iommu/arm-smmu-v3.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d412bdd..cbc8309 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -379,6 +379,8 @@
 #define CMDQ_SYNC_0_CS_NONE(0UL << CMDQ_SYNC_0_CS_SHIFT)
 #define CMDQ_SYNC_0_CS_SEV (2UL << CMDQ_SYNC_0_CS_SHIFT)
 
+#define CMDQ_DRAIN_TIMEOUT_US  1000
+
 /* Event queue */
 #define EVTQ_ENT_DWORDS4
 #define EVTQ_MAX_SZ_SHIFT  7
@@ -737,7 +739,12 @@ static void queue_inc_prod(struct arm_smmu_queue *q)
  */
 static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
 {
-   ktime_t timeout = ktime_add_us(ktime_get(), ARM_SMMU_POLL_TIMEOUT_US);
+   ktime_t timeout;
+   unsigned int delay = 1;
+
+   /* Wait longer if it's queue drain */
+   timeout = ktime_add_us(ktime_get(), drain ? CMDQ_DRAIN_TIMEOUT_US :
+   ARM_SMMU_POLL_TIMEOUT_US);
 
while (queue_sync_cons(q), (drain ? !queue_empty(q) : queue_full(q))) {
if (ktime_compare(ktime_get(), timeout) > 0)
@@ -747,7 +754,11 @@ static int queue_poll_cons(struct arm_smmu_queue *q, bool 
drain, bool wfe)
wfe();
} else {
cpu_relax();
-   udelay(1);
+   udelay(delay);
+   /* No point in sleeping for fixed time,
+* if cons isn't moving fast.
+*/
+   delay *= 2;
}
}
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/dma: Don't touch invalid iova_domain members

2017-05-05 Thread Robin Murphy
When __iommu_dma_map() and iommu_dma_free_iova() are called from
iommu_dma_get_msi_page(), various iova_*() helpers are still invoked in
the process, whcih is unwise since they access a different member of the
union (the iova_domain) from that which was last written, and there's no
guarantee that sensible values will result anyway.

CLean up the code paths that are valid for an MSI cookie to ensure we
only do iova_domain-specific things when we're actually dealing with one.

Reported-by: Nate Watterson 
Tested-by: Shanker Donthineni 
Tested-by: Bharat Bhushan 
Signed-off-by: Robin Murphy 
---

I've taken the liberty of upgrading the prose testing confirmations
into tested-by tags, hope that's OK.

Joerg; I'm happy to resend this after -rc1 with a fixes tag if you'd
rather - I'm just throwing it out now for the sake of catching up with
things.

Robin.

 drivers/iommu/dma-iommu.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 8348f366ddd1..62618e77bedc 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -396,13 +396,13 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie 
*cookie,
dma_addr_t iova, size_t size)
 {
struct iova_domain *iovad = &cookie->iovad;
-   unsigned long shift = iova_shift(iovad);
 
/* The MSI case is only ever cleaning up its most recent allocation */
if (cookie->type == IOMMU_DMA_MSI_COOKIE)
cookie->msi_iova -= size;
else
-   free_iova_fast(iovad, iova >> shift, size >> shift);
+   free_iova_fast(iovad, iova_pfn(iovad, iova),
+   size >> iova_shift(iovad));
 }
 
 static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr,
@@ -617,11 +617,14 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
 {
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
-   struct iova_domain *iovad = &cookie->iovad;
-   size_t iova_off = iova_offset(iovad, phys);
+   size_t iova_off = 0;
dma_addr_t iova;
 
-   size = iova_align(iovad, size + iova_off);
+   if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
+   iova_off = iova_offset(&cookie->iovad, phys);
+   size = iova_align(&cookie->iovad, size + iova_off);
+   }
+
iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
if (!iova)
return DMA_ERROR_CODE;
-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Amir Goldstein
On Fri, May 5, 2017 at 12:57 PM, Christoph Hellwig  wrote:
> On Fri, May 05, 2017 at 12:50:31PM +0300, Amir Goldstein wrote:
>> To complete the picture for folks not cc'ed on my patches,
>> xfs use case suggests there is also justification for the additional helpers:
>>
>> uuid_is_null() / uuid_equal()
>> guid_is_null() / guid_equal()
>
> The is_null is useful and I bet we'll find other instances. I'm
> not sure _equals really adds much value over the existing _cmp
> helpers, but on the other hand they are so trivial that we might as
> well add them.

Exactly. The fact that not only xfs used the same helper name
(drivers/md/md.c) suggests that it useful.

>
> The other thing XFS has is uuid_copy.

Andy already listed uuid_copy.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Christoph Hellwig
On Fri, May 05, 2017 at 12:50:31PM +0300, Amir Goldstein wrote:
> To complete the picture for folks not cc'ed on my patches,
> xfs use case suggests there is also justification for the additional helpers:
> 
> uuid_is_null() / uuid_equal()
> guid_is_null() / guid_equal()

The is_null is useful and I bet we'll find other instances.  I'm
not sure _equals really adds much value over the existing _cmp
helpers, but on the other hand they are so trivial that we might as
well add them.

The other thing XFS has is uuid_copy.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Amir Goldstein
On Fri, May 5, 2017 at 12:24 PM, Andy Shevchenko
 wrote:
> On Fri, 2017-05-05 at 10:06 +0300, Amir Goldstein wrote:
[...]
>> I think with this semantic change, our proposals can reach common
>> grounds
>> and satisfy a wider group of users (i.e. filesystem developers).
>>
>> Christoph also suggested a similar treatment to typedef guid_t to
>> struct uuid_le.
>> I don't know the use cases enough to comment on that.
>
> We may go this way. But I wouldn't prevent current users of uuid_le to
> continue using it without conversion (it may be done case by case after
> we settle an API)
>
> So, summarize what Christoph said it will look like
>
> typedef uuid_be uuid_t;
> typedef uuid_le guid_t
>
> uuid_cmp() / uuid_copy() / uuid_to_bin() / etc
> guid_cmp() / guid_copy() / guid_to_bin() / etc
>
> Correct? Christoph?
>

That looks right to me.

To complete the picture for folks not cc'ed on my patches,
xfs use case suggests there is also justification for the additional helpers:

uuid_is_null() / uuid_equal()
guid_is_null() / guid_equal()

Cheers,
Amir.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Andy Shevchenko
On Fri, 2017-05-05 at 10:06 +0300, Amir Goldstein wrote:
> On Fri, May 5, 2017 at 9:20 AM, Dan Williams  > wrote:
> > On Thu, May 4, 2017 at 2:21 AM, Andy Shevchenko
> >  wrote:


> > > for (i = 0; i < NFIT_UUID_MAX; i++)
> > > -   if (memcmp(to_nfit_uuid(i), spa->range_guid, 16)
> > > == 0)
> > > +   if (!uuid_le_cmp_pp(to_nfit_uuid(i), (uuid_le
> > > *)spa->range_guid))
> > 
> > What is _cmp_pp? Why not _compare?

Dan, it's a typo. In this patch it should be just ..._cmp(), which is
already a part of API.

> > 
> 
> I second that.
> 
> Andy,

Amir, just to be clear. This patch can be applied without any addons to
an existing API. Above is just a typo due to rebase in my tree. I will
replace it to just uuid_le_cmp().

> I much rather that you sort out uuid helpers in a way that will
> satisfy the filesystem
> needs (just provide the helpers don't need to convert filesystems
> code).

> The only reason I took a swing at hoisting the xfs uuid helpers is
> because it didn't
> seem like your proposal was going to be posted soon or wasn't going to
> satisfy
> the filesystems use case.

> 
> My opinion now, is that your suggestion is probably much closer to the
> real deal
> than mine.
> 
> IMO, you should acknowledge that the common use case for filesystems
> is
> to handle an opaque char[16] which most likely holds a uuid_be and you
> should provide 'neutral' helpers to satisfy this use case.
> 
> The simplest would be to typedef uuid_t to struct uuid_be and to name
> 'neutral'
> helpers' uuid_cmp/uuid_copy(uuid_t *, uuid_t *), similar to my
> proposal.

> I think with this semantic change, our proposals can reach common
> grounds
> and satisfy a wider group of users (i.e. filesystem developers).
> 
> Christoph also suggested a similar treatment to typedef guid_t to
> struct uuid_le.
> I don't know the use cases enough to comment on that.

We may go this way. But I wouldn't prevent current users of uuid_le to
continue using it without conversion (it may be done case by case after
we settle an API)

So, summarize what Christoph said it will look like

typedef uuid_be uuid_t;
typedef uuid_le guid_t

uuid_cmp() / uuid_copy() / uuid_to_bin() / etc
guid_cmp() / guid_copy() / guid_to_bin() / etc

Correct? Christoph?

-- 
Andy Shevchenko 
Intel Finland Oy
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 0/3] OF/PCI address PCI inbound memory limitations

2017-05-05 Thread Oza Pawandeep via iommu
It is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound 
transaction addressing.

This is particularly problematic on ARM/ARM64 SOCs where the 
IOMMU (i.e. SMMU) translates IOVA to PA for in-bound transactions
only after PCI Host has forwarded these transactions on SOC IO bus. 

This means, on such ARM/ARM64 SOCs the IOVA of in-bound transactions
has to honor the addressing restrictions of the PCI Host.

Current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices but no implementation exists for pci devices.

For e.g. iproc based SOCs and other SOCs (such as rcar) have
PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

This patchset
reserves the IOVA ranges for PCI masters based
on the PCI world dma-ranges.
fix of_dma_get_range to cater to PCI dma-ranges.
fix of_dma_get_range which currently returns size 0 for PCI devices.

IOVA allocation patch:
[PATCH 2/3] iommu/pci: reserve iova for PCI masters

Fix of_dma_get_range bug and address PCI master.
[PATCH 3/3] PCI/of fix of_dma_get_range; get PCI specific

Base patch for both of the above patches:
[PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

Changes since v1:
- Remove internal GERRIT details from patch descriptions
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of this function to of_bus_default_dma_get_ranges
  and adding that to the default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin
  
Oza Pawandeep (3):
  of/pci/dma: fix DMA configuration for PCI masters
  iommu/pci: reserve iova for PCI masters
  PCI/of fix of_dma_get_range; get PCI specific dma-ranges

 drivers/iommu/dma-iommu.c | 35 +
 drivers/of/address.c  | 52 
 drivers/of/of_pci.c   | 77 +++
 include/linux/of_pci.h|  7 +
 4 files changed, 171 insertions(+)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/3] PCI/of fix of_dma_get_range; get PCI specific dma-ranges

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch fixes the bug in of_dma_get_range, which with as is,
parses the PCI memory ranges and return wrong size as 0.

in order to get largest possible dma_mask. this patch also
retuns the largest possible size based on dma-ranges,

for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

based on which IOVA allocation space will honour PCI host
bridge limitations.

the implementation hooks bus specific callbacks for getting
dma-ranges.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903..cc0fc28 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -46,6 +47,8 @@ struct of_bus {
int na, int ns, int pna);
int (*translate)(__be32 *addr, u64 offset, int na);
unsigned int(*get_flags)(const __be32 *addr);
+   int (*get_dma_ranges)(struct device_node *np,
+ u64 *dma_addr, u64 *paddr, u64 *size);
 };
 
 /*
@@ -171,6 +174,146 @@ static int of_bus_pci_translate(__be32 *addr, u64 offset, 
int na)
 {
return of_bus_default_translate(addr + 1, offset, na - 1);
 }
+
+static int of_bus_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr,
+u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   int ret = 0;
+   struct resource_entry *window;
+   LIST_HEAD(res);
+
+   if (!node)
+   return -EINVAL;
+
+   if (of_bus_pci_match(np)) {
+   *size = 0;
+   /*
+* PCI dma-ranges is not mandatory property.
+* many devices do no need to have it, since
+* host bridge does not require inbound memory
+* configuration or rather have design limitations.
+* so we look for dma-ranges, if missing we
+* just return the caller full size, and also
+* no dma-ranges suggests that, host bridge allows
+* whatever comes in, so we set dma_addr to 0.
+*/
+   ret = of_pci_get_dma_ranges(np, &res);
+   if (!ret) {
+   resource_list_for_each_entry(window, &res) {
+   struct resource *res_dma = window->res;
+
+   if (*size < resource_size(res_dma)) {
+   *dma_addr = res_dma->start - window->offset;
+   *paddr = res_dma->start;
+   *size = resource_size(res_dma);
+   }
+   }
+   }
+   pci_free_resource_list(&res);
+
+   /*
+* return the largest possible size,
+* since PCI master allows everything.
+*/
+   if (*size == 0) {
+   pr_debug("empty/zero size dma-ranges found for 
node(%s)\n",
+   np->full_name);
+   *size = DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1;
+   *dma_addr = *paddr = 0;
+   ret = 0;
+   }
+
+   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+*dma_addr, *paddr, *size);
+   }
+
+   of_node_put(node);
+
+   return ret;
+}
+
+static int get_dma_ranges(struct device_node *np, u64 *dma_addr,
+ u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   const __be32 *ranges = NULL;
+   int len, naddr, nsize, pna;
+   int ret = 0;
+   u64 dmaaddr;
+
+   if (!node)
+   return -EINVAL;
+
+   while (1) {
+   naddr = of_n_addr_cells(node);
+   nsize = of_n_size_cells(node);
+   node = of_get_next_parent(node);
+   if (!node)
+   break;
+
+   ranges = of_get_property(node, "dma-ranges", &len);
+
+   /* Ignore empty ranges, they imply no translation required */
+   if (ranges && len > 0)
+   break;
+
+   /*
+* At least empty ranges has to be defined for parent node if
+* DMA is supported
+*/
+   if (!ranges)
+   

[PATCH v2 1/3] of/pci/dma: fix DMA configuration for PCI masters

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch serves following:

1) exposes interface to the pci host driver for their
inbound memory ranges

2) provide an interface to callers such as of_dma_get_ranges.
so then the returned size get best possible (largest) dma_mask.
because PCI RC drivers do not call APIs such as
dma_set_coherent_mask() and hence rather it shows its addressing
capabilities based on dma-ranges.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

3) this patch handles multiple inbound windows and dma-ranges.
it is left to the caller, how it wants to use them.
the new function returns the resources in a standard and unform way

4) this way the callers of for e.g. of_dma_get_ranges
does not need to change.

5) leaves scope of adding PCI flag handling for inbound memory
by the new function.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 0ee42c3..ed6e69a 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -283,6 +283,83 @@ int of_pci_get_host_bridge_resources(struct device_node 
*dev,
return err;
 }
 EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
+/**
+ * of_pci_get_dma_ranges - Parse PCI host bridge inbound resources from DT
+ * @np: device node of the host bridge having the dma-ranges property
+ * @resources: list where the range of resources will be added after DT parsing
+ *
+ * It is the caller's job to free the @resources list.
+ *
+ * This function will parse the "dma-ranges" property of a
+ * PCI host bridge device node and setup the resource mapping based
+ * on its content.
+ *
+ * It returns zero if the range parsing has been successful or a standard error
+ * value if it failed.
+ */
+
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources)
+{
+   struct device_node *node = of_node_get(np);
+   int rlen;
+   int ret = 0;
+   const int na = 3, ns = 2;
+   struct resource *res;
+   struct of_pci_range_parser parser;
+   struct of_pci_range range;
+
+   if (!node)
+   return -EINVAL;
+
+   parser.node = node;
+   parser.pna = of_n_addr_cells(node);
+   parser.np = parser.pna + na + ns;
+
+   parser.range = of_get_property(node, "dma-ranges", &rlen);
+
+   if (!parser.range) {
+   pr_debug("pcie device has no dma-ranges defined for node(%s)\n",
+ np->full_name);
+   ret = -EINVAL;
+   goto out;
+   }
+
+   parser.end = parser.range + rlen / sizeof(__be32);
+
+   for_each_of_pci_range(&parser, &range) {
+   /*
+* If we failed translation or got a zero-sized region
+* then skip this range
+*/
+   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+   continue;
+
+   res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+   if (!res) {
+   ret = -ENOMEM;
+   goto parse_failed;
+   }
+
+   ret = of_pci_range_to_resource(&range, np, res);
+   if (ret) {
+   kfree(res);
+   continue;
+   }
+
+   pci_add_resource_offset(resources, res,
+   res->start - range.pci_addr);
+   }
+
+   return ret;
+
+parse_failed:
+   pci_free_resource_list(resources);
+out:
+   of_node_put(node);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges);
 #endif /* CONFIG_OF_ADDRESS */
 
 #ifdef CONFIG_PCI_MSI
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 0e0974e..617b90d 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { }
 int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
struct list_head *resources, resource_size_t *io_base);
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources);
 #else
 static inline int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
@@ -83,6 +84,12 @@ static inline int of_pci_get_host_bridge_resources(struct 
device_node *dev,
 {
return -EINVAL;
 }
+
+static inline int of_pci_get_dma_ranges(struct device_node *np,
+

[PATCH v2 0/3] OF/PCI address PCI inbound memory limitations

2017-05-05 Thread Oza Pawandeep via iommu
It is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound 
transaction addressing.

This is particularly problematic on ARM/ARM64 SOCs where the 
IOMMU (i.e. SMMU) translates IOVA to PA for in-bound transactions
only after PCI Host has forwarded these transactions on SOC IO bus. 

This means, on such ARM/ARM64 SOCs the IOVA of in-bound transactions
has to honor the addressing restrictions of the PCI Host.

Current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices but no implementation exists for pci devices.

For e.g. iproc based SOCs and other SOCs (such as rcar) have
PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

This patchset
reserves the IOVA ranges for PCI masters based
on the PCI world dma-ranges.
fix of_dma_get_range to cater to PCI dma-ranges.
fix of_dma_get_range which currently returns size 0 for PCI devices.

IOVA allocation patch:
[PATCH 2/3] iommu/pci: reserve iova for PCI masters

Fix of_dma_get_range bug and address PCI master.
[PATCH 3/3] PCI/of fix of_dma_get_range; get PCI specific

Base patch for both of the above patches:
[PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

Changes since v1:
- Remove internal GERRIT details from patch descriptions
- address Rob's comments.
- Add a get_dma_ranges() function to of_bus struct..
- Convert existing contents of this function to of_bus_default_dma_get_ranges
  and adding that to the default of_bus struct.
- Make of_dma_get_range call of_bus_match() and then bus->get_dma_ranges.
- no revison for [PATCH 2/3] iommu/pci: reserve iova for PCI masters; since 
under discussion with Robin
  
Oza Pawandeep (3):
  of/pci/dma: fix DMA configuration for PCI masters
  iommu/pci: reserve iova for PCI masters
  PCI/of fix of_dma_get_range; get PCI specific dma-ranges

 drivers/iommu/dma-iommu.c | 35 +
 drivers/of/address.c  | 52 
 drivers/of/of_pci.c   | 77 +++
 include/linux/of_pci.h|  7 +
 4 files changed, 171 insertions(+)

-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/3] PCI/of fix of_dma_get_range; get PCI specific dma-ranges

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch fixes the bug in of_dma_get_range, which with as is,
parses the PCI memory ranges and return wrong size as 0.

in order to get largest possible dma_mask. this patch also
retuns the largest possible size based on dma-ranges,

for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

based on which IOVA allocation space will honour PCI host
bridge limitations.

the implementation hooks bus specific callbacks for getting
dma-ranges.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 02b2903..cc0fc28 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -46,6 +47,8 @@ struct of_bus {
int na, int ns, int pna);
int (*translate)(__be32 *addr, u64 offset, int na);
unsigned int(*get_flags)(const __be32 *addr);
+   int (*get_dma_ranges)(struct device_node *np,
+ u64 *dma_addr, u64 *paddr, u64 *size);
 };
 
 /*
@@ -171,6 +174,146 @@ static int of_bus_pci_translate(__be32 *addr, u64 offset, 
int na)
 {
return of_bus_default_translate(addr + 1, offset, na - 1);
 }
+
+static int of_bus_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr,
+u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   int ret = 0;
+   struct resource_entry *window;
+   LIST_HEAD(res);
+
+   if (!node)
+   return -EINVAL;
+
+   if (of_bus_pci_match(np)) {
+   *size = 0;
+   /*
+* PCI dma-ranges is not mandatory property.
+* many devices do no need to have it, since
+* host bridge does not require inbound memory
+* configuration or rather have design limitations.
+* so we look for dma-ranges, if missing we
+* just return the caller full size, and also
+* no dma-ranges suggests that, host bridge allows
+* whatever comes in, so we set dma_addr to 0.
+*/
+   ret = of_pci_get_dma_ranges(np, &res);
+   if (!ret) {
+   resource_list_for_each_entry(window, &res) {
+   struct resource *res_dma = window->res;
+
+   if (*size < resource_size(res_dma)) {
+   *dma_addr = res_dma->start - window->offset;
+   *paddr = res_dma->start;
+   *size = resource_size(res_dma);
+   }
+   }
+   }
+   pci_free_resource_list(&res);
+
+   /*
+* return the largest possible size,
+* since PCI master allows everything.
+*/
+   if (*size == 0) {
+   pr_debug("empty/zero size dma-ranges found for 
node(%s)\n",
+   np->full_name);
+   *size = DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1;
+   *dma_addr = *paddr = 0;
+   ret = 0;
+   }
+
+   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+*dma_addr, *paddr, *size);
+   }
+
+   of_node_put(node);
+
+   return ret;
+}
+
+static int get_dma_ranges(struct device_node *np, u64 *dma_addr,
+ u64 *paddr, u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   const __be32 *ranges = NULL;
+   int len, naddr, nsize, pna;
+   int ret = 0;
+   u64 dmaaddr;
+
+   if (!node)
+   return -EINVAL;
+
+   while (1) {
+   naddr = of_n_addr_cells(node);
+   nsize = of_n_size_cells(node);
+   node = of_get_next_parent(node);
+   if (!node)
+   break;
+
+   ranges = of_get_property(node, "dma-ranges", &len);
+
+   /* Ignore empty ranges, they imply no translation required */
+   if (ranges && len > 0)
+   break;
+
+   /*
+* At least empty ranges has to be defined for parent node if
+* DMA is supported
+*/
+   if (!ranges)
+   

[PATCH v2 1/3] of/pci/dma: fix DMA configuration for PCI masters

2017-05-05 Thread Oza Pawandeep via iommu
current device framework and OF framework integration assumes
dma-ranges in a way where memory-mapped devices define their
dma-ranges. (child-bus-address, parent-bus-address, length).

of_dma_configure is specifically written to take care of memory
mapped devices. but no implementation exists for pci to take
care of pcie based memory ranges.

for e.g. iproc based SOCs and other SOCs(suc as rcar) have PCI
world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

this patch serves following:

1) exposes interface to the pci host driver for their
inbound memory ranges

2) provide an interface to callers such as of_dma_get_ranges.
so then the returned size get best possible (largest) dma_mask.
because PCI RC drivers do not call APIs such as
dma_set_coherent_mask() and hence rather it shows its addressing
capabilities based on dma-ranges.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

3) this patch handles multiple inbound windows and dma-ranges.
it is left to the caller, how it wants to use them.
the new function returns the resources in a standard and unform way

4) this way the callers of for e.g. of_dma_get_ranges
does not need to change.

5) leaves scope of adding PCI flag handling for inbound memory
by the new function.

Signed-off-by: Oza Pawandeep 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 

diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 0ee42c3..ed6e69a 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -283,6 +283,83 @@ int of_pci_get_host_bridge_resources(struct device_node 
*dev,
return err;
 }
 EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
+/**
+ * of_pci_get_dma_ranges - Parse PCI host bridge inbound resources from DT
+ * @np: device node of the host bridge having the dma-ranges property
+ * @resources: list where the range of resources will be added after DT parsing
+ *
+ * It is the caller's job to free the @resources list.
+ *
+ * This function will parse the "dma-ranges" property of a
+ * PCI host bridge device node and setup the resource mapping based
+ * on its content.
+ *
+ * It returns zero if the range parsing has been successful or a standard error
+ * value if it failed.
+ */
+
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources)
+{
+   struct device_node *node = of_node_get(np);
+   int rlen;
+   int ret = 0;
+   const int na = 3, ns = 2;
+   struct resource *res;
+   struct of_pci_range_parser parser;
+   struct of_pci_range range;
+
+   if (!node)
+   return -EINVAL;
+
+   parser.node = node;
+   parser.pna = of_n_addr_cells(node);
+   parser.np = parser.pna + na + ns;
+
+   parser.range = of_get_property(node, "dma-ranges", &rlen);
+
+   if (!parser.range) {
+   pr_debug("pcie device has no dma-ranges defined for node(%s)\n",
+ np->full_name);
+   ret = -EINVAL;
+   goto out;
+   }
+
+   parser.end = parser.range + rlen / sizeof(__be32);
+
+   for_each_of_pci_range(&parser, &range) {
+   /*
+* If we failed translation or got a zero-sized region
+* then skip this range
+*/
+   if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
+   continue;
+
+   res = kzalloc(sizeof(struct resource), GFP_KERNEL);
+   if (!res) {
+   ret = -ENOMEM;
+   goto parse_failed;
+   }
+
+   ret = of_pci_range_to_resource(&range, np, res);
+   if (ret) {
+   kfree(res);
+   continue;
+   }
+
+   pci_add_resource_offset(resources, res,
+   res->start - range.pci_addr);
+   }
+
+   return ret;
+
+parse_failed:
+   pci_free_resource_list(resources);
+out:
+   of_node_put(node);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges);
 #endif /* CONFIG_OF_ADDRESS */
 
 #ifdef CONFIG_PCI_MSI
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 0e0974e..617b90d 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { }
 int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
struct list_head *resources, resource_size_t *io_base);
+int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources);
 #else
 static inline int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
@@ -83,6 +84,12 @@ static inline int of_pci_get_host_bridge_resources(struct 
device_node *dev,
 {
return -EINVAL;
 }
+
+static inline int of_pci_get_dma_ra

Re: [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Amir Goldstein
On Fri, May 5, 2017 at 9:20 AM, Dan Williams  wrote:
> On Thu, May 4, 2017 at 2:21 AM, Andy Shevchenko
>  wrote:
>> acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
>> bytes. Instead we convert them to use uuid_le type. At the same time we
>> convert current users.
>>
>> acpi_str_to_uuid() becomes useless after the conversion and it's safe to
>> get rid of it.
>>
>> The conversion fixes a potential bug in int340x_thermal as well since
>> we have to use memcmp() on binary data.
>>
>> Cc: Rafael J. Wysocki 
>> Cc: Mika Westerberg 
>> Cc: Borislav Petkov 
>> Cc: Dan Williams 
>> Cc: Amir Goldstein 
>> Cc: Jarkko Sakkinen 
>> Cc: Jani Nikula 
>> Cc: Ben Skeggs 
>> Cc: Benjamin Tissoires 
>> Cc: Joerg Roedel 
>> Cc: Adrian Hunter 
>> Cc: Yisen Zhuang 
>> Cc: Bjorn Helgaas 
>> Cc: Zhang Rui 
>> Cc: Felipe Balbi 
>> Cc: Mathias Nyman 
>> Cc: Heikki Krogerus 
>> Cc: Liam Girdwood 
>> Cc: Mark Brown 
>> Signed-off-by: Andy Shevchenko 
> [..]
>> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
>> index 0f7982a1caaf..bd3e45ede056 100644
>> --- a/drivers/acpi/nfit/core.c
>> +++ b/drivers/acpi/nfit/core.c
>> @@ -74,11 +74,11 @@ struct nfit_table_prev {
>> struct list_head flushes;
>>  };
>>
>> -static u8 nfit_uuid[NFIT_UUID_MAX][16];
>> +static uuid_le nfit_uuid[NFIT_UUID_MAX];
>>
>> -const u8 *to_nfit_uuid(enum nfit_uuids id)
>> +const uuid_le *to_nfit_uuid(enum nfit_uuids id)
>>  {
>> -   return nfit_uuid[id];
>> +   return &nfit_uuid[id];
>>  }
>>  EXPORT_SYMBOL(to_nfit_uuid);
>>
>> @@ -207,7 +207,7 @@ int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, 
>> struct nvdimm *nvdimm,
>> u32 offset, fw_status = 0;
>> acpi_handle handle;
>> unsigned int func;
>> -   const u8 *uuid;
>> +   const uuid_le *uuid;
>> int rc, i;
>>
>> func = cmd;
>> @@ -394,7 +394,7 @@ int nfit_spa_type(struct acpi_nfit_system_address *spa)
>> int i;
>>
>> for (i = 0; i < NFIT_UUID_MAX; i++)
>> -   if (memcmp(to_nfit_uuid(i), spa->range_guid, 16) == 0)
>> +   if (!uuid_le_cmp_pp(to_nfit_uuid(i), (uuid_le 
>> *)spa->range_guid))
>
> What is _cmp_pp? Why not _compare?
>

I second that.

Andy,

I much rather that you sort out uuid helpers in a way that will
satisfy the filesystem
needs (just provide the helpers don't need to convert filesystems code).

The only reason I took a swing at hoisting the xfs uuid helpers is
because it didn't
seem like your proposal was going to be posted soon or wasn't going to satisfy
the filesystems use case.

My opinion now, is that your suggestion is probably much closer to the real deal
than mine.

IMO, you should acknowledge that the common use case for filesystems is
to handle an opaque char[16] which most likely holds a uuid_be and you
should provide 'neutral' helpers to satisfy this use case.

The simplest would be to typedef uuid_t to struct uuid_be and to name 'neutral'
helpers' uuid_cmp/uuid_copy(uuid_t *, uuid_t *), similar to my proposal.

I think with this semantic change, our proposals can reach common grounds
and satisfy a wider group of users (i.e. filesystem developers).

Christoph also suggested a similar treatment to typedef guid_t to
struct uuid_le.
I don't know the use cases enough to comment on that.

Cheers,
Amir.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/pci: reserve iova for PCI masters

2017-05-05 Thread Oza Oza via iommu
On Thu, May 4, 2017 at 11:50 PM, Robin Murphy  wrote:
> On 03/05/17 05:46, Oza Pawandeep wrote:
>> this patch reserves the iova for PCI masters.
>> ARM64 based SOCs may have scattered memory banks.
>> such as iproc based SOC has
>>
>> <0x 0x8000 0x0 0x8000>, /* 2G @ 2G */
>> <0x0008 0x8000 0x3 0x8000>, /* 14G @ 34G */
>> <0x0090 0x 0x4 0x>, /* 16G @ 576G */
>> <0x00a0 0x 0x4 0x>; /* 16G @ 640G */
>>
>> but incoming PCI transcation addressing capability is limited
>> by host bridge, for example if max incoming window capability
>> is 512 GB, then 0x0090 and 0x00a0 will fall beyond it.
>>
>> to address this problem, iommu has to avoid allocating iovas which
>> are reserved. which inturn does not allocate iova if it falls into hole.
>
> I don't necessarily disagree with doing this, as we could do with facing
> up to the issue of discontiguous DMA ranges in particular (I too have a
> platform with this problem), but I'm still not overly keen on pulling DT
> specifics into this layer. More than that, though, if we are going to do
> it, then we should do it for all devices with a restrictive
> "dma-ranges", not just PCI ones.
>

pci_create_root_bus allocates host bridge, and currently it takes only
oubound resources.

if inbound memory is also added as a part of  pci_create_root_bus params,
then IOVA allocation can directly make use of inbound_windows member
of structure pci_host_bridge.

struct pci_host_bridge {
struct device dev;
struct pci_bus *bus; /* root bus */
struct list_head windows; /* resource_entry */
struct list_head inbound_windows; /* resource_entry */
.
.
}

so iova_reserve_pci_windows can iterate throough
resource_list_for_each_entry(window, &bridge->inbound_windows)
this way we can remove the dependency of dma-iommu.c on OF layer.

but only thing is:
pci_create_root_bus is called by handful of RC drivers, which needs to change.
ideally if you see both inbound and outbound resource should belong to
pci_host_bridge anyway.
and inbound is completely missing.

let me know your thoughts on this, Robin.

>> Bug: SOC-5216
>> Change-Id: Icbfc99a045d730be143fef427098c937b9d46353
>> Signed-off-by: Oza Pawandeep 
>> Reviewed-on: http://gerrit-ccxsw.broadcom.net/40760
>> Reviewed-by: vpx_checkpatch status 
>> Reviewed-by: CCXSW 
>> Tested-by: vpx_autobuild status 
>> Tested-by: vpx_smoketest status 
>> Tested-by: CCXSW 
>> Reviewed-by: Scott Branden 
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 48d36ce..08764b0 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -27,6 +27,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -171,8 +172,12 @@ static void iova_reserve_pci_windows(struct pci_dev 
>> *dev,
>>   struct iova_domain *iovad)
>>  {
>>   struct pci_host_bridge *bridge = pci_find_host_bridge(dev->bus);
>> + struct device_node *np = bridge->dev.parent->of_node;
>>   struct resource_entry *window;
>>   unsigned long lo, hi;
>> + int ret;
>> + dma_addr_t tmp_dma_addr = 0, dma_addr;
>> + LIST_HEAD(res);
>>
>>   resource_list_for_each_entry(window, &bridge->windows) {
>>   if (resource_type(window->res) != IORESOURCE_MEM &&
>> @@ -183,6 +188,36 @@ static void iova_reserve_pci_windows(struct pci_dev 
>> *dev,
>>   hi = iova_pfn(iovad, window->res->end - window->offset);
>>   reserve_iova(iovad, lo, hi);
>>   }
>> +
>> + /* PCI inbound memory reservation. */
>> + ret = of_pci_get_dma_ranges(np, &res);
>> + if (!ret) {
>> + resource_list_for_each_entry(window, &res) {
>> + struct resource *res_dma = window->res;
>> +
>> + dma_addr = res_dma->start - window->offset;
>> + if (tmp_dma_addr > dma_addr) {
>> + pr_warn("PCI: failed to reserve iovas; ranges 
>> should be sorted\n");
>
> I don't see anything in the DT spec about the entries having to be
> sorted, and it's not exactly impossible to sort a list if you need it so
> (and if I'm being really pedantic, one could still trigger this with a
> list that *is* sorted, only by different criteria).
>
> Robin.
>
>> + return;
>> + }
>> + if (tmp_dma_addr != dma_addr) {
>> + lo = iova_pfn(iovad, tmp_dma_addr);
>> + hi = iova_pfn(iovad, dma_addr - 1);
>> + reserve_iova(iovad, lo, hi);
>> + }
>> + tmp_dma_addr = window->res->end - window->offset;
>> + }
>> + /*
>> +  * the last dma-range should honour based on the
>> +  * 32/64-bit dma addresses.
>> +  */
>> + if (tmp_dma_a

Re: [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-05 Thread Christoph Hellwig
On Fri, May 05, 2017 at 10:06:06AM +0300, Amir Goldstein wrote:
> I much rather that you sort out uuid helpers in a way that will
> satisfy the filesystem
> needs (just provide the helpers don't need to convert filesystems code).

Yeah.

> IMO, you should acknowledge that the common use case for filesystems is
> to handle an opaque char[16] which most likely holds a uuid_be and you
> should provide 'neutral' helpers to satisfy this use case.
> 
> The simplest would be to typedef uuid_t to struct uuid_be and to name 
> 'neutral'
> helpers' uuid_cmp/uuid_copy(uuid_t *, uuid_t *), similar to my proposal.

It's not jut neutral, it's the right thing to to.  The Apollo / DCE
uuids (later also specified in RFC 4122) are big endian, so we should
use the name there.

> Christoph also suggested a similar treatment to typedef guid_t to
> struct uuid_le.

Exactly.  The whole idea of "little endian UUIDs" comes from the
Wintel world, and if you look at the relevant specs they are almost
exclusively referred to as GUIDs.

The magic XFS and AFS types for specific interpretations of one of
the RFC4122 formats don't really help, but I'll just send a patch
to kill them off for XFS ASAP to at least get that out, and we
probably should revert at least

"afs: Move UUID struct to linux/uuid.h"

That moved the AFS mess to common code as a start, and then
also clean up the way we generate random UUIDs, where we currently
have le helper, a be helper and then also generate_random_uuid
just to confuse the heck out of people.  With no description of
either of them.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu