[PATCH v2] iommu/vt-d: add NUMA awareness to intel_alloc_coherent()

2018-01-31 Thread Eric Dumazet
From: Eric Dumazet 

Some devices (like mlx4) try hard to allocate memory on selected
NUMA node, but it turns out intel_alloc_coherent() is not NUMA
aware yet.

Note that dma_generic_alloc_coherent() in arch/x86/kernel/pci-dma.c
gets this right.

Signed-off-by: Eric Dumazet 
Cc: Benjamin Serebrin 
Cc: David Woodhouse 
Cc: Joerg Roedel 
---
v2: no fallback to alloc_pages(), this is not needed and might even
hurt in OOM cases.

 drivers/iommu/intel-iommu.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 
a1373cf343269455808f66ad18dc0a2fb7aa73f2..3c538466a98bdb8fffdca688462b1350d536791b
 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3735,7 +3735,7 @@ static void *intel_alloc_coherent(struct device *dev, 
size_t size,
}
 
if (!page)
-   page = alloc_pages(flags, order);
+   page = alloc_pages_node(dev_to_node(dev), flags, order);
if (!page)
return NULL;
memset(page_address(page), 0, size);

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-01-31 Thread Vivek Gautam



On 1/31/2018 5:53 PM, Robin Murphy wrote:

On 19/01/18 11:43, Vivek Gautam wrote:

From: Sricharan R 

The smmu needs to be functional only when the respective
master's using it are active. The device_link feature
helps to track such functional dependencies, so that the
iommu gets powered when the master device enables itself
using pm_runtime. So by adapting the smmu driver for
runtime pm, above said dependency can be addressed.

This patch adds the pm runtime/sleep callbacks to the
driver and also the functions to parse the smmu clocks
from DT and enable them in resume/suspend.

Signed-off-by: Sricharan R 
Signed-off-by: Archit Taneja 
[vivek: Clock rework to request bulk of clocks]
Signed-off-by: Vivek Gautam 
---
  drivers/iommu/arm-smmu.c | 55 
++--

  1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 78d4c6b8f1ba..21acffe91a1c 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -48,6 +48,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  @@ -205,6 +206,9 @@ struct arm_smmu_device {
  u32    num_global_irqs;
  u32    num_context_irqs;
  unsigned int    *irqs;
+    struct clk_bulk_data    *clocks;
+    int    num_clks;
+    const char * const    *clk_names;


This seems unnecessary, as we use it a grand total of of once, during 
initialisation when we have the source data directly to hand. Just 
pass data->clks into arm_smmu_init_clks() as an additional argument.


Sure, will do that.


Otherwise, I think this looks reasonable; it's about as unobtrusive as 
it's going to get.


Thanks for reviewing.

regards
Vivek



Robin.


  u32    cavium_id_base; /* Specific to Cavium */
  @@ -1685,6 +1689,25 @@ static int arm_smmu_id_size_to_bits(int size)
  }
  }
  +static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)
+{
+    int i;
+    int num = smmu->num_clks;
+
+    if (num < 1)
+    return 0;
+
+    smmu->clocks = devm_kcalloc(smmu->dev, num,
+    sizeof(*smmu->clocks), GFP_KERNEL);
+    if (!smmu->clocks)
+    return -ENOMEM;
+
+    for (i = 0; i < num; i++)
+    smmu->clocks[i].id = smmu->clk_names[i];
+
+    return devm_clk_bulk_get(smmu->dev, num, smmu->clocks);
+}
+
  static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
  {
  unsigned long size;
@@ -1897,10 +1920,12 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)

  struct arm_smmu_match_data {
  enum arm_smmu_arch_version version;
  enum arm_smmu_implementation model;
+    const char * const *clks;
+    int num_clks;
  };
    #define ARM_SMMU_MATCH_DATA(name, ver, imp)    \
-static struct arm_smmu_match_data name = { .version = ver, .model = 
imp }
+static const struct arm_smmu_match_data name = { .version = ver, 
.model = imp }

    ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU);
  ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
@@ -2001,6 +2026,8 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev,

  data = of_device_get_match_data(dev);
  smmu->version = data->version;
  smmu->model = data->model;
+    smmu->clk_names = data->clks;
+    smmu->num_clks = data->num_clks;
    parse_driver_options(smmu);
  @@ -2099,6 +2126,10 @@ static int arm_smmu_device_probe(struct 
platform_device *pdev)

  smmu->irqs[i] = irq;
  }
  +    err = arm_smmu_init_clocks(smmu);
+    if (err)
+    return err;
+
  err = arm_smmu_device_cfg_probe(smmu);
  if (err)
  return err;
@@ -2197,7 +2228,27 @@ static int __maybe_unused 
arm_smmu_pm_resume(struct device *dev)

  return 0;
  }
  -static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume);
+static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
+{
+    struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+
+    return clk_bulk_prepare_enable(smmu->num_clks, smmu->clocks);
+}
+
+static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
+{
+    struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+
+    clk_bulk_disable_unprepare(smmu->num_clks, smmu->clocks);
+
+    return 0;
+}
+
+static const struct dev_pm_ops arm_smmu_pm_ops = {
+    SET_SYSTEM_SLEEP_PM_OPS(NULL, arm_smmu_pm_resume)
+    SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
+   arm_smmu_runtime_resume, NULL)
+};
    static struct platform_driver arm_smmu_driver = {
  .driver    = {



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] iommu: Fix iommu_unmap and iommu_unmap_fast return type

2018-01-31 Thread Suravee Suthikulpanit

Hi Robin,

On 2/1/18 1:02 AM, Robin Murphy wrote:

Hi Suravee,

On 31/01/18 01:48, Suravee Suthikulpanit wrote:

Currently, iommu_unmap and iommu_unmap_fast return unmapped
pages with size_t.  However, the actual value returned could
be error codes (< 0), which can be misinterpreted as large
number of unmapped pages. Therefore, change the return type to ssize_t.

Cc: Joerg Roedel 
Cc: Alex Williamson 
Signed-off-by: Suravee Suthikulpanit 
---
  drivers/iommu/amd_iommu.c   |  6 +++---
  drivers/iommu/intel-iommu.c |  4 ++--


Er, there are a few more drivers than that implementing iommu_ops ;)


Ahh right.


It seems like it might be more sensible to fix the single instance of a driver returning 
-EINVAL (which appears to be a "should never happen if used correctly" kinda 
thing anyway) and leave the API-internal callback prototype as-is. I do agree the 
inconsistency of iommu_unmap() itself wants sorting, though (particularly the !IOMMU_API 
stubs which are wrong either way).

Robin.


Make sense. I'll leave the API alone, and change the code to not returning 
error then.
There are a few places to fix.

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/vt-d: add NUMA awareness to intel_alloc_coherent()

2018-01-31 Thread Eric Dumazet
On Wed, 2018-01-31 at 14:45 -0800, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> Some devices (like mlx4) try hard to allocate memory on selected
> NUMA node, but it turns out intel_alloc_coherent() is not NUMA
> aware yet.
> 
> Note that dma_generic_alloc_coherent() in arch/x86/kernel/pci-dma.c
> gets this right.
> 
> Signed-off-by: Eric Dumazet 
> Cc: Benjamin Serebrin 
> Cc: David Woodhouse 
> Cc: Joerg Roedel 
> ---
>  drivers/iommu/intel-iommu.c |7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 
> a1373cf343269455808f66ad18dc0a2fb7aa73f2..0efef077abc099eb29ebc5cefdd1b996f025dffd
>  100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -3734,8 +3734,11 @@ static void *intel_alloc_coherent(struct device *dev, 
> size_t size,
>   }
>   }
>  
> - if (!page)
> - page = alloc_pages(flags, order);
> + if (!page) {
> + page = alloc_pages_node(dev_to_node(dev), flags, order);
> + if (!page)
> + page = alloc_pages(flags, order);

I'll send a V2 without the fallback to alloc_pages()

This seems not necessary at all.


> + }
>   if (!page)
>   return NULL;
>   memset(page_address(page), 0, size);



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 SMMU Setup

2018-01-31 Thread Ard Biesheuvel


> On 31 Jan 2018, at 23:31, Thor Thayer  wrote:
> 
> Hi Ard,
> 
>> On 01/31/2018 04:21 PM, Ard Biesheuvel wrote:
>>> On 31 January 2018 at 22:09, Thor Thayer  
>>> wrote:
>>> Hi,
>>> 
>>> I'm enabling the ARM SMMU-500 on an ARM64 A53. I'm hitting a data abort in
>>> the probe functions because I'm accessing the registers in EL1 mode.
>>> 
>> Why do you think the fact that you are running at EL1 is causing the data 
>> abort?
> Yes, I may not be diagnosing this correctly. I narrowed it down to failing on 
> the first read in arm_smmu_device_cfg_probe().
> 
> When I read the same register with a debugger (I'm in EL1), it fails. I just 
> realized the read also fails in EL2.
> 
> If I elevate the read to EL3, I can read the default reset value from the 
> register but I mistakenly thought this was EL2.
> 

This confirms my suspicion. It is up to the secure firmware running in EL3 (or 
secure EL1) to configure the SMMU itself and the interconnect in a way that 
allows code running in EL2 or non-secure EL1 to access it (note that EL3 is 
always secure and EL2 is always non-secure)

I don’t know the details of the MMU-500 or the way it is integrated into your 
platform, so I can’t be of any more help, unfortunately.


> 
>>> Linux starts in EL2 mode but drops down to EL1 mode by the time it reaches
>>> the arm-smmu probe function.
>>> 
>>> Is there something else I need to add to either move the arm-smmu probe
>>> earlier or remain in EL2 until after the arm-smmu probe?
>>> 
>> No. I am pretty sure EL2 vs EL1 is not causing your issue. Could you
>> elaborate on the nature of the exception you are getting?
> The error I'm getting is:
> 
> Unhandled fault: synchronous external abort (0x9610) at 0x08e40020
> Internal error: : 9610 [#1] PREEMPT SMP

> To be honest I got lost trying to decode 0x9610 and I may have used the 
> wrong ARM document. I ended up interpreting this as Data Abort taken without 
> an Exception level from the ARMv8 ARM.
> 
> Additionally, the SMMU documentation seemed to indicate the Hypervisor would 
> need to run at EL2 so I ran with that.
> 
> Thank you for the reply and I'll go back and examine my assumptions again.
> 
> Thor
> 
>>> A Google search didn't turn up anything so I hoped posting to the list would
>>> help.
>>> 
>>> Thanks in advance,
>>> 
>>> Thor
>>> 
>>> 
>>> ___
>>> linux-arm-kernel mailing list
>>> linux-arm-ker...@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 SMMU Setup

2018-01-31 Thread Thor Thayer

Hi Ard,

On 01/31/2018 04:21 PM, Ard Biesheuvel wrote:

On 31 January 2018 at 22:09, Thor Thayer  wrote:

Hi,

I'm enabling the ARM SMMU-500 on an ARM64 A53. I'm hitting a data abort in
the probe functions because I'm accessing the registers in EL1 mode.



Why do you think the fact that you are running at EL1 is causing the data abort?

Yes, I may not be diagnosing this correctly. I narrowed it down to 
failing on the first read in arm_smmu_device_cfg_probe().


When I read the same register with a debugger (I'm in EL1), it fails. I 
just realized the read also fails in EL2.


If I elevate the read to EL3, I can read the default reset value from 
the register but I mistakenly thought this was EL2.




Linux starts in EL2 mode but drops down to EL1 mode by the time it reaches
the arm-smmu probe function.

Is there something else I need to add to either move the arm-smmu probe
earlier or remain in EL2 until after the arm-smmu probe?



No. I am pretty sure EL2 vs EL1 is not causing your issue. Could you
elaborate on the nature of the exception you are getting?


The error I'm getting is:

Unhandled fault: synchronous external abort (0x9610) at 
0x08e40020

Internal error: : 9610 [#1] PREEMPT SMP

To be honest I got lost trying to decode 0x9610 and I may have used 
the wrong ARM document. I ended up interpreting this as Data Abort taken 
without an Exception level from the ARMv8 ARM.


Additionally, the SMMU documentation seemed to indicate the Hypervisor 
would need to run at EL2 so I ran with that.


Thank you for the reply and I'll go back and examine my assumptions again.

Thor


A Google search didn't turn up anything so I hoped posting to the list would
help.

Thanks in advance,

Thor


___
linux-arm-kernel mailing list
linux-arm-ker...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: add NUMA awareness to intel_alloc_coherent()

2018-01-31 Thread Eric Dumazet
From: Eric Dumazet 

Some devices (like mlx4) try hard to allocate memory on selected
NUMA node, but it turns out intel_alloc_coherent() is not NUMA
aware yet.

Note that dma_generic_alloc_coherent() in arch/x86/kernel/pci-dma.c
gets this right.

Signed-off-by: Eric Dumazet 
Cc: Benjamin Serebrin 
Cc: David Woodhouse 
Cc: Joerg Roedel 
---
 drivers/iommu/intel-iommu.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 
a1373cf343269455808f66ad18dc0a2fb7aa73f2..0efef077abc099eb29ebc5cefdd1b996f025dffd
 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3734,8 +3734,11 @@ static void *intel_alloc_coherent(struct device *dev, 
size_t size,
}
}
 
-   if (!page)
-   page = alloc_pages(flags, order);
+   if (!page) {
+   page = alloc_pages_node(dev_to_node(dev), flags, order);
+   if (!page)
+   page = alloc_pages(flags, order);
+   }
if (!page)
return NULL;
memset(page_address(page), 0, size);

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: ARM64 SMMU Setup

2018-01-31 Thread Ard Biesheuvel
On 31 January 2018 at 22:09, Thor Thayer  wrote:
> Hi,
>
> I'm enabling the ARM SMMU-500 on an ARM64 A53. I'm hitting a data abort in
> the probe functions because I'm accessing the registers in EL1 mode.
>

Why do you think the fact that you are running at EL1 is causing the data abort?

> Linux starts in EL2 mode but drops down to EL1 mode by the time it reaches
> the arm-smmu probe function.
>
> Is there something else I need to add to either move the arm-smmu probe
> earlier or remain in EL2 until after the arm-smmu probe?
>

No. I am pretty sure EL2 vs EL1 is not causing your issue. Could you
elaborate on the nature of the exception you are getting?

> A Google search didn't turn up anything so I hoped posting to the list would
> help.
>
> Thanks in advance,
>
> Thor
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


ARM64 SMMU Setup

2018-01-31 Thread Thor Thayer

Hi,

I'm enabling the ARM SMMU-500 on an ARM64 A53. I'm hitting a data abort 
in the probe functions because I'm accessing the registers in EL1 mode.


Linux starts in EL2 mode but drops down to EL1 mode by the time it 
reaches the arm-smmu probe function.


Is there something else I need to add to either move the arm-smmu probe 
earlier or remain in EL2 until after the arm-smmu probe?


A Google search didn't turn up anything so I hoped posting to the list 
would help.


Thanks in advance,

Thor

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu: Fix iommu_unmap and iommu_unmap_fast return type

2018-01-31 Thread Robin Murphy

Hi Suravee,

On 31/01/18 01:48, Suravee Suthikulpanit wrote:

Currently, iommu_unmap and iommu_unmap_fast return unmapped
pages with size_t.  However, the actual value returned could
be error codes (< 0), which can be misinterpreted as large
number of unmapped pages. Therefore, change the return type to ssize_t.

Cc: Joerg Roedel 
Cc: Alex Williamson 
Signed-off-by: Suravee Suthikulpanit 
---
  drivers/iommu/amd_iommu.c   |  6 +++---
  drivers/iommu/intel-iommu.c |  4 ++--


Er, there are a few more drivers than that implementing iommu_ops ;)

It seems like it might be more sensible to fix the single instance of a 
driver returning -EINVAL (which appears to be a "should never happen if 
used correctly" kinda thing anyway) and leave the API-internal callback 
prototype as-is. I do agree the inconsistency of iommu_unmap() itself 
wants sorting, though (particularly the !IOMMU_API stubs which are wrong 
either way).


Robin.


  drivers/iommu/iommu.c   | 16 
  include/linux/iommu.h   | 20 ++--
  4 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 7d5eb00..3609f51 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3030,11 +3030,11 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
return ret;
  }
  
-static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,

-  size_t page_size)
+static ssize_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
+  size_t page_size)
  {
struct protection_domain *domain = to_pdomain(dom);
-   size_t unmap_size;
+   ssize_t unmap_size;
  
  	if (domain->mode == PAGE_MODE_NONE)

return -EINVAL;
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4a2de34..15ba866 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5068,8 +5068,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
return ret;
  }
  
-static size_t intel_iommu_unmap(struct iommu_domain *domain,

-   unsigned long iova, size_t size)
+static ssize_t intel_iommu_unmap(struct iommu_domain *domain,
+unsigned long iova, size_t size)
  {
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct page *freelist = NULL;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3de5c0b..8f7da8a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1557,12 +1557,12 @@ int iommu_map(struct iommu_domain *domain, unsigned 
long iova,
  }
  EXPORT_SYMBOL_GPL(iommu_map);
  
-static size_t __iommu_unmap(struct iommu_domain *domain,

-   unsigned long iova, size_t size,
-   bool sync)
+static ssize_t __iommu_unmap(struct iommu_domain *domain,
+unsigned long iova, size_t size,
+bool sync)
  {
const struct iommu_ops *ops = domain->ops;
-   size_t unmapped_page, unmapped = 0;
+   ssize_t unmapped_page, unmapped = 0;
unsigned long orig_iova = iova;
unsigned int min_pagesz;
  
@@ -1617,15 +1617,15 @@ static size_t __iommu_unmap(struct iommu_domain *domain,

return unmapped;
  }
  
-size_t iommu_unmap(struct iommu_domain *domain,

-  unsigned long iova, size_t size)
+ssize_t iommu_unmap(struct iommu_domain *domain,
+   unsigned long iova, size_t size)
  {
return __iommu_unmap(domain, iova, size, true);
  }
  EXPORT_SYMBOL_GPL(iommu_unmap);
  
-size_t iommu_unmap_fast(struct iommu_domain *domain,

-   unsigned long iova, size_t size)
+ssize_t iommu_unmap_fast(struct iommu_domain *domain,
+unsigned long iova, size_t size)
  {
return __iommu_unmap(domain, iova, size, false);
  }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 41b8c57..78df048 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -199,8 +199,8 @@ struct iommu_ops {
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
int (*map)(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot);
-   size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
-size_t size);
+   ssize_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
+size_t size);
size_t (*map_sg)(struct iommu_domain *domain, unsigned long iova,
 struct scatterlist *sg, unsigned int nents, int prot);
void (*flush_iotlb_all)(struct iommu_domain *domain);
@@ -299,10 +299,10 @@ extern void iommu_detach_device(struct iommu_domain 
*domain,
  extern struct 

Re: [PATCH v5 08/13] iommu/rockchip: Control clocks needed to access the IOMMU

2018-01-31 Thread Rob Herring
On Wed, Jan 31, 2018 at 1:52 AM, Tomasz Figa  wrote:
> Hi Rob,
>
> On Wed, Jan 31, 2018 at 2:05 AM, Rob Herring  wrote:
>> On Wed, Jan 24, 2018 at 06:35:11PM +0800, Jeffy Chen wrote:
>>> From: Tomasz Figa 
>>>
>>> Current code relies on master driver enabling necessary clocks before
>>> IOMMU is accessed, however there are cases when the IOMMU should be
>>> accessed while the master is not running yet, for example allocating
>>> V4L2 videobuf2 buffers, which is done by the VB2 framework using DMA
>>> mapping API and doesn't engage the master driver at all.
>>>
>>> This patch fixes the problem by letting clocks needed for IOMMU
>>> operation to be listed in Device Tree and making the driver enable them
>>> for the time of accessing the hardware.
>>>
>>> Signed-off-by: Jeffy Chen 
>>> Signed-off-by: Tomasz Figa 
>>> ---
>>>
>>> Changes in v5:
>>> Use clk_bulk APIs.
>>>
>>> Changes in v4: None
>>> Changes in v3: None
>>> Changes in v2: None
>>>
>>>  .../devicetree/bindings/iommu/rockchip,iommu.txt   |  8 +++
>>
>> Please split binding patches to a separate patch.
>>
>>>  drivers/iommu/rockchip-iommu.c | 74 
>>> --
>>>  2 files changed, 76 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt 
>>> b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
>>> index 2098f7732264..33dd853359fa 100644
>>> --- a/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
>>> +++ b/Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
>>> @@ -14,6 +14,13 @@ Required properties:
>>>  "single-master" device, and needs no additional 
>>> information
>>>  to associate with its master device.  See:
>>>  Documentation/devicetree/bindings/iommu/iommu.txt
>>> +Optional properties:
>>> +- clocks : A list of master clocks requires for the IOMMU to be accessible
>>> +   by the host CPU. The number of clocks depends on the master
>>> +   block and might as well be zero. See [1] for generic clock
>>> +   bindings description.
>>
>> Hardware blocks don't have a variable number of clock connections.
>
> I think you underestimate the imagination of hardware designers. :)

Learned long ago to never do that. If there are 2 ways to do
something, they will find a 3rd way.

> For Rockchip IOMMU, there is a set of clocks, which all need to be
> enabled for IOMMU register access to succeed. The clocks are not
> directly fed to the IOMMU, but they are needed for the various buses
> and intermediate blocks on the way to the IOMMU to work.

The binding should describe the clock connections, not what clocks a
driver needs (currently). It sounds like a lack of managing bus clocks
to me.

In any case, the binding must be written so it can be verified. If you
can have any number of clocks with any names, there's no point in
documenting.

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/13] drm/amdgpu: add DRM_AMDGPU_ATC config option

2018-01-31 Thread Jean-Philippe Brucker
Hi Christian,

On 31/01/18 13:04, Christian König wrote:
> Adding Jean and the IOMMU list as well.
> 
> Am 31.01.2018 um 13:43 schrieb Oded Gabbay:
>> On Tue, Jan 30, 2018 at 6:53 PM, Felix Kuehling  
>> wrote:
>>> [SNIP]
>> There was some discussion last year about a generic PASID allocator in
>> the iommu subsystem:
>> https://lists.linuxfoundation.org/pipermail/iommu/2017-June/022159.html
>>
>> Unfortunately, I don't think it got any further then that.

I also sent an RFCv2 in October:
https://www.spinics.net/lists/arm-kernel/msg609771.html

> Yeah, that sounds exactly like what I had in mind as well.
> 
> Jean any updates on that topic? We are rather interested in common PASID 
> handling as well.

Good to hear, I plan to send a new version of that series after the merge
window.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 4/6] iommu/arm-smmu: Add the device_link between masters and smmu

2018-01-31 Thread Robin Murphy

On 19/01/18 11:43, Vivek Gautam wrote:

From: Sricharan R 

Finally add the device link between the master device and
smmu, so that the smmu gets runtime enabled/disabled only when the
master needs it. This is done from add_device callback which gets
called once when the master is added to the smmu.


Don't we need to balance this with a device_link_del() in .remove_device 
(like exynos-iommu does)?


Robin.


Signed-off-by: Sricharan R 
---
  drivers/iommu/arm-smmu.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 95478bfb182c..33bbcfedb896 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1367,6 +1367,7 @@ static int arm_smmu_add_device(struct device *dev)
struct arm_smmu_device *smmu;
struct arm_smmu_master_cfg *cfg;
struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+   struct device_link *link;
int i, ret;
  
  	if (using_legacy_binding) {

@@ -1428,6 +1429,16 @@ static int arm_smmu_add_device(struct device *dev)
  
  	pm_runtime_put_sync(smmu->dev);
  
+	/*

+* Establish the link between smmu and master, so that the
+* smmu gets runtime enabled/disabled as per the master's
+* needs.
+*/
+   link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME);
+   if (!link)
+   dev_warn(smmu->dev, "Unable to create device link between %s and 
%s\n",
+dev_name(smmu->dev), dev_name(dev));
+
return 0;
  
  out_cfg_free:



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-01-31 Thread Robin Murphy

On 19/01/18 11:43, Vivek Gautam wrote:

From: Sricharan R 

The smmu device probe/remove and add/remove master device callbacks
gets called when the smmu is not linked to its master, that is without
the context of the master device. So calling runtime apis in those places
separately.

Signed-off-by: Sricharan R 
[vivek: Cleanup pm runtime calls]
Signed-off-by: Vivek Gautam 
---
  drivers/iommu/arm-smmu.c | 45 +
  1 file changed, 41 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 21acffe91a1c..95478bfb182c 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -914,11 +914,15 @@ static void arm_smmu_destroy_domain_context(struct 
iommu_domain *domain)
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_cfg *cfg = _domain->cfg;
-   int irq;
+   int ret, irq;
  
  	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)

return;
  
+	ret = pm_runtime_get_sync(smmu->dev);

+   if (ret)
+   return;
+
/*
 * Disable the context bank and free the page tables before freeing
 * it.
@@ -933,6 +937,8 @@ static void arm_smmu_destroy_domain_context(struct 
iommu_domain *domain)
  
  	free_io_pgtable_ops(smmu_domain->pgtbl_ops);

__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
+
+   pm_runtime_put_sync(smmu->dev);
  }
  
  static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)

@@ -1408,12 +1414,20 @@ static int arm_smmu_add_device(struct device *dev)
while (i--)
cfg->smendx[i] = INVALID_SMENDX;
  
-	ret = arm_smmu_master_alloc_smes(dev);

+   ret = pm_runtime_get_sync(smmu->dev);
if (ret)
goto out_cfg_free;
  
+	ret = arm_smmu_master_alloc_smes(dev);

+   if (ret) {
+   pm_runtime_put_sync(smmu->dev);
+   goto out_cfg_free;


Please keep to the existing pattern and put this on the cleanup path 
with a new label, rather than inline.



+   }
+
iommu_device_link(>iommu, dev);
  
+	pm_runtime_put_sync(smmu->dev);

+
return 0;
  
  out_cfg_free:

@@ -1428,7 +1442,7 @@ static void arm_smmu_remove_device(struct device *dev)
struct iommu_fwspec *fwspec = dev->iommu_fwspec;
struct arm_smmu_master_cfg *cfg;
struct arm_smmu_device *smmu;
-
+   int ret;
  
  	if (!fwspec || fwspec->ops != _smmu_ops)

return;
@@ -1436,8 +1450,21 @@ static void arm_smmu_remove_device(struct device *dev)
cfg  = fwspec->iommu_priv;
smmu = cfg->smmu;
  
+	/*

+* The device link between the master device and
+* smmu is already purged at this point.
+* So enable the power to smmu explicitly.
+*/


I don't understand this comment, especially since we don't even 
introduce device links until the following patch... :/



+
+   ret = pm_runtime_get_sync(smmu->dev);
+   if (ret)
+   return;
+
iommu_device_unlink(>iommu, dev);
arm_smmu_master_free_smes(fwspec);
+
+   pm_runtime_put_sync(smmu->dev);
+
iommu_group_remove_device(dev);
kfree(fwspec->iommu_priv);
iommu_fwspec_free(dev);
@@ -2130,6 +2157,14 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
if (err)
return err;
  
+	platform_set_drvdata(pdev, smmu);

+
+   pm_runtime_enable(dev);
+
+   err = pm_runtime_get_sync(dev);
+   if (err)
+   return err;
+
err = arm_smmu_device_cfg_probe(smmu);
if (err)
return err;
@@ -2171,9 +2206,9 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
return err;
}
  
-	platform_set_drvdata(pdev, smmu);

arm_smmu_device_reset(smmu);
arm_smmu_test_smr_masks(smmu);
+   pm_runtime_put_sync(dev);
  
  	/*

 * For ACPI and generic DT bindings, an SMMU will be probed before
@@ -2212,6 +2247,8 @@ static int arm_smmu_device_remove(struct platform_device 
*pdev)
  
  	/* Turn the thing off */

writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+   pm_runtime_force_suspend(smmu->dev);


Why do we need this? I guess it might be a Qualcomm-ism as I don't see 
anyone else calling it from .remove other than a couple of other qcom_* 
drivers. Given that we only get here during system shutdown (or the root 
user intentionally pissing about with driver unbinding), it doesn't seem 
like a point where power saving really matters all that much.


I'd also naively expect that anything this device was the last consumer 
off would get turned off by core code anyway once it's removed, but 
maybe things aren't that slick; I dunno :/


Robin.


+
return 0;
  }
  



Re: [PATCH 11/13] drm/amdgpu: add DRM_AMDGPU_ATC config option

2018-01-31 Thread Christian König

Adding Jean and the IOMMU list as well.

Am 31.01.2018 um 13:43 schrieb Oded Gabbay:

On Tue, Jan 30, 2018 at 6:53 PM, Felix Kuehling  wrote:

[SNIP]

There was some discussion last year about a generic PASID allocator in
the iommu subsystem:
https://lists.linuxfoundation.org/pipermail/iommu/2017-June/022159.html

Unfortunately, I don't think it got any further then that.


Yeah, that sounds exactly like what I had in mind as well.

Jean any updates on that topic? We are rather interested in common PASID 
handling as well.


Anybody already working on this? Sounds mostly like moving code around 
and creating a common interface.


Thanks,
Christian.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-01-31 Thread Robin Murphy

On 19/01/18 11:43, Vivek Gautam wrote:

From: Sricharan R 

The smmu needs to be functional only when the respective
master's using it are active. The device_link feature
helps to track such functional dependencies, so that the
iommu gets powered when the master device enables itself
using pm_runtime. So by adapting the smmu driver for
runtime pm, above said dependency can be addressed.

This patch adds the pm runtime/sleep callbacks to the
driver and also the functions to parse the smmu clocks
from DT and enable them in resume/suspend.

Signed-off-by: Sricharan R 
Signed-off-by: Archit Taneja 
[vivek: Clock rework to request bulk of clocks]
Signed-off-by: Vivek Gautam 
---
  drivers/iommu/arm-smmu.c | 55 ++--
  1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 78d4c6b8f1ba..21acffe91a1c 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -48,6 +48,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  
@@ -205,6 +206,9 @@ struct arm_smmu_device {

u32 num_global_irqs;
u32 num_context_irqs;
unsigned int*irqs;
+   struct clk_bulk_data*clocks;
+   int num_clks;
+   const char * const  *clk_names;


This seems unnecessary, as we use it a grand total of of once, during 
initialisation when we have the source data directly to hand. Just pass 
data->clks into arm_smmu_init_clks() as an additional argument.


Otherwise, I think this looks reasonable; it's about as unobtrusive as 
it's going to get.


Robin.


u32 cavium_id_base; /* Specific to Cavium */
  
@@ -1685,6 +1689,25 @@ static int arm_smmu_id_size_to_bits(int size)

}
  }
  
+static int arm_smmu_init_clocks(struct arm_smmu_device *smmu)

+{
+   int i;
+   int num = smmu->num_clks;
+
+   if (num < 1)
+   return 0;
+
+   smmu->clocks = devm_kcalloc(smmu->dev, num,
+   sizeof(*smmu->clocks), GFP_KERNEL);
+   if (!smmu->clocks)
+   return -ENOMEM;
+
+   for (i = 0; i < num; i++)
+   smmu->clocks[i].id = smmu->clk_names[i];
+
+   return devm_clk_bulk_get(smmu->dev, num, smmu->clocks);
+}
+
  static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
  {
unsigned long size;
@@ -1897,10 +1920,12 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
  struct arm_smmu_match_data {
enum arm_smmu_arch_version version;
enum arm_smmu_implementation model;
+   const char * const *clks;
+   int num_clks;
  };
  
  #define ARM_SMMU_MATCH_DATA(name, ver, imp)	\

-static struct arm_smmu_match_data name = { .version = ver, .model = imp }
+static const struct arm_smmu_match_data name = { .version = ver, .model = imp }
  
  ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU);

  ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
@@ -2001,6 +2026,8 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev,
data = of_device_get_match_data(dev);
smmu->version = data->version;
smmu->model = data->model;
+   smmu->clk_names = data->clks;
+   smmu->num_clks = data->num_clks;
  
  	parse_driver_options(smmu);
  
@@ -2099,6 +2126,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev)

smmu->irqs[i] = irq;
}
  
+	err = arm_smmu_init_clocks(smmu);

+   if (err)
+   return err;
+
err = arm_smmu_device_cfg_probe(smmu);
if (err)
return err;
@@ -2197,7 +2228,27 @@ static int __maybe_unused arm_smmu_pm_resume(struct 
device *dev)
return 0;
  }
  
-static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume);

+static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
+{
+   struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+
+   return clk_bulk_prepare_enable(smmu->num_clks, smmu->clocks);
+}
+
+static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
+{
+   struct arm_smmu_device *smmu = dev_get_drvdata(dev);
+
+   clk_bulk_disable_unprepare(smmu->num_clks, smmu->clocks);
+
+   return 0;
+}
+
+static const struct dev_pm_ops arm_smmu_pm_ops = {
+   SET_SYSTEM_SLEEP_PM_OPS(NULL, arm_smmu_pm_resume)
+   SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend,
+  arm_smmu_runtime_resume, NULL)
+};
  
  static struct platform_driver arm_smmu_driver = {

.driver = {


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 5/6] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-01-31 Thread Vivek Gautam



On 1/30/2018 1:12 AM, Rob Herring wrote:

On Fri, Jan 19, 2018 at 05:13:42PM +0530, Vivek Gautam wrote:

qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
clock and power requirements. This smmu core is used with
multiple masters on msm8996, viz. mdss, video, etc.
Add bindings for the same.

Signed-off-by: Vivek Gautam 
---
  .../devicetree/bindings/iommu/arm,smmu.txt | 43 ++
  drivers/iommu/arm-smmu.c   | 13 +++
  2 files changed, 56 insertions(+)

Reviewed-by: Rob Herring 


Thanks Rob.


--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v12 0/3] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)

2018-01-31 Thread Shameerali Kolothum Thodi
Hi Joerg,

> -Original Message-
> From: Shameerali Kolothum Thodi
> Sent: Monday, January 29, 2018 4:42 PM
> To: 'Will Deacon' ; Joerg Roedel 
> Cc: lorenzo.pieral...@arm.com; robin.mur...@arm.com;
> marc.zyng...@arm.com; j...@8bytes.org; John Garry
> ; xuwei (O) ; Guohanjun
> (Hanjun Guo) ; iommu@lists.linux-foundation.org;
> linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> devicet...@vger.kernel.org; Linuxarm 
> Subject: RE: [PATCH v12 0/3] iommu/smmu-v3: Workaround for hisilicon
> 161010801 erratum(reserve HW MSI)
> 
> Hi Joerg,
> 
> > -Original Message-
> > From: Will Deacon [mailto:will.dea...@arm.com]
> > Sent: Monday, January 29, 2018 4:21 PM
> > To: Shameerali Kolothum Thodi 
> > Cc: lorenzo.pieral...@arm.com; robin.mur...@arm.com;
> > marc.zyng...@arm.com; j...@8bytes.org; John Garry
> > ; xuwei (O) ; Guohanjun
> > (Hanjun Guo) ; iommu@lists.linux-foundation.org;
> > linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > devicet...@vger.kernel.org; Linuxarm 
> > Subject: Re: [PATCH v12 0/3] iommu/smmu-v3: Workaround for hisilicon
> > 161010801 erratum(reserve HW MSI)
> >
> > On Mon, Jan 29, 2018 at 04:16:33PM +, Shameerali Kolothum Thodi
> wrote:
> > > > -Original Message-
> > > > From: Will Deacon [mailto:will.dea...@arm.com]
> > > > Sent: Monday, January 29, 2018 3:40 PM
> > > > To: Shameerali Kolothum Thodi
> > > > 
> > > > Cc: lorenzo.pieral...@arm.com; robin.mur...@arm.com;
> > > > marc.zyng...@arm.com; j...@8bytes.org; John Garry
> > > > ; xuwei (O) ;
> Guohanjun
> > > > (Hanjun Guo) ; iommu@lists.linux-
> > foundation.org;
> > > > linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > > > devicet...@vger.kernel.org; Linuxarm 
> > > > Subject: Re: [PATCH v12 0/3] iommu/smmu-v3: Workaround for
> > > > hisilicon
> > > > 161010801 erratum(reserve HW MSI)
> > > > On Thu, Dec 14, 2017 at 04:09:54PM +, Shameer Kolothum wrote:
> > > > > On certain HiSilicon platforms (hip06/hip07) the GIC ITS and
> > > > > PCIe RC deviates from the standard implementation and this
> > > > > breaks PCIe MSI functionality when SMMU is enabled.
> > > > >
> > > > > The HiSilicon erratum 161010801 describes this limitation of
> > > > > certain HiSilicon platforms to support the SMMU mappings for MSI
> transactions.
> > > > > On these platforms GICv3 ITS translator is presented with the
> > > > > deviceID by extending the MSI payload data to 64 bits to include the
> deviceID.
> > > > > Hence, the PCIe controller on this platforms has to
> > > > > differentiate the MSI payload against other DMA payload and has
> > > > > to modify the MSI
> > payload.
> > > > > This basically makes it difficult for this platforms to have a
> > > > > SMMU translation for MSI.
> > > > >
> > > > > This patch implements an ACPI based quirk to reserve the hw msi
> > > > > regions in the smmu-v3 driver which means these address regions
> > > > > will not be translated and will be excluded from iova allocations.
> > > > >
> > > > > To implement this quirk, the following changes are incorporated:
> > > > > 1. Added a generic helper function to IORT code to retrieve and 
> > > > > reserve
> > > > >the associated ITS base address from a device IORT node. The 
> > > > > function
> > > > >has a check for smmu model to determine whether the platform
> > requires
> > > > >the HW MSI reservation or not.
> > > > > 2. Added smmu node entries and explicitly disabled them in hip06/hip07
> > > > > dts files so that users are warned about the non-DT support for 
> > > > > this
> > > > > erratum.
> > > >
> > > > [...]
> > > >
> > > > >  arch/arm64/boot/dts/hisilicon/hip06.dtsi |  56 
> > > > > arch/arm64/boot/dts/hisilicon/hip07.dtsi |  25 +++
> > > > >  drivers/acpi/arm64/iort.c| 111
> > > > ++-
> > > > >  drivers/iommu/dma-iommu.c|   8 ++-
> > > > >  drivers/irqchip/irq-gic-v3-its.c |   3 +-
> > > > >  include/linux/acpi_iort.h|   7 +-
> > > > >  6 files changed, 204 insertions(+), 6 deletions(-)
> > > >
> > > > It occurred to me this morning that this series probably isn't
> > > > queued
> > anywhere
> > > > because it's not obvious which tree is supposed to take it and I
> > > > can't see it in
> > -
> > > > next.
> > > >
> > > > Is this one for arm64, IOMMU, irqchip or something else? It's
> > > > probably
> > missed
> > > > the boat for 4.16 now...
> > > >
> > >
> > > I have been trying to ping you guys on this[1]. My expectation was
> > > that it will
> > be
> > > through IOMMU. Anyway