Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
On 2016/11/16 at 22:58, Myron Stowe wrote:
> On Wed, Nov 16, 2016 at 2:13 AM, Xunlei Pang  wrote:
>> Ccing David
>> On 2016/11/16 at 17:02, Xunlei Pang wrote:
>>> We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
>>> under kdump, it can be steadily reproduced on several different machines,
>>> the dmesg log is like:
>>> HP HPSA Driver (v 3.4.16-0)
>>> hpsa :02:00.0: using doorbell to reset controller
>>> hpsa :02:00.0: board ready after hard reset.
>>> hpsa :02:00.0: Waiting for controller to respond to no-op
>>> DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff]
>>> DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff]
>>> DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff]
>>> DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff]
>>> DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff]
>>> DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff]
>>> DMAR: DRHD: handling fault status reg 2
>>> DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason 
>>> 06] PTE Read access is not set
>>> hpsa :02:00.0: controller message 03:00 timed out
>>> hpsa :02:00.0: no-op failed; re-trying
>>>
>>> After some debugging, we found that the corresponding pte entry value
>>> is correct, and the value of the iommu caching mode is 0, the fault is
>>> probably due to the old iotlb cache of the in-flight DMA.
>>>
>>> Thus need to flush the old iotlb after context mapping is setup for the
>>> device, where the device is supposed to finish reset at its driver probe
>>> stage and no in-flight DMA exists hereafter.
>>>
>>> With this patch, all our problematic machines can survive the kdump tests.
>>>
>>> CC: Myron Stowe 
>>> CC: Don Brace 
>>> CC: Baoquan He 
>>> CC: Dave Young 
>>> Tested-by: Joseph Szczypek 
>>> Signed-off-by: Xunlei Pang 
>>> ---
>>>  drivers/iommu/intel-iommu.c | 11 +--
>>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>> index 3965e73..eb79288 100644
>>> --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct 
>>> dmar_domain *domain,
>>>* It's a non-present to present mapping. If hardware doesn't cache
>>>* non-present entry we only need to flush the write-buffer. If the
>>>* _does_ cache non-present entries, then it does so in the special
> If this does get accepted then we should fix the above grammar also -
>   "If the _does_ cache ..." -> "If the hardware _does_ cache ..."

Yes, but this reminds me of something.
As per the comment, the code here only needs to flush context caches for the 
special domain 0 which is
used to tag the non-present/erroneous caches, seems we should flush the old 
domain id of present entries
for kdump according to the analysis, other than the new-allocated domain id. 
Let me ponder more on this.

Regards,
Xunlei

>
>>> -  * domain #0, which we have to flush:
>>> +  * domain #0, which we have to flush.
>>> +  *
>>> +  * For kdump cases, present entries may be cached due to the in-flight
>>> +  * DMA and copied old pgtable, but there is no unmapping behaviour for
>>> +  * them, so we need an explicit iotlb flush for the newly-mapped 
>>> device.
>>> +  * For kdump, at this point, the device is supposed to finish reset at
>>> +  * the driver probe stage, no in-flight DMA will exist, thus we do not
>>> +  * need to worry about that anymore hereafter.
>>>*/
>>> - if (cap_caching_mode(iommu->cap)) {
>>> + if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) {
>>>   iommu->flush.flush_context(iommu, 0,
>>>  (((u16)bus) << 8) | devfn,
>>>  DMA_CCMD_MASK_NOBIT,
>> ___
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: mtk: add common-clk dependency

2016-11-16 Thread Honghui Zhang
On Wed, 2016-11-16 at 11:38 -0800, Stephen Boyd wrote:
> On 11/16, Arnd Bergmann wrote:
> > After the MT2701 clock driver was added, we get a harmless warning for
> > the iommu driver that selects it, when compile-testing without
> > COMMON_CLK.
> > 
> > warning: (MTK_IOMMU_V1) selects COMMON_CLK_MT2701_IMGSYS which has unmet 
> > direct dependencies (COMMON_CLK)
> > 
> > Adding a dependency on COMMON_CLK avoids the warning.
> > 
> > Fixes: e9862118272a ("clk: mediatek: Add MT2701 clock support")
> > Signed-off-by: Arnd Bergmann 
> 
> Hm.. why is an iommu driver selecting a clk driver? They should
> be using standard clk APIs so it's not like they need it for
> build time correctness. Shouldn't we drop the selects instead?
> Those look to have been introduced a few kernel versions ago, but
> they were selecting options that didn't exist until a few days
> ago when I merged the mediatek clk driver. The clk options are
> user-visible, so it should be possible to select them in the
> configuration phase.
> 

Hi, Stephen,
  I'm a bit out of date of the current clock code. Mediatek IOMMU v1
driver will need smi driver to enable iommu clients. And smi driver is
also respond to enable/disable the susbsys clocks for multi-media HW.
The relationship between iommu and smi is like the graphics below[1].

  EMI (External Memory Interface)
   |
  m4u (Multimedia Memory Management Unit)
   |
   SMI Common(Smart Multimedia Interface Common)
   |
   ++---
   ||
   ||
   SMI larb0SMI larb1   ... SoCs have several SMI local
arbiter(larb).
   (display) (vdec)
   ||
   ||
 +-+-+ +++
 | | | |||
 | | |...  |||  ... There are different ports in each
larb.
 | | | |||
OVL0 RDMA0 WDMA0  MC   PP   VLD


When enable SMI driver it will need those subsys clock provider.
But those clocks providers are disabled in default. Since it's needed by
smi driver, and smi was select by MTK_IOMMU_V1, I figure it should be
select by MTK_IOMMU_V1 too.

[1]Documentation/devicetree/bindings/iommu/mediatek,iommu.txt


thanks.

> 8<
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 8ee54d71c7eb..37e204f3d9be 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -352,9 +352,6 @@ config MTK_IOMMU_V1
>   select IOMMU_API
>   select MEMORY
>   select MTK_SMI
> - select COMMON_CLK_MT2701_MMSYS
> - select COMMON_CLK_MT2701_IMGSYS
> - select COMMON_CLK_MT2701_VDECSYS
>   help
> Support for the M4U on certain Mediatek SoCs. M4U generation 1 HW is
> Multimedia Memory Managememt Unit. This option enables remapping of
> 





___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: mtk: add common-clk dependency

2016-11-16 Thread Stephen Boyd
On 11/16, Arnd Bergmann wrote:
> After the MT2701 clock driver was added, we get a harmless warning for
> the iommu driver that selects it, when compile-testing without
> COMMON_CLK.
> 
> warning: (MTK_IOMMU_V1) selects COMMON_CLK_MT2701_IMGSYS which has unmet 
> direct dependencies (COMMON_CLK)
> 
> Adding a dependency on COMMON_CLK avoids the warning.
> 
> Fixes: e9862118272a ("clk: mediatek: Add MT2701 clock support")
> Signed-off-by: Arnd Bergmann 

Hm.. why is an iommu driver selecting a clk driver? They should
be using standard clk APIs so it's not like they need it for
build time correctness. Shouldn't we drop the selects instead?
Those look to have been introduced a few kernel versions ago, but
they were selecting options that didn't exist until a few days
ago when I merged the mediatek clk driver. The clk options are
user-visible, so it should be possible to select them in the
configuration phase.

8<
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 8ee54d71c7eb..37e204f3d9be 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -352,9 +352,6 @@ config MTK_IOMMU_V1
select IOMMU_API
select MEMORY
select MTK_SMI
-   select COMMON_CLK_MT2701_MMSYS
-   select COMMON_CLK_MT2701_IMGSYS
-   select COMMON_CLK_MT2701_VDECSYS
help
  Support for the M4U on certain Mediatek SoCs. M4U generation 1 HW is
  Multimedia Memory Managememt Unit. This option enables remapping of

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v3 08/20] x86: Add support for early encryption/decryption of memory

2016-11-16 Thread Tom Lendacky
On 11/16/2016 4:46 AM, Borislav Petkov wrote:
> Btw, for your next submission, this patch can be split in two exactly
> like the commit message paragraphs are:

I think I originally had it that way, I don't know why I combined them.
I'll split them out.

> 
> On Wed, Nov 09, 2016 at 06:36:10PM -0600, Tom Lendacky wrote:
>> Add support to be able to either encrypt or decrypt data in place during
>> the early stages of booting the kernel. This does not change the memory
>> encryption attribute - it is used for ensuring that data present in either
>> an encrypted or un-encrypted memory area is in the proper state (for
>> example the initrd will have been loaded by the boot loader and will not be
>> encrypted, but the memory that it resides in is marked as encrypted).
> 
> Patch 2: users of the new memmap change
> 
>> The early_memmap support is enhanced to specify encrypted and un-encrypted
>> mappings with and without write-protection. The use of write-protection is
>> necessary when encrypting data "in place". The write-protect attribute is
>> considered cacheable for loads, but not stores. This implies that the
>> hardware will never give the core a dirty line with this memtype.
> 
> Patch 1: change memmap
> 
> This makes this aspect of the patchset much clearer and is better for
> bisection.
> 
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/x86/include/asm/fixmap.h|9 +++
>>  arch/x86/include/asm/mem_encrypt.h   |   15 +
>>  arch/x86/include/asm/pgtable_types.h |8 +++
>>  arch/x86/mm/ioremap.c|   28 +
>>  arch/x86/mm/mem_encrypt.c|  102 
>> ++
>>  include/asm-generic/early_ioremap.h  |2 +
>>  mm/early_ioremap.c   |   15 +
>>  7 files changed, 179 insertions(+)
> 
> ...
> 
>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>> index d642cc5..06235b4 100644
>> --- a/arch/x86/mm/mem_encrypt.c
>> +++ b/arch/x86/mm/mem_encrypt.c
>> @@ -14,6 +14,9 @@
>>  #include 
>>  #include 
>>  
>> +#include 
>> +#include 
>> +
>>  extern pmdval_t early_pmd_flags;
>>  
>>  /*
>> @@ -24,6 +27,105 @@ extern pmdval_t early_pmd_flags;
>>  unsigned long sme_me_mask __section(.data) = 0;
>>  EXPORT_SYMBOL_GPL(sme_me_mask);
>>  
>> +/* Buffer used for early in-place encryption by BSP, no locking needed */
>> +static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
>> +
>> +/*
>> + * This routine does not change the underlying encryption setting of the
>> + * page(s) that map this memory. It assumes that eventually the memory is
>> + * meant to be accessed as encrypted but the contents are currently not
>> + * encrypted.
>> + */
>> +void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size)
>> +{
>> +void *src, *dst;
>> +size_t len;
>> +
>> +if (!sme_me_mask)
>> +return;
>> +
>> +local_flush_tlb();
>> +wbinvd();
>> +
>> +/*
>> + * There are limited number of early mapping slots, so map (at most)
>> + * one page at time.
>> + */
>> +while (size) {
>> +len = min_t(size_t, sizeof(sme_early_buffer), size);
>> +
>> +/* Create a mapping for non-encrypted write-protected memory */
>> +src = early_memremap_dec_wp(paddr, len);
>> +
>> +/* Create a mapping for encrypted memory */
>> +dst = early_memremap_enc(paddr, len);
>> +
>> +/*
>> + * If a mapping can't be obtained to perform the encryption,
>> + * then encrypted access to that area will end up causing
>> + * a crash.
>> + */
>> +BUG_ON(!src || !dst);
>> +
>> +memcpy(sme_early_buffer, src, len);
>> +memcpy(dst, sme_early_buffer, len);
> 
> I still am missing the short explanation why we need the temporary buffer.

Ok, I'll add that.

> 
> 
> Oh, and we can save us the code duplication a little. Diff ontop of yours:

Yup, makes sense.  I'll incorporate this.

Thanks,
Tom

> 
> ---
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 06235b477d7c..50e2c4fc7338 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -36,7 +36,8 @@ static char sme_early_buffer[PAGE_SIZE] 
> __aligned(PAGE_SIZE);
>   * meant to be accessed as encrypted but the contents are currently not
>   * encrypted.
>   */
> -void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size)
> +static void __init noinline
> +__mem_enc_dec(resource_size_t paddr, unsigned long size, bool enc)
>  {
>   void *src, *dst;
>   size_t len;
> @@ -54,15 +55,15 @@ void __init sme_early_mem_enc(resource_size_t paddr, 
> unsigned long size)
>   while (size) {
>   len = min_t(size_t, sizeof(sme_early_buffer), size);
>  
> - /* Create a mapping for non-encrypted write-protected memory */
> - src = early_memremap_dec_wp(paddr, len);
> + src = (enc ? early_memremap_dec_wp(pa

[PATCH] iommu: mtk: add common-clk dependency

2016-11-16 Thread Arnd Bergmann
After the MT2701 clock driver was added, we get a harmless warning for
the iommu driver that selects it, when compile-testing without
COMMON_CLK.

warning: (MTK_IOMMU_V1) selects COMMON_CLK_MT2701_IMGSYS which has unmet direct 
dependencies (COMMON_CLK)

Adding a dependency on COMMON_CLK avoids the warning.

Fixes: e9862118272a ("clk: mediatek: Add MT2701 clock support")
Signed-off-by: Arnd Bergmann 
---
 drivers/iommu/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 8ee54d71c7eb..bb537d06d319 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -346,7 +346,7 @@ config MTK_IOMMU
 
 config MTK_IOMMU_V1
bool "MTK IOMMU Version 1 (M4U gen1) Support"
-   depends on ARM
+   depends on ARM && COMMON_CLK
depends on ARCH_MEDIATEK || COMPILE_TEST
select ARM_DMA_USE_IOMMU
select IOMMU_API
-- 
2.9.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 13/16] drivers: iommu: arm-smmu: add IORT configuration

2016-11-16 Thread Lorenzo Pieralisi
In ACPI bases systems, in order to be able to create platform
devices and initialize them for ARM SMMU components, the IORT
kernel implementation requires a set of static functions to be
used by the IORT kernel layer to configure platform devices for
ARM SMMU components.

Add static configuration functions to the IORT kernel layer for
the ARM SMMU components, so that the ARM SMMU driver can
initialize its respective platform device by relying on the IORT
kernel infrastructure and by adding a corresponding ACPI device
early probe section entry.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Joerg Roedel 
---
 drivers/acpi/arm64/iort.c | 81 +
 drivers/iommu/arm-smmu.c  | 83 ++-
 include/linux/acpi_iort.h |  3 ++
 3 files changed, 166 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index fd52e4c..4708806 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -548,6 +548,78 @@ static bool __init arm_smmu_v3_is_coherent(struct 
acpi_iort_node *node)
return smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE;
 }
 
+static int __init arm_smmu_count_resources(struct acpi_iort_node *node)
+{
+   struct acpi_iort_smmu *smmu;
+   int num_irqs;
+   u64 *glb_irq;
+
+   /* Retrieve SMMU specific data */
+   smmu = (struct acpi_iort_smmu *)node->node_data;
+
+   glb_irq = ACPI_ADD_PTR(u64, node, smmu->global_interrupt_offset);
+   if (!IORT_IRQ_MASK(glb_irq[1])) /* 0 means not implemented */
+   num_irqs = 1;
+   else
+   num_irqs = 2;
+
+   num_irqs += smmu->context_interrupt_count;
+
+   return num_irqs + 1;
+}
+
+static void __init arm_smmu_init_resources(struct resource *res,
+  struct acpi_iort_node *node)
+{
+   struct acpi_iort_smmu *smmu;
+   int i, hw_irq, trigger, num_res = 0;
+   u64 *ctx_irq, *glb_irq;
+
+   /* Retrieve SMMU specific data */
+   smmu = (struct acpi_iort_smmu *)node->node_data;
+
+   res[num_res].start = smmu->base_address;
+   res[num_res].end = smmu->base_address + smmu->span - 1;
+   res[num_res].flags = IORESOURCE_MEM;
+   num_res++;
+
+   glb_irq = ACPI_ADD_PTR(u64, node, smmu->global_interrupt_offset);
+   /* Global IRQs */
+   hw_irq = IORT_IRQ_MASK(glb_irq[0]);
+   trigger = IORT_IRQ_TRIGGER_MASK(glb_irq[0]);
+
+   acpi_iort_register_irq(hw_irq, "arm-smmu-global", trigger,
+&res[num_res++]);
+
+   /* Global IRQs */
+   hw_irq = IORT_IRQ_MASK(glb_irq[1]);
+   if (hw_irq) {
+   trigger = IORT_IRQ_TRIGGER_MASK(glb_irq[1]);
+   acpi_iort_register_irq(hw_irq, "arm-smmu-global", trigger,
+&res[num_res++]);
+   }
+
+   /* Context IRQs */
+   ctx_irq = ACPI_ADD_PTR(u64, node, smmu->context_interrupt_offset);
+   for (i = 0; i < smmu->context_interrupt_count; i++) {
+   hw_irq = IORT_IRQ_MASK(ctx_irq[i]);
+   trigger = IORT_IRQ_TRIGGER_MASK(ctx_irq[i]);
+
+   acpi_iort_register_irq(hw_irq, "arm-smmu-context", trigger,
+  &res[num_res++]);
+   }
+}
+
+static bool __init arm_smmu_is_coherent(struct acpi_iort_node *node)
+{
+   struct acpi_iort_smmu *smmu;
+
+   /* Retrieve SMMU specific data */
+   smmu = (struct acpi_iort_smmu *)node->node_data;
+
+   return smmu->flags & ACPI_IORT_SMMU_COHERENT_WALK;
+}
+
 struct iort_iommu_config {
const char *name;
int (*iommu_init)(struct acpi_iort_node *node);
@@ -564,12 +636,21 @@ static const struct iort_iommu_config 
iort_arm_smmu_v3_cfg __initconst = {
.iommu_init_resources = arm_smmu_v3_init_resources
 };
 
+static const struct iort_iommu_config iort_arm_smmu_cfg __initconst = {
+   .name = "arm-smmu",
+   .iommu_is_coherent = arm_smmu_is_coherent,
+   .iommu_count_resources = arm_smmu_count_resources,
+   .iommu_init_resources = arm_smmu_init_resources
+};
+
 static __init
 const struct iort_iommu_config *iort_get_iommu_cfg(struct acpi_iort_node *node)
 {
switch (node->type) {
case ACPI_IORT_NODE_SMMU_V3:
return &iort_arm_smmu_v3_cfg;
+   case ACPI_IORT_NODE_SMMU:
+   return &iort_arm_smmu_cfg;
default:
return NULL;
}
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 573b2b6..21d1892 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -28,6 +28,8 @@
 
 #define pr_fmt(fmt) "arm-smmu: " fmt
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -1904,6 +1906,70 @@ static const struct of_device_id arm_smmu_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, arm_

[PATCH v8 16/16] drivers: acpi: iort: introduce iort_iommu_configure

2016-11-16 Thread Lorenzo Pieralisi
DT based systems have a generic kernel API to configure IOMMUs
for devices (ie of_iommu_configure()).

On ARM based ACPI systems, the of_iommu_configure() equivalent can
be implemented atop ACPI IORT kernel API, with the corresponding
functions to map device identifiers to IOMMUs and retrieve the
corresponding IOMMU operations necessary for DMA operations set-up.

By relying on the iommu_fwspec generic kernel infrastructure,
implement the IORT based IOMMU configuration for ARM ACPI systems
and hook it up in the ACPI kernel layer that implements DMA
configuration for a device.

Signed-off-by: Lorenzo Pieralisi 
Acked-by: Rafael J. Wysocki  [ACPI core]
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Hanjun Guo 
Cc: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 98 +++
 drivers/acpi/scan.c   |  7 +++-
 include/linux/acpi_iort.h |  6 +++
 3 files changed, 110 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 7d30605..5fa585d 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -28,6 +28,8 @@
 
 #define IORT_TYPE_MASK(type)   (1 << (type))
 #define IORT_MSI_TYPE  (1 << ACPI_IORT_NODE_ITS_GROUP)
+#define IORT_IOMMU_TYPE((1 << ACPI_IORT_NODE_SMMU) |   \
+   (1 << ACPI_IORT_NODE_SMMU_V3))
 
 struct iort_its_msi_chip {
struct list_headlist;
@@ -501,6 +503,102 @@ struct irq_domain *iort_get_device_domain(struct device 
*dev, u32 req_id)
return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI);
 }
 
+static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
+{
+   u32 *rid = data;
+
+   *rid = alias;
+   return 0;
+}
+
+static int arm_smmu_iort_xlate(struct device *dev, u32 streamid,
+  struct fwnode_handle *fwnode,
+  const struct iommu_ops *ops)
+{
+   int ret = iommu_fwspec_init(dev, fwnode, ops);
+
+   if (!ret)
+   ret = iommu_fwspec_add_ids(dev, &streamid, 1);
+
+   return ret;
+}
+
+static const struct iommu_ops *iort_iommu_xlate(struct device *dev,
+   struct acpi_iort_node *node,
+   u32 streamid)
+{
+   const struct iommu_ops *ops = NULL;
+   int ret = -ENODEV;
+   struct fwnode_handle *iort_fwnode;
+
+   if (node) {
+   iort_fwnode = iort_get_fwnode(node);
+   if (!iort_fwnode)
+   return NULL;
+
+   ops = iommu_get_instance(iort_fwnode);
+   if (!ops)
+   return NULL;
+
+   ret = arm_smmu_iort_xlate(dev, streamid, iort_fwnode, ops);
+   }
+
+   return ret ? NULL : ops;
+}
+
+/**
+ * iort_iommu_configure - Set-up IOMMU configuration for a device.
+ *
+ * @dev: device to configure
+ *
+ * Returns: iommu_ops pointer on configuration success
+ *  NULL on configuration failure
+ */
+const struct iommu_ops *iort_iommu_configure(struct device *dev)
+{
+   struct acpi_iort_node *node, *parent;
+   const struct iommu_ops *ops = NULL;
+   u32 streamid = 0;
+
+   if (dev_is_pci(dev)) {
+   struct pci_bus *bus = to_pci_dev(dev)->bus;
+   u32 rid;
+
+   pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid,
+  &rid);
+
+   node = iort_scan_node(ACPI_IORT_NODE_PCI_ROOT_COMPLEX,
+ iort_match_node_callback, &bus->dev);
+   if (!node)
+   return NULL;
+
+   parent = iort_node_map_rid(node, rid, &streamid,
+  IORT_IOMMU_TYPE);
+
+   ops = iort_iommu_xlate(dev, parent, streamid);
+
+   } else {
+   int i = 0;
+
+   node = iort_scan_node(ACPI_IORT_NODE_NAMED_COMPONENT,
+ iort_match_node_callback, dev);
+   if (!node)
+   return NULL;
+
+   parent = iort_node_get_id(node, &streamid,
+ IORT_IOMMU_TYPE, i++);
+
+   while (parent) {
+   ops = iort_iommu_xlate(dev, parent, streamid);
+
+   parent = iort_node_get_id(node, &streamid,
+ IORT_IOMMU_TYPE, i++);
+   }
+   }
+
+   return ops;
+}
+
 static void __init acpi_iort_register_irq(int hwirq, const char *name,
  int trigger,
  struct resource *res)
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 694e0b6..e5f7004 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #includ

[PATCH v8 11/16] drivers: iommu: arm-smmu-v3: add IORT configuration

2016-11-16 Thread Lorenzo Pieralisi
In ACPI bases systems, in order to be able to create platform
devices and initialize them for ARM SMMU v3 components, the IORT
kernel implementation requires a set of static functions to be
used by the IORT kernel layer to configure platform devices for
ARM SMMU v3 components.

Add static configuration functions to the IORT kernel layer for
the ARM SMMU v3 components, so that the ARM SMMU v3 driver can
initialize its respective platform device by relying on the IORT
kernel infrastructure and by adding a corresponding ACPI device
early probe section entry.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Joerg Roedel 
---
 drivers/acpi/arm64/iort.c   | 103 +++-
 drivers/iommu/arm-smmu-v3.c |  49 -
 2 files changed, 150 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index ddf83b5..fd52e4c 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -459,6 +459,95 @@ struct irq_domain *iort_get_device_domain(struct device 
*dev, u32 req_id)
return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI);
 }
 
+static void __init acpi_iort_register_irq(int hwirq, const char *name,
+ int trigger,
+ struct resource *res)
+{
+   int irq = acpi_register_gsi(NULL, hwirq, trigger,
+   ACPI_ACTIVE_HIGH);
+
+   if (irq <= 0) {
+   pr_err("could not register gsi hwirq %d name [%s]\n", hwirq,
+ name);
+   return;
+   }
+
+   res->start = irq;
+   res->end = irq;
+   res->flags = IORESOURCE_IRQ;
+   res->name = name;
+}
+
+static int __init arm_smmu_v3_count_resources(struct acpi_iort_node *node)
+{
+   struct acpi_iort_smmu_v3 *smmu;
+   /* Always present mem resource */
+   int num_res = 1;
+
+   /* Retrieve SMMUv3 specific data */
+   smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
+
+   if (smmu->event_gsiv)
+   num_res++;
+
+   if (smmu->pri_gsiv)
+   num_res++;
+
+   if (smmu->gerr_gsiv)
+   num_res++;
+
+   if (smmu->sync_gsiv)
+   num_res++;
+
+   return num_res;
+}
+
+static void __init arm_smmu_v3_init_resources(struct resource *res,
+ struct acpi_iort_node *node)
+{
+   struct acpi_iort_smmu_v3 *smmu;
+   int num_res = 0;
+
+   /* Retrieve SMMUv3 specific data */
+   smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
+
+   res[num_res].start = smmu->base_address;
+   res[num_res].end = smmu->base_address + SZ_128K - 1;
+   res[num_res].flags = IORESOURCE_MEM;
+
+   num_res++;
+
+   if (smmu->event_gsiv)
+   acpi_iort_register_irq(smmu->event_gsiv, "eventq",
+  ACPI_EDGE_SENSITIVE,
+  &res[num_res++]);
+
+   if (smmu->pri_gsiv)
+   acpi_iort_register_irq(smmu->pri_gsiv, "priq",
+  ACPI_EDGE_SENSITIVE,
+  &res[num_res++]);
+
+   if (smmu->gerr_gsiv)
+   acpi_iort_register_irq(smmu->gerr_gsiv, "gerror",
+  ACPI_EDGE_SENSITIVE,
+  &res[num_res++]);
+
+   if (smmu->sync_gsiv)
+   acpi_iort_register_irq(smmu->sync_gsiv, "cmdq-sync",
+  ACPI_EDGE_SENSITIVE,
+  &res[num_res++]);
+}
+
+static bool __init arm_smmu_v3_is_coherent(struct acpi_iort_node *node)
+{
+   struct acpi_iort_smmu_v3 *smmu;
+
+   /* Retrieve SMMUv3 specific data */
+   smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
+
+   return smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE;
+}
+
 struct iort_iommu_config {
const char *name;
int (*iommu_init)(struct acpi_iort_node *node);
@@ -468,10 +557,22 @@ struct iort_iommu_config {
 struct acpi_iort_node *node);
 };
 
+static const struct iort_iommu_config iort_arm_smmu_v3_cfg __initconst = {
+   .name = "arm-smmu-v3",
+   .iommu_is_coherent = arm_smmu_v3_is_coherent,
+   .iommu_count_resources = arm_smmu_v3_count_resources,
+   .iommu_init_resources = arm_smmu_v3_init_resources
+};
+
 static __init
 const struct iort_iommu_config *iort_get_iommu_cfg(struct acpi_iort_node *node)
 {
-   return NULL;
+   switch (node->type) {
+   case ACPI_IORT_NODE_SMMU_V3:
+   return &iort_arm_smmu_v3_cfg;
+   default:
+   return NULL;
+   }
 }
 
 /**
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index ed

[PATCH v8 10/16] drivers: iommu: arm-smmu-v3: split probe functions into DT/generic portions

2016-11-16 Thread Lorenzo Pieralisi
Current ARM SMMUv3 probe functions intermingle HW and DT probing in the
initialization functions to detect and programme the ARM SMMU v3 driver
features. In order to allow probing the ARM SMMUv3 with other firmwares
than DT, this patch splits the ARM SMMUv3 init functions into DT and HW
specific portions so that other FW interfaces (ie ACPI) can reuse the HW
probing functions and skip the DT portion accordingly.

This patch implements no functional change, only code reshuffling.

Signed-off-by: Lorenzo Pieralisi 
Acked-by: Will Deacon 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Robin Murphy 
Cc: Joerg Roedel 
---
 drivers/iommu/arm-smmu-v3.c | 46 +
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e6e1c87..ed563307 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2381,10 +2381,10 @@ static int arm_smmu_device_reset(struct arm_smmu_device 
*smmu, bool bypass)
return 0;
 }
 
-static int arm_smmu_device_probe(struct arm_smmu_device *smmu)
+static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 {
u32 reg;
-   bool coherent;
+   bool coherent = smmu->features & ARM_SMMU_FEAT_COHERENCY;
 
/* IDR0 */
reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0);
@@ -2436,13 +2436,9 @@ static int arm_smmu_device_probe(struct arm_smmu_device 
*smmu)
smmu->features |= ARM_SMMU_FEAT_HYP;
 
/*
-* The dma-coherent property is used in preference to the ID
+* The coherency feature as set by FW is used in preference to the ID
 * register, but warn on mismatch.
 */
-   coherent = of_dma_is_coherent(smmu->dev->of_node);
-   if (coherent)
-   smmu->features |= ARM_SMMU_FEAT_COHERENCY;
-
if (!!(reg & IDR0_COHACC) != coherent)
dev_warn(smmu->dev, "IDR0.COHACC overridden by dma-coherent 
property (%s)\n",
 coherent ? "true" : "false");
@@ -2563,21 +2559,37 @@ static int arm_smmu_device_probe(struct arm_smmu_device 
*smmu)
return 0;
 }
 
-static int arm_smmu_device_dt_probe(struct platform_device *pdev)
+static int arm_smmu_device_dt_probe(struct platform_device *pdev,
+   struct arm_smmu_device *smmu,
+   bool *bypass)
 {
-   int irq, ret;
-   struct resource *res;
-   struct arm_smmu_device *smmu;
struct device *dev = &pdev->dev;
-   bool bypass = true;
u32 cells;
 
+   *bypass = true;
+
if (of_property_read_u32(dev->of_node, "#iommu-cells", &cells))
dev_err(dev, "missing #iommu-cells property\n");
else if (cells != 1)
dev_err(dev, "invalid #iommu-cells value (%d)\n", cells);
else
-   bypass = false;
+   *bypass = false;
+
+   parse_driver_options(smmu);
+
+   if (of_dma_is_coherent(dev->of_node))
+   smmu->features |= ARM_SMMU_FEAT_COHERENCY;
+
+   return 0;
+}
+
+static int arm_smmu_device_probe(struct platform_device *pdev)
+{
+   int irq, ret;
+   struct resource *res;
+   struct arm_smmu_device *smmu;
+   struct device *dev = &pdev->dev;
+   bool bypass;
 
smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
if (!smmu) {
@@ -2614,10 +2626,12 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
if (irq > 0)
smmu->gerr_irq = irq;
 
-   parse_driver_options(smmu);
+   ret = arm_smmu_device_dt_probe(pdev, smmu, &bypass);
+   if (ret)
+   return ret;
 
/* Probe the h/w */
-   ret = arm_smmu_device_probe(smmu);
+   ret = arm_smmu_device_hw_probe(smmu);
if (ret)
return ret;
 
@@ -2679,7 +2693,7 @@ static struct platform_driver arm_smmu_driver = {
.name   = "arm-smmu-v3",
.of_match_table = of_match_ptr(arm_smmu_of_match),
},
-   .probe  = arm_smmu_device_dt_probe,
+   .probe  = arm_smmu_device_probe,
.remove = arm_smmu_device_remove,
 };
 
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 14/16] drivers: acpi: iort: replace rid map type with type mask

2016-11-16 Thread Lorenzo Pieralisi
IORT tables provide data that allow the kernel to carry out
device ID mappings between endpoints and system components
(eg interrupt controllers, IOMMUs). When the mapping for a
given device ID is carried out, the translation mechanism
is done on a per-subsystem basis rather than a component
subtype (ie the IOMMU kernel layer will look for mappings
from a device to all IORT node types corresponding to IOMMU
components), therefore the corresponding mapping API should
work on a range (ie mask) of IORT node types corresponding
to a common set of components (eg IOMMUs) rather than a
specific node type.

Upgrade the IORT iort_node_map_rid() API to work with a
type mask instead of a single node type so that it can
be used for mappings that span multiple components types
(ie IOMMUs).

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Hanjun Guo 
Cc: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 4708806..62057c6 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -26,6 +26,9 @@
 #include 
 #include 
 
+#define IORT_TYPE_MASK(type)   (1 << (type))
+#define IORT_MSI_TYPE  (1 << ACPI_IORT_NODE_ITS_GROUP)
+
 struct iort_its_msi_chip {
struct list_headlist;
struct fwnode_handle*fw_node;
@@ -317,7 +320,7 @@ static int iort_id_map(struct acpi_iort_id_mapping *map, u8 
type, u32 rid_in,
 
 static struct acpi_iort_node *iort_node_map_rid(struct acpi_iort_node *node,
u32 rid_in, u32 *rid_out,
-   u8 type)
+   u8 type_mask)
 {
u32 rid = rid_in;
 
@@ -326,7 +329,7 @@ static struct acpi_iort_node *iort_node_map_rid(struct 
acpi_iort_node *node,
struct acpi_iort_id_mapping *map;
int i;
 
-   if (node->type == type) {
+   if (IORT_TYPE_MASK(node->type) & type_mask) {
if (rid_out)
*rid_out = rid;
return node;
@@ -399,7 +402,7 @@ u32 iort_msi_map_rid(struct device *dev, u32 req_id)
if (!node)
return req_id;
 
-   iort_node_map_rid(node, req_id, &dev_id, ACPI_IORT_NODE_ITS_GROUP);
+   iort_node_map_rid(node, req_id, &dev_id, IORT_MSI_TYPE);
return dev_id;
 }
 
@@ -421,7 +424,7 @@ static int iort_dev_find_its_id(struct device *dev, u32 
req_id,
if (!node)
return -ENXIO;
 
-   node = iort_node_map_rid(node, req_id, NULL, ACPI_IORT_NODE_ITS_GROUP);
+   node = iort_node_map_rid(node, req_id, NULL, IORT_MSI_TYPE);
if (!node)
return -ENXIO;
 
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 12/16] drivers: iommu: arm-smmu: split probe functions into DT/generic portions

2016-11-16 Thread Lorenzo Pieralisi
Current ARM SMMU probe functions intermingle HW and DT probing
in the initialization functions to detect and programme the ARM SMMU
driver features. In order to allow probing the ARM SMMU with other
firmwares than DT, this patch splits the ARM SMMU init functions into
DT and HW specific portions so that other FW interfaces (ie ACPI) can
reuse the HW probing functions and skip the DT portion accordingly.

This patch implements no functional change, only code reshuffling.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 62 +---
 1 file changed, 37 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 339a8d3..573b2b6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1668,7 +1668,7 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
unsigned long size;
void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
u32 id;
-   bool cttw_dt, cttw_reg;
+   bool cttw_reg, cttw_fw = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK;
int i;
 
dev_notice(smmu->dev, "probing hardware configuration...\n");
@@ -1713,20 +1713,17 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
 
/*
 * In order for DMA API calls to work properly, we must defer to what
-* the DT says about coherency, regardless of what the hardware claims.
+* the FW says about coherency, regardless of what the hardware claims.
 * Fortunately, this also opens up a workaround for systems where the
 * ID register value has ended up configured incorrectly.
 */
-   cttw_dt = of_dma_is_coherent(smmu->dev->of_node);
cttw_reg = !!(id & ID0_CTTW);
-   if (cttw_dt)
-   smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
-   if (cttw_dt || cttw_reg)
+   if (cttw_fw || cttw_reg)
dev_notice(smmu->dev, "\t%scoherent table walk\n",
-  cttw_dt ? "" : "non-");
-   if (cttw_dt != cttw_reg)
+  cttw_fw ? "" : "non-");
+   if (cttw_fw != cttw_reg)
dev_notice(smmu->dev,
-  "\t(IDR0.CTTW overridden by dma-coherent 
property)\n");
+  "\t(IDR0.CTTW overridden by FW configuration)\n");
 
/* Max. number of entries we have for stream matching/indexing */
size = 1 << ((id >> ID0_NUMSIDB_SHIFT) & ID0_NUMSIDB_MASK);
@@ -1907,15 +1904,25 @@ static const struct of_device_id arm_smmu_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
 
-static int arm_smmu_device_dt_probe(struct platform_device *pdev)
+static int arm_smmu_device_dt_probe(struct platform_device *pdev,
+   struct arm_smmu_device *smmu)
 {
const struct arm_smmu_match_data *data;
-   struct resource *res;
-   struct arm_smmu_device *smmu;
struct device *dev = &pdev->dev;
-   int num_irqs, i, err;
bool legacy_binding;
 
+   if (of_property_read_u32(dev->of_node, "#global-interrupts",
+&smmu->num_global_irqs)) {
+   dev_err(dev, "missing #global-interrupts property\n");
+   return -ENODEV;
+   }
+
+   data = of_device_get_match_data(dev);
+   smmu->version = data->version;
+   smmu->model = data->model;
+
+   parse_driver_options(smmu);
+
legacy_binding = of_find_property(dev->of_node, "mmu-masters", NULL);
if (legacy_binding && !using_generic_binding) {
if (!using_legacy_binding)
@@ -1928,6 +1935,19 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
return -ENODEV;
}
 
+   if (of_dma_is_coherent(dev->of_node))
+   smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
+
+   return 0;
+}
+
+static int arm_smmu_device_probe(struct platform_device *pdev)
+{
+   struct resource *res;
+   struct arm_smmu_device *smmu;
+   struct device *dev = &pdev->dev;
+   int num_irqs, i, err;
+
smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
if (!smmu) {
dev_err(dev, "failed to allocate arm_smmu_device\n");
@@ -1935,9 +1955,9 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
}
smmu->dev = dev;
 
-   data = of_device_get_match_data(dev);
-   smmu->version = data->version;
-   smmu->model = data->model;
+   err = arm_smmu_device_dt_probe(pdev, smmu);
+   if (err)
+   return err;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
smmu->base = devm_ioremap_resource(dev, res);
@@ -1945,12 +1965,6 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
return PTR_ERR(smmu->base);

[PATCH v8 15/16] drivers: acpi: iort: add single mapping function

2016-11-16 Thread Lorenzo Pieralisi
The current IORT id mapping API requires components to provide
an input requester ID (a Bus-Device-Function (BDF) identifier for
PCI devices) to translate an input identifier to an output
identifier through an IORT range mapping.

Named components do not have an identifiable source ID therefore
their respective input/output mapping can only be defined in
IORT tables through single mappings, that provide a translation
that does not require any input identifier.

Current IORT interface for requester id mappings (iort_node_map_rid())
is not suitable for components that do not provide a requester id,
so it cannot be used for IORT named components.

Add an interface to the IORT API to enable retrieval of id
by allowing an indexed walk of the single mappings array for
a given component, therefore completing the IORT mapping API.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Hanjun Guo 
Cc: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 62057c6..7d30605 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -318,6 +318,45 @@ static int iort_id_map(struct acpi_iort_id_mapping *map, 
u8 type, u32 rid_in,
return 0;
 }
 
+static
+struct acpi_iort_node *iort_node_get_id(struct acpi_iort_node *node,
+   u32 *id_out, u8 type_mask,
+   int index)
+{
+   struct acpi_iort_node *parent;
+   struct acpi_iort_id_mapping *map;
+
+   if (!node->mapping_offset || !node->mapping_count ||
+index >= node->mapping_count)
+   return NULL;
+
+   map = ACPI_ADD_PTR(struct acpi_iort_id_mapping, node,
+  node->mapping_offset);
+
+   /* Firmware bug! */
+   if (!map->output_reference) {
+   pr_err(FW_BUG "[node %p type %d] ID map has NULL parent 
reference\n",
+  node, node->type);
+   return NULL;
+   }
+
+   parent = ACPI_ADD_PTR(struct acpi_iort_node, iort_table,
+  map->output_reference);
+
+   if (!(IORT_TYPE_MASK(parent->type) & type_mask))
+   return NULL;
+
+   if (map[index].flags & ACPI_IORT_ID_SINGLE_MAPPING) {
+   if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT ||
+   node->type == ACPI_IORT_NODE_PCI_ROOT_COMPLEX) {
+   *id_out = map[index].output_base;
+   return parent;
+   }
+   }
+
+   return NULL;
+}
+
 static struct acpi_iort_node *iort_node_map_rid(struct acpi_iort_node *node,
u32 rid_in, u32 *rid_out,
u8 type_mask)
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 08/16] drivers: acpi: iort: add node match function

2016-11-16 Thread Lorenzo Pieralisi
Device drivers (eg ARM SMMU) need to know if a specific component
is part of the IORT table, so that kernel data structures are not
initialized at initcalls time if the respective component is not
part of the IORT table.

To this end, this patch adds a trivial function that allows detecting
if a given IORT node type is present or not in the ACPI table, providing
an ACPI IORT equivalent for of_find_matching_node().

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Hanjun Guo 
Cc: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 15 +++
 include/linux/acpi_iort.h |  2 ++
 2 files changed, 17 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 1ac2720..4bb6acb 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -227,6 +227,21 @@ static struct acpi_iort_node *iort_scan_node(enum 
acpi_iort_node_type type,
return NULL;
 }
 
+static acpi_status
+iort_match_type_callback(struct acpi_iort_node *node, void *context)
+{
+   return AE_OK;
+}
+
+bool iort_node_match(u8 type)
+{
+   struct acpi_iort_node *node;
+
+   node = iort_scan_node(type, iort_match_type_callback, NULL);
+
+   return node != NULL;
+}
+
 static acpi_status iort_match_node_callback(struct acpi_iort_node *node,
void *context)
 {
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index d16fdda..17bb078 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -28,10 +28,12 @@ void iort_deregister_domain_token(int trans_id);
 struct fwnode_handle *iort_find_domain_token(int trans_id);
 #ifdef CONFIG_ACPI_IORT
 void acpi_iort_init(void);
+bool iort_node_match(u8 type);
 u32 iort_msi_map_rid(struct device *dev, u32 req_id);
 struct irq_domain *iort_get_device_domain(struct device *dev, u32 req_id);
 #else
 static inline void acpi_iort_init(void) { }
+static inline bool iort_node_match(u8 type) { return false; }
 static inline u32 iort_msi_map_rid(struct device *dev, u32 req_id)
 { return req_id; }
 static inline struct irq_domain *iort_get_device_domain(struct device *dev,
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 09/16] drivers: acpi: iort: add support for ARM SMMU platform devices creation

2016-11-16 Thread Lorenzo Pieralisi
In ARM ACPI systems, IOMMU components are specified through static
IORT table entries. In order to create platform devices for the
corresponding ARM SMMU components, IORT kernel code should be made
able to parse IORT table entries and create platform devices
dynamically.

This patch adds the generic IORT infrastructure required to create
platform devices for ARM SMMUs.

ARM SMMU versions have different resources requirement therefore this
patch also introduces an IORT specific structure (ie iort_iommu_config)
that contains hooks (to be defined when the corresponding ARM SMMU
driver support is added to the kernel) to be used to define the
platform devices names, init the IOMMUs, count their resources and
finally initialize them.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Hanjun Guo 
Cc: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 151 ++
 1 file changed, 151 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 4bb6acb..ddf83b5 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -19,9 +19,11 @@
 #define pr_fmt(fmt)"ACPI: IORT: " fmt
 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct iort_its_msi_chip {
@@ -457,6 +459,153 @@ struct irq_domain *iort_get_device_domain(struct device 
*dev, u32 req_id)
return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI);
 }
 
+struct iort_iommu_config {
+   const char *name;
+   int (*iommu_init)(struct acpi_iort_node *node);
+   bool (*iommu_is_coherent)(struct acpi_iort_node *node);
+   int (*iommu_count_resources)(struct acpi_iort_node *node);
+   void (*iommu_init_resources)(struct resource *res,
+struct acpi_iort_node *node);
+};
+
+static __init
+const struct iort_iommu_config *iort_get_iommu_cfg(struct acpi_iort_node *node)
+{
+   return NULL;
+}
+
+/**
+ * iort_add_smmu_platform_device() - Allocate a platform device for SMMU
+ * @node: Pointer to SMMU ACPI IORT node
+ *
+ * Returns: 0 on success, <0 failure
+ */
+static int __init iort_add_smmu_platform_device(struct acpi_iort_node *node)
+{
+   struct fwnode_handle *fwnode;
+   struct platform_device *pdev;
+   struct resource *r;
+   enum dev_dma_attr attr;
+   int ret, count;
+   const struct iort_iommu_config *ops = iort_get_iommu_cfg(node);
+
+   if (!ops)
+   return -ENODEV;
+
+   pdev = platform_device_alloc(ops->name, PLATFORM_DEVID_AUTO);
+   if (!pdev)
+   return PTR_ERR(pdev);
+
+   count = ops->iommu_count_resources(node);
+
+   r = kcalloc(count, sizeof(*r), GFP_KERNEL);
+   if (!r) {
+   ret = -ENOMEM;
+   goto dev_put;
+   }
+
+   ops->iommu_init_resources(r, node);
+
+   ret = platform_device_add_resources(pdev, r, count);
+   /*
+* Resources are duplicated in platform_device_add_resources,
+* free their allocated memory
+*/
+   kfree(r);
+
+   if (ret)
+   goto dev_put;
+
+   /*
+* Add a copy of IORT node pointer to platform_data to
+* be used to retrieve IORT data information.
+*/
+   ret = platform_device_add_data(pdev, &node, sizeof(node));
+   if (ret)
+   goto dev_put;
+
+   /*
+* We expect the dma masks to be equivalent for
+* all SMMUs set-ups
+*/
+   pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask;
+
+   fwnode = iort_get_fwnode(node);
+
+   if (!fwnode) {
+   ret = -ENODEV;
+   goto dev_put;
+   }
+
+   pdev->dev.fwnode = fwnode;
+
+   attr = ops->iommu_is_coherent(node) ?
+DEV_DMA_COHERENT : DEV_DMA_NON_COHERENT;
+
+   /* Configure DMA for the page table walker */
+   acpi_dma_configure(&pdev->dev, attr);
+
+   ret = platform_device_add(pdev);
+   if (ret)
+   goto dma_deconfigure;
+
+   return 0;
+
+dma_deconfigure:
+   acpi_dma_deconfigure(&pdev->dev);
+dev_put:
+   platform_device_put(pdev);
+
+   return ret;
+}
+
+static void __init iort_init_platform_devices(void)
+{
+   struct acpi_iort_node *iort_node, *iort_end;
+   struct acpi_table_iort *iort;
+   struct fwnode_handle *fwnode;
+   int i, ret;
+
+   /*
+* iort_table and iort both point to the start of IORT table, but
+* have different struct types
+*/
+   iort = (struct acpi_table_iort *)iort_table;
+
+   /* Get the first IORT node */
+   iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort,
+iort->node_offset);
+   iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort,
+   iort_table->length);
+
+   for (i = 0; i < iort->node_count; i++) {
+   

[PATCH v8 05/16] drivers: iommu: arm-smmu: convert struct device of_node to fwnode usage

2016-11-16 Thread Lorenzo Pieralisi
Current ARM SMMU driver rely on the struct device.of_node pointer for
device look-up and iommu_ops retrieval.

In preparation for ACPI probing enablement, convert the driver to use
the struct device.fwnode member for device and iommu_ops look-up so that
the driver infrastructure can be used also on systems that do not
associate an of_node pointer to a struct device (eg ACPI), making the
device look-up and iommu_ops retrieval firmware agnostic.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Robin Murphy 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8f72814..339a8d3 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1379,13 +1379,14 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 
 static int arm_smmu_match_node(struct device *dev, void *data)
 {
-   return dev->of_node == data;
+   return dev->fwnode == data;
 }
 
-static struct arm_smmu_device *arm_smmu_get_by_node(struct device_node *np)
+static
+struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnode)
 {
struct device *dev = driver_find_device(&arm_smmu_driver.driver, NULL,
-   np, arm_smmu_match_node);
+   fwnode, arm_smmu_match_node);
put_device(dev);
return dev ? dev_get_drvdata(dev) : NULL;
 }
@@ -1403,7 +1404,7 @@ static int arm_smmu_add_device(struct device *dev)
if (ret)
goto out_free;
} else if (fwspec && fwspec->ops == &arm_smmu_ops) {
-   smmu = arm_smmu_get_by_node(to_of_node(fwspec->iommu_fwnode));
+   smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode);
} else {
return -ENODEV;
}
@@ -2007,7 +2008,7 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
}
}
 
-   of_iommu_set_ops(dev->of_node, &arm_smmu_ops);
+   iommu_register_instance(dev->fwnode, &arm_smmu_ops);
platform_set_drvdata(pdev, smmu);
arm_smmu_device_reset(smmu);
 
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 04/16] drivers: iommu: make of_iommu_set/get_ops() DT agnostic

2016-11-16 Thread Lorenzo Pieralisi
The of_iommu_{set/get}_ops() API is used to associate a device
tree node with a specific set of IOMMU operations. The same
kernel interface is required on systems booting with ACPI, where
devices are not associated with a device tree node, therefore
the interface requires generalization.

The struct device fwnode member represents the fwnode token associated
with the device and the struct it points at is firmware specific;
regardless, it is initialized on both ACPI and DT systems and makes an
ideal candidate to use it to associate a set of IOMMU operations to a
given device, through its struct device.fwnode member pointer, paving
the way for representing per-device iommu_ops (ie an iommu instance
associated with a device).

Convert the DT specific of_iommu_{set/get}_ops() interface to
use struct device.fwnode as a look-up token, making the interface
usable on ACPI systems and rename the data structures and the
registration API so that they are made to represent their usage
more clearly.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Robin Murphy 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Robin Murphy 
Cc: Joerg Roedel 
---
 drivers/iommu/iommu.c| 40 
 drivers/iommu/of_iommu.c | 39 ---
 include/linux/iommu.h| 14 ++
 include/linux/of_iommu.h | 12 ++--
 4 files changed, 64 insertions(+), 41 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 9a2f196..8d3e847 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1615,6 +1615,46 @@ int iommu_request_dm_for_dev(struct device *dev)
return ret;
 }
 
+struct iommu_instance {
+   struct list_head list;
+   struct fwnode_handle *fwnode;
+   const struct iommu_ops *ops;
+};
+static LIST_HEAD(iommu_instance_list);
+static DEFINE_SPINLOCK(iommu_instance_lock);
+
+void iommu_register_instance(struct fwnode_handle *fwnode,
+const struct iommu_ops *ops)
+{
+   struct iommu_instance *iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
+
+   if (WARN_ON(!iommu))
+   return;
+
+   of_node_get(to_of_node(fwnode));
+   INIT_LIST_HEAD(&iommu->list);
+   iommu->fwnode = fwnode;
+   iommu->ops = ops;
+   spin_lock(&iommu_instance_lock);
+   list_add_tail(&iommu->list, &iommu_instance_list);
+   spin_unlock(&iommu_instance_lock);
+}
+
+const struct iommu_ops *iommu_get_instance(struct fwnode_handle *fwnode)
+{
+   struct iommu_instance *instance;
+   const struct iommu_ops *ops = NULL;
+
+   spin_lock(&iommu_instance_lock);
+   list_for_each_entry(instance, &iommu_instance_list, list)
+   if (instance->fwnode == fwnode) {
+   ops = instance->ops;
+   break;
+   }
+   spin_unlock(&iommu_instance_lock);
+   return ops;
+}
+
 int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
  const struct iommu_ops *ops)
 {
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 5b82862..0f57ddc 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -96,45 +96,6 @@ int of_get_dma_window(struct device_node *dn, const char 
*prefix, int index,
 }
 EXPORT_SYMBOL_GPL(of_get_dma_window);
 
-struct of_iommu_node {
-   struct list_head list;
-   struct device_node *np;
-   const struct iommu_ops *ops;
-};
-static LIST_HEAD(of_iommu_list);
-static DEFINE_SPINLOCK(of_iommu_lock);
-
-void of_iommu_set_ops(struct device_node *np, const struct iommu_ops *ops)
-{
-   struct of_iommu_node *iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
-
-   if (WARN_ON(!iommu))
-   return;
-
-   of_node_get(np);
-   INIT_LIST_HEAD(&iommu->list);
-   iommu->np = np;
-   iommu->ops = ops;
-   spin_lock(&of_iommu_lock);
-   list_add_tail(&iommu->list, &of_iommu_list);
-   spin_unlock(&of_iommu_lock);
-}
-
-const struct iommu_ops *of_iommu_get_ops(struct device_node *np)
-{
-   struct of_iommu_node *node;
-   const struct iommu_ops *ops = NULL;
-
-   spin_lock(&of_iommu_lock);
-   list_for_each_entry(node, &of_iommu_list, list)
-   if (node->np == np) {
-   ops = node->ops;
-   break;
-   }
-   spin_unlock(&of_iommu_lock);
-   return ops;
-}
-
 static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
 {
struct of_phandle_args *iommu_spec = data;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 436dc21..f2960e4 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -351,6 +351,9 @@ int iommu_fwspec_init(struct device *dev, struct 
fwnode_handle *iommu_fwnode,
  const struct iommu_ops *ops);
 void iommu_fwspec_free(struct device *dev);
 int iommu_fwspec_add_i

[PATCH v8 07/16] drivers: acpi: implement acpi_dma_configure

2016-11-16 Thread Lorenzo Pieralisi
On DT based systems, the of_dma_configure() API implements DMA
configuration for a given device. On ACPI systems an API equivalent to
of_dma_configure() is missing which implies that it is currently not
possible to set-up DMA operations for devices through the ACPI generic
kernel layer.

This patch fills the gap by introducing acpi_dma_configure/deconfigure()
calls that for now are just wrappers around arch_setup_dma_ops() and
arch_teardown_dma_ops() and also updates ACPI and PCI core code to use
the newly introduced acpi_dma_configure/acpi_dma_deconfigure functions.

Since acpi_dma_configure() is used to configure DMA operations, the
function initializes the dma/coherent_dma masks to sane default values
if the current masks are uninitialized (also to keep the default values
consistent with DT systems) to make sure the device has a complete
default DMA set-up.

The DMA range size passed to arch_setup_dma_ops() is sized according
to the device coherent_dma_mask (starting at address 0x0), mirroring the
DT probing path behaviour when a dma-ranges property is not provided
for the device being probed; this changes the current arch_setup_dma_ops()
call parameters in the ACPI probing case, but since arch_setup_dma_ops()
is a NOP on all architectures but ARM/ARM64 this patch does not change
the current kernel behaviour on them.

Signed-off-by: Lorenzo Pieralisi 
Acked-by: Bjorn Helgaas  [pci]
Acked-by: Rafael J. Wysocki 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Bjorn Helgaas 
Cc: Robin Murphy 
Cc: Tomasz Nowicki 
Cc: Joerg Roedel 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/glue.c |  4 ++--
 drivers/acpi/scan.c | 40 
 drivers/pci/probe.c |  3 +--
 include/acpi/acpi_bus.h |  2 ++
 include/linux/acpi.h|  5 +
 5 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/glue.c b/drivers/acpi/glue.c
index 5ea5dc2..f8d6564 100644
--- a/drivers/acpi/glue.c
+++ b/drivers/acpi/glue.c
@@ -227,8 +227,7 @@ int acpi_bind_one(struct device *dev, struct acpi_device 
*acpi_dev)
 
attr = acpi_get_dma_attr(acpi_dev);
if (attr != DEV_DMA_NOT_SUPPORTED)
-   arch_setup_dma_ops(dev, 0, 0, NULL,
-  attr == DEV_DMA_COHERENT);
+   acpi_dma_configure(dev, attr);
 
acpi_physnode_link_name(physical_node_name, node_id);
retval = sysfs_create_link(&acpi_dev->dev.kobj, &dev->kobj,
@@ -251,6 +250,7 @@ int acpi_bind_one(struct device *dev, struct acpi_device 
*acpi_dev)
return 0;
 
  err:
+   acpi_dma_deconfigure(dev);
ACPI_COMPANION_SET(dev, NULL);
put_device(dev);
put_device(&acpi_dev->dev);
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 035ac64..694e0b6 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1370,6 +1370,46 @@ enum dev_dma_attr acpi_get_dma_attr(struct acpi_device 
*adev)
return DEV_DMA_NON_COHERENT;
 }
 
+/**
+ * acpi_dma_configure - Set-up DMA configuration for the device.
+ * @dev: The pointer to the device
+ * @attr: device dma attributes
+ */
+void acpi_dma_configure(struct device *dev, enum dev_dma_attr attr)
+{
+   /*
+* Set default coherent_dma_mask to 32 bit.  Drivers are expected to
+* setup the correct supported mask.
+*/
+   if (!dev->coherent_dma_mask)
+   dev->coherent_dma_mask = DMA_BIT_MASK(32);
+
+   /*
+* Set it to coherent_dma_mask by default if the architecture
+* code has not set it.
+*/
+   if (!dev->dma_mask)
+   dev->dma_mask = &dev->coherent_dma_mask;
+
+   /*
+* Assume dma valid range starts at 0 and covers the whole
+* coherent_dma_mask.
+*/
+   arch_setup_dma_ops(dev, 0, dev->coherent_dma_mask + 1, NULL,
+  attr == DEV_DMA_COHERENT);
+}
+EXPORT_SYMBOL_GPL(acpi_dma_configure);
+
+/**
+ * acpi_dma_deconfigure - Tear-down DMA configuration for the device.
+ * @dev: The pointer to the device
+ */
+void acpi_dma_deconfigure(struct device *dev)
+{
+   arch_teardown_dma_ops(dev);
+}
+EXPORT_SYMBOL_GPL(acpi_dma_deconfigure);
+
 static void acpi_init_coherency(struct acpi_device *adev)
 {
unsigned long long cca = 0;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ab00267..c29e07a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1738,8 +1738,7 @@ static void pci_dma_configure(struct pci_dev *dev)
if (attr == DEV_DMA_NOT_SUPPORTED)
dev_warn(&dev->dev, "DMA not supported.\n");
else
-   arch_setup_dma_ops(&dev->dev, 0, 0, NULL,
-  attr == DEV_DMA_COHERENT);
+   acpi_dma_configure(&dev->dev, attr);
}
 
pci_put_host_bridge_device(bridge);
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index c1a52

[PATCH v8 06/16] drivers: iommu: arm-smmu-v3: convert struct device of_node to fwnode usage

2016-11-16 Thread Lorenzo Pieralisi
Current ARM SMMU v3 driver rely on the struct device.of_node pointer for
device look-up and iommu_ops retrieval.

In preparation for ACPI probing enablement, convert the driver to use
the struct device.fwnode member for device and iommu_ops look-up so that
the driver infrastructure can be used also on systems that do not
associate an of_node pointer to a struct device (eg ACPI), making the
device look-up and iommu_ops retrieval firmware agnostic.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Robin Murphy 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Robin Murphy 
---
 drivers/iommu/arm-smmu-v3.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e6f9b2d..e6e1c87 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1723,13 +1723,14 @@ static struct platform_driver arm_smmu_driver;
 
 static int arm_smmu_match_node(struct device *dev, void *data)
 {
-   return dev->of_node == data;
+   return dev->fwnode == data;
 }
 
-static struct arm_smmu_device *arm_smmu_get_by_node(struct device_node *np)
+static
+struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle *fwnode)
 {
struct device *dev = driver_find_device(&arm_smmu_driver.driver, NULL,
-   np, arm_smmu_match_node);
+   fwnode, arm_smmu_match_node);
put_device(dev);
return dev ? dev_get_drvdata(dev) : NULL;
 }
@@ -1765,7 +1766,7 @@ static int arm_smmu_add_device(struct device *dev)
master = fwspec->iommu_priv;
smmu = master->smmu;
} else {
-   smmu = arm_smmu_get_by_node(to_of_node(fwspec->iommu_fwnode));
+   smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode);
if (!smmu)
return -ENODEV;
master = kzalloc(sizeof(*master), GFP_KERNEL);
@@ -2634,7 +2635,8 @@ static int arm_smmu_device_dt_probe(struct 
platform_device *pdev)
return ret;
 
/* And we're up. Go go go! */
-   of_iommu_set_ops(dev->of_node, &arm_smmu_ops);
+   iommu_register_instance(dev->fwnode, &arm_smmu_ops);
+
 #ifdef CONFIG_PCI
if (pci_bus_type.iommu_ops != &arm_smmu_ops) {
pci_request_acs();
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 01/16] drivers: acpi: add FWNODE_ACPI_STATIC fwnode type

2016-11-16 Thread Lorenzo Pieralisi
On systems booting with a device tree, every struct device is associated
with a struct device_node, that provides its DT firmware representation.
The device node can be used in generic kernel contexts (eg IRQ
translation, IOMMU streamid mapping), to retrieve the properties
associated with the device and carry out kernel operations accordingly.
Owing to the 1:1 relationship between the device and its device_node,
the device_node can also be used as a look-up token for the device (eg
looking up a device through its device_node), to retrieve the device in
kernel paths where the device_node is available.

On systems booting with ACPI, the same abstraction provided by
the device_node is required to provide look-up functionality.

The struct acpi_device, that represents firmware objects in the
ACPI namespace already includes a struct fwnode_handle of
type FWNODE_ACPI as their member; the same abstraction is missing
though for devices that are instantiated out of static ACPI tables
entries (eg ARM SMMU devices).

Add a new fwnode_handle type to associate devices created out
of static ACPI table entries to the respective firmware components
and create a simple ACPI core layer interface to dynamically allocate
and free the corresponding firmware nodes so that kernel subsystems
can use it to instantiate the nodes and associate them with the
respective devices.

Signed-off-by: Lorenzo Pieralisi 
Acked-by: Rafael J. Wysocki 
Reviewed-by: Hanjun Guo 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
---
 include/linux/acpi.h   | 21 +
 include/linux/fwnode.h |  3 ++-
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 689a8b9..6efb13c 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -56,6 +56,27 @@ static inline acpi_handle acpi_device_handle(struct 
acpi_device *adev)
acpi_fwnode_handle(adev) : NULL)
 #define ACPI_HANDLE(dev)   acpi_device_handle(ACPI_COMPANION(dev))
 
+static inline struct fwnode_handle *acpi_alloc_fwnode_static(void)
+{
+   struct fwnode_handle *fwnode;
+
+   fwnode = kzalloc(sizeof(struct fwnode_handle), GFP_KERNEL);
+   if (!fwnode)
+   return NULL;
+
+   fwnode->type = FWNODE_ACPI_STATIC;
+
+   return fwnode;
+}
+
+static inline void acpi_free_fwnode_static(struct fwnode_handle *fwnode)
+{
+   if (WARN_ON(!fwnode || fwnode->type != FWNODE_ACPI_STATIC))
+   return;
+
+   kfree(fwnode);
+}
+
 /**
  * ACPI_DEVICE_CLASS - macro used to describe an ACPI device with
  * the PCI-defined class-code information
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index 8516717..8bd28ce 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -17,8 +17,9 @@ enum fwnode_type {
FWNODE_OF,
FWNODE_ACPI,
FWNODE_ACPI_DATA,
+   FWNODE_ACPI_STATIC,
FWNODE_PDATA,
-   FWNODE_IRQCHIP,
+   FWNODE_IRQCHIP
 };
 
 struct fwnode_handle {
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 00/16] ACPI IORT ARM SMMU support

2016-11-16 Thread Lorenzo Pieralisi
This patch series is v8 of a previous posting:

https://lkml.org/lkml/2016/11/9/422

v7 -> v8
- Renamed fwnode iommu_ops registration API according to review
- Minor change in ARM SMMU driver DT/ACPI split
- Added review tags

v6 -> v7
- Rebased against v4.9-rc4
- Fixed IORT probing on ACPI systems with missing IORT table
- Fixed SMMUv1/v2 global interrupt detection
- Updated iommu_ops firmware look-up

v5 -> v6
- Rebased against v4.9-rc1
- Changed FWNODE_IOMMU to FWNODE_ACPI_STATIC
- Moved platform devices creation into IORT code
- Updated fwnode handling
- Added default dma masks initialization

v4 -> v5
- Added SMMUv1/v2 support
- Rebased against v4.8-rc5 and dependencies series
- Consolidated IORT platform devices creation

v3 -> v4
- Added single mapping API (for IORT named components)
- Fixed arm_smmu_iort_xlate() return value
- Reworked fwnode registration and platform device creation
  ordering to fix probe ordering dependencies
- Added code to keep device_node ref count with new iommu
  fwspec API
- Added patch to make iommu_fwspec arch agnostic
- Dropped RFC status
- Rebased against v4.8-rc2

v2 -> v3
- Rebased on top of dependencies series [1][2][3](v4.7-rc3)
- Added back reliance on ACPI early probing infrastructure
- Patch[1-3] merged through other dependent series
- Added back IOMMU fwnode generalization
- Move SMMU v3 static functions configuration to IORT code
- Implemented generic IOMMU fwspec API
- Added code to implement fwnode platform device look-up

v1 -> v2:
- Rebased on top of dependencies series [1][2][3](v4.7-rc1)
- Removed IOMMU fwnode generalization
- Implemented ARM SMMU v3 ACPI probing instead of ARM SMMU v2
  owing to patch series dependencies [1]
- Moved platform device creation logic to IORT code to
  generalize its usage for ARM SMMU v1-v2-v3 components
- Removed reliance on ACPI early device probing
- Created IORT specific iommu_xlate() translation hook leaving
  OF code unchanged according to v1 reviews

The ACPI IORT table provides information that allows instantiating
ARM SMMU devices and carrying out id mappings between components on
ARM based systems (devices, IOMMUs, interrupt controllers).

http://infocenter.arm.com/help/topic/com.arm.doc.den0049b/DEN0049B_IO_Remapping_Table.pdf

Building on basic IORT support, this patchset enables ARM SMMUs support
on ACPI systems.

Most of the code is aimed at building the required generic ACPI
infrastructure to create and enable IOMMU components and to bring
the IOMMU infrastructure for ACPI on par with DT, which is going to
make future ARM SMMU components easier to integrate.

PATCH (1) adds a FWNODE_ACPI_STATIC type to the struct fwnode_handle type.
  It is required to attach a fwnode identifier to platform
  devices allocated/detected through static ACPI table entries
  (ie IORT tables entries).
  IOMMU devices have to have an identifier to look them up
  eg IOMMU core layer carrying out id translation. This can be
  done through a fwnode_handle (ie IOMMU platform devices created
  out of IORT tables are not ACPI devices hence they can't be
  allocated as such, otherwise they would have a fwnode_handle of
  type FWNODE_ACPI).

PATCH (2) makes use of the ACPI early probing API to add a linker script
  section for probing devices via IORT ACPI kernel code.

PATCH (3) provides IORT support for registering IOMMU IORT node through
  their fwnode handle.

PATCH (4) make of_iommu_{set/get}_ops() functions DT agnostic and
  rename the registration API.

PATCH (5) convert ARM SMMU driver to use fwnode instead of of_node as
  look-up and iommu_ops retrieval token.

PATCH (6) convert ARM SMMU v3 driver to use fwnode instead of of_node as
  look-up and iommu_ops retrieval token.

PATCH (7) implements the of_dma_configure() API in ACPI world -
  acpi_dma_configure() - and patches PCI and ACPI core code to
  start making use of it.

PATCH (8) provides an IORT function to detect existence of specific type
  of IORT components.

PATCH (9) creates the kernel infrastructure required to create ARM SMMU
  platform devices for IORT nodes.

PATCH (10) refactors the ARM SMMU v3 driver so that the init functions are
   split in a way that groups together code that probes through DT
   and code that carries out HW registers FW agnostic probing, in
   preparation for adding the ACPI probing path.

PATCH (11) adds ARM SMMU v3 IORT IOMMU operations to create and probe
   ARM SMMU v3 components.

PATCH (12) refactors the ARM SMMU v1/v2 driver so that the init functio

[PATCH v8 02/16] drivers: acpi: iort: introduce linker section for IORT entries probing

2016-11-16 Thread Lorenzo Pieralisi
Since commit e647b532275b ("ACPI: Add early device probing
infrastructure") the kernel has gained the infrastructure that allows
adding linker script section entries to execute ACPI driver callbacks
(ie probe routines) for all subsystems that register a table entry
in the respective kernel section (eg clocksource, irqchip).

Since ARM IOMMU devices data is described through IORT tables when
booting with ACPI, the ARM IOMMU drivers must be made able to hook ACPI
callback routines that are called to probe IORT entries and initialize
the respective IOMMU devices.

To avoid adding driver specific hooks into IORT table initialization
code (breaking therefore code modularity - ie ACPI IORT code must be made
aware of ARM SMMU drivers ACPI init callbacks), this patch adds code
that allows ARM SMMU drivers to take advantage of the ACPI early probing
infrastructure, so that they can add linker script section entries
containing drivers callback to be executed on IORT tables detection.

Since IORT nodes are differentiated by a type, the callback routines
can easily parse the IORT table entries, check the IORT nodes and
carry out some actions whenever the IORT node type associated with
the driver specific callback is matched.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Hanjun Guo 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
Cc: Marc Zyngier 
---
 drivers/acpi/arm64/iort.c | 13 ++---
 include/asm-generic/vmlinux.lds.h |  1 +
 include/linux/acpi_iort.h |  3 +++
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 6b81746..2c46ebc 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -361,8 +361,15 @@ void __init acpi_iort_init(void)
acpi_status status;
 
status = acpi_get_table(ACPI_SIG_IORT, 0, &iort_table);
-   if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
-   const char *msg = acpi_format_exception(status);
-   pr_err("Failed to get table, %s\n", msg);
+   if (ACPI_FAILURE(status)) {
+   if (status != AE_NOT_FOUND) {
+   const char *msg = acpi_format_exception(status);
+
+   pr_err("Failed to get table, %s\n", msg);
+   }
+
+   return;
}
+
+   acpi_probe_device_table(iort);
 }
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 3074796..f9c9f3c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -563,6 +563,7 @@
IRQCHIP_OF_MATCH_TABLE()\
ACPI_PROBE_TABLE(irqchip)   \
ACPI_PROBE_TABLE(clksrc)\
+   ACPI_PROBE_TABLE(iort)  \
EARLYCON_TABLE()
 
 #define INIT_TEXT  \
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index 0e32dac..d16fdda 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -39,4 +39,7 @@ static inline struct irq_domain 
*iort_get_device_domain(struct device *dev,
 { return NULL; }
 #endif
 
+#define IORT_ACPI_DECLARE(name, table_id, fn)  \
+   ACPI_DECLARE_PROBE_ENTRY(iort, name, table_id, 0, NULL, 0, fn)
+
 #endif /* __ACPI_IORT_H__ */
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 03/16] drivers: acpi: iort: add support for IOMMU fwnode registration

2016-11-16 Thread Lorenzo Pieralisi
The ACPI IORT table provide entries for IOMMU (aka SMMU in ARM world)
components that allow creating the kernel data structures required to
probe and initialize the IOMMU devices.

This patch provides support in the IORT kernel code to register IOMMU
components and their respective fwnode.

Signed-off-by: Lorenzo Pieralisi 
Reviewed-by: Hanjun Guo 
Reviewed-by: Tomasz Nowicki 
Tested-by: Hanjun Guo 
Tested-by: Tomasz Nowicki 
Cc: Hanjun Guo 
Cc: Tomasz Nowicki 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 86 +++
 1 file changed, 86 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 2c46ebc..1ac2720 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -20,7 +20,9 @@
 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 struct iort_its_msi_chip {
struct list_headlist;
@@ -28,6 +30,90 @@ struct iort_its_msi_chip {
u32 translation_id;
 };
 
+struct iort_fwnode {
+   struct list_head list;
+   struct acpi_iort_node *iort_node;
+   struct fwnode_handle *fwnode;
+};
+static LIST_HEAD(iort_fwnode_list);
+static DEFINE_SPINLOCK(iort_fwnode_lock);
+
+/**
+ * iort_set_fwnode() - Create iort_fwnode and use it to register
+ *iommu data in the iort_fwnode_list
+ *
+ * @node: IORT table node associated with the IOMMU
+ * @fwnode: fwnode associated with the IORT node
+ *
+ * Returns: 0 on success
+ *  <0 on failure
+ */
+static inline int iort_set_fwnode(struct acpi_iort_node *iort_node,
+ struct fwnode_handle *fwnode)
+{
+   struct iort_fwnode *np;
+
+   np = kzalloc(sizeof(struct iort_fwnode), GFP_ATOMIC);
+
+   if (WARN_ON(!np))
+   return -ENOMEM;
+
+   INIT_LIST_HEAD(&np->list);
+   np->iort_node = iort_node;
+   np->fwnode = fwnode;
+
+   spin_lock(&iort_fwnode_lock);
+   list_add_tail(&np->list, &iort_fwnode_list);
+   spin_unlock(&iort_fwnode_lock);
+
+   return 0;
+}
+
+/**
+ * iort_get_fwnode() - Retrieve fwnode associated with an IORT node
+ *
+ * @node: IORT table node to be looked-up
+ *
+ * Returns: fwnode_handle pointer on success, NULL on failure
+ */
+static inline
+struct fwnode_handle *iort_get_fwnode(struct acpi_iort_node *node)
+{
+   struct iort_fwnode *curr;
+   struct fwnode_handle *fwnode = NULL;
+
+   spin_lock(&iort_fwnode_lock);
+   list_for_each_entry(curr, &iort_fwnode_list, list) {
+   if (curr->iort_node == node) {
+   fwnode = curr->fwnode;
+   break;
+   }
+   }
+   spin_unlock(&iort_fwnode_lock);
+
+   return fwnode;
+}
+
+/**
+ * iort_delete_fwnode() - Delete fwnode associated with an IORT node
+ *
+ * @node: IORT table node associated with fwnode to delete
+ */
+static inline void iort_delete_fwnode(struct acpi_iort_node *node)
+{
+   struct iort_fwnode *curr, *tmp;
+
+   spin_lock(&iort_fwnode_lock);
+   list_for_each_entry_safe(curr, tmp, &iort_fwnode_list, list) {
+   if (curr->iort_node == node) {
+   list_del(&curr->list);
+   kfree(curr);
+   break;
+   }
+   }
+   spin_unlock(&iort_fwnode_lock);
+}
+
 typedef acpi_status (*iort_find_node_callback)
(struct acpi_iort_node *node, void *context);
 
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Myron Stowe
On Wed, Nov 16, 2016 at 2:13 AM, Xunlei Pang  wrote:
> Ccing David
> On 2016/11/16 at 17:02, Xunlei Pang wrote:
>> We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
>> under kdump, it can be steadily reproduced on several different machines,
>> the dmesg log is like:
>> HP HPSA Driver (v 3.4.16-0)
>> hpsa :02:00.0: using doorbell to reset controller
>> hpsa :02:00.0: board ready after hard reset.
>> hpsa :02:00.0: Waiting for controller to respond to no-op
>> DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff]
>> DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff]
>> DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff]
>> DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff]
>> DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff]
>> DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff]
>> DMAR: DRHD: handling fault status reg 2
>> DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason 
>> 06] PTE Read access is not set
>> hpsa :02:00.0: controller message 03:00 timed out
>> hpsa :02:00.0: no-op failed; re-trying
>>
>> After some debugging, we found that the corresponding pte entry value
>> is correct, and the value of the iommu caching mode is 0, the fault is
>> probably due to the old iotlb cache of the in-flight DMA.
>>
>> Thus need to flush the old iotlb after context mapping is setup for the
>> device, where the device is supposed to finish reset at its driver probe
>> stage and no in-flight DMA exists hereafter.
>>
>> With this patch, all our problematic machines can survive the kdump tests.
>>
>> CC: Myron Stowe 
>> CC: Don Brace 
>> CC: Baoquan He 
>> CC: Dave Young 
>> Tested-by: Joseph Szczypek 
>> Signed-off-by: Xunlei Pang 
>> ---
>>  drivers/iommu/intel-iommu.c | 11 +--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 3965e73..eb79288 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct 
>> dmar_domain *domain,
>>* It's a non-present to present mapping. If hardware doesn't cache
>>* non-present entry we only need to flush the write-buffer. If the
>>* _does_ cache non-present entries, then it does so in the special

If this does get accepted then we should fix the above grammar also -
  "If the _does_ cache ..." -> "If the hardware _does_ cache ..."

>> -  * domain #0, which we have to flush:
>> +  * domain #0, which we have to flush.
>> +  *
>> +  * For kdump cases, present entries may be cached due to the in-flight
>> +  * DMA and copied old pgtable, but there is no unmapping behaviour for
>> +  * them, so we need an explicit iotlb flush for the newly-mapped 
>> device.
>> +  * For kdump, at this point, the device is supposed to finish reset at
>> +  * the driver probe stage, no in-flight DMA will exist, thus we do not
>> +  * need to worry about that anymore hereafter.
>>*/
>> - if (cap_caching_mode(iommu->cap)) {
>> + if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) {
>>   iommu->flush.flush_context(iommu, 0,
>>  (((u16)bus) << 8) | devfn,
>>  DMA_CCMD_MASK_NOBIT,
>
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] ACPI IORT ARM SMMU support

2016-11-16 Thread Tomasz Nowicki

On 09.11.2016 15:19, Lorenzo Pieralisi wrote:

This patch series is v7 of a previous posting:

https://lkml.org/lkml/2016/10/18/506



[...]



The ACPI IORT table provides information that allows instantiating
ARM SMMU devices and carrying out id mappings between components on
ARM based systems (devices, IOMMUs, interrupt controllers).

http://infocenter.arm.com/help/topic/com.arm.doc.den0049b/DEN0049B_IO_Remapping_Table.pdf

Building on basic IORT support, this patchset enables ARM SMMUs support
on ACPI systems.

Most of the code is aimed at building the required generic ACPI
infrastructure to create and enable IOMMU components and to bring
the IOMMU infrastructure for ACPI on par with DT, which is going to
make future ARM SMMU components easier to integrate.



[...]



This patchset is provided for review/testing purposes here:

git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/linux.git 
acpi/iort-smmu-v7

Tested on Juno and FVP models for ARM SMMU v1 and v3 probing path.



For all series:
Reviewed-by: Tomasz Nowicki 

Thanks,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v3 08/20] x86: Add support for early encryption/decryption of memory

2016-11-16 Thread Borislav Petkov
Btw, for your next submission, this patch can be split in two exactly
like the commit message paragraphs are:

On Wed, Nov 09, 2016 at 06:36:10PM -0600, Tom Lendacky wrote:
> Add support to be able to either encrypt or decrypt data in place during
> the early stages of booting the kernel. This does not change the memory
> encryption attribute - it is used for ensuring that data present in either
> an encrypted or un-encrypted memory area is in the proper state (for
> example the initrd will have been loaded by the boot loader and will not be
> encrypted, but the memory that it resides in is marked as encrypted).

Patch 2: users of the new memmap change

> The early_memmap support is enhanced to specify encrypted and un-encrypted
> mappings with and without write-protection. The use of write-protection is
> necessary when encrypting data "in place". The write-protect attribute is
> considered cacheable for loads, but not stores. This implies that the
> hardware will never give the core a dirty line with this memtype.

Patch 1: change memmap

This makes this aspect of the patchset much clearer and is better for
bisection.

> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/fixmap.h|9 +++
>  arch/x86/include/asm/mem_encrypt.h   |   15 +
>  arch/x86/include/asm/pgtable_types.h |8 +++
>  arch/x86/mm/ioremap.c|   28 +
>  arch/x86/mm/mem_encrypt.c|  102 
> ++
>  include/asm-generic/early_ioremap.h  |2 +
>  mm/early_ioremap.c   |   15 +
>  7 files changed, 179 insertions(+)

...

> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index d642cc5..06235b4 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -14,6 +14,9 @@
>  #include 
>  #include 
>  
> +#include 
> +#include 
> +
>  extern pmdval_t early_pmd_flags;
>  
>  /*
> @@ -24,6 +27,105 @@ extern pmdval_t early_pmd_flags;
>  unsigned long sme_me_mask __section(.data) = 0;
>  EXPORT_SYMBOL_GPL(sme_me_mask);
>  
> +/* Buffer used for early in-place encryption by BSP, no locking needed */
> +static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
> +
> +/*
> + * This routine does not change the underlying encryption setting of the
> + * page(s) that map this memory. It assumes that eventually the memory is
> + * meant to be accessed as encrypted but the contents are currently not
> + * encrypted.
> + */
> +void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size)
> +{
> + void *src, *dst;
> + size_t len;
> +
> + if (!sme_me_mask)
> + return;
> +
> + local_flush_tlb();
> + wbinvd();
> +
> + /*
> +  * There are limited number of early mapping slots, so map (at most)
> +  * one page at time.
> +  */
> + while (size) {
> + len = min_t(size_t, sizeof(sme_early_buffer), size);
> +
> + /* Create a mapping for non-encrypted write-protected memory */
> + src = early_memremap_dec_wp(paddr, len);
> +
> + /* Create a mapping for encrypted memory */
> + dst = early_memremap_enc(paddr, len);
> +
> + /*
> +  * If a mapping can't be obtained to perform the encryption,
> +  * then encrypted access to that area will end up causing
> +  * a crash.
> +  */
> + BUG_ON(!src || !dst);
> +
> + memcpy(sme_early_buffer, src, len);
> + memcpy(dst, sme_early_buffer, len);

I still am missing the short explanation why we need the temporary buffer.


Oh, and we can save us the code duplication a little. Diff ontop of yours:

---
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 06235b477d7c..50e2c4fc7338 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -36,7 +36,8 @@ static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
  * meant to be accessed as encrypted but the contents are currently not
  * encrypted.
  */
-void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size)
+static void __init noinline
+__mem_enc_dec(resource_size_t paddr, unsigned long size, bool enc)
 {
void *src, *dst;
size_t len;
@@ -54,15 +55,15 @@ void __init sme_early_mem_enc(resource_size_t paddr, 
unsigned long size)
while (size) {
len = min_t(size_t, sizeof(sme_early_buffer), size);
 
-   /* Create a mapping for non-encrypted write-protected memory */
-   src = early_memremap_dec_wp(paddr, len);
+   src = (enc ? early_memremap_dec_wp(paddr, len)
+  : early_memremap_enc_wp(paddr, len));
 
-   /* Create a mapping for encrypted memory */
-   dst = early_memremap_enc(paddr, len);
+   dst = (enc ? early_memremap_enc(paddr, len)
+  : early_memremap_dec(paddr, len));
 
/*
-* If 

Re: [PATCH v7 04/16] drivers: iommu: make of_iommu_set/get_ops() DT agnostic

2016-11-16 Thread Lorenzo Pieralisi
Hi Joerg,

On Mon, Nov 14, 2016 at 06:25:16PM +, Robin Murphy wrote:
> On 14/11/16 15:52, Joerg Roedel wrote:
> > On Mon, Nov 14, 2016 at 12:00:47PM +, Robin Murphy wrote:
> >> If we've already made the decision to move away from bus ops, I don't
> >> see that it makes sense to deliberately introduce new dependencies on
> >> them. Besides, as it stands, this patch literally implements "tell the
> >> iommu-core which hardware-iommus exist in the system and a seperate
> >> iommu_ops ptr for each of them" straight off.
> > 
> > Not sure which code you are looking at, but as I see it we have only
> > per-device iommu-ops now (with this patch). That is different from
> > having core-visible hardware-iommu instances where devices could link
> > to.
> 
> The per-device IOMMU ops are already there since 57f98d2f61e1. This
> patch generalises the other end, moving the "registering an IOMMU
> instance" (i.e. iommu_fwentry) bit into the IOMMU core, from being
> OF-specific. I'd be perfectly happy if we rename iommu_fwentry to
> iommu_instance, fwnode_iommu_set_ops() to iommu_register_instance(), and
> such if that makes the design intent clearer.

I can easily make the changes Robin suggests above, I need to know
what to do with this patch it is the last blocking point for this
series and time is running out I can revert to using dev->bus to
retrieve iommu_ops (even though I do not think it makes sense given
what Robin outlines below) but I need to know please, we can't gate
an entire series for this patch that is just syntactic sugar.

Thanks !
Lorenzo

> If you'd also prefer to replace iommu_fwspec::ops with an opaque
> iommu_fwspec::iommu_instance pointer so that things are a bit more
> centralised (and users are forced to go through the API rather then call
> ops directly), I'd have no major objection either. My main point is that
> we've been deliberately putting the relevant building blocks in place -
> the of_iommu_{get,set}_ops stuff was designed from the start to
> accommodate per-instance ops, via the ops pointer *being* the instance
> token; the iommu_fwspec stuff is deliberately intended to provide
> per-device ops on top of that. The raw functionality is either there in
> iommu.c already, or moving there in patches already written, so if it
> doesn't look right all we need to focus on is making it look right.
> 
> > Also the rest of iommu-core code still makes use of the per-bus ops. The
> > per-device ops are only used for the of_xlate fn-ptr.
> 
> Hence my aforementioned patches intended for 4.10, directly following on
> from introducing iommu_fwspec in 4.9:
> 
> http://www.mail-archive.com/iommu@lists.linux-foundation.org/msg14576.html
> 
> ...the purpose being to provide a smooth transition from per-bus ops to
> per-device, per-instance ops. Apply those and we're 90% of the way there
> for OF-based IOMMU drivers (not that any of those actually need
> per-instance ops, admittedly; I did prototype it for the ARM SMMU ages
> ago, but it didn't seem worth the bother). Lorenzo's series broadens the
> scope to ACPI-based systems and moves the generically-useful parts into
> the core where we can easily build on them further if necessary. The
> major remaining work is to convert external callers of the current
> bus-dependent functions like iommu_domain_alloc(), iommu_present(), etc.
> to device-based alternatives.
> 
> Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] arm64: SMMU-v2: Workaround for Cavium ThunderX erratum 28168

2016-11-16 Thread Marc Zyngier
On 15/11/16 18:24, David Daney wrote:
> On 11/15/2016 01:26 AM, Marc Zyngier wrote:
>> On 15/11/16 07:00, Geetha sowjanya wrote:
>>> From: Tirumalesh Chalamarla 
>>>
>>>This patch implements Cavium ThunderX erratum 28168.
>>>
>>>PCI requires stores complete in order. Due to erratum #28168
>>>PCI-inbound MSI-X store to the interrupt controller are delivered
>>>to the interrupt controller before older PCI-inbound memory stores
>>>are committed.
>>>Doing a sync on SMMU will make sure all prior data transfers are
>>>completed before invoking ISR.
>>>
>>> Signed-off-by: Tirumalesh Chalamarla 
>>> Signed-off-by: Geetha sowjanya 
> [...]
>>> --- a/drivers/irqchip/irq-gic-v3.c
>>> +++ b/drivers/irqchip/irq-gic-v3.c
>>> @@ -28,6 +28,8 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>> +#include 
>>>
>>>   #include 
>>>   #include 
>>> @@ -736,6 +738,20 @@ static inline void gic_cpu_pm_init(void) { }
>>>
>>>   #define GIC_ID_NR (1U << gic_data.rdists.id_bits)
>>>
>>> +/*
>>> + * Due to #28168 erratum in ThunderX,
>>> + * we need to make sure DMA data transfer is done before MSIX.
>>> + */
>>> +static void cavium_irq_perflow_handler(struct irq_data *data)
>>> +{
>>> +   struct pci_dev *pdev;
>>> +
>>> +   pdev = msi_desc_to_pci_dev(irq_data_get_msi_desc(data));
>>
>> What happens if this is not a PCI device?
>>
>>> +   if ((pdev->vendor != 0x177d) &&
>>> +   ((pdev->device & 0xA000) != 0xA000))
>>> +   cavium_arm_smmu_tlb_sync(&pdev->dev);
>>
>> I've asked that before. What makes Cavium devices so special that they
>> are not sensitive to this bug?
> 
> 
> This is a heuristic for devices connected to external PCIe buses as 
> opposed to on-SoC devices (which don't suffer from the erratum).
> 
> In any event what would happen if we got rid of this check and ...
> 
> 
>>
>>> +}
>>> +
>>>   static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,
>>>   irq_hw_number_t hw)
>>>   {
>>> @@ -773,6 +789,9 @@ static int gic_irq_domain_map(struct irq_domain *d, 
>>> unsigned int irq,
>>> return -EPERM;
>>> irq_domain_set_info(d, irq, hw, chip, d->host_data,
>>> handle_fasteoi_irq, NULL, NULL);
>>> +   if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_28168))
>>> +   __irq_set_preflow_handler(irq,
>>> + cavium_irq_perflow_handler);
>>
> 
> ... move the registration of the preflow_handler into a 
> msi_domain_ops.msi_finish() handler in irq-git-v3-its-pic-msi.c?

That's the kind of thing I was angling for. You'll have to store the
device pointer into the scratchpad (we still have plenty of space there)
so that msi_finish() can have a peek.

> There we will know that it is a pci device, and can walk up the bus 
> hierarchy to see if there is a Cavium PCIe root port present.  If such a 
> port is found, we know we are on an external Cavium PCIe bus, and can 
> register the preflow_handler without having to check the device identifiers.

Something like that (though I'm unclear why other devices wouldn't see a
root port, but that's probably me lacking some PCIe foo).

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 7/7] iommu/exynos: Use device dependency links to control runtime pm

2016-11-16 Thread Lukas Wunner
On Thu, Nov 10, 2016 at 12:56:14AM +0100, Rafael J. Wysocki wrote:
> The idea, roughly, is that if there is a single on/off switch acting
> on multiple devices, you can (a) set up a PM domain tracking all of
> those device's runtime PM invocations and (b) maintaining a reference
> counter of devices still not suspended.  This way it would only turn
> the switch off when all of the devices in question had been suspended.
> Analogously, it would turn the switch on before resuming the first
> device in the domain.  Of course, that code isn't available as a
> library, you would need to implement it (or use genpd, but chances are
> it is too heavy weight for the job).

My understanding is that the hierarchy of struct generic_pm_domain
is created by the platform on boot.  For an embedded platform, this
is encoded in the device tree, but what about ACPI which doesn't
know anything about struct generic_pm_domain?  I would have to lump
devices into generic_pm_domains after the fact, after the platform
has scanned the buses, but this seems to be forbidden according to
this slide deck, which calls that a "layering violation":

https://events.linuxfoundation.org/images/stories/pdf/lcjp2012_wysocki.pdf

(Quote: "Adding and Removing Devices [...] Supposed to be called by
the platform (calling one of them from a device driver is a layering
violation).")

So it seems that using struct generic_pm_domain is never an option
on ACPI, is that correct?

Thanks,

Lukas
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
Ccing David
On 2016/11/16 at 17:02, Xunlei Pang wrote:
> We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
> under kdump, it can be steadily reproduced on several different machines,
> the dmesg log is like:
> HP HPSA Driver (v 3.4.16-0)
> hpsa :02:00.0: using doorbell to reset controller
> hpsa :02:00.0: board ready after hard reset.
> hpsa :02:00.0: Waiting for controller to respond to no-op
> DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff]
> DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff]
> DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff]
> DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff]
> DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff]
> DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff]
> DMAR: DRHD: handling fault status reg 2
> DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason 
> 06] PTE Read access is not set
> hpsa :02:00.0: controller message 03:00 timed out
> hpsa :02:00.0: no-op failed; re-trying
>
> After some debugging, we found that the corresponding pte entry value
> is correct, and the value of the iommu caching mode is 0, the fault is
> probably due to the old iotlb cache of the in-flight DMA.
>
> Thus need to flush the old iotlb after context mapping is setup for the
> device, where the device is supposed to finish reset at its driver probe
> stage and no in-flight DMA exists hereafter.
>
> With this patch, all our problematic machines can survive the kdump tests.
>
> CC: Myron Stowe 
> CC: Don Brace 
> CC: Baoquan He 
> CC: Dave Young 
> Tested-by: Joseph Szczypek 
> Signed-off-by: Xunlei Pang 
> ---
>  drivers/iommu/intel-iommu.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 3965e73..eb79288 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct 
> dmar_domain *domain,
>* It's a non-present to present mapping. If hardware doesn't cache
>* non-present entry we only need to flush the write-buffer. If the
>* _does_ cache non-present entries, then it does so in the special
> -  * domain #0, which we have to flush:
> +  * domain #0, which we have to flush.
> +  *
> +  * For kdump cases, present entries may be cached due to the in-flight
> +  * DMA and copied old pgtable, but there is no unmapping behaviour for
> +  * them, so we need an explicit iotlb flush for the newly-mapped device.
> +  * For kdump, at this point, the device is supposed to finish reset at
> +  * the driver probe stage, no in-flight DMA will exist, thus we do not
> +  * need to worry about that anymore hereafter.
>*/
> - if (cap_caching_mode(iommu->cap)) {
> + if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) {
>   iommu->flush.flush_context(iommu, 0,
>  (((u16)bus) << 8) | devfn,
>  DMA_CCMD_MASK_NOBIT,

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: Flush old iotlb for kdump when the device gets context mapped

2016-11-16 Thread Xunlei Pang
We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
under kdump, it can be steadily reproduced on several different machines,
the dmesg log is like:
HP HPSA Driver (v 3.4.16-0)
hpsa :02:00.0: using doorbell to reset controller
hpsa :02:00.0: board ready after hard reset.
hpsa :02:00.0: Waiting for controller to respond to no-op
DMAR: Setting identity map for device :02:00.0 [0xe8000 - 0xe8fff]
DMAR: Setting identity map for device :02:00.0 [0xf4000 - 0xf4fff]
DMAR: Setting identity map for device :02:00.0 [0xbdf6e000 - 0xbdf6efff]
DMAR: Setting identity map for device :02:00.0 [0xbdf6f000 - 0xbdf7efff]
DMAR: Setting identity map for device :02:00.0 [0xbdf7f000 - 0xbdf82fff]
DMAR: Setting identity map for device :02:00.0 [0xbdf83000 - 0xbdf84fff]
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read] Request device [02:00.0] fault addr f000 [fault reason 06] 
PTE Read access is not set
hpsa :02:00.0: controller message 03:00 timed out
hpsa :02:00.0: no-op failed; re-trying

After some debugging, we found that the corresponding pte entry value
is correct, and the value of the iommu caching mode is 0, the fault is
probably due to the old iotlb cache of the in-flight DMA.

Thus need to flush the old iotlb after context mapping is setup for the
device, where the device is supposed to finish reset at its driver probe
stage and no in-flight DMA exists hereafter.

With this patch, all our problematic machines can survive the kdump tests.

CC: Myron Stowe 
CC: Don Brace 
CC: Baoquan He 
CC: Dave Young 
Tested-by: Joseph Szczypek 
Signed-off-by: Xunlei Pang 
---
 drivers/iommu/intel-iommu.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3965e73..eb79288 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2067,9 +2067,16 @@ static int domain_context_mapping_one(struct dmar_domain 
*domain,
 * It's a non-present to present mapping. If hardware doesn't cache
 * non-present entry we only need to flush the write-buffer. If the
 * _does_ cache non-present entries, then it does so in the special
-* domain #0, which we have to flush:
+* domain #0, which we have to flush.
+*
+* For kdump cases, present entries may be cached due to the in-flight
+* DMA and copied old pgtable, but there is no unmapping behaviour for
+* them, so we need an explicit iotlb flush for the newly-mapped device.
+* For kdump, at this point, the device is supposed to finish reset at
+* the driver probe stage, no in-flight DMA will exist, thus we do not
+* need to worry about that anymore hereafter.
 */
-   if (cap_caching_mode(iommu->cap)) {
+   if (is_kdump_kernel() || cap_caching_mode(iommu->cap)) {
iommu->flush.flush_context(iommu, 0,
   (((u16)bus) << 8) | devfn,
   DMA_CCMD_MASK_NOBIT,
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu