RE: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook

2019-08-30 Thread Krishna Reddy
>> +if (smmu->impl->tlb_sync) {
>> +smmu->impl->tlb_sync(smmu, page, sync, status);

>What I'd hoped is that rather than needing a hook for this, you could just 
>override smmu_domain->tlb_ops from .init_context to wire up the alternate 
>.sync method directly. That would save this extra level of indirection.

Hi Robin,  overriding tlb_ops->tlb_sync function is not enough here.
There are direct references to arm_smmu_tlb_sync_context(),  
arm_smmu_tlb_sync_global() functions.
In arm-smmu.c.  we can replace these direct references with tlb_ops->tlb_sync() 
function except in one function arm_smmu_device_reset().  
When arm_smmu_device_reset() gets called, domains are not initialized and 
tlb_ops is not available. 
Should we add a new hook for arm_smmu_tlb_sync_global() or make this as a 
responsibility of impl->reset() hook as it gets
called at the same place?

-KR 


RE: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks

2019-08-30 Thread Krishna Reddy
>> +static irqreturn_t nsmmu_context_fault_inst(int irq,
>> +struct arm_smmu_device *smmu,
>> +int idx, int inst);

>More of these signed integers that could be unsigned. Also why the need to 
>predeclare this? Can you not just put the definition of the function up here?

The singed integers are based on original function prototype from arm-smmu.c.  
inst can be updated to unsigned.
This is because I was checking context faults from global fault handler as 
well.  This can avoided by using interrupt lines of all the instances across 
global and context irq entries.  Let me update.
 
> + if (smmu->impl->global_fault)
> + return smmu->impl->global_fault(irq, smmu);
>>... and here about the extra level of indirection.

Fixing in next version.

-KR



RE: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook

2019-08-30 Thread Krishna Reddy
>Wouldn't it work if you replaced all calls of __arm_smmu_tlb_sync() by
>smmu->impl->tlb_sync() and assign __arm_smmu_tlb_sync() as default for
>devices that don't need to override it? That makes this patch slightly larger, 
>but it saves us one level of indirection.

The tlb_ops->tlb_sync can be overridden directly in arm-smmu-nvidia.c specific 
implementation as pointed by Robin. Will be updating in next patch.


>> +void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
>> + int status);

>Can't page, sync and status all be unsigned?

This is to be uniform with original tlb_sync definition is arm-smmu.c.  Anyway, 
this hook is not necessary as tlb_ops->tlb_sync can be overridden directly.

-KR


RE: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU

2019-08-30 Thread Krishna Reddy
>> +smmu: iommu@1200 {
>> +compatible = "nvidia,smmu-v2";

>Should we have a compatibility string like "nvidia,tegra194-smmu" so that we 
>can have other chips with SMMUv2 that could be different?

As pointed by Robin as well, as Nvidia hasn't modified ARM MMU-500 
implementation, we can update it to specific chip based. 
Let me update.

-KR

 


RE: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU

2019-08-30 Thread Krishna Reddy
>The number of global interrupts has never been related to the number of 
>context interrupts :/

Yeah,  They are not related. I was trying to use minimum irq entries in the DT 
node as they both would achieve the same functionality.

>Clearly you have one combined interrupt output per SMMU - describing those as 
>one global interrupt and the first two context bank interrupts respectively 
>makes far less sense than calling them 3 global interrupts, not least because 
>the latter is strictly true.

Will update to 3 in next patch to make it better for readability.

-KR


RE: [PATCH 1/7] iommu/arm-smmu: add Nvidia SMMUv2 implementation

2019-08-30 Thread Krishna Reddy
>> +ARM_SMMU_MATCH_DATA(nvidia_smmuv2, ARM_SMMU_V2, NVIDIA_SMMUV2);

> From the previous discussions, I got the impression that other than the 
> 'novel' way they're integrated, the actual SMMU implementations were 
> unmodified Arm MMU-500s. Is that the case, or have I misread something?

The ARM MMU-500 implementation is unmodified.  It is the way the are integrated 
and used together(for interleaved accesses) is different from regular ARM 
MMU-500.
I have added it to get the model number and to be able differentiate the SMMU 
implementation in arm-smmu-impl.c.

-KR

 


RE: [PATCH 2/7] dt-bindings: arm-smmu: Add binding for nvidia,smmu-v2

2019-08-30 Thread Krishna Reddy
>> +"nidia,smmu-v2"
>>   "qcom,smmu-v2"

>I agree with Mikko that the compatible must be at least SoC-specific, but 
>potentially even instance-specific (e.g. "nvidia,tegra194-gpu-smmu")
> depending on how many of these parallel-SMMU configurations might be hiding 
> in current and future SoCs.

I am correcting the spelling mistake pointed by Mikko.  The NVIDIA SMMUv2 
implementation is getting used beyond  Tegra194 SOC.  
To be able to use the smmu compatible string across multiple SOC's, 
"nvidia,smmu-v2" compatible string is chosen.
Are you suggesting to make it soc specific and add another one in future?

-KR
 


RE: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook

2019-08-30 Thread Krishna Reddy
>> +for (i = 0; i < to_nsmmu(smmu)->num_inst; i++)

>It might make more sense to make this the innermost loop, i.e.:
for (i = 0; i < nsmmu->num_inst; i++)
reg &= readl_relaxed(nsmmu_page(smmu, i, page)...
>since polling the instances in parallel rather than in series seems like it 
>might be a bit more efficient.

Sync register is programmed at the same time for both instances. The status 
check is serialized.
I can update it to check status of both at the same time.

>> +if (smmu->impl->tlb_sync) {
>> +smmu->impl->tlb_sync(smmu, page, sync, status);

>What I'd hoped is that rather than needing a hook for this, you could just 
>override smmu_domain->tlb_ops from .init_context to wire up the alternate 
>.sync method directly. That would save this extra level of indirection.

With arm_smmu_domain now available in arm-smmu.h,  arm-smmu-nvidia.c can 
directly update the tlb_ops->tlb_sync and avoid indirection.
Will update in next version.

-KR 


Re: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU

2019-08-30 Thread Robin Murphy

On 30/08/2019 18:25, Krishna Reddy wrote:

+   #global-interrupts = <1>;



Shouldn't that be 3?


Interrupt line is shared between global and all context faults for each SMMU 
instance.
Nvidia implementation checks for both Global and context faults on each 
interrupt to an SMMU instance.
It can be either 1 or 3.  If we make it 3, we need to add two more irq entries 
in node for context faults.


The number of global interrupts has never been related to the number of 
context interrupts :/



In the future, we can update arm-smmu.c to support shared interrupt line 
between global and all context faults.


Clearly you have one combined interrupt output per SMMU - describing 
those as one global interrupt and the first two context bank interrupts 
respectively makes far less sense than calling them 3 global interrupts, 
not least because the latter is strictly true. Yes, the binding prevents 
us from describing the context bank interrupts for more than one 
instance, but at that point the fact that it *is* the combined output 
saves us - because the driver is aware of this specific integration it 
knows it can just register the "secondary" global interrupts as 
"secondary" context interrupts too. If we had separate IRQ lines per 
context bank per instance, then we'd have a really big problem and might 
have to redefine the binding, but as it is it happens to work out pretty 
neatly.


Robin.


RE: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks

2019-08-30 Thread Krishna Reddy
>> +if (smmu->impl->global_fault)
>> +return smmu->impl->global_fault(irq, smmu);

>Can't we just register impl->global_fault (if set) instead of 
>arm_smmu_global_fault as the handler when we first set up the IRQs in 
>arm_smmu_device_probe()?
>Ideally we'd do the same for the context banks as well, although we might need 
>an additional hook from which to request the secondary IRQs that the main flow 
>can't accommodate.

When first implemented theis patch, I think there were compile issues in 
accessing struct arm_smmu_domain from arm-smmu-nvidia.c as it was part of 
arm-smmu.c. 
To avoid issues accessing arm_smmu_domain. It did this for context fault and 
did same for global fault for uniformity.
Now, I see that it is part of arm-smmu.h.  Let me update code to register 
implementation hooks directly.  Thanks

-KR

 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU

2019-08-30 Thread Krishna Reddy
>> +#global-interrupts = <1>;

>Shouldn't that be 3?

Interrupt line is shared between global and all context faults for each SMMU 
instance.
Nvidia implementation checks for both Global and context faults on each 
interrupt to an SMMU instance. 
It can be either 1 or 3.  If we make it 3, we need to add two more irq entries 
in node for context faults. 
In the future, we can update arm-smmu.c to support shared interrupt line 
between global and all context faults.


 -KR


Re: [PATCH v2 01/11] asm-generic: add dma_zone_size

2019-08-30 Thread Nicolas Saenz Julienne
On Fri, 2019-08-30 at 15:45 +0100, Catalin Marinas wrote:
> On Mon, Aug 26, 2019 at 03:46:52PM +0200, Nicolas Saenz Julienne wrote:
> > On Mon, 2019-08-26 at 09:09 +0200, Christoph Hellwig wrote:
> > > On Tue, Aug 20, 2019 at 04:58:09PM +0200, Nicolas Saenz Julienne wrote:
> > > > Some architectures have platform specific DMA addressing limitations.
> > > > This will allow for hardware description code to provide the constraints
> > > > in a generic manner, so as for arch code to properly setup it's memory
> > > > zones and DMA mask.
> > > 
> > > I know this just spreads the arm code, but I still kinda hate it.
> > 
> > Rob's main concern was finding a way to pass the constraint from HW
> > definition
> > to arch without widening fdt's architecture specific function surface. I'd
> > say
> > it's fair to argue that having a generic mechanism makes sense as it'll now
> > traverse multiple archs and subsystems.
> > 
> > I get adding globals like this is not very appealing, yet I went with it as
> > it
> > was the easier to integrate with arm's code. Any alternative suggestions?
> 
> In some discussion with Robin, since it's just RPi4 that we are aware of
> having such requirement on arm64, he suggested that we have a permanent
> ZONE_DMA on arm64 with a default size of 1GB. It should cover all arm64
> SoCs we know of without breaking the single Image binary. The arch/arm
> can use its current mach-* support.
> 
> I may like this more than the proposed early_init_dt_get_dma_zone_size()
> here which checks for specific SoCs (my preferred way was to build the
> mask from all buses described in DT but I hadn't realised the
> complications).

Hi Catalin, thanks for giving it a thought.

I'll be happy to implement it that way. I agree it's a good compromise.

@Christoph, do you still want the patch where I create 'zone_dma_bits'? With a
hardcoded ZONE_DMA it's not absolutely necessary. Though I remember you said it
was a first step towards being able to initialize dma-direct's min_mask in
meminit.

Regards,
Nicolas



signature.asc
Description: This is a digitally signed message part


Re: [PATCH] PCI: Move ATS declarations to linux/pci.h

2019-08-30 Thread Robin Murphy

On 30/08/2019 17:18, Christoph Hellwig wrote:

On Fri, Aug 30, 2019 at 05:07:56PM +0200, Krzysztof Wilczynski wrote:

Move ATS function prototypes from include/linux/pci-ats.h to
include/linux/pci.h so users only need to include :


Why is that so important?  Very few PCI(e) device drivers use ATS,
so keeping it out of everyones include hell doesn't seem all bad.


Although to be fair it seems that all the actual ATS stuff already moved 
out 4 years ago, so at the very least maybe it would warrant renaming to 
pci-pri-pasid.h :)


Robin.


Re: [PATCH] PCI: Move ATS declarations to linux/pci.h

2019-08-30 Thread Christoph Hellwig
On Fri, Aug 30, 2019 at 05:07:56PM +0200, Krzysztof Wilczynski wrote:
> Move ATS function prototypes from include/linux/pci-ats.h to
> include/linux/pci.h so users only need to include :

Why is that so important?  Very few PCI(e) device drivers use ATS,
so keeping it out of everyones include hell doesn't seem all bad.


Re: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU

2019-08-30 Thread Robin Murphy

On 29/08/2019 23:47, Krishna Reddy wrote:

Add DT node for T194 SMMU to enable SMMU support.

Signed-off-by: Krishna Reddy 
---
  arch/arm64/boot/dts/nvidia/tegra194.dtsi | 75 
  1 file changed, 75 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index d906958..ad509bb 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1401,6 +1401,81 @@
  0x8200 0x0  0x4000 0x1f 0x4000 0x0 
0xc000>; /* non-prefetchable memory (3GB) */
};
  
+	smmu: iommu@1200 {

+   compatible = "nvidia,smmu-v2";
+   reg = <0 0x1200 0 0x80>,
+ <0 0x1100 0 0x80>,
+ <0 0x1000 0 0x80>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   stream-match-mask = <0x7f80>;
+   #global-interrupts = <1>;


Shouldn't that be 3?

Robin.


+   #iommu-cells = <1>;
+   };
+
sysram@4000 {
compatible = "nvidia,tegra194-sysram", "mmio-sram";
reg = <0x0 0x4000 0x0 0x5>;



Re: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks

2019-08-30 Thread Robin Murphy

On 29/08/2019 23:47, Krishna Reddy wrote:

Add global/context fault hooks to allow Nvidia SMMU implementation
handle faults across multiple SMMUs.

Signed-off-by: Krishna Reddy 
---
  drivers/iommu/arm-smmu-nvidia.c | 127 
  drivers/iommu/arm-smmu.c|   6 ++
  drivers/iommu/arm-smmu.h|   4 ++
  3 files changed, 137 insertions(+)

diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
index a429b2c..b2a3c49 100644
--- a/drivers/iommu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -14,6 +14,10 @@
  
  #define NUM_SMMU_INSTANCES 3
  
+static irqreturn_t nsmmu_context_fault_inst(int irq,

+   struct arm_smmu_device *smmu,
+   int idx, int inst);
+
  struct nvidia_smmu {
struct arm_smmu_device  smmu;
int num_inst;
@@ -87,12 +91,135 @@ static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, 
int page,
nsmmu_tlb_sync_wait(smmu, page, sync, status, i);
  }
  
+static irqreturn_t nsmmu_global_fault_inst(int irq,

+  struct arm_smmu_device *smmu,
+  int inst)
+{
+   u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
+
+   gfsr = readl_relaxed(nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
+   gfsynr0 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+   ARM_SMMU_GR0_sGFSYNR0);
+   gfsynr1 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+   ARM_SMMU_GR0_sGFSYNR1);
+   gfsynr2 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+   ARM_SMMU_GR0_sGFSYNR2);
+
+   if (!gfsr)
+   return IRQ_NONE;
+
+   dev_err_ratelimited(smmu->dev,
+   "Unexpected global fault, this could be serious\n");
+   dev_err_ratelimited(smmu->dev,
+   "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
0x%08x\n",
+   gfsr, gfsynr0, gfsynr1, gfsynr2);
+
+   writel_relaxed(gfsr, nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
+   return IRQ_HANDLED;
+}
+
+static irqreturn_t nsmmu_global_fault(int irq, struct arm_smmu_device *smmu)
+{
+   int i;
+   irqreturn_t irq_ret = IRQ_NONE;
+
+   /* Interrupt line is shared between global and context faults.
+* Check for both type of interrupts on either fault handlers.
+*/
+   for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) {
+   irq_ret = nsmmu_context_fault_inst(irq, smmu, 0, i);
+   if (irq_ret == IRQ_HANDLED)
+   return irq_ret;
+   }
+
+   for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) {
+   irq_ret = nsmmu_global_fault_inst(irq, smmu, i);
+   if (irq_ret == IRQ_HANDLED)
+   return irq_ret;
+   }
+
+   return irq_ret;
+}
+
+static irqreturn_t nsmmu_context_fault_bank(int irq,
+   struct arm_smmu_device *smmu,
+   int idx, int inst)
+{
+   u32 fsr, fsynr, cbfrsynra;
+   unsigned long iova;
+
+   fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
+   if (!(fsr & FSR_FAULT))
+   return IRQ_NONE;
+
+   fsynr = readl_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
+ ARM_SMMU_CB_FSYNR0);
+   iova = readq_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
+ARM_SMMU_CB_FAR);
+   cbfrsynra = readl_relaxed(nsmmu_page(smmu, inst, 1) +
+ ARM_SMMU_GR1_CBFRSYNRA(idx));
+
+   dev_err_ratelimited(smmu->dev,
+   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
cbfrsynra=0x%x, cb=%d\n",
+   fsr, iova, fsynr, cbfrsynra, idx);
+
+   writel_relaxed(fsr, nsmmu_page(smmu, inst, smmu->numpage + idx) +
+   ARM_SMMU_CB_FSR);
+   return IRQ_HANDLED;
+}
+
+static irqreturn_t nsmmu_context_fault_inst(int irq,
+   struct arm_smmu_device *smmu,
+   int idx, int inst)
+{
+   irqreturn_t irq_ret = IRQ_NONE;
+
+   /* Interrupt line shared between global and all context faults.
+* Check for faults across all contexts.
+*/
+   for (idx = 0; idx < smmu->num_context_banks; idx++) {
+   irq_ret = nsmmu_context_fault_bank(irq, smmu, idx, inst);
+
+   if (irq_ret == IRQ_HANDLED)
+   break;
+   }
+
+   return irq_ret;
+}
+
+static irqreturn_t nsmmu_context_fault(int irq,
+  struct arm_smmu_device *smmu,
+  int cbndx)
+{
+   int i;
+   irqreturn_t irq_ret = IRQ_NONE;
+
+   /* Interrupt line is shared between global and context faul

Re: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook

2019-08-30 Thread Robin Murphy

On 29/08/2019 23:47, Krishna Reddy wrote:

tlb_sync hook allows nvidia smmu handle tlb sync
across multiple SMMUs as necessary.

Signed-off-by: Krishna Reddy 
---
  drivers/iommu/arm-smmu-nvidia.c | 32 
  drivers/iommu/arm-smmu.c|  8 +---
  drivers/iommu/arm-smmu.h|  4 
  3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
index d93ceda..a429b2c 100644
--- a/drivers/iommu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -56,11 +56,43 @@ static void nsmmu_write_reg64(struct arm_smmu_device *smmu,
writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset);
  }
  
+static void nsmmu_tlb_sync_wait(struct arm_smmu_device *smmu, int page,

+   int sync, int status, int inst)
+{
+   u32 reg;
+   unsigned int spin_cnt, delay;
+
+   for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
+   for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
+   reg = readl_relaxed(
+ nsmmu_page(smmu, inst, page) + status);
+   if (!(reg & sTLBGSTATUS_GSACTIVE))
+   return;
+   cpu_relax();
+   }
+   udelay(delay);
+   }
+   dev_err_ratelimited(smmu->dev,
+   "TLB sync timed out -- SMMU may be deadlocked\n");
+}
+
+static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page,
+  int sync, int status)
+{
+   int i;
+
+   arm_smmu_writel(smmu, page, sync, 0);
+
+   for (i = 0; i < to_nsmmu(smmu)->num_inst; i++)


It might make more sense to make this the innermost loop, i.e.:

for (i = 0; i < nsmmu->num_inst; i++)
reg &= readl_relaxed(nsmmu_page(smmu, i, page)...

since polling the instances in parallel rather than in series seems like 
it might be a bit more efficient.



+   nsmmu_tlb_sync_wait(smmu, page, sync, status, i);
+}
+
  static const struct arm_smmu_impl nsmmu_impl = {
.read_reg = nsmmu_read_reg,
.write_reg = nsmmu_write_reg,
.read_reg64 = nsmmu_read_reg64,
.write_reg64 = nsmmu_write_reg64,
+   .tlb_sync = nsmmu_tlb_sync,
  };
  
  struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 46e1641..f5454e71 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -52,9 +52,6 @@
   */
  #define QCOM_DUMMY_VAL -1
  
-#define TLB_LOOP_TIMEOUT		100	/* 1s! */

-#define TLB_SPIN_COUNT 10
-
  #define MSI_IOVA_BASE 0x800
  #define MSI_IOVA_LENGTH   0x10
  
@@ -244,6 +241,11 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,

unsigned int spin_cnt, delay;
u32 reg;
  
+	if (smmu->impl->tlb_sync) {

+   smmu->impl->tlb_sync(smmu, page, sync, status);


What I'd hoped is that rather than needing a hook for this, you could 
just override smmu_domain->tlb_ops from .init_context to wire up the 
alternate .sync method directly. That would save this extra level of 
indirection.


Robin.


+   return;
+   }
+
arm_smmu_writel(smmu, page, sync, QCOM_DUMMY_VAL);
for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 9645bf1..d3217f1 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -207,6 +207,8 @@ enum arm_smmu_cbar_type {
  /* Maximum number of context banks per SMMU */
  #define ARM_SMMU_MAX_CBS  128
  
+#define TLB_LOOP_TIMEOUT		100	/* 1s! */

+#define TLB_SPIN_COUNT 10
  
  /* Shared driver definitions */

  enum arm_smmu_arch_version {
@@ -336,6 +338,8 @@ struct arm_smmu_impl {
int (*cfg_probe)(struct arm_smmu_device *smmu);
int (*reset)(struct arm_smmu_device *smmu);
int (*init_context)(struct arm_smmu_domain *smmu_domain);
+   void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
+int status);
  };
  
  static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n)




Re: [PATCH 2/7] dt-bindings: arm-smmu: Add binding for nvidia,smmu-v2

2019-08-30 Thread Robin Murphy

On 29/08/2019 23:47, Krishna Reddy wrote:

Add binding doc for Nvidia's smmu-v2 implementation.

Signed-off-by: Krishna Reddy 
---
  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 1 +
  1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 3133f3b..0de3759 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -17,6 +17,7 @@ conditions.
  "arm,mmu-401"
  "arm,mmu-500"
  "cavium,smmu-v2"
+"nidia,smmu-v2"
  "qcom,smmu-v2"


I agree with Mikko that the compatible must be at least SoC-specific, 
but potentially even instance-specific (e.g. "nvidia,tegra194-gpu-smmu") 
depending on how many of these parallel-SMMU configurations might be 
hiding in current and future SoCs.


Robin.

  
depending on the particular implementation and/or the




[PATCH] PCI: Move ATS declarations to linux/pci.h

2019-08-30 Thread Krzysztof Wilczynski
Move ATS function prototypes from include/linux/pci-ats.h to
include/linux/pci.h so users only need to include :

Realted to PRI capability:

  pci_enable_pri()
  pci_disable_pri()
  pci_restore_pri_state()
  pci_reset_pri()

Related to PASID capability:

  pci_enable_pasid()
  pci_disable_pasid()
  pci_restore_pasid_state()
  pci_pasid_features()
  pci_max_pasids()
  pci_prg_resp_pasid_required()

No functional changes intended.

Signed-off-by: Krzysztof Wilczynski 
---
 drivers/iommu/amd_iommu.c   |  1 -
 drivers/iommu/arm-smmu-v3.c |  1 -
 drivers/iommu/intel-iommu.c |  1 -
 drivers/iommu/intel-pasid.c |  1 -
 drivers/iommu/intel-svm.c   |  1 -
 drivers/pci/ats.c   |  1 -
 drivers/pci/pci.c   |  1 -
 include/linux/pci-ats.h | 77 -
 include/linux/pci.h | 34 
 9 files changed, 34 insertions(+), 84 deletions(-)
 delete mode 100644 include/linux/pci-ats.h

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 04a9f8443344..d43913386915 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 0ad6d34d1e96..3bd9455efc39 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include 
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4658cda6f3d2..362845b5c88a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -35,7 +35,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 040a445be300..f670315afa67 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "intel-pasid.h"
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 780de0caafe8..ee9dfc84f925 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index e18499243f84..3f5fb2d4a763 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -10,7 +10,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f20a3de57d21..c8f2a05e6b37 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
deleted file mode 100644
index 1ebb88e7c184..
--- a/include/linux/pci-ats.h
+++ /dev/null
@@ -1,77 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef LINUX_PCI_ATS_H
-#define LINUX_PCI_ATS_H
-
-#include 
-
-#ifdef CONFIG_PCI_PRI
-
-int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
-void pci_disable_pri(struct pci_dev *pdev);
-void pci_restore_pri_state(struct pci_dev *pdev);
-int pci_reset_pri(struct pci_dev *pdev);
-
-#else /* CONFIG_PCI_PRI */
-
-static inline int pci_enable_pri(struct pci_dev *pdev, u32 reqs)
-{
-   return -ENODEV;
-}
-
-static inline void pci_disable_pri(struct pci_dev *pdev)
-{
-}
-
-static inline void pci_restore_pri_state(struct pci_dev *pdev)
-{
-}
-
-static inline int pci_reset_pri(struct pci_dev *pdev)
-{
-   return -ENODEV;
-}
-
-#endif /* CONFIG_PCI_PRI */
-
-#ifdef CONFIG_PCI_PASID
-
-int pci_enable_pasid(struct pci_dev *pdev, int features);
-void pci_disable_pasid(struct pci_dev *pdev);
-void pci_restore_pasid_state(struct pci_dev *pdev);
-int pci_pasid_features(struct pci_dev *pdev);
-int pci_max_pasids(struct pci_dev *pdev);
-int pci_prg_resp_pasid_required(struct pci_dev *pdev);
-
-#else  /* CONFIG_PCI_PASID */
-
-static inline int pci_enable_pasid(struct pci_dev *pdev, int features)
-{
-   return -EINVAL;
-}
-
-static inline void pci_disable_pasid(struct pci_dev *pdev)
-{
-}
-
-static inline void pci_restore_pasid_state(struct pci_dev *pdev)
-{
-}
-
-static inline int pci_pasid_features(struct pci_dev *pdev)
-{
-   return -EINVAL;
-}
-
-static inline int pci_max_pasids(struct pci_dev *pdev)
-{
-   return -EINVAL;
-}
-
-static inline int pci_prg_resp_pasid_required(struct pci_dev *pdev)
-{
-   return 0;
-}
-#endif /* CONFIG_PCI_PASID */
-
-
-#endif /* LINUX_PCI_ATS_H*/
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 463486016290..8ac142801890 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2349,6 +2349,40 @@ static inline bool pci_is_thunderbolt_attached(struct 
pci_dev *pdev)
 void pci_uevent_ers(struct pci_dev *pdev, enum  pci_ers_result err_type);
 #endif
 
+#ifdef CONFIG_PCI_PRI
+int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
+void pci_disabl

Re: [PATCH 1/7] iommu/arm-smmu: add Nvidia SMMUv2 implementation

2019-08-30 Thread Robin Murphy

On 29/08/2019 23:47, Krishna Reddy wrote:

Add Nvidia SMMUv2 implementation and model info.

Signed-off-by: Krishna Reddy 
---
  MAINTAINERS |  2 +
  drivers/iommu/Makefile  |  2 +-
  drivers/iommu/arm-smmu-impl.c   |  2 +
  drivers/iommu/arm-smmu-nvidia.c | 97 +
  drivers/iommu/arm-smmu.c|  2 +
  drivers/iommu/arm-smmu.h|  2 +
  6 files changed, 106 insertions(+), 1 deletion(-)
  create mode 100644 drivers/iommu/arm-smmu-nvidia.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 289fb06..b9d59e51 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15785,9 +15785,11 @@ F: drivers/i2c/busses/i2c-tegra.c
  
  TEGRA IOMMU DRIVERS

  M:Thierry Reding 
+R: Krishna Reddy 
  L:linux-te...@vger.kernel.org
  S:Supported
  F:drivers/iommu/tegra*
+F: drivers/iommu/arm-smmu-nvidia.c
  
  TEGRA KBC DRIVER

  M:Laxman Dewangan 
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index a2729aa..7f5489e 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,7 +13,7 @@ obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
  obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o
  obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
  obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
-obj-$(CONFIG_ARM_SMMU) += arm-smmu.o arm-smmu-impl.o
+obj-$(CONFIG_ARM_SMMU) += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o
  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
  obj-$(CONFIG_DMAR_TABLE) += dmar.o
  obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 5c87a38..e5e595f 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -162,6 +162,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
break;
case CAVIUM_SMMUV2:
return cavium_smmu_impl_init(smmu);
+   case NVIDIA_SMMUV2:
+   return nvidia_smmu_impl_init(smmu);
default:
break;
}
diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
new file mode 100644
index 000..d93ceda
--- /dev/null
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Nvidia ARM SMMU v2 implementation quirks
+// Copyright (C) 2019 NVIDIA CORPORATION.  All rights reserved.
+
+#define pr_fmt(fmt) "nvidia-smmu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "arm-smmu.h"
+
+#define NUM_SMMU_INSTANCES 3
+
+struct nvidia_smmu {
+   struct arm_smmu_device  smmu;
+   int num_inst;
+   void __iomem*bases[NUM_SMMU_INSTANCES];
+};
+
+#define to_nsmmu(s)container_of(s, struct nvidia_smmu, smmu)
+
+#define nsmmu_page(smmu, inst, page) \
+   (((inst) ? to_nsmmu(smmu)->bases[(inst)] : smmu->base) + \
+   ((page) << smmu->pgshift))
+
+static u32 nsmmu_read_reg(struct arm_smmu_device *smmu,
+ int page, int offset)
+{
+   return readl_relaxed(nsmmu_page(smmu, 0, page) + offset);
+}
+
+static void nsmmu_write_reg(struct arm_smmu_device *smmu,
+   int page, int offset, u32 val)
+{
+   int i;
+
+   for (i = 0; i < to_nsmmu(smmu)->num_inst; i++)
+   writel_relaxed(val, nsmmu_page(smmu, i, page) + offset);
+}
+
+static u64 nsmmu_read_reg64(struct arm_smmu_device *smmu,
+   int page, int offset)
+{
+   return readq_relaxed(nsmmu_page(smmu, 0, page) + offset);
+}
+
+static void nsmmu_write_reg64(struct arm_smmu_device *smmu,
+ int page, int offset, u64 val)
+{
+   int i;
+
+   for (i = 0; i < to_nsmmu(smmu)->num_inst; i++)
+   writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset);
+}
+
+static const struct arm_smmu_impl nsmmu_impl = {
+   .read_reg = nsmmu_read_reg,
+   .write_reg = nsmmu_write_reg,
+   .read_reg64 = nsmmu_read_reg64,
+   .write_reg64 = nsmmu_write_reg64,
+};
+
+struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
+{
+   int i;
+   struct nvidia_smmu *nsmmu;
+   struct resource *res;
+   struct device *dev = smmu->dev;
+   struct platform_device *pdev = to_platform_device(smmu->dev);
+
+   nsmmu = devm_kzalloc(smmu->dev, sizeof(*nsmmu), GFP_KERNEL);
+   if (!nsmmu)
+   return ERR_PTR(-ENOMEM);
+
+   nsmmu->smmu = *smmu;
+   /* Instance 0 is ioremapped by arm-smmu.c */
+   nsmmu->num_inst = 1;
+
+   for (i = 1; i < NUM_SMMU_INSTANCES; i++) {
+   res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+   if (!res)
+   break;
+   nsmmu->bases[i] = devm_ioremap_resource(dev, res);
+   if (IS_ERR(nsmmu->bases[i]))
+   return (struct arm_smmu_device *)nsmmu->bases[i];
+   nsmmu->num

Re: [PATCH 1/4] vmalloc: lift the arm flag for coherent mappings to common code

2019-08-30 Thread Christoph Hellwig
On Fri, Aug 30, 2019 at 10:29:18AM +0100, Russell King - ARM Linux admin wrote:
> On Fri, Aug 30, 2019 at 08:29:21AM +0200, Christoph Hellwig wrote:
> > The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA
> > coherent remapping for a while.  Lift this flag to common code so
> > that we can use it generically.  We also check it in the only place
> > VM_USERMAP is directly check so that we can entirely replace that
> > flag as well (although I'm not even sure why we'd want to allow
> > remapping DMA appings, but I'd rather not change behavior).
> 
> Good, because if you did change that behaviour, you'd break almost
> every ARM framebuffer and cripple ARM audio drivers.

How would that break them?  All the usual video and audio drivers that
use dma_alloc_* then use dma_mmap_* which never end up in the only place
that actually checks VM_USERMAP (remap_vmalloc_range_partial) as they
end up in the dma_map_ops mmap methods which contain what is effecitvely
open coded versions of that routine.  There are very few callers of
remap_vmalloc_range_partial / remap_vmalloc_range, and while a few of
those actually are in media drivers and the virtual frame buffer video
driver, none of these seems to be called on dma memory (which would
be a layering violation anyway).


Re: [PATCH v2 01/11] asm-generic: add dma_zone_size

2019-08-30 Thread Catalin Marinas
On Mon, Aug 26, 2019 at 03:46:52PM +0200, Nicolas Saenz Julienne wrote:
> On Mon, 2019-08-26 at 09:09 +0200, Christoph Hellwig wrote:
> > On Tue, Aug 20, 2019 at 04:58:09PM +0200, Nicolas Saenz Julienne wrote:
> > > Some architectures have platform specific DMA addressing limitations.
> > > This will allow for hardware description code to provide the constraints
> > > in a generic manner, so as for arch code to properly setup it's memory
> > > zones and DMA mask.
> > 
> > I know this just spreads the arm code, but I still kinda hate it.
> 
> Rob's main concern was finding a way to pass the constraint from HW definition
> to arch without widening fdt's architecture specific function surface. I'd say
> it's fair to argue that having a generic mechanism makes sense as it'll now
> traverse multiple archs and subsystems.
> 
> I get adding globals like this is not very appealing, yet I went with it as it
> was the easier to integrate with arm's code. Any alternative suggestions?

In some discussion with Robin, since it's just RPi4 that we are aware of
having such requirement on arm64, he suggested that we have a permanent
ZONE_DMA on arm64 with a default size of 1GB. It should cover all arm64
SoCs we know of without breaking the single Image binary. The arch/arm
can use its current mach-* support.

I may like this more than the proposed early_init_dt_get_dma_zone_size()
here which checks for specific SoCs (my preferred way was to build the
mask from all buses described in DT but I hadn't realised the
complications).

-- 
Catalin


Re: [PATCH] iommu/qcom_iommu: Use struct_size() helper

2019-08-30 Thread Joerg Roedel
On Thu, Aug 29, 2019 at 11:03:27PM -0500, Gustavo A. R. Silva wrote:
>  drivers/iommu/qcom_iommu.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)

Applied, thanks.


Re: [PATCH] Remove wrong default domain comments

2019-08-30 Thread Joerg Roedel
On Mon, Aug 26, 2019 at 05:48:21AM +0100, Tom Murphy wrote:
>  drivers/iommu/iommu.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Applied, thanks.


Re: [PATCH v1] iommu/vt-d: remove global page flush support

2019-08-30 Thread Joerg Roedel
On Mon, Aug 26, 2019 at 08:53:29AM -0700, Jacob Pan wrote:
>  drivers/iommu/intel-svm.c   | 36 +++-
>  include/linux/intel-iommu.h |  3 ---
>  2 files changed, 15 insertions(+), 24 deletions(-)

Applied, thanks.


Re: [PATCH v8 7/7] iommu/vt-d: Use bounce buffer for untrusted devices

2019-08-30 Thread Robin Murphy

On 30/08/2019 14:39, David Laight wrote:

From: Lu Baolu

Sent: 30 August 2019 08:17



The Intel VT-d hardware uses paging for DMA remapping.
The minimum mapped window is a page size. The device
drivers may map buffers not filling the whole IOMMU
window. This allows the device to access to possibly
unrelated memory and a malicious device could exploit
this to perform DMA attacks. To address this, the
Intel IOMMU driver will use bounce pages for those
buffers which don't fill whole IOMMU pages.


Won't this completely kill performance?


Yes it will.

Though hopefully by now we're all well aware that speed and security 
being inversely proportional is the universal truth of modern computing.



I'd expect to see something for dma_alloc_coherent() (etc)
that tries to give the driver page sized buffers.


Coherent DMA already works in PAGE_SIZE units under the covers (at least 
in the DMA API implementations relevant here) - that's not an issue. The 
problem is streaming DMA, where we have to give the device access to, 
say, some pre-existing 64-byte data packet, from right in the middle of 
who knows what else. Since we do not necessarily have control over the 
who knows what else, the only universally-practical way to isolate the 
DMA data is to copy it away to some safe sanitised page which we *do* 
control, and make the actual DMA accesses target that.



Either that or the driver could allocate page sized buffers
even though it only passes fragments of these buffers to
the dma functions (to avoid excessive cache invalidates).


Where, since we can't easily second-guess users' systems, "the driver" 
turns out to be every DMA-capable driver, every subsystem-level buffer 
manager, every userspace application which could possibly make use of 
some kind of zero-copy I/O call...


Compared to a single effectively-transparent implementation in a single 
place at the lowest level with a single switch for the user to turn it 
on or off depending on how security-critical their particular system is, 
I know which approach I'd rather review, maintain and rely on.


Robin.


Since you have to trust the driver, why not actually trust it?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH 1/1] Revert "iommu/vt-d: Avoid duplicated pci dma alias consideration"

2019-08-30 Thread Joerg Roedel
On Mon, Aug 26, 2019 at 04:50:56PM +0800, Lu Baolu wrote:
>  drivers/iommu/intel-iommu.c | 55 +++--
>  1 file changed, 53 insertions(+), 2 deletions(-)

Applied to fixes branch, thanks.


Re: [PATCH] iommu/dma: fix for dereferencing before null checking

2019-08-30 Thread Joerg Roedel
On Sat, Aug 24, 2019 at 09:47:12AM +0800, Yunsheng Lin wrote:
>  drivers/iommu/dma-iommu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Applied, thanks.


Re: [PATCH v11 00/23] MT8183 IOMMU SUPPORT

2019-08-30 Thread Joerg Roedel
On Sat, Aug 24, 2019 at 11:01:45AM +0800, Yong Wu wrote:
> Change notes:
> v11:
>1) Adjust a bit code for mtk quirk in v7s.
>2) Collect ack from will and Matthias of the last patch.

Applied to arm/mediatek, thanks.


Re: [PATCH v8 6/7] iommu/vt-d: Add trace events for device dma map/unmap

2019-08-30 Thread Steven Rostedt
On Fri, 30 Aug 2019 15:17:17 +0800
Lu Baolu  wrote:

> This adds trace support for the Intel IOMMU driver. It
> also declares some events which could be used to trace
> the events when an IOVA is being mapped or unmapped in
> a domain.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Mika Westerberg 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Steven Rostedt (VMware) 
> ---
>  drivers/iommu/Makefile |  1 +
>  drivers/iommu/intel-trace.c| 14 +
>  include/trace/events/intel_iommu.h | 84 ++
>  3 files changed, 99 insertions(+)
>  create mode 100644 drivers/iommu/intel-trace.c
>  create mode 100644 include/trace/events/intel_iommu.h
> 
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index f13f36ae1af6..bfe27b2755bd 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -17,6 +17,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
>  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
>  obj-$(CONFIG_DMAR_TABLE) += dmar.o
>  obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
> +obj-$(CONFIG_INTEL_IOMMU) += intel-trace.o
>  obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += intel-iommu-debugfs.o
>  obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o
>  obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
> diff --git a/drivers/iommu/intel-trace.c b/drivers/iommu/intel-trace.c
> new file mode 100644
> index ..bfb6a6e37a88
> --- /dev/null
> +++ b/drivers/iommu/intel-trace.c
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel IOMMU trace support
> + *
> + * Copyright (C) 2019 Intel Corporation
> + *
> + * Author: Lu Baolu 
> + */
> +
> +#include 
> +#include 
> +
> +#define CREATE_TRACE_POINTS
> +#include 
> diff --git a/include/trace/events/intel_iommu.h 
> b/include/trace/events/intel_iommu.h
> new file mode 100644
> index ..9c28e6cae86f
> --- /dev/null
> +++ b/include/trace/events/intel_iommu.h
> @@ -0,0 +1,84 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Intel IOMMU trace support
> + *
> + * Copyright (C) 2019 Intel Corporation
> + *
> + * Author: Lu Baolu 
> + */
> +#ifdef CONFIG_INTEL_IOMMU
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM intel_iommu
> +
> +#if !defined(_TRACE_INTEL_IOMMU_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_INTEL_IOMMU_H
> +
> +#include 
> +#include 
> +
> +DECLARE_EVENT_CLASS(dma_map,
> + TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
> +  size_t size),
> +
> + TP_ARGS(dev, dev_addr, phys_addr, size),
> +
> + TP_STRUCT__entry(
> + __string(dev_name, dev_name(dev))
> + __field(dma_addr_t, dev_addr)
> + __field(phys_addr_t, phys_addr)
> + __field(size_t, size)
> + ),
> +
> + TP_fast_assign(
> + __assign_str(dev_name, dev_name(dev));
> + __entry->dev_addr = dev_addr;
> + __entry->phys_addr = phys_addr;
> + __entry->size = size;
> + ),
> +
> + TP_printk("dev=%s dev_addr=0x%llx phys_addr=0x%llx size=%zu",
> +   __get_str(dev_name),
> +   (unsigned long long)__entry->dev_addr,
> +   (unsigned long long)__entry->phys_addr,
> +   __entry->size)
> +);
> +
> +DEFINE_EVENT(dma_map, bounce_map_single,
> + TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
> +  size_t size),
> + TP_ARGS(dev, dev_addr, phys_addr, size)
> +);

Do you plan on adding more events to these classes? This patch has two
distinct DECLARE_EVENT_CLASS() calls, and each has one DEFINE_EVENT()
for them.

It's fine to do this, but I'm curious to why you did not use
the "TRACE_EVENT()" macro, which basically is just a single
DECLARE_EVENT_CLASS() followed by a single DEFINE_EVENT(). In other
words, you just open coded TRACE_EVENT().

-- Steve

> +
> +DECLARE_EVENT_CLASS(dma_unmap,
> + TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size),
> +
> + TP_ARGS(dev, dev_addr, size),
> +
> + TP_STRUCT__entry(
> + __string(dev_name, dev_name(dev))
> + __field(dma_addr_t, dev_addr)
> + __field(size_t, size)
> + ),
> +
> + TP_fast_assign(
> + __assign_str(dev_name, dev_name(dev));
> + __entry->dev_addr = dev_addr;
> + __entry->size = size;
> + ),
> +
> + TP_printk("dev=%s dev_addr=0x%llx size=%zu",
> +   __get_str(dev_name),
> +   (unsigned long long)__entry->dev_addr,
> +   __entry->size)
> +);
> +
> +DEFINE_EVENT(dma_unmap, bounce_unmap_single,
> + TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size),
> + TP_ARGS(dev, dev_addr, size)
> +);
> +
> +#endif /* _TRACE_INTEL_IOMMU_H */
> +
> +/* This part must be outside protection */
> +#include 
> +#endif /* CONFIG_INTEL_IOMMU */



Re: [PATCH v3] iommu: revisit iommu_insert_resv_region() implementation

2019-08-30 Thread Joerg Roedel
On Wed, Aug 21, 2019 at 02:09:40PM +0200, Eric Auger wrote:
>  drivers/iommu/iommu.c | 96 +--
>  1 file changed, 47 insertions(+), 49 deletions(-)

Applied, thanks.


Re: [PATCH] iommu/vt-d: Fix wrong analysis whether devices share the same bus

2019-08-30 Thread Joerg Roedel
On Tue, Aug 20, 2019 at 01:53:17AM -0700, Nadav Amit wrote:
>  drivers/iommu/intel_irq_remapping.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Applied, thanks.


RE: [PATCH v8 7/7] iommu/vt-d: Use bounce buffer for untrusted devices

2019-08-30 Thread David Laight
From: Lu Baolu
> Sent: 30 August 2019 08:17

> The Intel VT-d hardware uses paging for DMA remapping.
> The minimum mapped window is a page size. The device
> drivers may map buffers not filling the whole IOMMU
> window. This allows the device to access to possibly
> unrelated memory and a malicious device could exploit
> this to perform DMA attacks. To address this, the
> Intel IOMMU driver will use bounce pages for those
> buffers which don't fill whole IOMMU pages.

Won't this completely kill performance?

I'd expect to see something for dma_alloc_coherent() (etc)
that tries to give the driver page sized buffers.

Either that or the driver could allocate page sized buffers
even though it only passes fragments of these buffers to
the dma functions (to avoid excessive cache invalidates).

Since you have to trust the driver, why not actually trust it?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH] iommu/iova: avoid false sharing on fq_timer_on

2019-08-30 Thread Joerg Roedel
On Fri, Aug 30, 2019 at 01:27:25PM +0100, Robin Murphy wrote:
> On 30/08/2019 11:49, Joerg Roedel wrote:
> > Looks good to me, but adding Robin for his opinion.
> 
> Sounds reasonable to me too - that should also be true for the majority of
> Arm systems that we know of. Will suggested that atomic_try_cmpxchg() might
> be relevant, but AFAICS that's backwards compared to what we want to do
> here, which I guess is more of an "atomic_unlikely_cmpxchg".
> 
> Acked-by: Robin Murphy 

Great, thanks for looking into it, Robin.

Applied now, thanks Eric.


Re: [PATCH] iommu/iova: avoid false sharing on fq_timer_on

2019-08-30 Thread Robin Murphy

On 30/08/2019 11:49, Joerg Roedel wrote:

Looks good to me, but adding Robin for his opinion.


Sounds reasonable to me too - that should also be true for the majority 
of Arm systems that we know of. Will suggested that atomic_try_cmpxchg() 
might be relevant, but AFAICS that's backwards compared to what we want 
to do here, which I guess is more of an "atomic_unlikely_cmpxchg".


Acked-by: Robin Murphy 

Cheers,
Robin.


On Wed, Aug 28, 2019 at 06:13:38AM -0700, Eric Dumazet wrote:

In commit 14bd9a607f90 ("iommu/iova: Separate atomic variables
to improve performance") Jinyu Qi identified that the atomic_cmpxchg()
in queue_iova() was causing a performance loss and moved critical fields
so that the false sharing would not impact them.

However, avoiding the false sharing in the first place seems easy.
We should attempt the atomic_cmpxchg() no more than 100 times
per second. Adding an atomic_read() will keep the cache
line mostly shared.

This false sharing came with commit 9a005a800ae8
("iommu/iova: Add flush timer").

Signed-off-by: Eric Dumazet 
Cc: Jinyu Qi 
Cc: Joerg Roedel 
---
  drivers/iommu/iova.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 
3e1a8a6755723a927a7942a7429ab7e6c19a0027..41c605b0058f9615c2dbdd83f1de2404a9b1d255
 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -577,7 +577,9 @@ void queue_iova(struct iova_domain *iovad,
  
  	spin_unlock_irqrestore(&fq->lock, flags);
  
-	if (atomic_cmpxchg(&iovad->fq_timer_on, 0, 1) == 0)

+   /* Avoid false sharing as much as possible. */
+   if (!atomic_read(&iovad->fq_timer_on) &&
+   !atomic_cmpxchg(&iovad->fq_timer_on, 0, 1))
mod_timer(&iovad->fq_timer,
  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
  }
--
2.23.0.187.g17f5b7556c-goog


Re: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU

2019-08-30 Thread Mikko Perttunen

On 30.8.2019 1.47, Krishna Reddy wrote:

Add DT node for T194 SMMU to enable SMMU support.

Signed-off-by: Krishna Reddy 
---
  arch/arm64/boot/dts/nvidia/tegra194.dtsi | 75 
  1 file changed, 75 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index d906958..ad509bb 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1401,6 +1401,81 @@
  0x8200 0x0  0x4000 0x1f 0x4000 0x0 
0xc000>; /* non-prefetchable memory (3GB) */
};
  
+	smmu: iommu@1200 {

+   compatible = "nvidia,smmu-v2";


Should we have a compatibility string like "nvidia,tegra194-smmu" so 
that we can have other chips with SMMUv2 that could be different?


Mikko


+   reg = <0 0x1200 0 0x80>,
+ <0 0x1100 0 0x80>,
+ <0 0x1000 0 0x80>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   stream-match-mask = <0x7f80>;
+   #global-interrupts = <1>;
+   #iommu-cells = <1>;
+   };
+
sysram@4000 {
compatible = "nvidia,tegra194-sysram", "mmio-sram";
reg = <0x0 0x4000 0x0 0x5>;



Re: [PATCH 2/7] dt-bindings: arm-smmu: Add binding for nvidia,smmu-v2

2019-08-30 Thread Mikko Perttunen

On 30.8.2019 1.47, Krishna Reddy wrote:

Add binding doc for Nvidia's smmu-v2 implementation.

Signed-off-by: Krishna Reddy 
---
  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 1 +
  1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 3133f3b..0de3759 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -17,6 +17,7 @@ conditions.
  "arm,mmu-401"
  "arm,mmu-500"
  "cavium,smmu-v2"
+"nidia,smmu-v2"


nvidia

Mikko


  "qcom,smmu-v2"
  
depending on the particular implementation and/or the




Re: [GIT PULL] iommu/arm-smmu: Big batch of updates for 5.4

2019-08-30 Thread Will Deacon
On Fri, Aug 30, 2019 at 12:29:39PM +0200, Joerg Roedel wrote:
> On Wed, Aug 28, 2019 at 10:42:30PM +0100, Will Deacon wrote:
> > On Fri, Aug 23, 2019 at 03:54:40PM +0100, Will Deacon wrote:
> > > Please pull these ARM SMMU updates for 5.4. The branch is based on the
> > > for-joerg/batched-unmap branch that you pulled into iommu/core already
> > > because I didn't want to rebase everything onto -rc3. The pull request
> > > was generated against iommu/core.
> > 
> > Just a gentle nudge on this pull request, since it would be nice to have
> > it sit in -next for a bit before the merge window opens.
> > 
> > Please let me know if you need anything more from me.
> 
> It is pulled and pushed out now, thanks Will.

Cheers, Joerg and sorry for nagging. Next week is going to be crazy for
me, so I'm trying to get my 5.4 queue resolved early.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/7] arm64: tegra: Add Memory controller DT node on T194

2019-08-30 Thread Thierry Reding
On Thu, Aug 29, 2019 at 03:47:05PM -0700, Krishna Reddy wrote:
> Add Memory controller DT node on T194 and enable it.
> This patch is a prerequisite for SMMU enable on T194.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi | 4 
>  arch/arm64/boot/dts/nvidia/tegra194.dtsi   | 7 +++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi 
> b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
> index 62e07e11..4b3441b 100644
> --- a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
> +++ b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
> @@ -47,6 +47,10 @@
>   };
>   };
>  
> + memory-controller@2c0 {
> + status = "okay";
> + };
> +
>   serial@311 {
>   status = "okay";
>   };
> diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
> b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
> index adebbbf..d906958 100644
> --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
> +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
> @@ -6,6 +6,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  / {
>   compatible = "nvidia,tegra194";
> @@ -130,6 +131,12 @@
>   };
>   };
>  
> + memory-controller@2c0 {
> + compatible = "nvidia,tegra186-mc";

I think we need to make this "nvidia,tegra194-mc" and then enhance the
Tegra186 driver to match on that compatible string.

Nothing to worry about just yet and I can make that change when
applying.

Thierry

> + reg = <0x02c0 0xb>;
> + status = "disabled";
> + };
> +
>   uarta: serial@310 {
>   compatible = "nvidia,tegra194-uart", 
> "nvidia,tegra20-uart";
>   reg = <0x0310 0x40>;
> -- 
> 2.1.4
> 


signature.asc
Description: PGP signature


Re: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks

2019-08-30 Thread Thierry Reding
On Thu, Aug 29, 2019 at 03:47:04PM -0700, Krishna Reddy wrote:
> Add global/context fault hooks to allow Nvidia SMMU implementation
> handle faults across multiple SMMUs.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/arm-smmu-nvidia.c | 127 
> 
>  drivers/iommu/arm-smmu.c|   6 ++
>  drivers/iommu/arm-smmu.h|   4 ++
>  3 files changed, 137 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> index a429b2c..b2a3c49 100644
> --- a/drivers/iommu/arm-smmu-nvidia.c
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -14,6 +14,10 @@
>  
>  #define NUM_SMMU_INSTANCES 3
>  
> +static irqreturn_t nsmmu_context_fault_inst(int irq,
> + struct arm_smmu_device *smmu,
> + int idx, int inst);

More of these signed integers that could be unsigned. Also why the need
to predeclare this? Can you not just put the definition of the function
up here?

> +
>  struct nvidia_smmu {
>   struct arm_smmu_device  smmu;
>   int num_inst;
> @@ -87,12 +91,135 @@ static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, 
> int page,
>   nsmmu_tlb_sync_wait(smmu, page, sync, status, i);
>  }
>  
> +static irqreturn_t nsmmu_global_fault_inst(int irq,
> +struct arm_smmu_device *smmu,
> +int inst)
> +{
> + u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
> +
> + gfsr = readl_relaxed(nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
> + gfsynr0 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
> + ARM_SMMU_GR0_sGFSYNR0);
> + gfsynr1 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
> + ARM_SMMU_GR0_sGFSYNR1);
> + gfsynr2 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
> + ARM_SMMU_GR0_sGFSYNR2);
> +
> + if (!gfsr)
> + return IRQ_NONE;
> +
> + dev_err_ratelimited(smmu->dev,
> + "Unexpected global fault, this could be serious\n");
> + dev_err_ratelimited(smmu->dev,
> + "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
> 0x%08x\n",
> + gfsr, gfsynr0, gfsynr1, gfsynr2);
> +
> + writel_relaxed(gfsr, nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nsmmu_global_fault(int irq, struct arm_smmu_device *smmu)
> +{
> + int i;
> + irqreturn_t irq_ret = IRQ_NONE;
> +
> + /* Interrupt line is shared between global and context faults.
> +  * Check for both type of interrupts on either fault handlers.
> +  */
> + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) {
> + irq_ret = nsmmu_context_fault_inst(irq, smmu, 0, i);
> + if (irq_ret == IRQ_HANDLED)
> + return irq_ret;
> + }
> +
> + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) {
> + irq_ret = nsmmu_global_fault_inst(irq, smmu, i);
> + if (irq_ret == IRQ_HANDLED)
> + return irq_ret;
> + }
> +
> + return irq_ret;
> +}
> +
> +static irqreturn_t nsmmu_context_fault_bank(int irq,
> + struct arm_smmu_device *smmu,
> + int idx, int inst)
> +{
> + u32 fsr, fsynr, cbfrsynra;
> + unsigned long iova;
> +
> + fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
> + if (!(fsr & FSR_FAULT))
> + return IRQ_NONE;
> +
> + fsynr = readl_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
> +   ARM_SMMU_CB_FSYNR0);
> + iova = readq_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
> +  ARM_SMMU_CB_FAR);
> + cbfrsynra = readl_relaxed(nsmmu_page(smmu, inst, 1) +
> +   ARM_SMMU_GR1_CBFRSYNRA(idx));
> +
> + dev_err_ratelimited(smmu->dev,
> + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
> cbfrsynra=0x%x, cb=%d\n",
> + fsr, iova, fsynr, cbfrsynra, idx);
> +
> + writel_relaxed(fsr, nsmmu_page(smmu, inst, smmu->numpage + idx) +
> + ARM_SMMU_CB_FSR);
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nsmmu_context_fault_inst(int irq,
> + struct arm_smmu_device *smmu,
> + int idx, int inst)
> +{
> + irqreturn_t irq_ret = IRQ_NONE;
> +
> + /* Interrupt line shared between global and all context faults.
> +  * Check for faults across all contexts.
> +  */
> + for (idx = 0; idx < smmu->num_context_banks; idx++) {
> + irq_ret = nsmmu_context_fault_bank(irq, smmu, idx, inst);
> +
> + if (irq_ret == IRQ_HANDLED)
> + break;
> + }
> +
> + return irq_ret;
> +}
> +
> +static irqret

Re: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook

2019-08-30 Thread Thierry Reding
On Thu, Aug 29, 2019 at 03:47:03PM -0700, Krishna Reddy wrote:
> tlb_sync hook allows nvidia smmu handle tlb sync
> across multiple SMMUs as necessary.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/arm-smmu-nvidia.c | 32 
>  drivers/iommu/arm-smmu.c|  8 +---
>  drivers/iommu/arm-smmu.h|  4 
>  3 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> index d93ceda..a429b2c 100644
> --- a/drivers/iommu/arm-smmu-nvidia.c
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -56,11 +56,43 @@ static void nsmmu_write_reg64(struct arm_smmu_device 
> *smmu,
>   writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset);
>  }
>  
> +static void nsmmu_tlb_sync_wait(struct arm_smmu_device *smmu, int page,
> + int sync, int status, int inst)
> +{
> + u32 reg;
> + unsigned int spin_cnt, delay;
> +
> + for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
> + for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
> + reg = readl_relaxed(
> +   nsmmu_page(smmu, inst, page) + status);
> + if (!(reg & sTLBGSTATUS_GSACTIVE))
> + return;
> + cpu_relax();
> + }
> + udelay(delay);
> + }
> + dev_err_ratelimited(smmu->dev,
> + "TLB sync timed out -- SMMU may be deadlocked\n");
> +}
> +
> +static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page,
> +int sync, int status)
> +{
> + int i;
> +
> + arm_smmu_writel(smmu, page, sync, 0);
> +
> + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++)
> + nsmmu_tlb_sync_wait(smmu, page, sync, status, i);
> +}
> +
>  static const struct arm_smmu_impl nsmmu_impl = {
>   .read_reg = nsmmu_read_reg,
>   .write_reg = nsmmu_write_reg,
>   .read_reg64 = nsmmu_read_reg64,
>   .write_reg64 = nsmmu_write_reg64,
> + .tlb_sync = nsmmu_tlb_sync,
>  };
>  
>  struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 46e1641..f5454e71 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -52,9 +52,6 @@
>   */
>  #define QCOM_DUMMY_VAL -1
>  
> -#define TLB_LOOP_TIMEOUT 100 /* 1s! */
> -#define TLB_SPIN_COUNT   10
> -
>  #define MSI_IOVA_BASE0x800
>  #define MSI_IOVA_LENGTH  0x10
>  
> @@ -244,6 +241,11 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device 
> *smmu, int page,
>   unsigned int spin_cnt, delay;
>   u32 reg;
>  
> + if (smmu->impl->tlb_sync) {
> + smmu->impl->tlb_sync(smmu, page, sync, status);
> + return;
> + }
> +

Wouldn't it work if you replaced all calls of __arm_smmu_tlb_sync() by
smmu->impl->tlb_sync() and assign __arm_smmu_tlb_sync() as default for
devices that don't need to override it? That makes this patch slightly
larger, but it saves us one level of indirection.

> +
>   arm_smmu_writel(smmu, page, sync, QCOM_DUMMY_VAL);
>   for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
>   for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index 9645bf1..d3217f1 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -207,6 +207,8 @@ enum arm_smmu_cbar_type {
>  /* Maximum number of context banks per SMMU */
>  #define ARM_SMMU_MAX_CBS 128
>  
> +#define TLB_LOOP_TIMEOUT 100 /* 1s! */
> +#define TLB_SPIN_COUNT   10
>  
>  /* Shared driver definitions */
>  enum arm_smmu_arch_version {
> @@ -336,6 +338,8 @@ struct arm_smmu_impl {
>   int (*cfg_probe)(struct arm_smmu_device *smmu);
>   int (*reset)(struct arm_smmu_device *smmu);
>   int (*init_context)(struct arm_smmu_domain *smmu_domain);
> + void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
> +  int status);

Can't page, sync and status all be unsigned?

Thierry


signature.asc
Description: PGP signature


Re: convert microblaze to the generic dma remap allocator

2019-08-30 Thread Michal Simek
Hi Christoph,

On 14. 08. 19 16:03, Christoph Hellwig wrote:
> Hi Michal,
> 
> can you take a look at this patch that moves microblaze over to the
> generic DMA remap allocator?  I've been trying to slowly get all
> architectures over to the generic code, and microblaze is one that
> seems very straightfoward to convert.
> 

I took at look at this series and tested it on kc705 and I can't see any
issue.
Patches applied.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP/Versal SoCs




signature.asc
Description: OpenPGP digital signature


Re: [PATCH] iommu/amd: silence warnings under memory pressure

2019-08-30 Thread Joerg Roedel
On Wed, Aug 28, 2019 at 05:39:43PM -0400, Qian Cai wrote:
>  drivers/iommu/amd_iommu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Applied, thanks.



Re: [PATCH] iommu/iova: avoid false sharing on fq_timer_on

2019-08-30 Thread Joerg Roedel
Looks good to me, but adding Robin for his opinion.

On Wed, Aug 28, 2019 at 06:13:38AM -0700, Eric Dumazet wrote:
> In commit 14bd9a607f90 ("iommu/iova: Separate atomic variables
> to improve performance") Jinyu Qi identified that the atomic_cmpxchg()
> in queue_iova() was causing a performance loss and moved critical fields
> so that the false sharing would not impact them.
> 
> However, avoiding the false sharing in the first place seems easy.
> We should attempt the atomic_cmpxchg() no more than 100 times
> per second. Adding an atomic_read() will keep the cache
> line mostly shared.
> 
> This false sharing came with commit 9a005a800ae8
> ("iommu/iova: Add flush timer").
> 
> Signed-off-by: Eric Dumazet 
> Cc: Jinyu Qi 
> Cc: Joerg Roedel 
> ---
>  drivers/iommu/iova.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 
> 3e1a8a6755723a927a7942a7429ab7e6c19a0027..41c605b0058f9615c2dbdd83f1de2404a9b1d255
>  100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -577,7 +577,9 @@ void queue_iova(struct iova_domain *iovad,
>  
>   spin_unlock_irqrestore(&fq->lock, flags);
>  
> - if (atomic_cmpxchg(&iovad->fq_timer_on, 0, 1) == 0)
> + /* Avoid false sharing as much as possible. */
> + if (!atomic_read(&iovad->fq_timer_on) &&
> + !atomic_cmpxchg(&iovad->fq_timer_on, 0, 1))
>   mod_timer(&iovad->fq_timer,
> jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
>  }
> -- 
> 2.23.0.187.g17f5b7556c-goog


Re: [PATCH v7 1/7] iommu/vt-d: Don't switch off swiotlb if use direct dma

2019-08-30 Thread Joerg Roedel
On Sat, Aug 24, 2019 at 10:17:30AM +0800, Lu Baolu wrote:
> If a system has any external port, through which an untrusted device
> might be connected, the external port itself should be marked as an
> untrusted device, and all devices beneath it just inherit this
> attribution.

Okay, makes sense.

> So during iommu driver initialization, we can easily know whether the
> system has (or potentially has) untrusted devices by iterating the
> device tree. I will add such check in the next version if no objections.

Sounds good, thanks Baolu.


Joerg


Re: [GIT PULL] iommu/arm-smmu: Big batch of updates for 5.4

2019-08-30 Thread Joerg Roedel
On Wed, Aug 28, 2019 at 10:42:30PM +0100, Will Deacon wrote:
> Hi Joerg,
> 
> On Fri, Aug 23, 2019 at 03:54:40PM +0100, Will Deacon wrote:
> > Please pull these ARM SMMU updates for 5.4. The branch is based on the
> > for-joerg/batched-unmap branch that you pulled into iommu/core already
> > because I didn't want to rebase everything onto -rc3. The pull request
> > was generated against iommu/core.
> 
> Just a gentle nudge on this pull request, since it would be nice to have
> it sit in -next for a bit before the merge window opens.
> 
> Please let me know if you need anything more from me.

It is pulled and pushed out now, thanks Will.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/4] vmalloc: lift the arm flag for coherent mappings to common code

2019-08-30 Thread Russell King - ARM Linux admin
On Fri, Aug 30, 2019 at 08:29:21AM +0200, Christoph Hellwig wrote:
> The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA
> coherent remapping for a while.  Lift this flag to common code so
> that we can use it generically.  We also check it in the only place
> VM_USERMAP is directly check so that we can entirely replace that
> flag as well (although I'm not even sure why we'd want to allow
> remapping DMA appings, but I'd rather not change behavior).

Good, because if you did change that behaviour, you'd break almost
every ARM framebuffer and cripple ARM audio drivers.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/7] swiotlb: Zero out bounce buffer for untrusted device

2019-08-30 Thread Christoph Hellwig
On Fri, Aug 30, 2019 at 03:17:14PM +0800, Lu Baolu wrote:
> +#include 

> + if (dev_is_untrusted(hwdev) && zero_size)
> + memset(zero_addr, 0, zero_size);

As said before swiotlb must not grow pci dependencies like this.
Please move the untrusted flag to struct device.


[PATCH v8 4/7] iommu/vt-d: Check whether device requires bounce buffer

2019-08-30 Thread Lu Baolu
This adds a helper to check whether a device needs to
use bounce buffer. It also provides a boot time option
to disable the bounce buffer. Users can use this to
prevent the iommu driver from using the bounce buffer
for performance gain.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Lu Baolu 
Tested-by: Xu Pengfei 
Tested-by: Mika Westerberg 
---
 Documentation/admin-guide/kernel-parameters.txt | 5 +
 drivers/iommu/intel-iommu.c | 6 ++
 2 files changed, 11 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 4c1971960afa..f441a9cea5bb 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1732,6 +1732,11 @@
Note that using this option lowers the security
provided by tboot because it makes the system
vulnerable to DMA attacks.
+   nobounce [Default off]
+   Disable bounce buffer for unstrusted devices such as
+   the Thunderbolt devices. This will treat the untrusted
+   devices as the trusted ones, hence might expose security
+   risks of DMA attacks.
 
intel_idle.max_cstate=  [KNL,HW,ACPI,X86]
0   disables intel_idle and fall back on acpi_idle.
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index c4e0e4a9ee9e..fee5dff16e95 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -362,6 +362,7 @@ static int dmar_forcedac;
 static int intel_iommu_strict;
 static int intel_iommu_superpage = 1;
 static int iommu_identity_mapping;
+static int intel_no_bounce;
 
 #define IDENTMAP_ALL   1
 #define IDENTMAP_GFX   2
@@ -375,6 +376,8 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
 static DEFINE_SPINLOCK(device_domain_lock);
 static LIST_HEAD(device_domain_list);
 
+#define device_needs_bounce(d) (!intel_no_bounce && dev_is_untrusted(d))
+
 /*
  * Iterate over elements in device_domain_list and call the specified
  * callback @fn against each element.
@@ -457,6 +460,9 @@ static int __init intel_iommu_setup(char *str)
printk(KERN_INFO
"Intel-IOMMU: not forcing on after tboot. This 
could expose security risk for tboot\n");
intel_iommu_tboot_noforce = 1;
+   } else if (!strncmp(str, "nobounce", 8)) {
+   pr_info("Intel-IOMMU: No bounce buffer. This could 
expose security risks of DMA attacks\n");
+   intel_no_bounce = 1;
}
 
str += strcspn(str, ",");
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 7/7] iommu/vt-d: Use bounce buffer for untrusted devices

2019-08-30 Thread Lu Baolu
The Intel VT-d hardware uses paging for DMA remapping.
The minimum mapped window is a page size. The device
drivers may map buffers not filling the whole IOMMU
window. This allows the device to access to possibly
unrelated memory and a malicious device could exploit
this to perform DMA attacks. To address this, the
Intel IOMMU driver will use bounce pages for those
buffers which don't fill whole IOMMU pages.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Lu Baolu 
Tested-by: Xu Pengfei 
Tested-by: Mika Westerberg 
---
 drivers/iommu/intel-iommu.c | 244 
 1 file changed, 244 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index eb2a13e39eca..2800c52eee89 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -41,9 +41,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 
 #include "irq_remapping.h"
 #include "intel-pasid.h"
@@ -346,6 +348,8 @@ static int domain_detach_iommu(struct dmar_domain *domain,
 static bool device_is_rmrr_locked(struct device *dev);
 static int intel_iommu_attach_device(struct iommu_domain *domain,
 struct device *dev);
+static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
+   dma_addr_t iova);
 
 #ifdef CONFIG_INTEL_IOMMU_DEFAULT_ON
 int dmar_disabled = 0;
@@ -3775,6 +3779,238 @@ static const struct dma_map_ops intel_dma_ops = {
.dma_supported = dma_direct_supported,
 };
 
+static void
+bounce_sync_single(struct device *dev, dma_addr_t addr, size_t size,
+  enum dma_data_direction dir, enum dma_sync_target target)
+{
+   struct dmar_domain *domain;
+   phys_addr_t tlb_addr;
+
+   domain = find_domain(dev);
+   if (WARN_ON(!domain))
+   return;
+
+   tlb_addr = intel_iommu_iova_to_phys(&domain->domain, addr);
+   if (is_swiotlb_buffer(tlb_addr))
+   swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target);
+}
+
+static dma_addr_t
+bounce_map_single(struct device *dev, phys_addr_t paddr, size_t size,
+ enum dma_data_direction dir, unsigned long attrs,
+ u64 dma_mask)
+{
+   size_t aligned_size = ALIGN(size, VTD_PAGE_SIZE);
+   struct dmar_domain *domain;
+   struct intel_iommu *iommu;
+   unsigned long iova_pfn;
+   unsigned long nrpages;
+   phys_addr_t tlb_addr;
+   int prot = 0;
+   int ret;
+
+   domain = find_domain(dev);
+   if (WARN_ON(dir == DMA_NONE || !domain))
+   return DMA_MAPPING_ERROR;
+
+   iommu = domain_get_iommu(domain);
+   if (WARN_ON(!iommu))
+   return DMA_MAPPING_ERROR;
+
+   nrpages = aligned_nrpages(0, size);
+   iova_pfn = intel_alloc_iova(dev, domain,
+   dma_to_mm_pfn(nrpages), dma_mask);
+   if (!iova_pfn)
+   return DMA_MAPPING_ERROR;
+
+   /*
+* Check if DMAR supports zero-length reads on write only
+* mappings..
+*/
+   if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL ||
+   !cap_zlr(iommu->cap))
+   prot |= DMA_PTE_READ;
+   if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
+   prot |= DMA_PTE_WRITE;
+
+   /*
+* If both the physical buffer start address and size are
+* page aligned, we don't need to use a bounce page.
+*/
+   if (!IS_ALIGNED(paddr | size, VTD_PAGE_SIZE)) {
+   tlb_addr = swiotlb_tbl_map_single(dev,
+   __phys_to_dma(dev, io_tlb_start),
+   paddr, size, aligned_size, dir, attrs);
+   if (tlb_addr == DMA_MAPPING_ERROR)
+   goto swiotlb_error;
+   } else {
+   tlb_addr = paddr;
+   }
+
+   ret = domain_pfn_mapping(domain, mm_to_dma_pfn(iova_pfn),
+tlb_addr >> VTD_PAGE_SHIFT, nrpages, prot);
+   if (ret)
+   goto mapping_error;
+
+   trace_bounce_map_single(dev, iova_pfn << PAGE_SHIFT, paddr, size);
+
+   return (phys_addr_t)iova_pfn << PAGE_SHIFT;
+
+mapping_error:
+   if (is_swiotlb_buffer(tlb_addr))
+   swiotlb_tbl_unmap_single(dev, tlb_addr, size,
+aligned_size, dir, attrs);
+swiotlb_error:
+   free_iova_fast(&domain->iovad, iova_pfn, dma_to_mm_pfn(nrpages));
+   dev_err(dev, "Device bounce map: %zx@%llx dir %d --- failed\n",
+   size, (unsigned long long)paddr, dir);
+
+   return DMA_MAPPING_ERROR;
+}
+
+static void
+bounce_unmap_single(struct device *dev, dma_addr_t dev_addr, size_t size,
+   enum dma_data_direction dir, unsigned long attrs)
+{
+   size_t aligned_size = ALIGN(size, VTD_PAGE_SIZE);
+   struct dmar_domain *domain;
+   phys_addr_t tlb_addr;
+

[PATCH v8 5/7] iommu/vt-d: Don't switch off swiotlb if bounce page is used

2019-08-30 Thread Lu Baolu
The bounce page implementation depends on swiotlb. Hence, don't
switch off swiotlb if the system has untrusted devices or could
potentially be hot-added with any untrusted devices.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Lu Baolu 
Reviewed-by: Christoph Hellwig 
---
 drivers/iommu/Kconfig   |  1 +
 drivers/iommu/intel-iommu.c | 32 +---
 2 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index e15cdcd8cb3c..a4ddeade8ac4 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -182,6 +182,7 @@ config INTEL_IOMMU
select IOMMU_IOVA
select NEED_DMA_MAP_STATE
select DMAR_TABLE
+   select SWIOTLB
help
  DMA remapping (DMAR) devices support enables independent address
  translations for Direct Memory Access (DMA) from devices.
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index fee5dff16e95..eb2a13e39eca 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4575,22 +4575,20 @@ const struct attribute_group *intel_iommu_groups[] = {
NULL,
 };
 
-static int __init platform_optin_force_iommu(void)
+static inline bool has_untrusted_dev(void)
 {
struct pci_dev *pdev = NULL;
-   bool has_untrusted_dev = false;
 
-   if (!dmar_platform_optin() || no_platform_optin)
-   return 0;
+   for_each_pci_dev(pdev)
+   if (pdev->untrusted)
+   return true;
 
-   for_each_pci_dev(pdev) {
-   if (pdev->untrusted) {
-   has_untrusted_dev = true;
-   break;
-   }
-   }
+   return false;
+}
 
-   if (!has_untrusted_dev)
+static int __init platform_optin_force_iommu(void)
+{
+   if (!dmar_platform_optin() || no_platform_optin || !has_untrusted_dev())
return 0;
 
if (no_iommu || dmar_disabled)
@@ -4604,9 +4602,6 @@ static int __init platform_optin_force_iommu(void)
iommu_identity_mapping |= IDENTMAP_ALL;
 
dmar_disabled = 0;
-#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
-   swiotlb = 0;
-#endif
no_iommu = 0;
 
return 1;
@@ -4746,7 +4741,14 @@ int __init intel_iommu_init(void)
up_write(&dmar_global_lock);
 
 #if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
-   swiotlb = 0;
+   /*
+* If the system has no untrusted device or the user has decided
+* to disable the bounce page mechanisms, we don't need swiotlb.
+* Mark this and the pre-allocated bounce pages will be released
+* later.
+*/
+   if (!has_untrusted_dev() || intel_no_bounce)
+   swiotlb = 0;
 #endif
dma_ops = &intel_dma_ops;
 
-- 
2.17.1



[PATCH v8 2/7] swiotlb: Split size parameter to map/unmap APIs

2019-08-30 Thread Lu Baolu
This splits the size parameter to swiotlb_tbl_map_single() and
swiotlb_tbl_unmap_single() into an alloc_size and a mapping_size
parameter, where the latter one is rounded up to the iommu page
size.

Suggested-by: Christoph Hellwig 
Signed-off-by: Lu Baolu 
Reviewed-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c |  8 
 include/linux/swiotlb.h   |  8 ++--
 kernel/dma/direct.c   |  2 +-
 kernel/dma/swiotlb.c  | 30 +++---
 4 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index ae1df496bf38..adcabd9473eb 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -386,8 +386,8 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
 */
trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
 
-   map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
-attrs);
+   map = swiotlb_tbl_map_single(dev, start_dma_addr, phys,
+size, size, dir, attrs);
if (map == (phys_addr_t)DMA_MAPPING_ERROR)
return DMA_MAPPING_ERROR;
 
@@ -397,7 +397,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
 * Ensure that the address returned is DMA'ble
 */
if (unlikely(!dma_capable(dev, dev_addr, size))) {
-   swiotlb_tbl_unmap_single(dev, map, size, dir,
+   swiotlb_tbl_unmap_single(dev, map, size, size, dir,
attrs | DMA_ATTR_SKIP_CPU_SYNC);
return DMA_MAPPING_ERROR;
}
@@ -433,7 +433,7 @@ static void xen_unmap_single(struct device *hwdev, 
dma_addr_t dev_addr,
 
/* NOTE: We use dev_addr here, not paddr! */
if (is_xen_swiotlb_buffer(dev_addr))
-   swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
+   swiotlb_tbl_unmap_single(hwdev, paddr, size, size, dir, attrs);
 }
 
 static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 361f62bb4a8e..cde3dc18e21a 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -46,13 +46,17 @@ enum dma_sync_target {
 
 extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
  dma_addr_t tbl_dma_addr,
- phys_addr_t phys, size_t size,
+ phys_addr_t phys,
+ size_t mapping_size,
+ size_t alloc_size,
  enum dma_data_direction dir,
  unsigned long attrs);
 
 extern void swiotlb_tbl_unmap_single(struct device *hwdev,
 phys_addr_t tlb_addr,
-size_t size, enum dma_data_direction dir,
+size_t mapping_size,
+size_t alloc_size,
+enum dma_data_direction dir,
 unsigned long attrs);
 
 extern void swiotlb_tbl_sync_single(struct device *hwdev,
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 706113c6bebc..8402b29c280f 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -305,7 +305,7 @@ void dma_direct_unmap_page(struct device *dev, dma_addr_t 
addr,
dma_direct_sync_single_for_cpu(dev, addr, size, dir);
 
if (unlikely(is_swiotlb_buffer(phys)))
-   swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
+   swiotlb_tbl_unmap_single(dev, phys, size, size, dir, attrs);
 }
 EXPORT_SYMBOL(dma_direct_unmap_page);
 
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 9de232229063..89066efa3840 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -444,7 +444,9 @@ static void swiotlb_bounce(phys_addr_t orig_addr, 
phys_addr_t tlb_addr,
 
 phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
   dma_addr_t tbl_dma_addr,
-  phys_addr_t orig_addr, size_t size,
+  phys_addr_t orig_addr,
+  size_t mapping_size,
+  size_t alloc_size,
   enum dma_data_direction dir,
   unsigned long attrs)
 {
@@ -464,6 +466,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
pr_warn_once("%s is active and system is using DMA bounce 
buffers\n",
 sme_active() ? "SME" : "SEV");
 
+   if (WARN_ON(mapping_size > alloc_size))
+   return (phys_addr_t)DMA_MAPPING_ERROR;
+
mask = dma_get_seg_boundary(hwdev);
 

[PATCH v8 0/7] iommu: Bounce page for untrusted devices

2019-08-30 Thread Lu Baolu
The Thunderbolt vulnerabilities are public and have a nice
name as Thunderclap [1] [3] nowadays. This patch series aims
to mitigate those concerns.

An external PCI device is a PCI peripheral device connected
to the system through an external bus, such as Thunderbolt.
What makes it different is that it can't be trusted to the
same degree as the devices build into the system. Generally,
a trusted PCIe device will DMA into the designated buffers
and not overrun or otherwise write outside the specified
bounds. But it's different for an external device.

The minimum IOMMU mapping granularity is one page (4k), so
for DMA transfers smaller than that a malicious PCIe device
can access the whole page of memory even if it does not
belong to the driver in question. This opens a possibility
for DMA attack. For more information about DMA attacks
imposed by an untrusted PCI/PCIe device, please refer to [2].

This implements bounce buffer for the untrusted external
devices. The transfers should be limited in isolated pages
so the IOMMU window does not cover memory outside of what
the driver expects. Previously (v3 and before), we proposed
an optimisation to only copy the head and tail of the buffer
if it spans multiple pages, and directly map the ones in the
middle. Figure 1 gives a big picture about this solution.

swiotlb System
IOVA  bounce page   Memory
 .-.  .-..-.
 | |  | || |
 | |  | || |
buffer_start .-.  .-..-.
 | |->| |***>| |
 | |  | | swiotlb| |
 | |  | | mapping| |
 IOMMU Page  '-'  '-''-'
  Boundary   | | | |
 | | | |
 | | | |
 | |>| |
 | |IOMMU mapping| |
 | | | |
 IOMMU Page  .-. .-.
  Boundary   | | | |
 | | | |
 | |>| |
 | | IOMMU mapping   | |
 | | | |
 | | | |
 IOMMU Page  .-.  .-..-.
  Boundary   | |  | || |
 | |  | || |
 | |->| |***>| |
  buffer_end '-'  '-' swiotlb'-'
 | |  | | mapping| |
 | |  | || |
 '-'  '-''-'
  Figure 1: A big view of iommu bounce page 

As Robin Murphy pointed out, this ties us to using strict mode for
TLB maintenance, which may not be an overall win depending on the
balance between invalidation bandwidth vs. memcpy bandwidth. If we
use standard SWIOTLB logic to always copy the whole thing, we should
be able to release the bounce pages via the flush queue to allow
'safe' lazy unmaps. So since v4 we start to use the standard swiotlb
logic.

swiotlb System
IOVA  bounce page   Memory
buffer_start .-.  .-..-.
 | |  | || |
 | |  | || |
 | |  | |.-.physical
 | |->| | -->| |_start  
 | |iommu | | swiotlb| |
 | | map  | |   map  | |
 IOMMU Page  .-.  .-.'-'
  Boundary   | |  | || |
 | |  | || |
 | |->| || |
 | |iommu | || |
 | | map  | || |
 | |  | || |
 IOMMU Page  .-.  .-..-.
  Boundary   | |  | || |
 | |->| || |
 | |iommu | || |
 | | map  | || |
 | |  

[PATCH v8 6/7] iommu/vt-d: Add trace events for device dma map/unmap

2019-08-30 Thread Lu Baolu
This adds trace support for the Intel IOMMU driver. It
also declares some events which could be used to trace
the events when an IOVA is being mapped or unmapped in
a domain.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Mika Westerberg 
Signed-off-by: Lu Baolu 
Reviewed-by: Steven Rostedt (VMware) 
---
 drivers/iommu/Makefile |  1 +
 drivers/iommu/intel-trace.c| 14 +
 include/trace/events/intel_iommu.h | 84 ++
 3 files changed, 99 insertions(+)
 create mode 100644 drivers/iommu/intel-trace.c
 create mode 100644 include/trace/events/intel_iommu.h

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index f13f36ae1af6..bfe27b2755bd 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
+obj-$(CONFIG_INTEL_IOMMU) += intel-trace.o
 obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += intel-iommu-debugfs.o
 obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o
 obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
diff --git a/drivers/iommu/intel-trace.c b/drivers/iommu/intel-trace.c
new file mode 100644
index ..bfb6a6e37a88
--- /dev/null
+++ b/drivers/iommu/intel-trace.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel IOMMU trace support
+ *
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Author: Lu Baolu 
+ */
+
+#include 
+#include 
+
+#define CREATE_TRACE_POINTS
+#include 
diff --git a/include/trace/events/intel_iommu.h 
b/include/trace/events/intel_iommu.h
new file mode 100644
index ..9c28e6cae86f
--- /dev/null
+++ b/include/trace/events/intel_iommu.h
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Intel IOMMU trace support
+ *
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Author: Lu Baolu 
+ */
+#ifdef CONFIG_INTEL_IOMMU
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM intel_iommu
+
+#if !defined(_TRACE_INTEL_IOMMU_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_INTEL_IOMMU_H
+
+#include 
+#include 
+
+DECLARE_EVENT_CLASS(dma_map,
+   TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
+size_t size),
+
+   TP_ARGS(dev, dev_addr, phys_addr, size),
+
+   TP_STRUCT__entry(
+   __string(dev_name, dev_name(dev))
+   __field(dma_addr_t, dev_addr)
+   __field(phys_addr_t, phys_addr)
+   __field(size_t, size)
+   ),
+
+   TP_fast_assign(
+   __assign_str(dev_name, dev_name(dev));
+   __entry->dev_addr = dev_addr;
+   __entry->phys_addr = phys_addr;
+   __entry->size = size;
+   ),
+
+   TP_printk("dev=%s dev_addr=0x%llx phys_addr=0x%llx size=%zu",
+ __get_str(dev_name),
+ (unsigned long long)__entry->dev_addr,
+ (unsigned long long)__entry->phys_addr,
+ __entry->size)
+);
+
+DEFINE_EVENT(dma_map, bounce_map_single,
+   TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
+size_t size),
+   TP_ARGS(dev, dev_addr, phys_addr, size)
+);
+
+DECLARE_EVENT_CLASS(dma_unmap,
+   TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size),
+
+   TP_ARGS(dev, dev_addr, size),
+
+   TP_STRUCT__entry(
+   __string(dev_name, dev_name(dev))
+   __field(dma_addr_t, dev_addr)
+   __field(size_t, size)
+   ),
+
+   TP_fast_assign(
+   __assign_str(dev_name, dev_name(dev));
+   __entry->dev_addr = dev_addr;
+   __entry->size = size;
+   ),
+
+   TP_printk("dev=%s dev_addr=0x%llx size=%zu",
+ __get_str(dev_name),
+ (unsigned long long)__entry->dev_addr,
+ __entry->size)
+);
+
+DEFINE_EVENT(dma_unmap, bounce_unmap_single,
+   TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size),
+   TP_ARGS(dev, dev_addr, size)
+);
+
+#endif /* _TRACE_INTEL_IOMMU_H */
+
+/* This part must be outside protection */
+#include 
+#endif /* CONFIG_INTEL_IOMMU */
-- 
2.17.1



[PATCH v8 1/7] PCI: Add dev_is_untrusted helper

2019-08-30 Thread Lu Baolu
There are several places in the kernel where it is necessary to
check whether a device is a pci untrusted device. Add a helper
to simplify the callers.

Signed-off-by: Lu Baolu 
Reviewed-by: Christoph Hellwig 
---
 include/linux/pci.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 82e4cd1b7ac3..6c107eb381ac 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1029,6 +1029,7 @@ void pcibios_setup_bridge(struct pci_bus *bus, unsigned 
long type);
 void pci_sort_breadthfirst(void);
 #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
 #define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
+#define dev_is_untrusted(d) ((dev_is_pci(d) ? to_pci_dev(d)->untrusted : 
false))
 
 /* Generic PCI functions exported to card drivers */
 
@@ -1768,6 +1769,7 @@ static inline struct pci_dev *pci_dev_get(struct pci_dev 
*dev) { return NULL; }
 
 #define dev_is_pci(d) (false)
 #define dev_is_pf(d) (false)
+#define dev_is_untrusted(d) (false)
 static inline bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags)
 { return false; }
 static inline int pci_irqd_intx_xlate(struct irq_domain *d,
-- 
2.17.1



[PATCH v8 3/7] swiotlb: Zero out bounce buffer for untrusted device

2019-08-30 Thread Lu Baolu
This is necessary to avoid exposing valid kernel data to any
malicious device.

Suggested-by: Christoph Hellwig 
Signed-off-by: Lu Baolu 
---
 kernel/dma/swiotlb.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 89066efa3840..04bea5a87462 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef CONFIG_DEBUG_FS
 #include 
 #endif
@@ -458,6 +459,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
unsigned long offset_slots;
unsigned long max_slots;
unsigned long tmp_io_tlb_used;
+   void *zero_addr;
+   size_t zero_size;
 
if (no_iotlb_memory)
panic("Can not allocate SWIOTLB buffer earlier and can't now 
provide you with the DMA bounce buffer");
@@ -565,9 +568,20 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 */
for (i = 0; i < nslots; i++)
io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
+
+   zero_addr = phys_to_virt(tlb_addr);
+   zero_size = alloc_size;
+
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-   (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+   (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) {
swiotlb_bounce(orig_addr, tlb_addr, mapping_size, 
DMA_TO_DEVICE);
+   zero_addr += mapping_size;
+   zero_size -= mapping_size;
+   }
+
+   /* Zero out the bounce buffer if the consumer is untrusted. */
+   if (dev_is_untrusted(hwdev) && zero_size)
+   memset(zero_addr, 0, zero_size);
 
return tlb_addr;
 }
-- 
2.17.1