RE: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook
>> +if (smmu->impl->tlb_sync) { >> +smmu->impl->tlb_sync(smmu, page, sync, status); >What I'd hoped is that rather than needing a hook for this, you could just >override smmu_domain->tlb_ops from .init_context to wire up the alternate >.sync method directly. That would save this extra level of indirection. Hi Robin, overriding tlb_ops->tlb_sync function is not enough here. There are direct references to arm_smmu_tlb_sync_context(), arm_smmu_tlb_sync_global() functions. In arm-smmu.c. we can replace these direct references with tlb_ops->tlb_sync() function except in one function arm_smmu_device_reset(). When arm_smmu_device_reset() gets called, domains are not initialized and tlb_ops is not available. Should we add a new hook for arm_smmu_tlb_sync_global() or make this as a responsibility of impl->reset() hook as it gets called at the same place? -KR
RE: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks
>> +static irqreturn_t nsmmu_context_fault_inst(int irq, >> +struct arm_smmu_device *smmu, >> +int idx, int inst); >More of these signed integers that could be unsigned. Also why the need to >predeclare this? Can you not just put the definition of the function up here? The singed integers are based on original function prototype from arm-smmu.c. inst can be updated to unsigned. This is because I was checking context faults from global fault handler as well. This can avoided by using interrupt lines of all the instances across global and context irq entries. Let me update. > + if (smmu->impl->global_fault) > + return smmu->impl->global_fault(irq, smmu); >>... and here about the extra level of indirection. Fixing in next version. -KR
RE: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook
>Wouldn't it work if you replaced all calls of __arm_smmu_tlb_sync() by >smmu->impl->tlb_sync() and assign __arm_smmu_tlb_sync() as default for >devices that don't need to override it? That makes this patch slightly larger, >but it saves us one level of indirection. The tlb_ops->tlb_sync can be overridden directly in arm-smmu-nvidia.c specific implementation as pointed by Robin. Will be updating in next patch. >> +void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync, >> + int status); >Can't page, sync and status all be unsigned? This is to be uniform with original tlb_sync definition is arm-smmu.c. Anyway, this hook is not necessary as tlb_ops->tlb_sync can be overridden directly. -KR
RE: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU
>> +smmu: iommu@1200 { >> +compatible = "nvidia,smmu-v2"; >Should we have a compatibility string like "nvidia,tegra194-smmu" so that we >can have other chips with SMMUv2 that could be different? As pointed by Robin as well, as Nvidia hasn't modified ARM MMU-500 implementation, we can update it to specific chip based. Let me update. -KR
RE: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU
>The number of global interrupts has never been related to the number of >context interrupts :/ Yeah, They are not related. I was trying to use minimum irq entries in the DT node as they both would achieve the same functionality. >Clearly you have one combined interrupt output per SMMU - describing those as >one global interrupt and the first two context bank interrupts respectively >makes far less sense than calling them 3 global interrupts, not least because >the latter is strictly true. Will update to 3 in next patch to make it better for readability. -KR
RE: [PATCH 1/7] iommu/arm-smmu: add Nvidia SMMUv2 implementation
>> +ARM_SMMU_MATCH_DATA(nvidia_smmuv2, ARM_SMMU_V2, NVIDIA_SMMUV2); > From the previous discussions, I got the impression that other than the > 'novel' way they're integrated, the actual SMMU implementations were > unmodified Arm MMU-500s. Is that the case, or have I misread something? The ARM MMU-500 implementation is unmodified. It is the way the are integrated and used together(for interleaved accesses) is different from regular ARM MMU-500. I have added it to get the model number and to be able differentiate the SMMU implementation in arm-smmu-impl.c. -KR
RE: [PATCH 2/7] dt-bindings: arm-smmu: Add binding for nvidia,smmu-v2
>> +"nidia,smmu-v2" >> "qcom,smmu-v2" >I agree with Mikko that the compatible must be at least SoC-specific, but >potentially even instance-specific (e.g. "nvidia,tegra194-gpu-smmu") > depending on how many of these parallel-SMMU configurations might be hiding > in current and future SoCs. I am correcting the spelling mistake pointed by Mikko. The NVIDIA SMMUv2 implementation is getting used beyond Tegra194 SOC. To be able to use the smmu compatible string across multiple SOC's, "nvidia,smmu-v2" compatible string is chosen. Are you suggesting to make it soc specific and add another one in future? -KR
RE: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook
>> +for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) >It might make more sense to make this the innermost loop, i.e.: for (i = 0; i < nsmmu->num_inst; i++) reg &= readl_relaxed(nsmmu_page(smmu, i, page)... >since polling the instances in parallel rather than in series seems like it >might be a bit more efficient. Sync register is programmed at the same time for both instances. The status check is serialized. I can update it to check status of both at the same time. >> +if (smmu->impl->tlb_sync) { >> +smmu->impl->tlb_sync(smmu, page, sync, status); >What I'd hoped is that rather than needing a hook for this, you could just >override smmu_domain->tlb_ops from .init_context to wire up the alternate >.sync method directly. That would save this extra level of indirection. With arm_smmu_domain now available in arm-smmu.h, arm-smmu-nvidia.c can directly update the tlb_ops->tlb_sync and avoid indirection. Will update in next version. -KR
Re: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU
On 30/08/2019 18:25, Krishna Reddy wrote: + #global-interrupts = <1>; Shouldn't that be 3? Interrupt line is shared between global and all context faults for each SMMU instance. Nvidia implementation checks for both Global and context faults on each interrupt to an SMMU instance. It can be either 1 or 3. If we make it 3, we need to add two more irq entries in node for context faults. The number of global interrupts has never been related to the number of context interrupts :/ In the future, we can update arm-smmu.c to support shared interrupt line between global and all context faults. Clearly you have one combined interrupt output per SMMU - describing those as one global interrupt and the first two context bank interrupts respectively makes far less sense than calling them 3 global interrupts, not least because the latter is strictly true. Yes, the binding prevents us from describing the context bank interrupts for more than one instance, but at that point the fact that it *is* the combined output saves us - because the driver is aware of this specific integration it knows it can just register the "secondary" global interrupts as "secondary" context interrupts too. If we had separate IRQ lines per context bank per instance, then we'd have a really big problem and might have to redefine the binding, but as it is it happens to work out pretty neatly. Robin.
RE: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks
>> +if (smmu->impl->global_fault) >> +return smmu->impl->global_fault(irq, smmu); >Can't we just register impl->global_fault (if set) instead of >arm_smmu_global_fault as the handler when we first set up the IRQs in >arm_smmu_device_probe()? >Ideally we'd do the same for the context banks as well, although we might need >an additional hook from which to request the secondary IRQs that the main flow >can't accommodate. When first implemented theis patch, I think there were compile issues in accessing struct arm_smmu_domain from arm-smmu-nvidia.c as it was part of arm-smmu.c. To avoid issues accessing arm_smmu_domain. It did this for context fault and did same for global fault for uniformity. Now, I see that it is part of arm-smmu.h. Let me update code to register implementation hooks directly. Thanks -KR ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU
>> +#global-interrupts = <1>; >Shouldn't that be 3? Interrupt line is shared between global and all context faults for each SMMU instance. Nvidia implementation checks for both Global and context faults on each interrupt to an SMMU instance. It can be either 1 or 3. If we make it 3, we need to add two more irq entries in node for context faults. In the future, we can update arm-smmu.c to support shared interrupt line between global and all context faults. -KR
Re: [PATCH v2 01/11] asm-generic: add dma_zone_size
On Fri, 2019-08-30 at 15:45 +0100, Catalin Marinas wrote: > On Mon, Aug 26, 2019 at 03:46:52PM +0200, Nicolas Saenz Julienne wrote: > > On Mon, 2019-08-26 at 09:09 +0200, Christoph Hellwig wrote: > > > On Tue, Aug 20, 2019 at 04:58:09PM +0200, Nicolas Saenz Julienne wrote: > > > > Some architectures have platform specific DMA addressing limitations. > > > > This will allow for hardware description code to provide the constraints > > > > in a generic manner, so as for arch code to properly setup it's memory > > > > zones and DMA mask. > > > > > > I know this just spreads the arm code, but I still kinda hate it. > > > > Rob's main concern was finding a way to pass the constraint from HW > > definition > > to arch without widening fdt's architecture specific function surface. I'd > > say > > it's fair to argue that having a generic mechanism makes sense as it'll now > > traverse multiple archs and subsystems. > > > > I get adding globals like this is not very appealing, yet I went with it as > > it > > was the easier to integrate with arm's code. Any alternative suggestions? > > In some discussion with Robin, since it's just RPi4 that we are aware of > having such requirement on arm64, he suggested that we have a permanent > ZONE_DMA on arm64 with a default size of 1GB. It should cover all arm64 > SoCs we know of without breaking the single Image binary. The arch/arm > can use its current mach-* support. > > I may like this more than the proposed early_init_dt_get_dma_zone_size() > here which checks for specific SoCs (my preferred way was to build the > mask from all buses described in DT but I hadn't realised the > complications). Hi Catalin, thanks for giving it a thought. I'll be happy to implement it that way. I agree it's a good compromise. @Christoph, do you still want the patch where I create 'zone_dma_bits'? With a hardcoded ZONE_DMA it's not absolutely necessary. Though I remember you said it was a first step towards being able to initialize dma-direct's min_mask in meminit. Regards, Nicolas signature.asc Description: This is a digitally signed message part
Re: [PATCH] PCI: Move ATS declarations to linux/pci.h
On 30/08/2019 17:18, Christoph Hellwig wrote: On Fri, Aug 30, 2019 at 05:07:56PM +0200, Krzysztof Wilczynski wrote: Move ATS function prototypes from include/linux/pci-ats.h to include/linux/pci.h so users only need to include : Why is that so important? Very few PCI(e) device drivers use ATS, so keeping it out of everyones include hell doesn't seem all bad. Although to be fair it seems that all the actual ATS stuff already moved out 4 years ago, so at the very least maybe it would warrant renaming to pci-pri-pasid.h :) Robin.
Re: [PATCH] PCI: Move ATS declarations to linux/pci.h
On Fri, Aug 30, 2019 at 05:07:56PM +0200, Krzysztof Wilczynski wrote: > Move ATS function prototypes from include/linux/pci-ats.h to > include/linux/pci.h so users only need to include : Why is that so important? Very few PCI(e) device drivers use ATS, so keeping it out of everyones include hell doesn't seem all bad.
Re: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU
On 29/08/2019 23:47, Krishna Reddy wrote: Add DT node for T194 SMMU to enable SMMU support. Signed-off-by: Krishna Reddy --- arch/arm64/boot/dts/nvidia/tegra194.dtsi | 75 1 file changed, 75 insertions(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi index d906958..ad509bb 100644 --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi @@ -1401,6 +1401,81 @@ 0x8200 0x0 0x4000 0x1f 0x4000 0x0 0xc000>; /* non-prefetchable memory (3GB) */ }; + smmu: iommu@1200 { + compatible = "nvidia,smmu-v2"; + reg = <0 0x1200 0 0x80>, + <0 0x1100 0 0x80>, + <0 0x1000 0 0x80>; + interrupts = , +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +; + stream-match-mask = <0x7f80>; + #global-interrupts = <1>; Shouldn't that be 3? Robin. + #iommu-cells = <1>; + }; + sysram@4000 { compatible = "nvidia,tegra194-sysram", "mmio-sram"; reg = <0x0 0x4000 0x0 0x5>;
Re: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks
On 29/08/2019 23:47, Krishna Reddy wrote: Add global/context fault hooks to allow Nvidia SMMU implementation handle faults across multiple SMMUs. Signed-off-by: Krishna Reddy --- drivers/iommu/arm-smmu-nvidia.c | 127 drivers/iommu/arm-smmu.c| 6 ++ drivers/iommu/arm-smmu.h| 4 ++ 3 files changed, 137 insertions(+) diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c index a429b2c..b2a3c49 100644 --- a/drivers/iommu/arm-smmu-nvidia.c +++ b/drivers/iommu/arm-smmu-nvidia.c @@ -14,6 +14,10 @@ #define NUM_SMMU_INSTANCES 3 +static irqreturn_t nsmmu_context_fault_inst(int irq, + struct arm_smmu_device *smmu, + int idx, int inst); + struct nvidia_smmu { struct arm_smmu_device smmu; int num_inst; @@ -87,12 +91,135 @@ static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page, nsmmu_tlb_sync_wait(smmu, page, sync, status, i); } +static irqreturn_t nsmmu_global_fault_inst(int irq, + struct arm_smmu_device *smmu, + int inst) +{ + u32 gfsr, gfsynr0, gfsynr1, gfsynr2; + + gfsr = readl_relaxed(nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR); + gfsynr0 = readl_relaxed(nsmmu_page(smmu, inst, 0) + + ARM_SMMU_GR0_sGFSYNR0); + gfsynr1 = readl_relaxed(nsmmu_page(smmu, inst, 0) + + ARM_SMMU_GR0_sGFSYNR1); + gfsynr2 = readl_relaxed(nsmmu_page(smmu, inst, 0) + + ARM_SMMU_GR0_sGFSYNR2); + + if (!gfsr) + return IRQ_NONE; + + dev_err_ratelimited(smmu->dev, + "Unexpected global fault, this could be serious\n"); + dev_err_ratelimited(smmu->dev, + "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 0x%08x\n", + gfsr, gfsynr0, gfsynr1, gfsynr2); + + writel_relaxed(gfsr, nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR); + return IRQ_HANDLED; +} + +static irqreturn_t nsmmu_global_fault(int irq, struct arm_smmu_device *smmu) +{ + int i; + irqreturn_t irq_ret = IRQ_NONE; + + /* Interrupt line is shared between global and context faults. +* Check for both type of interrupts on either fault handlers. +*/ + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) { + irq_ret = nsmmu_context_fault_inst(irq, smmu, 0, i); + if (irq_ret == IRQ_HANDLED) + return irq_ret; + } + + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) { + irq_ret = nsmmu_global_fault_inst(irq, smmu, i); + if (irq_ret == IRQ_HANDLED) + return irq_ret; + } + + return irq_ret; +} + +static irqreturn_t nsmmu_context_fault_bank(int irq, + struct arm_smmu_device *smmu, + int idx, int inst) +{ + u32 fsr, fsynr, cbfrsynra; + unsigned long iova; + + fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR); + if (!(fsr & FSR_FAULT)) + return IRQ_NONE; + + fsynr = readl_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) + + ARM_SMMU_CB_FSYNR0); + iova = readq_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) + +ARM_SMMU_CB_FAR); + cbfrsynra = readl_relaxed(nsmmu_page(smmu, inst, 1) + + ARM_SMMU_GR1_CBFRSYNRA(idx)); + + dev_err_ratelimited(smmu->dev, + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cbfrsynra=0x%x, cb=%d\n", + fsr, iova, fsynr, cbfrsynra, idx); + + writel_relaxed(fsr, nsmmu_page(smmu, inst, smmu->numpage + idx) + + ARM_SMMU_CB_FSR); + return IRQ_HANDLED; +} + +static irqreturn_t nsmmu_context_fault_inst(int irq, + struct arm_smmu_device *smmu, + int idx, int inst) +{ + irqreturn_t irq_ret = IRQ_NONE; + + /* Interrupt line shared between global and all context faults. +* Check for faults across all contexts. +*/ + for (idx = 0; idx < smmu->num_context_banks; idx++) { + irq_ret = nsmmu_context_fault_bank(irq, smmu, idx, inst); + + if (irq_ret == IRQ_HANDLED) + break; + } + + return irq_ret; +} + +static irqreturn_t nsmmu_context_fault(int irq, + struct arm_smmu_device *smmu, + int cbndx) +{ + int i; + irqreturn_t irq_ret = IRQ_NONE; + + /* Interrupt line is shared between global and context faul
Re: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook
On 29/08/2019 23:47, Krishna Reddy wrote: tlb_sync hook allows nvidia smmu handle tlb sync across multiple SMMUs as necessary. Signed-off-by: Krishna Reddy --- drivers/iommu/arm-smmu-nvidia.c | 32 drivers/iommu/arm-smmu.c| 8 +--- drivers/iommu/arm-smmu.h| 4 3 files changed, 41 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c index d93ceda..a429b2c 100644 --- a/drivers/iommu/arm-smmu-nvidia.c +++ b/drivers/iommu/arm-smmu-nvidia.c @@ -56,11 +56,43 @@ static void nsmmu_write_reg64(struct arm_smmu_device *smmu, writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset); } +static void nsmmu_tlb_sync_wait(struct arm_smmu_device *smmu, int page, + int sync, int status, int inst) +{ + u32 reg; + unsigned int spin_cnt, delay; + + for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { + for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { + reg = readl_relaxed( + nsmmu_page(smmu, inst, page) + status); + if (!(reg & sTLBGSTATUS_GSACTIVE)) + return; + cpu_relax(); + } + udelay(delay); + } + dev_err_ratelimited(smmu->dev, + "TLB sync timed out -- SMMU may be deadlocked\n"); +} + +static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page, + int sync, int status) +{ + int i; + + arm_smmu_writel(smmu, page, sync, 0); + + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) It might make more sense to make this the innermost loop, i.e.: for (i = 0; i < nsmmu->num_inst; i++) reg &= readl_relaxed(nsmmu_page(smmu, i, page)... since polling the instances in parallel rather than in series seems like it might be a bit more efficient. + nsmmu_tlb_sync_wait(smmu, page, sync, status, i); +} + static const struct arm_smmu_impl nsmmu_impl = { .read_reg = nsmmu_read_reg, .write_reg = nsmmu_write_reg, .read_reg64 = nsmmu_read_reg64, .write_reg64 = nsmmu_write_reg64, + .tlb_sync = nsmmu_tlb_sync, }; struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 46e1641..f5454e71 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -52,9 +52,6 @@ */ #define QCOM_DUMMY_VAL -1 -#define TLB_LOOP_TIMEOUT 100 /* 1s! */ -#define TLB_SPIN_COUNT 10 - #define MSI_IOVA_BASE 0x800 #define MSI_IOVA_LENGTH 0x10 @@ -244,6 +241,11 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int page, unsigned int spin_cnt, delay; u32 reg; + if (smmu->impl->tlb_sync) { + smmu->impl->tlb_sync(smmu, page, sync, status); What I'd hoped is that rather than needing a hook for this, you could just override smmu_domain->tlb_ops from .init_context to wire up the alternate .sync method directly. That would save this extra level of indirection. Robin. + return; + } + arm_smmu_writel(smmu, page, sync, QCOM_DUMMY_VAL); for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h index 9645bf1..d3217f1 100644 --- a/drivers/iommu/arm-smmu.h +++ b/drivers/iommu/arm-smmu.h @@ -207,6 +207,8 @@ enum arm_smmu_cbar_type { /* Maximum number of context banks per SMMU */ #define ARM_SMMU_MAX_CBS 128 +#define TLB_LOOP_TIMEOUT 100 /* 1s! */ +#define TLB_SPIN_COUNT 10 /* Shared driver definitions */ enum arm_smmu_arch_version { @@ -336,6 +338,8 @@ struct arm_smmu_impl { int (*cfg_probe)(struct arm_smmu_device *smmu); int (*reset)(struct arm_smmu_device *smmu); int (*init_context)(struct arm_smmu_domain *smmu_domain); + void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync, +int status); }; static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n)
Re: [PATCH 2/7] dt-bindings: arm-smmu: Add binding for nvidia,smmu-v2
On 29/08/2019 23:47, Krishna Reddy wrote: Add binding doc for Nvidia's smmu-v2 implementation. Signed-off-by: Krishna Reddy --- Documentation/devicetree/bindings/iommu/arm,smmu.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 3133f3b..0de3759 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -17,6 +17,7 @@ conditions. "arm,mmu-401" "arm,mmu-500" "cavium,smmu-v2" +"nidia,smmu-v2" "qcom,smmu-v2" I agree with Mikko that the compatible must be at least SoC-specific, but potentially even instance-specific (e.g. "nvidia,tegra194-gpu-smmu") depending on how many of these parallel-SMMU configurations might be hiding in current and future SoCs. Robin. depending on the particular implementation and/or the
[PATCH] PCI: Move ATS declarations to linux/pci.h
Move ATS function prototypes from include/linux/pci-ats.h to include/linux/pci.h so users only need to include : Realted to PRI capability: pci_enable_pri() pci_disable_pri() pci_restore_pri_state() pci_reset_pri() Related to PASID capability: pci_enable_pasid() pci_disable_pasid() pci_restore_pasid_state() pci_pasid_features() pci_max_pasids() pci_prg_resp_pasid_required() No functional changes intended. Signed-off-by: Krzysztof Wilczynski --- drivers/iommu/amd_iommu.c | 1 - drivers/iommu/arm-smmu-v3.c | 1 - drivers/iommu/intel-iommu.c | 1 - drivers/iommu/intel-pasid.c | 1 - drivers/iommu/intel-svm.c | 1 - drivers/pci/ats.c | 1 - drivers/pci/pci.c | 1 - include/linux/pci-ats.h | 77 - include/linux/pci.h | 34 9 files changed, 34 insertions(+), 84 deletions(-) delete mode 100644 include/linux/pci-ats.h diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 04a9f8443344..d43913386915 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include #include diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 0ad6d34d1e96..3bd9455efc39 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -29,7 +29,6 @@ #include #include #include -#include #include #include diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 4658cda6f3d2..362845b5c88a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -35,7 +35,6 @@ #include #include #include -#include #include #include #include diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c index 040a445be300..f670315afa67 100644 --- a/drivers/iommu/intel-pasid.c +++ b/drivers/iommu/intel-pasid.c @@ -16,7 +16,6 @@ #include #include #include -#include #include #include "intel-pasid.h" diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 780de0caafe8..ee9dfc84f925 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include #include diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index e18499243f84..3f5fb2d4a763 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -10,7 +10,6 @@ */ #include -#include #include #include diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index f20a3de57d21..c8f2a05e6b37 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -29,7 +29,6 @@ #include #include #include -#include #include #include #include diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h deleted file mode 100644 index 1ebb88e7c184.. --- a/include/linux/pci-ats.h +++ /dev/null @@ -1,77 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef LINUX_PCI_ATS_H -#define LINUX_PCI_ATS_H - -#include - -#ifdef CONFIG_PCI_PRI - -int pci_enable_pri(struct pci_dev *pdev, u32 reqs); -void pci_disable_pri(struct pci_dev *pdev); -void pci_restore_pri_state(struct pci_dev *pdev); -int pci_reset_pri(struct pci_dev *pdev); - -#else /* CONFIG_PCI_PRI */ - -static inline int pci_enable_pri(struct pci_dev *pdev, u32 reqs) -{ - return -ENODEV; -} - -static inline void pci_disable_pri(struct pci_dev *pdev) -{ -} - -static inline void pci_restore_pri_state(struct pci_dev *pdev) -{ -} - -static inline int pci_reset_pri(struct pci_dev *pdev) -{ - return -ENODEV; -} - -#endif /* CONFIG_PCI_PRI */ - -#ifdef CONFIG_PCI_PASID - -int pci_enable_pasid(struct pci_dev *pdev, int features); -void pci_disable_pasid(struct pci_dev *pdev); -void pci_restore_pasid_state(struct pci_dev *pdev); -int pci_pasid_features(struct pci_dev *pdev); -int pci_max_pasids(struct pci_dev *pdev); -int pci_prg_resp_pasid_required(struct pci_dev *pdev); - -#else /* CONFIG_PCI_PASID */ - -static inline int pci_enable_pasid(struct pci_dev *pdev, int features) -{ - return -EINVAL; -} - -static inline void pci_disable_pasid(struct pci_dev *pdev) -{ -} - -static inline void pci_restore_pasid_state(struct pci_dev *pdev) -{ -} - -static inline int pci_pasid_features(struct pci_dev *pdev) -{ - return -EINVAL; -} - -static inline int pci_max_pasids(struct pci_dev *pdev) -{ - return -EINVAL; -} - -static inline int pci_prg_resp_pasid_required(struct pci_dev *pdev) -{ - return 0; -} -#endif /* CONFIG_PCI_PASID */ - - -#endif /* LINUX_PCI_ATS_H*/ diff --git a/include/linux/pci.h b/include/linux/pci.h index 463486016290..8ac142801890 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2349,6 +2349,40 @@ static inline bool pci_is_thunderbolt_attached(struct pci_dev *pdev) void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type); #endif +#ifdef CONFIG_PCI_PRI +int pci_enable_pri(struct pci_dev *pdev, u32 reqs); +void pci_disabl
Re: [PATCH 1/7] iommu/arm-smmu: add Nvidia SMMUv2 implementation
On 29/08/2019 23:47, Krishna Reddy wrote: Add Nvidia SMMUv2 implementation and model info. Signed-off-by: Krishna Reddy --- MAINTAINERS | 2 + drivers/iommu/Makefile | 2 +- drivers/iommu/arm-smmu-impl.c | 2 + drivers/iommu/arm-smmu-nvidia.c | 97 + drivers/iommu/arm-smmu.c| 2 + drivers/iommu/arm-smmu.h| 2 + 6 files changed, 106 insertions(+), 1 deletion(-) create mode 100644 drivers/iommu/arm-smmu-nvidia.c diff --git a/MAINTAINERS b/MAINTAINERS index 289fb06..b9d59e51 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -15785,9 +15785,11 @@ F: drivers/i2c/busses/i2c-tegra.c TEGRA IOMMU DRIVERS M:Thierry Reding +R: Krishna Reddy L:linux-te...@vger.kernel.org S:Supported F:drivers/iommu/tegra* +F: drivers/iommu/arm-smmu-nvidia.c TEGRA KBC DRIVER M:Laxman Dewangan diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index a2729aa..7f5489e 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -13,7 +13,7 @@ obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o -obj-$(CONFIG_ARM_SMMU) += arm-smmu.o arm-smmu-impl.o +obj-$(CONFIG_ARM_SMMU) += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o obj-$(CONFIG_DMAR_TABLE) += dmar.o obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c index 5c87a38..e5e595f 100644 --- a/drivers/iommu/arm-smmu-impl.c +++ b/drivers/iommu/arm-smmu-impl.c @@ -162,6 +162,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu) break; case CAVIUM_SMMUV2: return cavium_smmu_impl_init(smmu); + case NVIDIA_SMMUV2: + return nvidia_smmu_impl_init(smmu); default: break; } diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c new file mode 100644 index 000..d93ceda --- /dev/null +++ b/drivers/iommu/arm-smmu-nvidia.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Nvidia ARM SMMU v2 implementation quirks +// Copyright (C) 2019 NVIDIA CORPORATION. All rights reserved. + +#define pr_fmt(fmt) "nvidia-smmu: " fmt + +#include +#include +#include +#include +#include + +#include "arm-smmu.h" + +#define NUM_SMMU_INSTANCES 3 + +struct nvidia_smmu { + struct arm_smmu_device smmu; + int num_inst; + void __iomem*bases[NUM_SMMU_INSTANCES]; +}; + +#define to_nsmmu(s)container_of(s, struct nvidia_smmu, smmu) + +#define nsmmu_page(smmu, inst, page) \ + (((inst) ? to_nsmmu(smmu)->bases[(inst)] : smmu->base) + \ + ((page) << smmu->pgshift)) + +static u32 nsmmu_read_reg(struct arm_smmu_device *smmu, + int page, int offset) +{ + return readl_relaxed(nsmmu_page(smmu, 0, page) + offset); +} + +static void nsmmu_write_reg(struct arm_smmu_device *smmu, + int page, int offset, u32 val) +{ + int i; + + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) + writel_relaxed(val, nsmmu_page(smmu, i, page) + offset); +} + +static u64 nsmmu_read_reg64(struct arm_smmu_device *smmu, + int page, int offset) +{ + return readq_relaxed(nsmmu_page(smmu, 0, page) + offset); +} + +static void nsmmu_write_reg64(struct arm_smmu_device *smmu, + int page, int offset, u64 val) +{ + int i; + + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) + writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset); +} + +static const struct arm_smmu_impl nsmmu_impl = { + .read_reg = nsmmu_read_reg, + .write_reg = nsmmu_write_reg, + .read_reg64 = nsmmu_read_reg64, + .write_reg64 = nsmmu_write_reg64, +}; + +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu) +{ + int i; + struct nvidia_smmu *nsmmu; + struct resource *res; + struct device *dev = smmu->dev; + struct platform_device *pdev = to_platform_device(smmu->dev); + + nsmmu = devm_kzalloc(smmu->dev, sizeof(*nsmmu), GFP_KERNEL); + if (!nsmmu) + return ERR_PTR(-ENOMEM); + + nsmmu->smmu = *smmu; + /* Instance 0 is ioremapped by arm-smmu.c */ + nsmmu->num_inst = 1; + + for (i = 1; i < NUM_SMMU_INSTANCES; i++) { + res = platform_get_resource(pdev, IORESOURCE_MEM, i); + if (!res) + break; + nsmmu->bases[i] = devm_ioremap_resource(dev, res); + if (IS_ERR(nsmmu->bases[i])) + return (struct arm_smmu_device *)nsmmu->bases[i]; + nsmmu->num
Re: [PATCH 1/4] vmalloc: lift the arm flag for coherent mappings to common code
On Fri, Aug 30, 2019 at 10:29:18AM +0100, Russell King - ARM Linux admin wrote: > On Fri, Aug 30, 2019 at 08:29:21AM +0200, Christoph Hellwig wrote: > > The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA > > coherent remapping for a while. Lift this flag to common code so > > that we can use it generically. We also check it in the only place > > VM_USERMAP is directly check so that we can entirely replace that > > flag as well (although I'm not even sure why we'd want to allow > > remapping DMA appings, but I'd rather not change behavior). > > Good, because if you did change that behaviour, you'd break almost > every ARM framebuffer and cripple ARM audio drivers. How would that break them? All the usual video and audio drivers that use dma_alloc_* then use dma_mmap_* which never end up in the only place that actually checks VM_USERMAP (remap_vmalloc_range_partial) as they end up in the dma_map_ops mmap methods which contain what is effecitvely open coded versions of that routine. There are very few callers of remap_vmalloc_range_partial / remap_vmalloc_range, and while a few of those actually are in media drivers and the virtual frame buffer video driver, none of these seems to be called on dma memory (which would be a layering violation anyway).
Re: [PATCH v2 01/11] asm-generic: add dma_zone_size
On Mon, Aug 26, 2019 at 03:46:52PM +0200, Nicolas Saenz Julienne wrote: > On Mon, 2019-08-26 at 09:09 +0200, Christoph Hellwig wrote: > > On Tue, Aug 20, 2019 at 04:58:09PM +0200, Nicolas Saenz Julienne wrote: > > > Some architectures have platform specific DMA addressing limitations. > > > This will allow for hardware description code to provide the constraints > > > in a generic manner, so as for arch code to properly setup it's memory > > > zones and DMA mask. > > > > I know this just spreads the arm code, but I still kinda hate it. > > Rob's main concern was finding a way to pass the constraint from HW definition > to arch without widening fdt's architecture specific function surface. I'd say > it's fair to argue that having a generic mechanism makes sense as it'll now > traverse multiple archs and subsystems. > > I get adding globals like this is not very appealing, yet I went with it as it > was the easier to integrate with arm's code. Any alternative suggestions? In some discussion with Robin, since it's just RPi4 that we are aware of having such requirement on arm64, he suggested that we have a permanent ZONE_DMA on arm64 with a default size of 1GB. It should cover all arm64 SoCs we know of without breaking the single Image binary. The arch/arm can use its current mach-* support. I may like this more than the proposed early_init_dt_get_dma_zone_size() here which checks for specific SoCs (my preferred way was to build the mask from all buses described in DT but I hadn't realised the complications). -- Catalin
Re: [PATCH] iommu/qcom_iommu: Use struct_size() helper
On Thu, Aug 29, 2019 at 11:03:27PM -0500, Gustavo A. R. Silva wrote: > drivers/iommu/qcom_iommu.c | 7 +++ > 1 file changed, 3 insertions(+), 4 deletions(-) Applied, thanks.
Re: [PATCH] Remove wrong default domain comments
On Mon, Aug 26, 2019 at 05:48:21AM +0100, Tom Murphy wrote: > drivers/iommu/iommu.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) Applied, thanks.
Re: [PATCH v1] iommu/vt-d: remove global page flush support
On Mon, Aug 26, 2019 at 08:53:29AM -0700, Jacob Pan wrote: > drivers/iommu/intel-svm.c | 36 +++- > include/linux/intel-iommu.h | 3 --- > 2 files changed, 15 insertions(+), 24 deletions(-) Applied, thanks.
Re: [PATCH v8 7/7] iommu/vt-d: Use bounce buffer for untrusted devices
On 30/08/2019 14:39, David Laight wrote: From: Lu Baolu Sent: 30 August 2019 08:17 The Intel VT-d hardware uses paging for DMA remapping. The minimum mapped window is a page size. The device drivers may map buffers not filling the whole IOMMU window. This allows the device to access to possibly unrelated memory and a malicious device could exploit this to perform DMA attacks. To address this, the Intel IOMMU driver will use bounce pages for those buffers which don't fill whole IOMMU pages. Won't this completely kill performance? Yes it will. Though hopefully by now we're all well aware that speed and security being inversely proportional is the universal truth of modern computing. I'd expect to see something for dma_alloc_coherent() (etc) that tries to give the driver page sized buffers. Coherent DMA already works in PAGE_SIZE units under the covers (at least in the DMA API implementations relevant here) - that's not an issue. The problem is streaming DMA, where we have to give the device access to, say, some pre-existing 64-byte data packet, from right in the middle of who knows what else. Since we do not necessarily have control over the who knows what else, the only universally-practical way to isolate the DMA data is to copy it away to some safe sanitised page which we *do* control, and make the actual DMA accesses target that. Either that or the driver could allocate page sized buffers even though it only passes fragments of these buffers to the dma functions (to avoid excessive cache invalidates). Where, since we can't easily second-guess users' systems, "the driver" turns out to be every DMA-capable driver, every subsystem-level buffer manager, every userspace application which could possibly make use of some kind of zero-copy I/O call... Compared to a single effectively-transparent implementation in a single place at the lowest level with a single switch for the user to turn it on or off depending on how security-critical their particular system is, I know which approach I'd rather review, maintain and rely on. Robin. Since you have to trust the driver, why not actually trust it? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH 1/1] Revert "iommu/vt-d: Avoid duplicated pci dma alias consideration"
On Mon, Aug 26, 2019 at 04:50:56PM +0800, Lu Baolu wrote: > drivers/iommu/intel-iommu.c | 55 +++-- > 1 file changed, 53 insertions(+), 2 deletions(-) Applied to fixes branch, thanks.
Re: [PATCH] iommu/dma: fix for dereferencing before null checking
On Sat, Aug 24, 2019 at 09:47:12AM +0800, Yunsheng Lin wrote: > drivers/iommu/dma-iommu.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) Applied, thanks.
Re: [PATCH v11 00/23] MT8183 IOMMU SUPPORT
On Sat, Aug 24, 2019 at 11:01:45AM +0800, Yong Wu wrote: > Change notes: > v11: >1) Adjust a bit code for mtk quirk in v7s. >2) Collect ack from will and Matthias of the last patch. Applied to arm/mediatek, thanks.
Re: [PATCH v8 6/7] iommu/vt-d: Add trace events for device dma map/unmap
On Fri, 30 Aug 2019 15:17:17 +0800 Lu Baolu wrote: > This adds trace support for the Intel IOMMU driver. It > also declares some events which could be used to trace > the events when an IOVA is being mapped or unmapped in > a domain. > > Cc: Ashok Raj > Cc: Jacob Pan > Cc: Kevin Tian > Signed-off-by: Mika Westerberg > Signed-off-by: Lu Baolu > Reviewed-by: Steven Rostedt (VMware) > --- > drivers/iommu/Makefile | 1 + > drivers/iommu/intel-trace.c| 14 + > include/trace/events/intel_iommu.h | 84 ++ > 3 files changed, 99 insertions(+) > create mode 100644 drivers/iommu/intel-trace.c > create mode 100644 include/trace/events/intel_iommu.h > > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile > index f13f36ae1af6..bfe27b2755bd 100644 > --- a/drivers/iommu/Makefile > +++ b/drivers/iommu/Makefile > @@ -17,6 +17,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o > obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o > obj-$(CONFIG_DMAR_TABLE) += dmar.o > obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o > +obj-$(CONFIG_INTEL_IOMMU) += intel-trace.o > obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += intel-iommu-debugfs.o > obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o > obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o > diff --git a/drivers/iommu/intel-trace.c b/drivers/iommu/intel-trace.c > new file mode 100644 > index ..bfb6a6e37a88 > --- /dev/null > +++ b/drivers/iommu/intel-trace.c > @@ -0,0 +1,14 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Intel IOMMU trace support > + * > + * Copyright (C) 2019 Intel Corporation > + * > + * Author: Lu Baolu > + */ > + > +#include > +#include > + > +#define CREATE_TRACE_POINTS > +#include > diff --git a/include/trace/events/intel_iommu.h > b/include/trace/events/intel_iommu.h > new file mode 100644 > index ..9c28e6cae86f > --- /dev/null > +++ b/include/trace/events/intel_iommu.h > @@ -0,0 +1,84 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Intel IOMMU trace support > + * > + * Copyright (C) 2019 Intel Corporation > + * > + * Author: Lu Baolu > + */ > +#ifdef CONFIG_INTEL_IOMMU > +#undef TRACE_SYSTEM > +#define TRACE_SYSTEM intel_iommu > + > +#if !defined(_TRACE_INTEL_IOMMU_H) || defined(TRACE_HEADER_MULTI_READ) > +#define _TRACE_INTEL_IOMMU_H > + > +#include > +#include > + > +DECLARE_EVENT_CLASS(dma_map, > + TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr, > + size_t size), > + > + TP_ARGS(dev, dev_addr, phys_addr, size), > + > + TP_STRUCT__entry( > + __string(dev_name, dev_name(dev)) > + __field(dma_addr_t, dev_addr) > + __field(phys_addr_t, phys_addr) > + __field(size_t, size) > + ), > + > + TP_fast_assign( > + __assign_str(dev_name, dev_name(dev)); > + __entry->dev_addr = dev_addr; > + __entry->phys_addr = phys_addr; > + __entry->size = size; > + ), > + > + TP_printk("dev=%s dev_addr=0x%llx phys_addr=0x%llx size=%zu", > + __get_str(dev_name), > + (unsigned long long)__entry->dev_addr, > + (unsigned long long)__entry->phys_addr, > + __entry->size) > +); > + > +DEFINE_EVENT(dma_map, bounce_map_single, > + TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr, > + size_t size), > + TP_ARGS(dev, dev_addr, phys_addr, size) > +); Do you plan on adding more events to these classes? This patch has two distinct DECLARE_EVENT_CLASS() calls, and each has one DEFINE_EVENT() for them. It's fine to do this, but I'm curious to why you did not use the "TRACE_EVENT()" macro, which basically is just a single DECLARE_EVENT_CLASS() followed by a single DEFINE_EVENT(). In other words, you just open coded TRACE_EVENT(). -- Steve > + > +DECLARE_EVENT_CLASS(dma_unmap, > + TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size), > + > + TP_ARGS(dev, dev_addr, size), > + > + TP_STRUCT__entry( > + __string(dev_name, dev_name(dev)) > + __field(dma_addr_t, dev_addr) > + __field(size_t, size) > + ), > + > + TP_fast_assign( > + __assign_str(dev_name, dev_name(dev)); > + __entry->dev_addr = dev_addr; > + __entry->size = size; > + ), > + > + TP_printk("dev=%s dev_addr=0x%llx size=%zu", > + __get_str(dev_name), > + (unsigned long long)__entry->dev_addr, > + __entry->size) > +); > + > +DEFINE_EVENT(dma_unmap, bounce_unmap_single, > + TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size), > + TP_ARGS(dev, dev_addr, size) > +); > + > +#endif /* _TRACE_INTEL_IOMMU_H */ > + > +/* This part must be outside protection */ > +#include > +#endif /* CONFIG_INTEL_IOMMU */
Re: [PATCH v3] iommu: revisit iommu_insert_resv_region() implementation
On Wed, Aug 21, 2019 at 02:09:40PM +0200, Eric Auger wrote: > drivers/iommu/iommu.c | 96 +-- > 1 file changed, 47 insertions(+), 49 deletions(-) Applied, thanks.
Re: [PATCH] iommu/vt-d: Fix wrong analysis whether devices share the same bus
On Tue, Aug 20, 2019 at 01:53:17AM -0700, Nadav Amit wrote: > drivers/iommu/intel_irq_remapping.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) Applied, thanks.
RE: [PATCH v8 7/7] iommu/vt-d: Use bounce buffer for untrusted devices
From: Lu Baolu > Sent: 30 August 2019 08:17 > The Intel VT-d hardware uses paging for DMA remapping. > The minimum mapped window is a page size. The device > drivers may map buffers not filling the whole IOMMU > window. This allows the device to access to possibly > unrelated memory and a malicious device could exploit > this to perform DMA attacks. To address this, the > Intel IOMMU driver will use bounce pages for those > buffers which don't fill whole IOMMU pages. Won't this completely kill performance? I'd expect to see something for dma_alloc_coherent() (etc) that tries to give the driver page sized buffers. Either that or the driver could allocate page sized buffers even though it only passes fragments of these buffers to the dma functions (to avoid excessive cache invalidates). Since you have to trust the driver, why not actually trust it? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH] iommu/iova: avoid false sharing on fq_timer_on
On Fri, Aug 30, 2019 at 01:27:25PM +0100, Robin Murphy wrote: > On 30/08/2019 11:49, Joerg Roedel wrote: > > Looks good to me, but adding Robin for his opinion. > > Sounds reasonable to me too - that should also be true for the majority of > Arm systems that we know of. Will suggested that atomic_try_cmpxchg() might > be relevant, but AFAICS that's backwards compared to what we want to do > here, which I guess is more of an "atomic_unlikely_cmpxchg". > > Acked-by: Robin Murphy Great, thanks for looking into it, Robin. Applied now, thanks Eric.
Re: [PATCH] iommu/iova: avoid false sharing on fq_timer_on
On 30/08/2019 11:49, Joerg Roedel wrote: Looks good to me, but adding Robin for his opinion. Sounds reasonable to me too - that should also be true for the majority of Arm systems that we know of. Will suggested that atomic_try_cmpxchg() might be relevant, but AFAICS that's backwards compared to what we want to do here, which I guess is more of an "atomic_unlikely_cmpxchg". Acked-by: Robin Murphy Cheers, Robin. On Wed, Aug 28, 2019 at 06:13:38AM -0700, Eric Dumazet wrote: In commit 14bd9a607f90 ("iommu/iova: Separate atomic variables to improve performance") Jinyu Qi identified that the atomic_cmpxchg() in queue_iova() was causing a performance loss and moved critical fields so that the false sharing would not impact them. However, avoiding the false sharing in the first place seems easy. We should attempt the atomic_cmpxchg() no more than 100 times per second. Adding an atomic_read() will keep the cache line mostly shared. This false sharing came with commit 9a005a800ae8 ("iommu/iova: Add flush timer"). Signed-off-by: Eric Dumazet Cc: Jinyu Qi Cc: Joerg Roedel --- drivers/iommu/iova.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 3e1a8a6755723a927a7942a7429ab7e6c19a0027..41c605b0058f9615c2dbdd83f1de2404a9b1d255 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -577,7 +577,9 @@ void queue_iova(struct iova_domain *iovad, spin_unlock_irqrestore(&fq->lock, flags); - if (atomic_cmpxchg(&iovad->fq_timer_on, 0, 1) == 0) + /* Avoid false sharing as much as possible. */ + if (!atomic_read(&iovad->fq_timer_on) && + !atomic_cmpxchg(&iovad->fq_timer_on, 0, 1)) mod_timer(&iovad->fq_timer, jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT)); } -- 2.23.0.187.g17f5b7556c-goog
Re: [PATCH 6/7] arm64: tegra: Add DT node for T194 SMMU
On 30.8.2019 1.47, Krishna Reddy wrote: Add DT node for T194 SMMU to enable SMMU support. Signed-off-by: Krishna Reddy --- arch/arm64/boot/dts/nvidia/tegra194.dtsi | 75 1 file changed, 75 insertions(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi index d906958..ad509bb 100644 --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi @@ -1401,6 +1401,81 @@ 0x8200 0x0 0x4000 0x1f 0x4000 0x0 0xc000>; /* non-prefetchable memory (3GB) */ }; + smmu: iommu@1200 { + compatible = "nvidia,smmu-v2"; Should we have a compatibility string like "nvidia,tegra194-smmu" so that we can have other chips with SMMUv2 that could be different? Mikko + reg = <0 0x1200 0 0x80>, + <0 0x1100 0 0x80>, + <0 0x1000 0 0x80>; + interrupts = , +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +; + stream-match-mask = <0x7f80>; + #global-interrupts = <1>; + #iommu-cells = <1>; + }; + sysram@4000 { compatible = "nvidia,tegra194-sysram", "mmio-sram"; reg = <0x0 0x4000 0x0 0x5>;
Re: [PATCH 2/7] dt-bindings: arm-smmu: Add binding for nvidia,smmu-v2
On 30.8.2019 1.47, Krishna Reddy wrote: Add binding doc for Nvidia's smmu-v2 implementation. Signed-off-by: Krishna Reddy --- Documentation/devicetree/bindings/iommu/arm,smmu.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 3133f3b..0de3759 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -17,6 +17,7 @@ conditions. "arm,mmu-401" "arm,mmu-500" "cavium,smmu-v2" +"nidia,smmu-v2" nvidia Mikko "qcom,smmu-v2" depending on the particular implementation and/or the
Re: [GIT PULL] iommu/arm-smmu: Big batch of updates for 5.4
On Fri, Aug 30, 2019 at 12:29:39PM +0200, Joerg Roedel wrote: > On Wed, Aug 28, 2019 at 10:42:30PM +0100, Will Deacon wrote: > > On Fri, Aug 23, 2019 at 03:54:40PM +0100, Will Deacon wrote: > > > Please pull these ARM SMMU updates for 5.4. The branch is based on the > > > for-joerg/batched-unmap branch that you pulled into iommu/core already > > > because I didn't want to rebase everything onto -rc3. The pull request > > > was generated against iommu/core. > > > > Just a gentle nudge on this pull request, since it would be nice to have > > it sit in -next for a bit before the merge window opens. > > > > Please let me know if you need anything more from me. > > It is pulled and pushed out now, thanks Will. Cheers, Joerg and sorry for nagging. Next week is going to be crazy for me, so I'm trying to get my 5.4 queue resolved early. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/7] arm64: tegra: Add Memory controller DT node on T194
On Thu, Aug 29, 2019 at 03:47:05PM -0700, Krishna Reddy wrote: > Add Memory controller DT node on T194 and enable it. > This patch is a prerequisite for SMMU enable on T194. > > Signed-off-by: Krishna Reddy > --- > arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi | 4 > arch/arm64/boot/dts/nvidia/tegra194.dtsi | 7 +++ > 2 files changed, 11 insertions(+) > > diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi > b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi > index 62e07e11..4b3441b 100644 > --- a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi > +++ b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi > @@ -47,6 +47,10 @@ > }; > }; > > + memory-controller@2c0 { > + status = "okay"; > + }; > + > serial@311 { > status = "okay"; > }; > diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi > b/arch/arm64/boot/dts/nvidia/tegra194.dtsi > index adebbbf..d906958 100644 > --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi > +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi > @@ -6,6 +6,7 @@ > #include > #include > #include > +#include > > / { > compatible = "nvidia,tegra194"; > @@ -130,6 +131,12 @@ > }; > }; > > + memory-controller@2c0 { > + compatible = "nvidia,tegra186-mc"; I think we need to make this "nvidia,tegra194-mc" and then enhance the Tegra186 driver to match on that compatible string. Nothing to worry about just yet and I can make that change when applying. Thierry > + reg = <0x02c0 0xb>; > + status = "disabled"; > + }; > + > uarta: serial@310 { > compatible = "nvidia,tegra194-uart", > "nvidia,tegra20-uart"; > reg = <0x0310 0x40>; > -- > 2.1.4 > signature.asc Description: PGP signature
Re: [PATCH 4/7] iommu/arm-smmu: Add global/context fault implementation hooks
On Thu, Aug 29, 2019 at 03:47:04PM -0700, Krishna Reddy wrote: > Add global/context fault hooks to allow Nvidia SMMU implementation > handle faults across multiple SMMUs. > > Signed-off-by: Krishna Reddy > --- > drivers/iommu/arm-smmu-nvidia.c | 127 > > drivers/iommu/arm-smmu.c| 6 ++ > drivers/iommu/arm-smmu.h| 4 ++ > 3 files changed, 137 insertions(+) > > diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c > index a429b2c..b2a3c49 100644 > --- a/drivers/iommu/arm-smmu-nvidia.c > +++ b/drivers/iommu/arm-smmu-nvidia.c > @@ -14,6 +14,10 @@ > > #define NUM_SMMU_INSTANCES 3 > > +static irqreturn_t nsmmu_context_fault_inst(int irq, > + struct arm_smmu_device *smmu, > + int idx, int inst); More of these signed integers that could be unsigned. Also why the need to predeclare this? Can you not just put the definition of the function up here? > + > struct nvidia_smmu { > struct arm_smmu_device smmu; > int num_inst; > @@ -87,12 +91,135 @@ static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, > int page, > nsmmu_tlb_sync_wait(smmu, page, sync, status, i); > } > > +static irqreturn_t nsmmu_global_fault_inst(int irq, > +struct arm_smmu_device *smmu, > +int inst) > +{ > + u32 gfsr, gfsynr0, gfsynr1, gfsynr2; > + > + gfsr = readl_relaxed(nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR); > + gfsynr0 = readl_relaxed(nsmmu_page(smmu, inst, 0) + > + ARM_SMMU_GR0_sGFSYNR0); > + gfsynr1 = readl_relaxed(nsmmu_page(smmu, inst, 0) + > + ARM_SMMU_GR0_sGFSYNR1); > + gfsynr2 = readl_relaxed(nsmmu_page(smmu, inst, 0) + > + ARM_SMMU_GR0_sGFSYNR2); > + > + if (!gfsr) > + return IRQ_NONE; > + > + dev_err_ratelimited(smmu->dev, > + "Unexpected global fault, this could be serious\n"); > + dev_err_ratelimited(smmu->dev, > + "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 > 0x%08x\n", > + gfsr, gfsynr0, gfsynr1, gfsynr2); > + > + writel_relaxed(gfsr, nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR); > + return IRQ_HANDLED; > +} > + > +static irqreturn_t nsmmu_global_fault(int irq, struct arm_smmu_device *smmu) > +{ > + int i; > + irqreturn_t irq_ret = IRQ_NONE; > + > + /* Interrupt line is shared between global and context faults. > + * Check for both type of interrupts on either fault handlers. > + */ > + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) { > + irq_ret = nsmmu_context_fault_inst(irq, smmu, 0, i); > + if (irq_ret == IRQ_HANDLED) > + return irq_ret; > + } > + > + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) { > + irq_ret = nsmmu_global_fault_inst(irq, smmu, i); > + if (irq_ret == IRQ_HANDLED) > + return irq_ret; > + } > + > + return irq_ret; > +} > + > +static irqreturn_t nsmmu_context_fault_bank(int irq, > + struct arm_smmu_device *smmu, > + int idx, int inst) > +{ > + u32 fsr, fsynr, cbfrsynra; > + unsigned long iova; > + > + fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR); > + if (!(fsr & FSR_FAULT)) > + return IRQ_NONE; > + > + fsynr = readl_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) + > + ARM_SMMU_CB_FSYNR0); > + iova = readq_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) + > + ARM_SMMU_CB_FAR); > + cbfrsynra = readl_relaxed(nsmmu_page(smmu, inst, 1) + > + ARM_SMMU_GR1_CBFRSYNRA(idx)); > + > + dev_err_ratelimited(smmu->dev, > + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, > cbfrsynra=0x%x, cb=%d\n", > + fsr, iova, fsynr, cbfrsynra, idx); > + > + writel_relaxed(fsr, nsmmu_page(smmu, inst, smmu->numpage + idx) + > + ARM_SMMU_CB_FSR); > + return IRQ_HANDLED; > +} > + > +static irqreturn_t nsmmu_context_fault_inst(int irq, > + struct arm_smmu_device *smmu, > + int idx, int inst) > +{ > + irqreturn_t irq_ret = IRQ_NONE; > + > + /* Interrupt line shared between global and all context faults. > + * Check for faults across all contexts. > + */ > + for (idx = 0; idx < smmu->num_context_banks; idx++) { > + irq_ret = nsmmu_context_fault_bank(irq, smmu, idx, inst); > + > + if (irq_ret == IRQ_HANDLED) > + break; > + } > + > + return irq_ret; > +} > + > +static irqret
Re: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook
On Thu, Aug 29, 2019 at 03:47:03PM -0700, Krishna Reddy wrote: > tlb_sync hook allows nvidia smmu handle tlb sync > across multiple SMMUs as necessary. > > Signed-off-by: Krishna Reddy > --- > drivers/iommu/arm-smmu-nvidia.c | 32 > drivers/iommu/arm-smmu.c| 8 +--- > drivers/iommu/arm-smmu.h| 4 > 3 files changed, 41 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c > index d93ceda..a429b2c 100644 > --- a/drivers/iommu/arm-smmu-nvidia.c > +++ b/drivers/iommu/arm-smmu-nvidia.c > @@ -56,11 +56,43 @@ static void nsmmu_write_reg64(struct arm_smmu_device > *smmu, > writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset); > } > > +static void nsmmu_tlb_sync_wait(struct arm_smmu_device *smmu, int page, > + int sync, int status, int inst) > +{ > + u32 reg; > + unsigned int spin_cnt, delay; > + > + for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { > + for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { > + reg = readl_relaxed( > + nsmmu_page(smmu, inst, page) + status); > + if (!(reg & sTLBGSTATUS_GSACTIVE)) > + return; > + cpu_relax(); > + } > + udelay(delay); > + } > + dev_err_ratelimited(smmu->dev, > + "TLB sync timed out -- SMMU may be deadlocked\n"); > +} > + > +static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page, > +int sync, int status) > +{ > + int i; > + > + arm_smmu_writel(smmu, page, sync, 0); > + > + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) > + nsmmu_tlb_sync_wait(smmu, page, sync, status, i); > +} > + > static const struct arm_smmu_impl nsmmu_impl = { > .read_reg = nsmmu_read_reg, > .write_reg = nsmmu_write_reg, > .read_reg64 = nsmmu_read_reg64, > .write_reg64 = nsmmu_write_reg64, > + .tlb_sync = nsmmu_tlb_sync, > }; > > struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu) > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 46e1641..f5454e71 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -52,9 +52,6 @@ > */ > #define QCOM_DUMMY_VAL -1 > > -#define TLB_LOOP_TIMEOUT 100 /* 1s! */ > -#define TLB_SPIN_COUNT 10 > - > #define MSI_IOVA_BASE0x800 > #define MSI_IOVA_LENGTH 0x10 > > @@ -244,6 +241,11 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device > *smmu, int page, > unsigned int spin_cnt, delay; > u32 reg; > > + if (smmu->impl->tlb_sync) { > + smmu->impl->tlb_sync(smmu, page, sync, status); > + return; > + } > + Wouldn't it work if you replaced all calls of __arm_smmu_tlb_sync() by smmu->impl->tlb_sync() and assign __arm_smmu_tlb_sync() as default for devices that don't need to override it? That makes this patch slightly larger, but it saves us one level of indirection. > + > arm_smmu_writel(smmu, page, sync, QCOM_DUMMY_VAL); > for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { > for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h > index 9645bf1..d3217f1 100644 > --- a/drivers/iommu/arm-smmu.h > +++ b/drivers/iommu/arm-smmu.h > @@ -207,6 +207,8 @@ enum arm_smmu_cbar_type { > /* Maximum number of context banks per SMMU */ > #define ARM_SMMU_MAX_CBS 128 > > +#define TLB_LOOP_TIMEOUT 100 /* 1s! */ > +#define TLB_SPIN_COUNT 10 > > /* Shared driver definitions */ > enum arm_smmu_arch_version { > @@ -336,6 +338,8 @@ struct arm_smmu_impl { > int (*cfg_probe)(struct arm_smmu_device *smmu); > int (*reset)(struct arm_smmu_device *smmu); > int (*init_context)(struct arm_smmu_domain *smmu_domain); > + void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync, > + int status); Can't page, sync and status all be unsigned? Thierry signature.asc Description: PGP signature
Re: convert microblaze to the generic dma remap allocator
Hi Christoph, On 14. 08. 19 16:03, Christoph Hellwig wrote: > Hi Michal, > > can you take a look at this patch that moves microblaze over to the > generic DMA remap allocator? I've been trying to slowly get all > architectures over to the generic code, and microblaze is one that > seems very straightfoward to convert. > I took at look at this series and tested it on kc705 and I can't see any issue. Patches applied. Thanks, Michal -- Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91 w: www.monstr.eu p: +42-0-721842854 Maintainer of Linux kernel - Xilinx Microblaze Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP/Versal SoCs signature.asc Description: OpenPGP digital signature
Re: [PATCH] iommu/amd: silence warnings under memory pressure
On Wed, Aug 28, 2019 at 05:39:43PM -0400, Qian Cai wrote: > drivers/iommu/amd_iommu.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) Applied, thanks.
Re: [PATCH] iommu/iova: avoid false sharing on fq_timer_on
Looks good to me, but adding Robin for his opinion. On Wed, Aug 28, 2019 at 06:13:38AM -0700, Eric Dumazet wrote: > In commit 14bd9a607f90 ("iommu/iova: Separate atomic variables > to improve performance") Jinyu Qi identified that the atomic_cmpxchg() > in queue_iova() was causing a performance loss and moved critical fields > so that the false sharing would not impact them. > > However, avoiding the false sharing in the first place seems easy. > We should attempt the atomic_cmpxchg() no more than 100 times > per second. Adding an atomic_read() will keep the cache > line mostly shared. > > This false sharing came with commit 9a005a800ae8 > ("iommu/iova: Add flush timer"). > > Signed-off-by: Eric Dumazet > Cc: Jinyu Qi > Cc: Joerg Roedel > --- > drivers/iommu/iova.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index > 3e1a8a6755723a927a7942a7429ab7e6c19a0027..41c605b0058f9615c2dbdd83f1de2404a9b1d255 > 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -577,7 +577,9 @@ void queue_iova(struct iova_domain *iovad, > > spin_unlock_irqrestore(&fq->lock, flags); > > - if (atomic_cmpxchg(&iovad->fq_timer_on, 0, 1) == 0) > + /* Avoid false sharing as much as possible. */ > + if (!atomic_read(&iovad->fq_timer_on) && > + !atomic_cmpxchg(&iovad->fq_timer_on, 0, 1)) > mod_timer(&iovad->fq_timer, > jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT)); > } > -- > 2.23.0.187.g17f5b7556c-goog
Re: [PATCH v7 1/7] iommu/vt-d: Don't switch off swiotlb if use direct dma
On Sat, Aug 24, 2019 at 10:17:30AM +0800, Lu Baolu wrote: > If a system has any external port, through which an untrusted device > might be connected, the external port itself should be marked as an > untrusted device, and all devices beneath it just inherit this > attribution. Okay, makes sense. > So during iommu driver initialization, we can easily know whether the > system has (or potentially has) untrusted devices by iterating the > device tree. I will add such check in the next version if no objections. Sounds good, thanks Baolu. Joerg
Re: [GIT PULL] iommu/arm-smmu: Big batch of updates for 5.4
On Wed, Aug 28, 2019 at 10:42:30PM +0100, Will Deacon wrote: > Hi Joerg, > > On Fri, Aug 23, 2019 at 03:54:40PM +0100, Will Deacon wrote: > > Please pull these ARM SMMU updates for 5.4. The branch is based on the > > for-joerg/batched-unmap branch that you pulled into iommu/core already > > because I didn't want to rebase everything onto -rc3. The pull request > > was generated against iommu/core. > > Just a gentle nudge on this pull request, since it would be nice to have > it sit in -next for a bit before the merge window opens. > > Please let me know if you need anything more from me. It is pulled and pushed out now, thanks Will. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/4] vmalloc: lift the arm flag for coherent mappings to common code
On Fri, Aug 30, 2019 at 08:29:21AM +0200, Christoph Hellwig wrote: > The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA > coherent remapping for a while. Lift this flag to common code so > that we can use it generically. We also check it in the only place > VM_USERMAP is directly check so that we can entirely replace that > flag as well (although I'm not even sure why we'd want to allow > remapping DMA appings, but I'd rather not change behavior). Good, because if you did change that behaviour, you'd break almost every ARM framebuffer and cripple ARM audio drivers. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 3/7] swiotlb: Zero out bounce buffer for untrusted device
On Fri, Aug 30, 2019 at 03:17:14PM +0800, Lu Baolu wrote: > +#include > + if (dev_is_untrusted(hwdev) && zero_size) > + memset(zero_addr, 0, zero_size); As said before swiotlb must not grow pci dependencies like this. Please move the untrusted flag to struct device.
[PATCH v8 4/7] iommu/vt-d: Check whether device requires bounce buffer
This adds a helper to check whether a device needs to use bounce buffer. It also provides a boot time option to disable the bounce buffer. Users can use this to prevent the iommu driver from using the bounce buffer for performance gain. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Signed-off-by: Lu Baolu Tested-by: Xu Pengfei Tested-by: Mika Westerberg --- Documentation/admin-guide/kernel-parameters.txt | 5 + drivers/iommu/intel-iommu.c | 6 ++ 2 files changed, 11 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 4c1971960afa..f441a9cea5bb 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1732,6 +1732,11 @@ Note that using this option lowers the security provided by tboot because it makes the system vulnerable to DMA attacks. + nobounce [Default off] + Disable bounce buffer for unstrusted devices such as + the Thunderbolt devices. This will treat the untrusted + devices as the trusted ones, hence might expose security + risks of DMA attacks. intel_idle.max_cstate= [KNL,HW,ACPI,X86] 0 disables intel_idle and fall back on acpi_idle. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index c4e0e4a9ee9e..fee5dff16e95 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -362,6 +362,7 @@ static int dmar_forcedac; static int intel_iommu_strict; static int intel_iommu_superpage = 1; static int iommu_identity_mapping; +static int intel_no_bounce; #define IDENTMAP_ALL 1 #define IDENTMAP_GFX 2 @@ -375,6 +376,8 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped); static DEFINE_SPINLOCK(device_domain_lock); static LIST_HEAD(device_domain_list); +#define device_needs_bounce(d) (!intel_no_bounce && dev_is_untrusted(d)) + /* * Iterate over elements in device_domain_list and call the specified * callback @fn against each element. @@ -457,6 +460,9 @@ static int __init intel_iommu_setup(char *str) printk(KERN_INFO "Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n"); intel_iommu_tboot_noforce = 1; + } else if (!strncmp(str, "nobounce", 8)) { + pr_info("Intel-IOMMU: No bounce buffer. This could expose security risks of DMA attacks\n"); + intel_no_bounce = 1; } str += strcspn(str, ","); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v8 7/7] iommu/vt-d: Use bounce buffer for untrusted devices
The Intel VT-d hardware uses paging for DMA remapping. The minimum mapped window is a page size. The device drivers may map buffers not filling the whole IOMMU window. This allows the device to access to possibly unrelated memory and a malicious device could exploit this to perform DMA attacks. To address this, the Intel IOMMU driver will use bounce pages for those buffers which don't fill whole IOMMU pages. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Signed-off-by: Lu Baolu Tested-by: Xu Pengfei Tested-by: Mika Westerberg --- drivers/iommu/intel-iommu.c | 244 1 file changed, 244 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index eb2a13e39eca..2800c52eee89 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -41,9 +41,11 @@ #include #include #include +#include #include #include #include +#include #include "irq_remapping.h" #include "intel-pasid.h" @@ -346,6 +348,8 @@ static int domain_detach_iommu(struct dmar_domain *domain, static bool device_is_rmrr_locked(struct device *dev); static int intel_iommu_attach_device(struct iommu_domain *domain, struct device *dev); +static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, + dma_addr_t iova); #ifdef CONFIG_INTEL_IOMMU_DEFAULT_ON int dmar_disabled = 0; @@ -3775,6 +3779,238 @@ static const struct dma_map_ops intel_dma_ops = { .dma_supported = dma_direct_supported, }; +static void +bounce_sync_single(struct device *dev, dma_addr_t addr, size_t size, + enum dma_data_direction dir, enum dma_sync_target target) +{ + struct dmar_domain *domain; + phys_addr_t tlb_addr; + + domain = find_domain(dev); + if (WARN_ON(!domain)) + return; + + tlb_addr = intel_iommu_iova_to_phys(&domain->domain, addr); + if (is_swiotlb_buffer(tlb_addr)) + swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target); +} + +static dma_addr_t +bounce_map_single(struct device *dev, phys_addr_t paddr, size_t size, + enum dma_data_direction dir, unsigned long attrs, + u64 dma_mask) +{ + size_t aligned_size = ALIGN(size, VTD_PAGE_SIZE); + struct dmar_domain *domain; + struct intel_iommu *iommu; + unsigned long iova_pfn; + unsigned long nrpages; + phys_addr_t tlb_addr; + int prot = 0; + int ret; + + domain = find_domain(dev); + if (WARN_ON(dir == DMA_NONE || !domain)) + return DMA_MAPPING_ERROR; + + iommu = domain_get_iommu(domain); + if (WARN_ON(!iommu)) + return DMA_MAPPING_ERROR; + + nrpages = aligned_nrpages(0, size); + iova_pfn = intel_alloc_iova(dev, domain, + dma_to_mm_pfn(nrpages), dma_mask); + if (!iova_pfn) + return DMA_MAPPING_ERROR; + + /* +* Check if DMAR supports zero-length reads on write only +* mappings.. +*/ + if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL || + !cap_zlr(iommu->cap)) + prot |= DMA_PTE_READ; + if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) + prot |= DMA_PTE_WRITE; + + /* +* If both the physical buffer start address and size are +* page aligned, we don't need to use a bounce page. +*/ + if (!IS_ALIGNED(paddr | size, VTD_PAGE_SIZE)) { + tlb_addr = swiotlb_tbl_map_single(dev, + __phys_to_dma(dev, io_tlb_start), + paddr, size, aligned_size, dir, attrs); + if (tlb_addr == DMA_MAPPING_ERROR) + goto swiotlb_error; + } else { + tlb_addr = paddr; + } + + ret = domain_pfn_mapping(domain, mm_to_dma_pfn(iova_pfn), +tlb_addr >> VTD_PAGE_SHIFT, nrpages, prot); + if (ret) + goto mapping_error; + + trace_bounce_map_single(dev, iova_pfn << PAGE_SHIFT, paddr, size); + + return (phys_addr_t)iova_pfn << PAGE_SHIFT; + +mapping_error: + if (is_swiotlb_buffer(tlb_addr)) + swiotlb_tbl_unmap_single(dev, tlb_addr, size, +aligned_size, dir, attrs); +swiotlb_error: + free_iova_fast(&domain->iovad, iova_pfn, dma_to_mm_pfn(nrpages)); + dev_err(dev, "Device bounce map: %zx@%llx dir %d --- failed\n", + size, (unsigned long long)paddr, dir); + + return DMA_MAPPING_ERROR; +} + +static void +bounce_unmap_single(struct device *dev, dma_addr_t dev_addr, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ + size_t aligned_size = ALIGN(size, VTD_PAGE_SIZE); + struct dmar_domain *domain; + phys_addr_t tlb_addr; +
[PATCH v8 5/7] iommu/vt-d: Don't switch off swiotlb if bounce page is used
The bounce page implementation depends on swiotlb. Hence, don't switch off swiotlb if the system has untrusted devices or could potentially be hot-added with any untrusted devices. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Signed-off-by: Lu Baolu Reviewed-by: Christoph Hellwig --- drivers/iommu/Kconfig | 1 + drivers/iommu/intel-iommu.c | 32 +--- 2 files changed, 18 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index e15cdcd8cb3c..a4ddeade8ac4 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -182,6 +182,7 @@ config INTEL_IOMMU select IOMMU_IOVA select NEED_DMA_MAP_STATE select DMAR_TABLE + select SWIOTLB help DMA remapping (DMAR) devices support enables independent address translations for Direct Memory Access (DMA) from devices. diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index fee5dff16e95..eb2a13e39eca 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4575,22 +4575,20 @@ const struct attribute_group *intel_iommu_groups[] = { NULL, }; -static int __init platform_optin_force_iommu(void) +static inline bool has_untrusted_dev(void) { struct pci_dev *pdev = NULL; - bool has_untrusted_dev = false; - if (!dmar_platform_optin() || no_platform_optin) - return 0; + for_each_pci_dev(pdev) + if (pdev->untrusted) + return true; - for_each_pci_dev(pdev) { - if (pdev->untrusted) { - has_untrusted_dev = true; - break; - } - } + return false; +} - if (!has_untrusted_dev) +static int __init platform_optin_force_iommu(void) +{ + if (!dmar_platform_optin() || no_platform_optin || !has_untrusted_dev()) return 0; if (no_iommu || dmar_disabled) @@ -4604,9 +4602,6 @@ static int __init platform_optin_force_iommu(void) iommu_identity_mapping |= IDENTMAP_ALL; dmar_disabled = 0; -#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB) - swiotlb = 0; -#endif no_iommu = 0; return 1; @@ -4746,7 +4741,14 @@ int __init intel_iommu_init(void) up_write(&dmar_global_lock); #if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB) - swiotlb = 0; + /* +* If the system has no untrusted device or the user has decided +* to disable the bounce page mechanisms, we don't need swiotlb. +* Mark this and the pre-allocated bounce pages will be released +* later. +*/ + if (!has_untrusted_dev() || intel_no_bounce) + swiotlb = 0; #endif dma_ops = &intel_dma_ops; -- 2.17.1
[PATCH v8 2/7] swiotlb: Split size parameter to map/unmap APIs
This splits the size parameter to swiotlb_tbl_map_single() and swiotlb_tbl_unmap_single() into an alloc_size and a mapping_size parameter, where the latter one is rounded up to the iommu page size. Suggested-by: Christoph Hellwig Signed-off-by: Lu Baolu Reviewed-by: Christoph Hellwig --- drivers/xen/swiotlb-xen.c | 8 include/linux/swiotlb.h | 8 ++-- kernel/dma/direct.c | 2 +- kernel/dma/swiotlb.c | 30 +++--- 4 files changed, 30 insertions(+), 18 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index ae1df496bf38..adcabd9473eb 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -386,8 +386,8 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, */ trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force); - map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir, -attrs); + map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, +size, size, dir, attrs); if (map == (phys_addr_t)DMA_MAPPING_ERROR) return DMA_MAPPING_ERROR; @@ -397,7 +397,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, * Ensure that the address returned is DMA'ble */ if (unlikely(!dma_capable(dev, dev_addr, size))) { - swiotlb_tbl_unmap_single(dev, map, size, dir, + swiotlb_tbl_unmap_single(dev, map, size, size, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC); return DMA_MAPPING_ERROR; } @@ -433,7 +433,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr, /* NOTE: We use dev_addr here, not paddr! */ if (is_xen_swiotlb_buffer(dev_addr)) - swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs); + swiotlb_tbl_unmap_single(hwdev, paddr, size, size, dir, attrs); } static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr, diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 361f62bb4a8e..cde3dc18e21a 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -46,13 +46,17 @@ enum dma_sync_target { extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, dma_addr_t tbl_dma_addr, - phys_addr_t phys, size_t size, + phys_addr_t phys, + size_t mapping_size, + size_t alloc_size, enum dma_data_direction dir, unsigned long attrs); extern void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, -size_t size, enum dma_data_direction dir, +size_t mapping_size, +size_t alloc_size, +enum dma_data_direction dir, unsigned long attrs); extern void swiotlb_tbl_sync_single(struct device *hwdev, diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 706113c6bebc..8402b29c280f 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -305,7 +305,7 @@ void dma_direct_unmap_page(struct device *dev, dma_addr_t addr, dma_direct_sync_single_for_cpu(dev, addr, size, dir); if (unlikely(is_swiotlb_buffer(phys))) - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); + swiotlb_tbl_unmap_single(dev, phys, size, size, dir, attrs); } EXPORT_SYMBOL(dma_direct_unmap_page); diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 9de232229063..89066efa3840 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -444,7 +444,9 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr, phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, dma_addr_t tbl_dma_addr, - phys_addr_t orig_addr, size_t size, + phys_addr_t orig_addr, + size_t mapping_size, + size_t alloc_size, enum dma_data_direction dir, unsigned long attrs) { @@ -464,6 +466,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, pr_warn_once("%s is active and system is using DMA bounce buffers\n", sme_active() ? "SME" : "SEV"); + if (WARN_ON(mapping_size > alloc_size)) + return (phys_addr_t)DMA_MAPPING_ERROR; + mask = dma_get_seg_boundary(hwdev);
[PATCH v8 0/7] iommu: Bounce page for untrusted devices
The Thunderbolt vulnerabilities are public and have a nice name as Thunderclap [1] [3] nowadays. This patch series aims to mitigate those concerns. An external PCI device is a PCI peripheral device connected to the system through an external bus, such as Thunderbolt. What makes it different is that it can't be trusted to the same degree as the devices build into the system. Generally, a trusted PCIe device will DMA into the designated buffers and not overrun or otherwise write outside the specified bounds. But it's different for an external device. The minimum IOMMU mapping granularity is one page (4k), so for DMA transfers smaller than that a malicious PCIe device can access the whole page of memory even if it does not belong to the driver in question. This opens a possibility for DMA attack. For more information about DMA attacks imposed by an untrusted PCI/PCIe device, please refer to [2]. This implements bounce buffer for the untrusted external devices. The transfers should be limited in isolated pages so the IOMMU window does not cover memory outside of what the driver expects. Previously (v3 and before), we proposed an optimisation to only copy the head and tail of the buffer if it spans multiple pages, and directly map the ones in the middle. Figure 1 gives a big picture about this solution. swiotlb System IOVA bounce page Memory .-. .-..-. | | | || | | | | || | buffer_start .-. .-..-. | |->| |***>| | | | | | swiotlb| | | | | | mapping| | IOMMU Page '-' '-''-' Boundary | | | | | | | | | | | | | |>| | | |IOMMU mapping| | | | | | IOMMU Page .-. .-. Boundary | | | | | | | | | |>| | | | IOMMU mapping | | | | | | | | | | IOMMU Page .-. .-..-. Boundary | | | || | | | | || | | |->| |***>| | buffer_end '-' '-' swiotlb'-' | | | | mapping| | | | | || | '-' '-''-' Figure 1: A big view of iommu bounce page As Robin Murphy pointed out, this ties us to using strict mode for TLB maintenance, which may not be an overall win depending on the balance between invalidation bandwidth vs. memcpy bandwidth. If we use standard SWIOTLB logic to always copy the whole thing, we should be able to release the bounce pages via the flush queue to allow 'safe' lazy unmaps. So since v4 we start to use the standard swiotlb logic. swiotlb System IOVA bounce page Memory buffer_start .-. .-..-. | | | || | | | | || | | | | |.-.physical | |->| | -->| |_start | |iommu | | swiotlb| | | | map | | map | | IOMMU Page .-. .-.'-' Boundary | | | || | | | | || | | |->| || | | |iommu | || | | | map | || | | | | || | IOMMU Page .-. .-..-. Boundary | | | || | | |->| || | | |iommu | || | | | map | || | | |
[PATCH v8 6/7] iommu/vt-d: Add trace events for device dma map/unmap
This adds trace support for the Intel IOMMU driver. It also declares some events which could be used to trace the events when an IOVA is being mapped or unmapped in a domain. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Signed-off-by: Mika Westerberg Signed-off-by: Lu Baolu Reviewed-by: Steven Rostedt (VMware) --- drivers/iommu/Makefile | 1 + drivers/iommu/intel-trace.c| 14 + include/trace/events/intel_iommu.h | 84 ++ 3 files changed, 99 insertions(+) create mode 100644 drivers/iommu/intel-trace.c create mode 100644 include/trace/events/intel_iommu.h diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index f13f36ae1af6..bfe27b2755bd 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -17,6 +17,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o obj-$(CONFIG_DMAR_TABLE) += dmar.o obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o +obj-$(CONFIG_INTEL_IOMMU) += intel-trace.o obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += intel-iommu-debugfs.o obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o diff --git a/drivers/iommu/intel-trace.c b/drivers/iommu/intel-trace.c new file mode 100644 index ..bfb6a6e37a88 --- /dev/null +++ b/drivers/iommu/intel-trace.c @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Intel IOMMU trace support + * + * Copyright (C) 2019 Intel Corporation + * + * Author: Lu Baolu + */ + +#include +#include + +#define CREATE_TRACE_POINTS +#include diff --git a/include/trace/events/intel_iommu.h b/include/trace/events/intel_iommu.h new file mode 100644 index ..9c28e6cae86f --- /dev/null +++ b/include/trace/events/intel_iommu.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Intel IOMMU trace support + * + * Copyright (C) 2019 Intel Corporation + * + * Author: Lu Baolu + */ +#ifdef CONFIG_INTEL_IOMMU +#undef TRACE_SYSTEM +#define TRACE_SYSTEM intel_iommu + +#if !defined(_TRACE_INTEL_IOMMU_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_INTEL_IOMMU_H + +#include +#include + +DECLARE_EVENT_CLASS(dma_map, + TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr, +size_t size), + + TP_ARGS(dev, dev_addr, phys_addr, size), + + TP_STRUCT__entry( + __string(dev_name, dev_name(dev)) + __field(dma_addr_t, dev_addr) + __field(phys_addr_t, phys_addr) + __field(size_t, size) + ), + + TP_fast_assign( + __assign_str(dev_name, dev_name(dev)); + __entry->dev_addr = dev_addr; + __entry->phys_addr = phys_addr; + __entry->size = size; + ), + + TP_printk("dev=%s dev_addr=0x%llx phys_addr=0x%llx size=%zu", + __get_str(dev_name), + (unsigned long long)__entry->dev_addr, + (unsigned long long)__entry->phys_addr, + __entry->size) +); + +DEFINE_EVENT(dma_map, bounce_map_single, + TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr, +size_t size), + TP_ARGS(dev, dev_addr, phys_addr, size) +); + +DECLARE_EVENT_CLASS(dma_unmap, + TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size), + + TP_ARGS(dev, dev_addr, size), + + TP_STRUCT__entry( + __string(dev_name, dev_name(dev)) + __field(dma_addr_t, dev_addr) + __field(size_t, size) + ), + + TP_fast_assign( + __assign_str(dev_name, dev_name(dev)); + __entry->dev_addr = dev_addr; + __entry->size = size; + ), + + TP_printk("dev=%s dev_addr=0x%llx size=%zu", + __get_str(dev_name), + (unsigned long long)__entry->dev_addr, + __entry->size) +); + +DEFINE_EVENT(dma_unmap, bounce_unmap_single, + TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size), + TP_ARGS(dev, dev_addr, size) +); + +#endif /* _TRACE_INTEL_IOMMU_H */ + +/* This part must be outside protection */ +#include +#endif /* CONFIG_INTEL_IOMMU */ -- 2.17.1
[PATCH v8 1/7] PCI: Add dev_is_untrusted helper
There are several places in the kernel where it is necessary to check whether a device is a pci untrusted device. Add a helper to simplify the callers. Signed-off-by: Lu Baolu Reviewed-by: Christoph Hellwig --- include/linux/pci.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/pci.h b/include/linux/pci.h index 82e4cd1b7ac3..6c107eb381ac 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1029,6 +1029,7 @@ void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type); void pci_sort_breadthfirst(void); #define dev_is_pci(d) ((d)->bus == &pci_bus_type) #define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false)) +#define dev_is_untrusted(d) ((dev_is_pci(d) ? to_pci_dev(d)->untrusted : false)) /* Generic PCI functions exported to card drivers */ @@ -1768,6 +1769,7 @@ static inline struct pci_dev *pci_dev_get(struct pci_dev *dev) { return NULL; } #define dev_is_pci(d) (false) #define dev_is_pf(d) (false) +#define dev_is_untrusted(d) (false) static inline bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags) { return false; } static inline int pci_irqd_intx_xlate(struct irq_domain *d, -- 2.17.1
[PATCH v8 3/7] swiotlb: Zero out bounce buffer for untrusted device
This is necessary to avoid exposing valid kernel data to any malicious device. Suggested-by: Christoph Hellwig Signed-off-by: Lu Baolu --- kernel/dma/swiotlb.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 89066efa3840..04bea5a87462 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -35,6 +35,7 @@ #include #include #include +#include #ifdef CONFIG_DEBUG_FS #include #endif @@ -458,6 +459,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, unsigned long offset_slots; unsigned long max_slots; unsigned long tmp_io_tlb_used; + void *zero_addr; + size_t zero_size; if (no_iotlb_memory) panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer"); @@ -565,9 +568,20 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, */ for (i = 0; i < nslots; i++) io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT); + + zero_addr = phys_to_virt(tlb_addr); + zero_size = alloc_size; + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && - (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) + (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) { swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE); + zero_addr += mapping_size; + zero_size -= mapping_size; + } + + /* Zero out the bounce buffer if the consumer is untrusted. */ + if (dev_is_untrusted(hwdev) && zero_size) + memset(zero_addr, 0, zero_size); return tlb_addr; } -- 2.17.1