intel-iommu: Is this a bug?
Hi, I am analyzing network performance with intel-iommu enabled. And found that running with iommu=pt, for every dma map/unmap it executes this code: /* * At boot time, we don't yet know if devices will be 64-bit capable. * Assume that they will — if they turn out not to be, then we can * take them out of the 1:1 domain later. */ if (!startup) { /* * If the device's dma_mask is less than the system's memory * size then this is not a candidate for identity mapping. */ u64 dma_mask = *dev->dma_mask; if (dev->coherent_dma_mask && dev->coherent_dma_mask < dma_mask) dma_mask = dev->coherent_dma_mask; return dma_mask >= dma_get_required_mask(dev); } Do we really need this check for every dma/unmap? Considering it should be only during startup, shouldn't it be, diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 582fd01..3c8f14e 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2929,7 +2929,7 @@ static int iommu_should_identity_map(struct device *dev, int startup) * Assume that they will — if they turn out not to be, then we can * take them out of the 1:1 domain later. */ - if (!startup) { + if (startup) { /* * If the device's dma_mask is less than the system's memory * size then this is not a candidate for identity mapping. Thanks. -Tushar ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/5] Add debugfs info for the AMD IOMMU
On 03/15/2018 08:58 AM, Joerg Roedel wrote: On Wed, Mar 14, 2018 at 06:04:44PM -0500, Gary R Hook wrote: Gary R Hook (5): iommu/amd - Add debugfs support iommu/amd - Add a 'verbose' switch for IOMMU debugfs iommu/amd - Add a README variable for the IOMMU debugfs iommu/amd - Expose the active IOMMU device table entries iommu/amd - Add a debugfs entry to specify a IOMMU device table entry Same problem here as with the Intel patches, I don't think it's a good idea to reveal internal iommu data structures to user-space in this way. I'm afraid I'm not following the line of thought here. AFAIK, if we restrict access of any debugfs info to root, then only root can see any of that information. If root is compromised, the whole system is compromised. Which seems to me to make all of debugfs suspect from a security perspective. Therefore, I'm likely not understanding the security concerns as they relate specifically to the IOMMU. As for stability, you mention on the Intel IOMMU thread issues surrounding kABI, which I appreciate. But I'm exposing well-documented device structures in my patches, not kernel structures. There's nothing there that is going to change underneath me without my knowledge, if ever (I vote for never). And I'm likely ignorant about policy nuances surrounding the kernel ABI. I've debugged iommu issues for around 10 years now and never had the need for an interface that reveals those internals. How exactly are you planning to use this information? Well, I've already had to dig into the DTEs and page tables for a couple of reasons (both debug and development), and it made life easier (for me) to make the live data available this way. Then I thought that others might be interested as well. Admittedly, I could be wrong. Finally, I'm no guru, and likely am unaware of other techniques in which I should develop skills. I'm always open to input. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/ipmmu-vmsa: Document R-Car M3-N IPMMU DT bindings
On Tue, Mar 20, 2018 at 03:48:11PM +0900, Magnus Damm wrote: > From: Magnus Damm> > Update the IPMMU DT binding documentation to include the r8a77965 compat > string for the IPMMU devices included in the R-Car M3-N SoC. > > Signed-off-by: Magnus Damm > --- > > Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt |1 + > 1 file changed, 1 insertion(+) Reviewed-by: Rob Herring ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb: swiotlb_{alloc,free}_buffer should depend on CONFIG_DMA_DIRECT_OPS
On Sat, Mar 24, 2018 at 04:05:45PM -0400, Konrad Rzeszutek Wilk wrote: > > > > Otherwise we might get unused symbol warnings for configs that built > > > > swiotlb.c only for use by xen-swiotlb.c and that don't otherwise select > > > > CONFIG_DMA_DIRECT_OPS, which is possible on arm. > > > > > > > > Fixes: 16e73adbca76 ("dma/swiotlb: Remove > > > > swiotlb_{alloc,free}_coherent()") > > > > Reported-by: Stephen Rothwell> > > > Signed-off-by: Christoph Hellwig > > > > > > > > > Alternatively could we set the Kconfig to slect DMA_DIRECT_OPS? > > > > > > IFF we build swiotlb.c only for xen-swiotlb we don't need DMA_DIRECT_OPS. > > I don't think there is ever an case where you want a Xen specific build. arm never uses swiotlb directly, so any swiotlb.c build on arm is purely for xen-swiotlb.c ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Is adding offset to dma address is valid operation?
On 26/03/18 13:31, Harsh Jain wrote: On 26-03-2018 17:15, Robin Murphy wrote: Hi Harsh, On 26/03/18 10:55, Harsh Jain wrote: Hi, Can we add offset to dma address received from IOMMU in scatter/Gather list before sending it to H/W. Address to HW = sg_dma_address(sg) + offset, where 0 < offset < sg_dma_len(sg). sg_dma_address() should already account for sg->offset (i.e. the start of the buffer within the page. If there's a DMA API implementation failing to respect that then that's a bug. I need this operation to make sure our H/W does not receives entry of size greater than 2K. If however it's just that you want to fragment a single SG entry into multiple commaneds/descriptors/etc. internally to your driver, then I can't see that being an issue as long as you don't visibly modify the scatterlist itself. If there's a genuine length and/or alignment limitation in the hardware, though, it would be better to set up dev->dma_parms appropriately to describe them, such that the subsystem responsible for generating the scatterlist respects the appropriate constraints in the first place (and if it doesn't, then that's another bug to fix). Can dma_parms handle 2K size limit ? Because dma_map_sg() returns <= passed nents. But in my case nents will increase. The point is that dma_map_sg() should still never increase nents, because the guy *creating* the scatterlist is also expected to respect dma_parms where it exists, such that nents would be appropriate to start with. That said, it does look like there are some places (e.g. the block layer) which may not cope properly with limits less than PAGE_SIZE, so I guess the reliability of dma_parms might depend on how much you want to go spelunking in core code :/ Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 7/7] iommu/arm-smmu-v3: Support 52-bit virtual address
Stage 1 input addresses are effectively 64-bit in SMMUv3 anyway, so really all that's involved is letting io-pgtable know the appropriate upper bound for T0SZ. Signed-off-by: Robin Murphy--- v3: Fix IDR5.VAX being a 2-bit field, and slight reorganisation of the stage 1 IAS logic. Since constraining to VA_BITS turns out to be an SMMUv3-specific behaviour, and would introduce an undesirable arm64 dependency to io-pgtable, I ended up leaving it here. drivers/iommu/arm-smmu-v3.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index e0d46661c153..b0e23d6bd1ef 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -92,6 +92,8 @@ #define IDR5_OAS_44_BIT4 #define IDR5_OAS_48_BIT5 #define IDR5_OAS_52_BIT6 +#define IDR5_VAX GENMASK(11, 10) +#define IDR5_VAX_52_BIT1 #define ARM_SMMU_CR0 0x20 #define CR0_CMDQEN (1 << 3) @@ -551,6 +553,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_STALLS (1 << 11) #define ARM_SMMU_FEAT_HYP (1 << 12) #define ARM_SMMU_FEAT_STALL_FORCE (1 << 13) +#define ARM_SMMU_FEAT_VAX (1 << 14) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -1591,7 +1594,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain) switch (smmu_domain->stage) { case ARM_SMMU_DOMAIN_S1: - ias = VA_BITS; + ias = (smmu->features & ARM_SMMU_FEAT_VAX) ? 52 : 48; + ias = min_t(unsigned long, ias, VA_BITS); oas = smmu->ias; fmt = ARM_64_LPAE_S1; finalise_stage_fn = arm_smmu_domain_finalise_s1; @@ -2634,6 +2638,10 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (reg & IDR5_GRAN4K) smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G; + /* Input address size */ + if (FIELD_GET(IDR5_VAX, reg) == IDR5_VAX_52_BIT) + smmu->features |= ARM_SMMU_FEAT_VAX; + /* Output address size */ switch (FIELD_GET(IDR5_OAS, reg)) { case IDR5_OAS_32_BIT: -- 2.16.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 3/7] iommu/arm-smmu-v3: Clean up table definitions
As with registers, use GENMASK and the bitfield accessors consistently for table fields, to save some lines and ease maintenance a little. This also catches a subtle off-by-one wherein bit 5 of CD.T0SZ was missing. Signed-off-by: Robin Murphy--- v3: New drivers/iommu/arm-smmu-v3.c | 147 +--- 1 file changed, 58 insertions(+), 89 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 40a19ce03f99..b1dc7d7cbbb5 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -207,54 +207,46 @@ #define STRTAB_SPLIT 8 #define STRTAB_L1_DESC_DWORDS 1 -#define STRTAB_L1_DESC_SPAN_SHIFT 0 -#define STRTAB_L1_DESC_SPAN_MASK 0x1fUL +#define STRTAB_L1_DESC_SPANGENMASK_ULL(4, 0) #define STRTAB_L1_DESC_L2PTR_MASK GENMASK_ULL(47, 6) #define STRTAB_STE_DWORDS 8 #define STRTAB_STE_0_V (1UL << 0) -#define STRTAB_STE_0_CFG_SHIFT 1 -#define STRTAB_STE_0_CFG_MASK 0x7UL -#define STRTAB_STE_0_CFG_ABORT (0UL << STRTAB_STE_0_CFG_SHIFT) -#define STRTAB_STE_0_CFG_BYPASS(4UL << STRTAB_STE_0_CFG_SHIFT) -#define STRTAB_STE_0_CFG_S1_TRANS (5UL << STRTAB_STE_0_CFG_SHIFT) -#define STRTAB_STE_0_CFG_S2_TRANS (6UL << STRTAB_STE_0_CFG_SHIFT) +#define STRTAB_STE_0_CFG GENMASK_ULL(3, 1) +#define STRTAB_STE_0_CFG_ABORT 0 +#define STRTAB_STE_0_CFG_BYPASS4 +#define STRTAB_STE_0_CFG_S1_TRANS 5 +#define STRTAB_STE_0_CFG_S2_TRANS 6 -#define STRTAB_STE_0_S1FMT_SHIFT 4 -#define STRTAB_STE_0_S1FMT_LINEAR (0UL << STRTAB_STE_0_S1FMT_SHIFT) +#define STRTAB_STE_0_S1FMT GENMASK_ULL(5, 4) +#define STRTAB_STE_0_S1FMT_LINEAR 0 #define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(47, 6) -#define STRTAB_STE_0_S1CDMAX_SHIFT 59 -#define STRTAB_STE_0_S1CDMAX_MASK 0x1fUL +#define STRTAB_STE_0_S1CDMAX GENMASK_ULL(63, 59) #define STRTAB_STE_1_S1C_CACHE_NC 0UL #define STRTAB_STE_1_S1C_CACHE_WBRA1UL #define STRTAB_STE_1_S1C_CACHE_WT 2UL #define STRTAB_STE_1_S1C_CACHE_WB 3UL -#define STRTAB_STE_1_S1C_SH_NSH0UL -#define STRTAB_STE_1_S1C_SH_OSH2UL -#define STRTAB_STE_1_S1C_SH_ISH3UL -#define STRTAB_STE_1_S1CIR_SHIFT 2 -#define STRTAB_STE_1_S1COR_SHIFT 4 -#define STRTAB_STE_1_S1CSH_SHIFT 6 +#define STRTAB_STE_1_S1CIR GENMASK_ULL(3, 2) +#define STRTAB_STE_1_S1COR GENMASK_ULL(5, 4) +#define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6) #define STRTAB_STE_1_S1STALLD (1UL << 27) +#define STRTAB_STE_1_EATS GENMASK_ULL(29, 28) #define STRTAB_STE_1_EATS_ABT 0UL #define STRTAB_STE_1_EATS_TRANS1UL #define STRTAB_STE_1_EATS_S1CHK2UL -#define STRTAB_STE_1_EATS_SHIFT28 +#define STRTAB_STE_1_STRW GENMASK_ULL(31, 30) #define STRTAB_STE_1_STRW_NSEL10UL #define STRTAB_STE_1_STRW_EL2 2UL -#define STRTAB_STE_1_STRW_SHIFT30 +#define STRTAB_STE_1_SHCFG GENMASK_ULL(45, 44) #define STRTAB_STE_1_SHCFG_INCOMING1UL -#define STRTAB_STE_1_SHCFG_SHIFT 44 -#define STRTAB_STE_2_S2VMID_SHIFT 0 -#define STRTAB_STE_2_S2VMID_MASK 0xUL -#define STRTAB_STE_2_VTCR_SHIFT32 -#define STRTAB_STE_2_VTCR_MASK 0x7UL +#define STRTAB_STE_2_S2VMIDGENMASK_ULL(15, 0) +#define STRTAB_STE_2_VTCR GENMASK_ULL(50, 32) #define STRTAB_STE_2_S2AA64(1UL << 51) #define STRTAB_STE_2_S2ENDI(1UL << 52) #define STRTAB_STE_2_S2PTW (1UL << 54) @@ -264,56 +256,41 @@ /* Context descriptor (stage-1 only) */ #define CTXDESC_CD_DWORDS 8 -#define CTXDESC_CD_0_TCR_T0SZ_SHIFT0 -#define ARM64_TCR_T0SZ_SHIFT 0 -#define ARM64_TCR_T0SZ_MASK0x1fUL -#define CTXDESC_CD_0_TCR_TG0_SHIFT 6 -#define ARM64_TCR_TG0_SHIFT14 -#define ARM64_TCR_TG0_MASK 0x3UL -#define CTXDESC_CD_0_TCR_IRGN0_SHIFT 8 -#define ARM64_TCR_IRGN0_SHIFT 8 -#define ARM64_TCR_IRGN0_MASK 0x3UL -#define CTXDESC_CD_0_TCR_ORGN0_SHIFT 10 -#define ARM64_TCR_ORGN0_SHIFT 10 -#define ARM64_TCR_ORGN0_MASK 0x3UL -#define CTXDESC_CD_0_TCR_SH0_SHIFT 12 -#define ARM64_TCR_SH0_SHIFT12 -#define ARM64_TCR_SH0_MASK 0x3UL -#define CTXDESC_CD_0_TCR_EPD0_SHIFT14 -#define ARM64_TCR_EPD0_SHIFT 7 -#define ARM64_TCR_EPD0_MASK0x1UL -#define CTXDESC_CD_0_TCR_EPD1_SHIFT30 -#define ARM64_TCR_EPD1_SHIFT 23 -#define ARM64_TCR_EPD1_MASK0x1UL +#define CTXDESC_CD_0_TCR_T0SZ GENMASK_ULL(5, 0) +#define ARM64_TCR_T0SZ GENMASK_ULL(5, 0) +#define CTXDESC_CD_0_TCR_TG0
[PATCH v3 6/7] iommu/arm-smmu-v3: Support 52-bit physical address
Implement SMMUv3.1 support for 52-bit physical addresses. Since a 52-bit OAS implies 64KB translation granule support, permitting level 1 block entries there is simple, and the rest is just extending address fields. Tested-by: Nate WattersonSigned-off-by: Robin Murphy --- v3: No change drivers/iommu/arm-smmu-v3.c | 35 --- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 4e0ffda217f6..e0d46661c153 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -91,6 +91,7 @@ #define IDR5_OAS_42_BIT3 #define IDR5_OAS_44_BIT4 #define IDR5_OAS_48_BIT5 +#define IDR5_OAS_52_BIT6 #define ARM_SMMU_CR0 0x20 #define CR0_CMDQEN (1 << 3) @@ -147,7 +148,7 @@ #define ARM_SMMU_STRTAB_BASE 0x80 #define STRTAB_BASE_RA (1UL << 62) -#define STRTAB_BASE_ADDR_MASK GENMASK_ULL(47, 6) +#define STRTAB_BASE_ADDR_MASK GENMASK_ULL(51, 6) #define ARM_SMMU_STRTAB_BASE_CFG 0x88 #define STRTAB_BASE_CFG_FMTGENMASK(17, 16) @@ -175,7 +176,7 @@ #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc /* Common MSI config fields */ -#define MSI_CFG0_ADDR_MASK GENMASK_ULL(47, 2) +#define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2) #define MSI_CFG2_SHGENMASK(5, 4) #define MSI_CFG2_MEMATTR GENMASK(3, 0) @@ -194,7 +195,7 @@ Q_IDX(q, p) * (q)->ent_dwords) #define Q_BASE_RWA (1UL << 62) -#define Q_BASE_ADDR_MASK GENMASK_ULL(47, 5) +#define Q_BASE_ADDR_MASK GENMASK_ULL(51, 5) #define Q_BASE_LOG2SIZEGENMASK(4, 0) /* @@ -209,7 +210,7 @@ #define STRTAB_L1_DESC_DWORDS 1 #define STRTAB_L1_DESC_SPANGENMASK_ULL(4, 0) -#define STRTAB_L1_DESC_L2PTR_MASK GENMASK_ULL(47, 6) +#define STRTAB_L1_DESC_L2PTR_MASK GENMASK_ULL(51, 6) #define STRTAB_STE_DWORDS 8 #define STRTAB_STE_0_V (1UL << 0) @@ -221,7 +222,7 @@ #define STRTAB_STE_0_S1FMT GENMASK_ULL(5, 4) #define STRTAB_STE_0_S1FMT_LINEAR 0 -#define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(47, 6) +#define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(51, 6) #define STRTAB_STE_0_S1CDMAX GENMASK_ULL(63, 59) #define STRTAB_STE_1_S1C_CACHE_NC 0UL @@ -253,7 +254,7 @@ #define STRTAB_STE_2_S2PTW (1UL << 54) #define STRTAB_STE_2_S2R (1UL << 58) -#define STRTAB_STE_3_S2TTB_MASKGENMASK_ULL(47, 4) +#define STRTAB_STE_3_S2TTB_MASKGENMASK_ULL(51, 4) /* Context descriptor (stage-1 only) */ #define CTXDESC_CD_DWORDS 8 @@ -287,7 +288,7 @@ #define CTXDESC_CD_0_ASET (1UL << 47) #define CTXDESC_CD_0_ASID GENMASK_ULL(63, 48) -#define CTXDESC_CD_1_TTB0_MASK GENMASK_ULL(47, 4) +#define CTXDESC_CD_1_TTB0_MASK GENMASK_ULL(51, 4) /* Convert between AArch64 (CPU) TCR format and SMMU CD format */ #define ARM_SMMU_TCR2CD(tcr, fld) FIELD_PREP(CTXDESC_CD_0_TCR_##fld, \ @@ -317,7 +318,7 @@ #define CMDQ_TLBI_0_ASID GENMASK_ULL(63, 48) #define CMDQ_TLBI_1_LEAF (1UL << 0) #define CMDQ_TLBI_1_VA_MASKGENMASK_ULL(63, 12) -#define CMDQ_TLBI_1_IPA_MASK GENMASK_ULL(47, 12) +#define CMDQ_TLBI_1_IPA_MASK GENMASK_ULL(51, 12) #define CMDQ_PRI_0_SSIDGENMASK_ULL(31, 12) #define CMDQ_PRI_0_SID GENMASK_ULL(63, 32) @@ -331,7 +332,7 @@ #define CMDQ_SYNC_0_MSHGENMASK_ULL(23, 22) #define CMDQ_SYNC_0_MSIATTRGENMASK_ULL(27, 24) #define CMDQ_SYNC_0_MSIDATAGENMASK_ULL(63, 32) -#define CMDQ_SYNC_1_MSIADDR_MASK GENMASK_ULL(47, 2) +#define CMDQ_SYNC_1_MSIADDR_MASK GENMASK_ULL(51, 2) /* Event queue */ #define EVTQ_ENT_DWORDS4 @@ -1622,7 +1623,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain) return -ENOMEM; domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; - domain->geometry.aperture_end = (1UL << ias) - 1; + domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1; domain->geometry.force_aperture = true; ret = finalise_stage_fn(smmu_domain, _cfg); @@ -2633,11 +2634,6 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (reg & IDR5_GRAN4K) smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G; - if (arm_smmu_ops.pgsize_bitmap == -1UL) - arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap; - else - arm_smmu_ops.pgsize_bitmap |=
[PATCH v3 5/7] iommu/io-pgtable-arm: Support 52-bit physical address
Bring io-pgtable-arm in line with the ARMv8.2-LPA feature allowing 52-bit physical addresses when using the 64KB translation granule. This will be supported by SMMUv3.1. Tested-by: Nate WattersonSigned-off-by: Robin Murphy --- v3: 64K-granule-specific logic in iopte_to_paddr() makes it much more symmetrical with paddr_to_iopte() and generates shorter code. I haven't implemented explicit input masking in paddr_to_iopte() since I realised that if we did ever get an out-of-range address there, it would only make the difference between silently mapping the wrong address and silently mapping a slightly different wrong address. I also went to add a comment about io_pgtable_alloc() updating the cfg, only to find that it's been documented as such from the start. drivers/iommu/io-pgtable-arm.c | 67 ++ 1 file changed, 49 insertions(+), 18 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 51e5c43caed1..a5be4c92c5c8 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -21,6 +21,7 @@ #define pr_fmt(fmt)"arm-lpae io-pgtable: " fmt #include +#include #include #include #include @@ -32,7 +33,7 @@ #include "io-pgtable.h" -#define ARM_LPAE_MAX_ADDR_BITS 48 +#define ARM_LPAE_MAX_ADDR_BITS 52 #define ARM_LPAE_S2_MAX_CONCAT_PAGES 16 #define ARM_LPAE_MAX_LEVELS4 @@ -86,6 +87,8 @@ #define ARM_LPAE_PTE_TYPE_TABLE3 #define ARM_LPAE_PTE_TYPE_PAGE 3 +#define ARM_LPAE_PTE_ADDR_MASK GENMASK_ULL(47,12) + #define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63) #define ARM_LPAE_PTE_XN(((arm_lpae_iopte)3) << 53) #define ARM_LPAE_PTE_AF(((arm_lpae_iopte)1) << 10) @@ -159,6 +162,7 @@ #define ARM_LPAE_TCR_PS_42_BIT 0x3ULL #define ARM_LPAE_TCR_PS_44_BIT 0x4ULL #define ARM_LPAE_TCR_PS_48_BIT 0x5ULL +#define ARM_LPAE_TCR_PS_52_BIT 0x6ULL #define ARM_LPAE_MAIR_ATTR_SHIFT(n)((n) << 3) #define ARM_LPAE_MAIR_ATTR_MASK0xff @@ -170,9 +174,7 @@ #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2 /* IOPTE accessors */ -#define iopte_deref(pte,d) \ - (__va((pte) & ((1ULL << ARM_LPAE_MAX_ADDR_BITS) - 1)\ - & ~(ARM_LPAE_GRANULE(d) - 1ULL))) +#define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d)) #define iopte_type(pte,l) \ (((pte) >> ARM_LPAE_PTE_TYPE_SHIFT) & ARM_LPAE_PTE_TYPE_MASK) @@ -184,12 +186,6 @@ (iopte_type(pte,l) == ARM_LPAE_PTE_TYPE_PAGE) : \ (iopte_type(pte,l) == ARM_LPAE_PTE_TYPE_BLOCK)) -#define iopte_to_pfn(pte,d)\ - (((pte) & ((1ULL << ARM_LPAE_MAX_ADDR_BITS) - 1)) >> (d)->pg_shift) - -#define pfn_to_iopte(pfn,d)\ - (((pfn) << (d)->pg_shift) & ((1ULL << ARM_LPAE_MAX_ADDR_BITS) - 1)) - struct arm_lpae_io_pgtable { struct io_pgtable iop; @@ -203,6 +199,27 @@ struct arm_lpae_io_pgtable { typedef u64 arm_lpae_iopte; +static arm_lpae_iopte paddr_to_iopte(phys_addr_t paddr, +struct arm_lpae_io_pgtable *data) +{ + arm_lpae_iopte pte = paddr; + + /* Of the bits which overlap, either 51:48 or 15:12 are always RES0 */ + return (pte | (pte >> (48 - 12))) & ARM_LPAE_PTE_ADDR_MASK; +} + +static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte, + struct arm_lpae_io_pgtable *data) +{ + phys_addr_t paddr = pte & ARM_LPAE_PTE_ADDR_MASK; + + if (data->pg_shift < 16) + return paddr; + + /* Rotate the packed high-order bits back to the top */ + return (paddr | (paddr << (48 - 12))) & (ARM_LPAE_PTE_ADDR_MASK << 4); +} + static bool selftest_running = false; static dma_addr_t __arm_lpae_dma_addr(void *pages) @@ -287,7 +304,7 @@ static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, pte |= ARM_LPAE_PTE_TYPE_BLOCK; pte |= ARM_LPAE_PTE_AF | ARM_LPAE_PTE_SH_IS; - pte |= pfn_to_iopte(paddr >> data->pg_shift, data); + pte |= paddr_to_iopte(paddr, data); __arm_lpae_set_pte(ptep, pte, >iop.cfg); } @@ -528,7 +545,7 @@ static int arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data, if (size == split_sz) unmap_idx = ARM_LPAE_LVL_IDX(iova, lvl, data); - blk_paddr = iopte_to_pfn(blk_pte, data) << data->pg_shift; + blk_paddr = iopte_to_paddr(blk_pte, data); pte = iopte_prot(blk_pte); for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz) { @@ -652,12 +669,13 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, found_translation: iova &=
[PATCH v3 4/7] iommu/arm-smmu-v3: Clean up queue definitions
As with registers and tables, use GENMASK and the bitfield accessors consistently for queue fields, to save some lines and ease maintenance a little. This now leaves everything in a nice state where all named field definitions expect to be used with bitfield accessors (although since single-bit fields can still be used directly we leave some of those uses as-is to avoid unnecessary churn), while the few remaining *_MASK definitions apply exclusively to in-place values. Signed-off-by: Robin Murphy--- v3: New drivers/iommu/arm-smmu-v3.c | 126 +++- 1 file changed, 54 insertions(+), 72 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index b1dc7d7cbbb5..4e0ffda217f6 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -184,6 +184,7 @@ #define ARM_SMMU_SH_OSH2 #define ARM_SMMU_SH_ISH3 #define ARM_SMMU_MEMATTR_DEVICE_nGnRE 0x1 +#define ARM_SMMU_MEMATTR_OIWB 0xf #define Q_IDX(q, p)((p) & ((1 << (q)->max_n_shift) - 1)) #define Q_WRP(q, p)((p) & (1 << (q)->max_n_shift)) @@ -301,64 +302,49 @@ #define CMDQ_ERR_CERROR_ILL_IDX1 #define CMDQ_ERR_CERROR_ABT_IDX2 -#define CMDQ_0_OP_SHIFT0 -#define CMDQ_0_OP_MASK 0xffUL +#define CMDQ_0_OP GENMASK_ULL(7, 0) #define CMDQ_0_SSV (1UL << 11) -#define CMDQ_PREFETCH_0_SID_SHIFT 32 -#define CMDQ_PREFETCH_1_SIZE_SHIFT 0 +#define CMDQ_PREFETCH_0_SIDGENMASK_ULL(63, 32) +#define CMDQ_PREFETCH_1_SIZE GENMASK_ULL(4, 0) #define CMDQ_PREFETCH_1_ADDR_MASK GENMASK_ULL(63, 12) -#define CMDQ_CFGI_0_SID_SHIFT 32 -#define CMDQ_CFGI_0_SID_MASK 0xUL +#define CMDQ_CFGI_0_SIDGENMASK_ULL(63, 32) #define CMDQ_CFGI_1_LEAF (1UL << 0) -#define CMDQ_CFGI_1_RANGE_SHIFT0 -#define CMDQ_CFGI_1_RANGE_MASK 0x1fUL +#define CMDQ_CFGI_1_RANGE GENMASK_ULL(4, 0) -#define CMDQ_TLBI_0_VMID_SHIFT 32 -#define CMDQ_TLBI_0_ASID_SHIFT 48 +#define CMDQ_TLBI_0_VMID GENMASK_ULL(47, 32) +#define CMDQ_TLBI_0_ASID GENMASK_ULL(63, 48) #define CMDQ_TLBI_1_LEAF (1UL << 0) #define CMDQ_TLBI_1_VA_MASKGENMASK_ULL(63, 12) #define CMDQ_TLBI_1_IPA_MASK GENMASK_ULL(47, 12) -#define CMDQ_PRI_0_SSID_SHIFT 12 -#define CMDQ_PRI_0_SSID_MASK 0xfUL -#define CMDQ_PRI_0_SID_SHIFT 32 -#define CMDQ_PRI_0_SID_MASK0xUL -#define CMDQ_PRI_1_GRPID_SHIFT 0 -#define CMDQ_PRI_1_GRPID_MASK 0x1ffUL -#define CMDQ_PRI_1_RESP_SHIFT 12 -#define CMDQ_PRI_1_RESP_DENY (0UL << CMDQ_PRI_1_RESP_SHIFT) -#define CMDQ_PRI_1_RESP_FAIL (1UL << CMDQ_PRI_1_RESP_SHIFT) -#define CMDQ_PRI_1_RESP_SUCC (2UL << CMDQ_PRI_1_RESP_SHIFT) +#define CMDQ_PRI_0_SSIDGENMASK_ULL(31, 12) +#define CMDQ_PRI_0_SID GENMASK_ULL(63, 32) +#define CMDQ_PRI_1_GRPID GENMASK_ULL(8, 0) +#define CMDQ_PRI_1_RESPGENMASK_ULL(13, 12) -#define CMDQ_SYNC_0_CS_SHIFT 12 -#define CMDQ_SYNC_0_CS_NONE(0UL << CMDQ_SYNC_0_CS_SHIFT) -#define CMDQ_SYNC_0_CS_IRQ (1UL << CMDQ_SYNC_0_CS_SHIFT) -#define CMDQ_SYNC_0_CS_SEV (2UL << CMDQ_SYNC_0_CS_SHIFT) -#define CMDQ_SYNC_0_MSH_SHIFT 22 -#define CMDQ_SYNC_0_MSH_ISH(3UL << CMDQ_SYNC_0_MSH_SHIFT) -#define CMDQ_SYNC_0_MSIATTR_SHIFT 24 -#define CMDQ_SYNC_0_MSIATTR_OIWB (0xfUL << CMDQ_SYNC_0_MSIATTR_SHIFT) -#define CMDQ_SYNC_0_MSIDATA_SHIFT 32 -#define CMDQ_SYNC_0_MSIDATA_MASK 0xUL +#define CMDQ_SYNC_0_CS GENMASK_ULL(13, 12) +#define CMDQ_SYNC_0_CS_NONE0 +#define CMDQ_SYNC_0_CS_IRQ 1 +#define CMDQ_SYNC_0_CS_SEV 2 +#define CMDQ_SYNC_0_MSHGENMASK_ULL(23, 22) +#define CMDQ_SYNC_0_MSIATTRGENMASK_ULL(27, 24) +#define CMDQ_SYNC_0_MSIDATAGENMASK_ULL(63, 32) #define CMDQ_SYNC_1_MSIADDR_MASK GENMASK_ULL(47, 2) /* Event queue */ #define EVTQ_ENT_DWORDS4 #define EVTQ_MAX_SZ_SHIFT 7 -#define EVTQ_0_ID_SHIFT0 -#define EVTQ_0_ID_MASK 0xffUL +#define EVTQ_0_ID GENMASK_ULL(7, 0) /* PRI queue */ #define PRIQ_ENT_DWORDS2 #define PRIQ_MAX_SZ_SHIFT 8 -#define PRIQ_0_SID_SHIFT 0 -#define PRIQ_0_SID_MASK0xUL -#define PRIQ_0_SSID_SHIFT 32 -#define PRIQ_0_SSID_MASK 0xfUL +#define PRIQ_0_SID
[PATCH v3 0/7] SMMUv3 52-bit address support
Hi all, Here's a v3 with most of the additional cleanup as discussed, which makes the diffstat look pretty horrific, but I think it does work out for the better in the end. I did write a further patch converting "(1 << x)" into "BIT(x)", but comes out as a +70/-70 diff of pure churn which somehow manages to make the codegen in arm_smmu_device_probe() worse, so it got git reset --hard into oblivion. Robin. Robin Murphy (7): iommu/arm-smmu-v3: Clean up address masking iommu/arm-smmu-v3: Clean up register definitions iommu/arm-smmu-v3: Clean up table definitions iommu/arm-smmu-v3: Clean up queue definitions iommu/io-pgtable-arm: Support 52-bit physical address iommu/arm-smmu-v3: Support 52-bit physical address iommu/arm-smmu-v3: Support 52-bit virtual address drivers/iommu/arm-smmu-v3.c| 523 ++--- drivers/iommu/io-pgtable-arm.c | 67 -- 2 files changed, 277 insertions(+), 313 deletions(-) -- 2.16.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/7] iommu/arm-smmu-v3: Clean up register definitions
The FIELD_{GET,PREP} accessors provided by linux/bitfield.h allow us to define multi-bit register fields solely in terms of their bit positions via GENMASK(), without needing explicit *_SHIFT and *_MASK definitions. As well as the immediate reduction in lines of code, this avoids the awkwardness of values sometimes being pre-shifted and sometimes not, which means we can factor out some common values like memory attributes. Furthermore, it also makes it trivial to verify the definitions against the architecture spec, on which note let's also fix up a few field names to properly match the current release (IHI0070B). Signed-off-by: Robin Murphy--- v3: New drivers/iommu/arm-smmu-v3.c | 174 1 file changed, 77 insertions(+), 97 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index ac437aedc598..40a19ce03f99 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -22,6 +22,7 @@ #include #include +#include #include #include #include @@ -44,18 +45,15 @@ /* MMIO registers */ #define ARM_SMMU_IDR0 0x0 -#define IDR0_ST_LVL_SHIFT 27 -#define IDR0_ST_LVL_MASK 0x3 -#define IDR0_ST_LVL_2LVL (1 << IDR0_ST_LVL_SHIFT) -#define IDR0_STALL_MODEL_SHIFT 24 -#define IDR0_STALL_MODEL_MASK 0x3 -#define IDR0_STALL_MODEL_STALL (0 << IDR0_STALL_MODEL_SHIFT) -#define IDR0_STALL_MODEL_FORCE (2 << IDR0_STALL_MODEL_SHIFT) -#define IDR0_TTENDIAN_SHIFT21 -#define IDR0_TTENDIAN_MASK 0x3 -#define IDR0_TTENDIAN_LE (2 << IDR0_TTENDIAN_SHIFT) -#define IDR0_TTENDIAN_BE (3 << IDR0_TTENDIAN_SHIFT) -#define IDR0_TTENDIAN_MIXED(0 << IDR0_TTENDIAN_SHIFT) +#define IDR0_ST_LVLGENMASK(28, 27) +#define IDR0_ST_LVL_2LVL 1 +#define IDR0_STALL_MODEL GENMASK(25, 24) +#define IDR0_STALL_MODEL_STALL 0 +#define IDR0_STALL_MODEL_FORCE 2 +#define IDR0_TTENDIAN GENMASK(22, 21) +#define IDR0_TTENDIAN_MIXED0 +#define IDR0_TTENDIAN_LE 2 +#define IDR0_TTENDIAN_BE 3 #define IDR0_CD2L (1 << 19) #define IDR0_VMID16(1 << 18) #define IDR0_PRI (1 << 16) @@ -65,10 +63,9 @@ #define IDR0_ATS (1 << 10) #define IDR0_HYP (1 << 9) #define IDR0_COHACC(1 << 4) -#define IDR0_TTF_SHIFT 2 -#define IDR0_TTF_MASK 0x3 -#define IDR0_TTF_AARCH64 (2 << IDR0_TTF_SHIFT) -#define IDR0_TTF_AARCH32_64(3 << IDR0_TTF_SHIFT) +#define IDR0_TTF GENMASK(3, 2) +#define IDR0_TTF_AARCH64 2 +#define IDR0_TTF_AARCH32_643 #define IDR0_S1P (1 << 1) #define IDR0_S2P (1 << 0) @@ -76,31 +73,24 @@ #define IDR1_TABLES_PRESET (1 << 30) #define IDR1_QUEUES_PRESET (1 << 29) #define IDR1_REL (1 << 28) -#define IDR1_CMDQ_SHIFT21 -#define IDR1_CMDQ_MASK 0x1f -#define IDR1_EVTQ_SHIFT16 -#define IDR1_EVTQ_MASK 0x1f -#define IDR1_PRIQ_SHIFT11 -#define IDR1_PRIQ_MASK 0x1f -#define IDR1_SSID_SHIFT6 -#define IDR1_SSID_MASK 0x1f -#define IDR1_SID_SHIFT 0 -#define IDR1_SID_MASK 0x3f +#define IDR1_CMDQS GENMASK(25, 21) +#define IDR1_EVTQS GENMASK(20, 16) +#define IDR1_PRIQS GENMASK(15, 11) +#define IDR1_SSIDSIZE GENMASK(10, 6) +#define IDR1_SIDSIZE GENMASK(5, 0) #define ARM_SMMU_IDR5 0x14 -#define IDR5_STALL_MAX_SHIFT 16 -#define IDR5_STALL_MAX_MASK0x +#define IDR5_STALL_MAX GENMASK(31, 16) #define IDR5_GRAN64K (1 << 6) #define IDR5_GRAN16K (1 << 5) #define IDR5_GRAN4K(1 << 4) -#define IDR5_OAS_SHIFT 0 -#define IDR5_OAS_MASK 0x7 -#define IDR5_OAS_32_BIT(0 << IDR5_OAS_SHIFT) -#define IDR5_OAS_36_BIT(1 << IDR5_OAS_SHIFT) -#define IDR5_OAS_40_BIT(2 << IDR5_OAS_SHIFT) -#define IDR5_OAS_42_BIT(3 << IDR5_OAS_SHIFT) -#define IDR5_OAS_44_BIT(4 << IDR5_OAS_SHIFT) -#define IDR5_OAS_48_BIT(5 << IDR5_OAS_SHIFT) +#define IDR5_OAS GENMASK(2, 0) +#define IDR5_OAS_32_BIT0 +#define IDR5_OAS_36_BIT1 +#define IDR5_OAS_40_BIT2
[PATCH v3 1/7] iommu/arm-smmu-v3: Clean up address masking
Before trying to add the SMMUv3.1 support for 52-bit addresses, make things bearable by cleaning up the various address mask definitions to use GENMASK_ULL() consistently. The fact that doing so reveals (and fixes) a latent off-by-one in Q_BASE_ADDR_MASK only goes to show what a jolly good idea it is... Tested-by: Nate WattersonSigned-off-by: Robin Murphy --- v3: No change drivers/iommu/arm-smmu-v3.c | 53 ++--- 1 file changed, 21 insertions(+), 32 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 3f2f1fc68b52..ac437aedc598 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -22,6 +22,7 @@ #include #include +#include #include #include #include @@ -158,8 +159,7 @@ #define ARM_SMMU_STRTAB_BASE 0x80 #define STRTAB_BASE_RA (1UL << 62) -#define STRTAB_BASE_ADDR_SHIFT 6 -#define STRTAB_BASE_ADDR_MASK 0x3ffUL +#define STRTAB_BASE_ADDR_MASK GENMASK_ULL(47, 6) #define ARM_SMMU_STRTAB_BASE_CFG 0x88 #define STRTAB_BASE_CFG_LOG2SIZE_SHIFT 0 @@ -190,8 +190,7 @@ #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc /* Common MSI config fields */ -#define MSI_CFG0_ADDR_SHIFT2 -#define MSI_CFG0_ADDR_MASK 0x3fffUL +#define MSI_CFG0_ADDR_MASK GENMASK_ULL(47, 2) #define MSI_CFG2_SH_SHIFT 4 #define MSI_CFG2_SH_NSH(0UL << MSI_CFG2_SH_SHIFT) #define MSI_CFG2_SH_OSH(2UL << MSI_CFG2_SH_SHIFT) @@ -207,8 +206,7 @@ Q_IDX(q, p) * (q)->ent_dwords) #define Q_BASE_RWA (1UL << 62) -#define Q_BASE_ADDR_SHIFT 5 -#define Q_BASE_ADDR_MASK 0xfffUL +#define Q_BASE_ADDR_MASK GENMASK_ULL(47, 5) #define Q_BASE_LOG2SIZE_SHIFT 0 #define Q_BASE_LOG2SIZE_MASK 0x1fUL @@ -225,8 +223,7 @@ #define STRTAB_L1_DESC_DWORDS 1 #define STRTAB_L1_DESC_SPAN_SHIFT 0 #define STRTAB_L1_DESC_SPAN_MASK 0x1fUL -#define STRTAB_L1_DESC_L2PTR_SHIFT 6 -#define STRTAB_L1_DESC_L2PTR_MASK 0x3ffUL +#define STRTAB_L1_DESC_L2PTR_MASK GENMASK_ULL(47, 6) #define STRTAB_STE_DWORDS 8 #define STRTAB_STE_0_V (1UL << 0) @@ -239,8 +236,7 @@ #define STRTAB_STE_0_S1FMT_SHIFT 4 #define STRTAB_STE_0_S1FMT_LINEAR (0UL << STRTAB_STE_0_S1FMT_SHIFT) -#define STRTAB_STE_0_S1CTXPTR_SHIFT6 -#define STRTAB_STE_0_S1CTXPTR_MASK 0x3ffUL +#define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(47, 6) #define STRTAB_STE_0_S1CDMAX_SHIFT 59 #define STRTAB_STE_0_S1CDMAX_MASK 0x1fUL @@ -278,8 +274,7 @@ #define STRTAB_STE_2_S2PTW (1UL << 54) #define STRTAB_STE_2_S2R (1UL << 58) -#define STRTAB_STE_3_S2TTB_SHIFT 4 -#define STRTAB_STE_3_S2TTB_MASK0xfffUL +#define STRTAB_STE_3_S2TTB_MASKGENMASK_ULL(47, 4) /* Context descriptor (stage-1 only) */ #define CTXDESC_CD_DWORDS 8 @@ -325,8 +320,7 @@ #define CTXDESC_CD_0_ASID_SHIFT48 #define CTXDESC_CD_0_ASID_MASK 0xUL -#define CTXDESC_CD_1_TTB0_SHIFT4 -#define CTXDESC_CD_1_TTB0_MASK 0xfffUL +#define CTXDESC_CD_1_TTB0_MASK GENMASK_ULL(47, 4) #define CTXDESC_CD_3_MAIR_SHIFT0 @@ -351,7 +345,7 @@ #define CMDQ_PREFETCH_0_SID_SHIFT 32 #define CMDQ_PREFETCH_1_SIZE_SHIFT 0 -#define CMDQ_PREFETCH_1_ADDR_MASK ~0xfffUL +#define CMDQ_PREFETCH_1_ADDR_MASK GENMASK_ULL(63, 12) #define CMDQ_CFGI_0_SID_SHIFT 32 #define CMDQ_CFGI_0_SID_MASK 0xUL @@ -362,8 +356,8 @@ #define CMDQ_TLBI_0_VMID_SHIFT 32 #define CMDQ_TLBI_0_ASID_SHIFT 48 #define CMDQ_TLBI_1_LEAF (1UL << 0) -#define CMDQ_TLBI_1_VA_MASK~0xfffUL -#define CMDQ_TLBI_1_IPA_MASK 0xf000UL +#define CMDQ_TLBI_1_VA_MASKGENMASK_ULL(63, 12) +#define CMDQ_TLBI_1_IPA_MASK GENMASK_ULL(47, 12) #define CMDQ_PRI_0_SSID_SHIFT 12 #define CMDQ_PRI_0_SSID_MASK 0xfUL @@ -386,8 +380,7 @@ #define CMDQ_SYNC_0_MSIATTR_OIWB (0xfUL << CMDQ_SYNC_0_MSIATTR_SHIFT) #define CMDQ_SYNC_0_MSIDATA_SHIFT 32 #define CMDQ_SYNC_0_MSIDATA_MASK 0xUL -#define CMDQ_SYNC_1_MSIADDR_SHIFT 0 -#define CMDQ_SYNC_1_MSIADDR_MASK 0xcUL +#define CMDQ_SYNC_1_MSIADDR_MASK GENMASK_ULL(47, 2) /* Event queue */ #define EVTQ_ENT_DWORDS4 @@ -413,8 +406,7 @@ #define PRIQ_1_PRG_IDX_SHIFT 0 #define PRIQ_1_PRG_IDX_MASK0x1ffUL -#define PRIQ_1_ADDR_SHIFT 12 -#define PRIQ_1_ADDR_MASK 0xfUL +#define
Re: Is adding offset to dma address is valid operation?
On 26-03-2018 17:15, Robin Murphy wrote: > Hi Harsh, > > On 26/03/18 10:55, Harsh Jain wrote: >> Hi, >> >> Can we add offset to dma address received from IOMMU in scatter/Gather list >> before sending it to H/W. >> >> Address to HW = sg_dma_address(sg) + offset, where 0 < offset < >> sg_dma_len(sg). > > sg_dma_address() should already account for sg->offset (i.e. the start of the > buffer within the page. If there's a DMA API implementation failing to > respect that then that's a bug. > >> I need this operation to make sure our H/W does not receives entry of size >> greater than 2K. > > If however it's just that you want to fragment a single SG entry into > multiple commaneds/descriptors/etc. internally to your driver, then I can't > see that being an issue as long as you don't visibly modify the scatterlist > itself. If there's a genuine length and/or alignment limitation in the > hardware, though, it would be better to set up dev->dma_parms appropriately > to describe them, such that the subsystem responsible for generating the > scatterlist respects the appropriate constraints in the first place (and if > it doesn't, then that's another bug to fix). Can dma_parms handle 2K size limit ? Because dma_map_sg() returns <= passed nents. But in my case nents will increase. > > Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Is adding offset to dma address is valid operation?
On 26-03-2018 17:15, Robin Murphy wrote: > Hi Harsh, > > On 26/03/18 10:55, Harsh Jain wrote: >> Hi, >> >> Can we add offset to dma address received from IOMMU in scatter/Gather list >> before sending it to H/W. >> >> Address to HW = sg_dma_address(sg) + offset, where 0 < offset < >> sg_dma_len(sg). > > sg_dma_address() should already account for sg->offset (i.e. the start of the > buffer within the page. If there's a DMA API implementation failing to > respect that then that's a bug. Thanks. Here "offset" means size of last dma entry I had mapped to H/W structure. eg if dma length is 9K offset will be 0, 2K, 4K,6K,8K. > >> I need this operation to make sure our H/W does not receives entry of size >> greater than 2K. > > If however it's just that you want to fragment a single SG entry into > multiple commaneds/descriptors/etc. internally to your driver, then I can't > see that being an issue as long as you don't visibly modify the scatterlist > itself. If there's a genuine length and/or alignment limitation in the > hardware, though, it would be better to set up dev->dma_parms appropriately > to describe them, such that the subsystem responsible for generating the > scatterlist respects the appropriate constraints in the first place (and if > it doesn't, then that's another bug to fix). > Yes, I wants to fragment single entry into multiple such that it is less than 2K. eg. if sg->length == 5k , I will split into 3 entries addr1 = sg_dma_address(sg), len =2K addr2 = sg_dma_address(sg) + 2K, len = 2K addr3 = sg_dma_address(sg) +4K, len = 1K > Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Is adding offset to dma address is valid operation?
Hi Harsh, On 26/03/18 10:55, Harsh Jain wrote: Hi, Can we add offset to dma address received from IOMMU in scatter/Gather list before sending it to H/W. Address to HW = sg_dma_address(sg) + offset, where 0 < offset < sg_dma_len(sg). sg_dma_address() should already account for sg->offset (i.e. the start of the buffer within the page. If there's a DMA API implementation failing to respect that then that's a bug. I need this operation to make sure our H/W does not receives entry of size greater than 2K. If however it's just that you want to fragment a single SG entry into multiple commaneds/descriptors/etc. internally to your driver, then I can't see that being an issue as long as you don't visibly modify the scatterlist itself. If there's a genuine length and/or alignment limitation in the hardware, though, it would be better to set up dev->dma_parms appropriately to describe them, such that the subsystem responsible for generating the scatterlist respects the appropriate constraints in the first place (and if it doesn't, then that's another bug to fix). Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Is adding offset to dma address is valid operation?
Hi, Can we add offset to dma address received from IOMMU in scatter/Gather list before sending it to H/W. Address to HW = sg_dma_address(sg) + offset, where 0 < offset < sg_dma_len(sg). I need this operation to make sure our H/W does not receives entry of size greater than 2K. Regards Harsh Jain ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 11/14] iommu/rockchip: Use OF_IOMMU to attach devices automatically
Hi Daniel, Thanks for your reply. On 03/26/2018 02:31 PM, Daniel Kurtz wrote: >+struct rk_iommudata { >+ struct rk_iommu *iommu; >+}; Why do we need this struct? Can't we just assign a pointer to struct rk_iommu directly to dev->archdata.iommu? hmmm, i was trying to add more device related data in patch[13]: struct rk_iommudata { + struct device_link *link; /* runtime PM link from IOMMU to master */ struct rk_iommu *iommu; }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 11/14] iommu/rockchip: Use OF_IOMMU to attach devices automatically
On Fri, Mar 23, 2018 at 1:40 AM Jeffy Chenwrote: > Converts the rockchip-iommu driver to use the OF_IOMMU infrastructure, > which allows attaching master devices to their IOMMUs automatically > according to DT properties. > Signed-off-by: Jeffy Chen > Reviewed-by: Robin Murphy > --- > Changes in v8: None > Changes in v7: None > Changes in v6: None > Changes in v5: None > Changes in v4: None > Changes in v3: > Add struct rk_iommudata. > Squash iommu/rockchip: Use iommu_group_get_for_dev() for add_device > Changes in v2: None > drivers/iommu/rockchip-iommu.c | 135 - > 1 file changed, 40 insertions(+), 95 deletions(-) > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c > index 7970d21b9858..bd8580b897e9 100644 > --- a/drivers/iommu/rockchip-iommu.c > +++ b/drivers/iommu/rockchip-iommu.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -104,6 +105,10 @@ struct rk_iommu { > struct iommu_domain *domain; /* domain to which iommu is attached */ > }; > +struct rk_iommudata { > + struct rk_iommu *iommu; > +}; Why do we need this struct? Can't we just assign a pointer to struct rk_iommu directly to dev->archdata.iommu? > + > static struct device *dma_dev; > static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, > @@ -807,18 +812,9 @@ static size_t rk_iommu_unmap(struct iommu_domain *domain, unsigned long _iova, > static struct rk_iommu *rk_iommu_from_dev(struct device *dev) > { > - struct iommu_group *group; > - struct device *iommu_dev; > - struct rk_iommu *rk_iommu; > + struct rk_iommudata *data = dev->archdata.iommu; > - group = iommu_group_get(dev); > - if (!group) > - return NULL; > - iommu_dev = iommu_group_get_iommudata(group); > - rk_iommu = dev_get_drvdata(iommu_dev); > - iommu_group_put(group); > - > - return rk_iommu; > + return data ? data->iommu : NULL; > } > static int rk_iommu_attach_device(struct iommu_domain *domain, > @@ -989,110 +985,53 @@ static void rk_iommu_domain_free(struct iommu_domain *domain) > iommu_put_dma_cookie(_domain->domain); > } > -static bool rk_iommu_is_dev_iommu_master(struct device *dev) > -{ > - struct device_node *np = dev->of_node; > - int ret; > - > - /* > -* An iommu master has an iommus property containing a list of phandles > -* to iommu nodes, each with an #iommu-cells property with value 0. > -*/ > - ret = of_count_phandle_with_args(np, "iommus", "#iommu-cells"); > - return (ret > 0); > -} > - > -static int rk_iommu_group_set_iommudata(struct iommu_group *group, > - struct device *dev) > +static int rk_iommu_add_device(struct device *dev) > { > - struct device_node *np = dev->of_node; > - struct platform_device *pd; > - int ret; > - struct of_phandle_args args; > + struct iommu_group *group; > + struct rk_iommu *iommu; > - /* > -* An iommu master has an iommus property containing a list of phandles > -* to iommu nodes, each with an #iommu-cells property with value 0. > -*/ > - ret = of_parse_phandle_with_args(np, "iommus", "#iommu-cells", 0, > -); > - if (ret) { > - dev_err(dev, "of_parse_phandle_with_args(%pOF) => %d\n", > - np, ret); > - return ret; > - } > - if (args.args_count != 0) { > - dev_err(dev, "incorrect number of iommu params found for %pOF (found %d, expected 0)\n", > - args.np, args.args_count); > - return -EINVAL; > - } > + iommu = rk_iommu_from_dev(dev); > + if (!iommu) > + return -ENODEV; > - pd = of_find_device_by_node(args.np); > - of_node_put(args.np); > - if (!pd) { > - dev_err(dev, "iommu %pOF not found\n", args.np); > - return -EPROBE_DEFER; > - } > + group = iommu_group_get_for_dev(dev); > + if (IS_ERR(group)) > + return PTR_ERR(group); > + iommu_group_put(group); > - /* TODO(djkurtz): handle multiple slave iommus for a single master */ > - iommu_group_set_iommudata(group, >dev, NULL); > + iommu_device_link(>iommu, dev); > return 0; > } > -static int rk_iommu_add_device(struct device *dev) > +static void rk_iommu_remove_device(struct device *dev) > { > - struct iommu_group *group; > struct rk_iommu *iommu; > - int ret; > - > - if (!rk_iommu_is_dev_iommu_master(dev)) > - return -ENODEV; > - > - group = iommu_group_get(dev); > - if (!group) {