[PATCH] iommu/arm-smmu-qcom: Fix mask extraction for bootloader programmed SMRs
When extracting the mask for a SMR that was programmed by the bootloader, the SMR's valid bit is also extracted and is treated as part of the mask, which is not correct. Consider the scenario where an SMMU master whose context is determined by a bootloader programmed SMR is removed (omitting parts of device/driver core): ->iommu_release_device() -> arm_smmu_release_device() -> arm_smmu_master_free_smes() -> arm_smmu_free_sme() /* Assume that the SME is now free */ -> arm_smmu_write_sme() -> arm_smmu_write_smr() /* Construct SMR value using mask and SID */ Since the valid bit was considered as part of the mask, the SMR will be programmed as valid. Fix the SMR mask extraction step for bootloader programmed SMRs by masking out the valid bit when we know that we're already working with a valid SMR. Fixes: 07a7f2caaa5a ("iommu/arm-smmu-qcom: Read back stream mappings") Signed-off-by: Isaac J. Manjarres Cc: sta...@vger.kernel.org --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index bcda170..abb1d2f 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -206,6 +206,8 @@ static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu) smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i)); if (FIELD_GET(ARM_SMMU_SMR_VALID, smr)) { + /* Ignore valid bit for SMR mask extraction. */ + smr &= ~ARM_SMMU_SMR_VALID; smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr); smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr); smmu->smrs[i].valid = true; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 0/5] Optimize iommu_map_sg() performance
The iommu_map_sg() code currently iterates through the given scatter-gather list, and in the worst case, invokes iommu_map() for each element in the scatter-gather list, which calls into the IOMMU driver through an indirect call. For an IOMMU driver that uses a format supported by the io-pgtable code, the IOMMU driver will then call into the io-pgtable code to map the chunk. Jumping between the IOMMU core code, the IOMMU driver, and the io-pgtable code and back for each element in a scatter-gather list is not efficient. Instead, add a map_sg() hook in both the IOMMU driver ops and the io-pgtable ops. iommu_map_sg() can then call into the IOMMU driver's map_sg() hook with the entire scatter-gather list, which can call into the io-pgtable map_sg() hook, which can process the entire scatter-gather list, signficantly reducing the number of indirect calls, and jumps between these layers, boosting performance. On a system that uses the ARM SMMU driver, and the ARM LPAE format, the current implementation of iommu_map_sg() yields the following latencies for mapping scatter-gather lists of various sizes. These latencies are calculated by repeating the mapping operation 10 times: sizeiommu_map_sg latency 4K0.624 us 64K9.468 us 1M 122.557 us 2M 239.807 us 12M 1435.979 us 24M 2884.968 us 32M 3832.979 us On the same system, the proposed modifications yield the following results: sizeiommu_map_sg latency 4K3.645 us 64K4.198 us 1M 11.010 us 2M 17.125 us 12M 82.416 us 24M 158.677 us 32M 210.468 us The procedure for collecting the iommu_map_sg latencies is the same in both experiments. Clearly, reducing the jumps between the different layers in the IOMMU code offers a signficant performance boost in iommu_map_sg() latency. Changes since v1: -Fixed an off by one error in arm_[lpae/v7s]_map_by_pgsize when checking if the IOVA and physical address ranges being mapped are within the appropriate limits. -Added Sai Prakash Ranjan's "Tested-by" tag. Thanks, Isaac Isaac J. Manjarres (5): iommu/io-pgtable: Introduce map_sg() as a page table op iommu/io-pgtable-arm: Hook up map_sg() iommu/io-pgtable-arm-v7s: Hook up map_sg() iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers iommu/arm-smmu: Hook up map_sg() drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 drivers/iommu/io-pgtable-arm-v7s.c| 90 +++ drivers/iommu/io-pgtable-arm.c| 86 + drivers/iommu/iommu.c | 25 -- include/linux/io-pgtable.h| 6 +++ include/linux/iommu.h | 13 + 6 files changed, 234 insertions(+), 5 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 1/5] iommu/io-pgtable: Introduce map_sg() as a page table op
While mapping a scatter-gather list, iommu_map_sg() calls into the IOMMU driver through an indirect call, which can call into the io-pgtable code through another indirect call. This sequence of going through the IOMMU core code, the IOMMU driver, and finally the io-pgtable code, occurs for every element in the scatter-gather list, in the worse case, which is not optimal. Introduce a map_sg callback in the io-pgtable ops so that IOMMU drivers can invoke it with the complete scatter-gather list, so that it can be processed within the io-pgtable code entirely, reducing the number of indirect calls, and boosting overall iommu_map_sg() performance. Signed-off-by: Isaac J. Manjarres Tested-by: Sai Prakash Ranjan --- include/linux/io-pgtable.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index ea727eb..6d0e731 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -147,6 +147,9 @@ struct io_pgtable_cfg { * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers. * * @map: Map a physically contiguous memory region. + * @map_sg: Map a scatter-gather list of physically contiguous memory + *chunks. The mapped pointer argument is used to store how + *many bytes are mapped. * @unmap:Unmap a physically contiguous memory region. * @iova_to_phys: Translate iova to physical address. * @@ -156,6 +159,9 @@ struct io_pgtable_cfg { struct io_pgtable_ops { int (*map)(struct io_pgtable_ops *ops, unsigned long iova, phys_addr_t paddr, size_t size, int prot, gfp_t gfp); + int (*map_sg)(struct io_pgtable_ops *ops, unsigned long iova, + struct scatterlist *sg, unsigned int nents, int prot, + gfp_t gfp, size_t *mapped); size_t (*unmap)(struct io_pgtable_ops *ops, unsigned long iova, size_t size, struct iommu_iotlb_gather *gather); phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 5/5] iommu/arm-smmu: Hook up map_sg()
Now that everything is in place for iommu_map_sg() to defer mapping a scatter-gather list to the io-pgtable layer, implement the map_sg() callback in the SMMU driver, so that iommu_map_sg() can invoke it with the entire scatter-gather list that will be mapped. Signed-off-by: Isaac J. Manjarres Tested-by: Sai Prakash Ranjan --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index d8c6bfd..52acc68 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1208,6 +1208,24 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, return ret; } +static int arm_smmu_map_sg(struct iommu_domain *domain, unsigned long iova, + struct scatterlist *sg, unsigned int nents, int prot, + gfp_t gfp, size_t *mapped) +{ + struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + int ret; + + if (!ops) + return -ENODEV; + + arm_smmu_rpm_get(smmu); + ret = ops->map_sg(ops, iova, sg, nents, prot, gfp, mapped); + arm_smmu_rpm_put(smmu); + + return ret; +} + static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *gather) { @@ -1624,6 +1642,7 @@ static struct iommu_ops arm_smmu_ops = { .domain_free= arm_smmu_domain_free, .attach_dev = arm_smmu_attach_dev, .map= arm_smmu_map, + .map_sg = arm_smmu_map_sg, .unmap = arm_smmu_unmap, .flush_iotlb_all= arm_smmu_flush_iotlb_all, .iotlb_sync = arm_smmu_iotlb_sync, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 3/5] iommu/io-pgtable-arm-v7s: Hook up map_sg()
Implement the map_sg io-pgtable op for the ARMv7s io-pgtable code, so that IOMMU drivers can call it when they need to map a scatter-gather list. Signed-off-by: Isaac J. Manjarres Tested-by: Sai Prakash Ranjan --- drivers/iommu/io-pgtable-arm-v7s.c | 90 ++ 1 file changed, 90 insertions(+) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index 1d92ac9..8665dab 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -545,6 +545,95 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova, return ret; } +static int arm_v7s_map_by_pgsize(struct io_pgtable_ops *ops, +unsigned long iova, phys_addr_t paddr, +size_t size, int prot, gfp_t gfp, +size_t *mapped) +{ + struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable *iop = >iop; + struct io_pgtable_cfg *cfg = >cfg; + unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap); + int ret; + size_t pgsize; + + if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 0x%x\n", + iova, , size, min_pagesz); + return -EINVAL; + } + + if (WARN_ON((iova + size - 1) >= (1ULL << cfg->ias) || + (paddr + size - 1) >= (1ULL << cfg->oas))) + return -ERANGE; + + while (size) { + pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size); + ret = __arm_v7s_map(data, iova, paddr, pgsize, prot, 1, + data->pgd, gfp); + + if (iop->cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) { + io_pgtable_tlb_flush_walk(>iop, iova, size, + ARM_V7S_BLOCK_SIZE(2)); + } else { + wmb(); + } + + if (ret) + return ret; + + iova += pgsize; + paddr += pgsize; + *mapped += pgsize; + size -= pgsize; + } + + return 0; +} + +static int arm_v7s_map_sg(struct io_pgtable_ops *ops, unsigned long iova, + struct scatterlist *sg, unsigned int nents, + int iommu_prot, gfp_t gfp, size_t *mapped) +{ + size_t len = 0; + unsigned int i = 0; + int ret; + phys_addr_t start; + + *mapped = 0; + + /* If no access, then nothing to do */ + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) + return 0; + + while (i <= nents) { + phys_addr_t s_phys = sg_phys(sg); + + if (len && s_phys != start + len) { + ret = arm_v7s_map_by_pgsize(ops, iova + *mapped, start, + len, iommu_prot, gfp, + mapped); + + if (ret) + return ret; + + len = 0; + } + + if (len) { + len += sg->length; + } else { + len = sg->length; + start = s_phys; + } + + if (++i < nents) + sg = sg_next(sg); + } + + return 0; +} + static void arm_v7s_free_pgtable(struct io_pgtable *iop) { struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop); @@ -783,6 +872,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, data->iop.ops = (struct io_pgtable_ops) { .map= arm_v7s_map, + .map_sg = arm_v7s_map_sg, .unmap = arm_v7s_unmap, .iova_to_phys = arm_v7s_iova_to_phys, }; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 4/5] iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
Add support for IOMMU drivers to have their own map_sg() callbacks. This completes the path for having iommu_map_sg() invoke an IOMMU driver's map_sg() callback, which can then invoke the io-pgtable map_sg() callback with the entire scatter-gather list, so that it can be processed entirely in the io-pgtable layer. For IOMMU drivers that do not provide a callback, the default implementation of iterating through the scatter-gather list, while calling iommu_map() will be used. Signed-off-by: Isaac J. Manjarres Tested-by: Sai Prakash Ranjan --- drivers/iommu/iommu.c | 13 + include/linux/iommu.h | 5 + 2 files changed, 18 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 0da0687..46acd5c 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2535,11 +2535,24 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { + const struct iommu_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; int ret; + if (ops->map_sg) { + ret = ops->map_sg(domain, iova, sg, nents, prot, gfp, ); + + if (ops->iotlb_sync_map) + ops->iotlb_sync_map(domain); + + if (ret) + goto out_err; + + return mapped; + } + while (i <= nents) { phys_addr_t s_phys = sg_phys(sg); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 0e40a38..bac7681 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -192,6 +192,8 @@ struct iommu_iotlb_gather { * @attach_dev: attach device to an iommu domain * @detach_dev: detach device from an iommu domain * @map: map a physically contiguous memory region to an iommu domain + * @map_sg: map a scatter-gather list of physically contiguous chunks to + * an iommu domain. * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain * @iotlb_sync_map: Sync mappings created recently using @map to the hardware @@ -243,6 +245,9 @@ struct iommu_ops { void (*detach_dev)(struct iommu_domain *domain, struct device *dev); int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot, gfp_t gfp); + int (*map_sg)(struct iommu_domain *domain, unsigned long iova, + struct scatterlist *sg, unsigned int nents, int prot, + gfp_t gfp, size_t *mapped); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather); void (*flush_iotlb_all)(struct iommu_domain *domain); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 2/5] iommu/io-pgtable-arm: Hook up map_sg()
Implement the map_sg io-pgtable op for the ARM LPAE io-pgtable code, so that IOMMU drivers can call it when they need to map a scatter-gather list. Signed-off-by: Isaac J. Manjarres Tested-by: Sai Prakash Ranjan --- drivers/iommu/io-pgtable-arm.c | 86 ++ drivers/iommu/iommu.c | 12 +++--- include/linux/iommu.h | 8 3 files changed, 101 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 87def58..0c11529 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -473,6 +473,91 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova, return ret; } +static int arm_lpae_map_by_pgsize(struct io_pgtable_ops *ops, + unsigned long iova, phys_addr_t paddr, + size_t size, int iommu_prot, gfp_t gfp, + size_t *mapped) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable_cfg *cfg = >iop.cfg; + arm_lpae_iopte *ptep = data->pgd; + int ret, lvl = data->start_level; + arm_lpae_iopte prot = arm_lpae_prot_to_pte(data, iommu_prot); + unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap); + long iaext = (s64)(iova + size - 1) >> cfg->ias; + size_t pgsize; + + if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 0x%x\n", + iova, , size, min_pagesz); + return -EINVAL; + } + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext || (paddr + size - 1) >> cfg->oas)) + return -ERANGE; + + while (size) { + pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size); + ret = __arm_lpae_map(data, iova, paddr, pgsize, prot, lvl, ptep, +gfp); + if (ret) + return ret; + + iova += pgsize; + paddr += pgsize; + *mapped += pgsize; + size -= pgsize; + } + + return 0; +} + +static int arm_lpae_map_sg(struct io_pgtable_ops *ops, unsigned long iova, + struct scatterlist *sg, unsigned int nents, + int iommu_prot, gfp_t gfp, size_t *mapped) +{ + + size_t len = 0; + unsigned int i = 0; + int ret; + phys_addr_t start; + + *mapped = 0; + + /* If no access, then nothing to do */ + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) + return 0; + + while (i <= nents) { + phys_addr_t s_phys = sg_phys(sg); + + if (len && s_phys != start + len) { + ret = arm_lpae_map_by_pgsize(ops, iova + *mapped, start, +len, iommu_prot, gfp, +mapped); + + if (ret) + return ret; + + len = 0; + } + + if (len) { + len += sg->length; + } else { + len = sg->length; + start = s_phys; + } + + if (++i < nents) + sg = sg_next(sg); + } + + return 0; +} + static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl, arm_lpae_iopte *ptep) { @@ -750,6 +835,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) data->iop.ops = (struct io_pgtable_ops) { .map= arm_lpae_map, + .map_sg = arm_lpae_map_sg, .unmap = arm_lpae_unmap, .iova_to_phys = arm_lpae_iova_to_phys, }; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ffeebda..0da0687 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2346,8 +2346,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) } EXPORT_SYMBOL_GPL(iommu_iova_to_phys); -static size_t iommu_pgsize(struct iommu_domain *domain, - unsigned long addr_merge, size_t size) +size_t iommu_pgsize(unsigned long pgsize_bitmap, unsigned long addr_merge, + size_t size) { unsigned int pgsize_idx; size_t pgsize; @@ -2366,7 +2366,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain, pgsize = (1UL << (pgsize_idx + 1)) - 1; /* throw away page sizes not supported by the hardware */ - pgsize &= domain->pgsize_bitmap; + pgsize &= pgsize_bitmap;
[PATCH 0/5] Optimize iommu_map_sg() performance
The iommu_map_sg() code currently iterates through the given scatter-gather list, and in the worst case, invokes iommu_map() for each element in the scatter-gather list, which calls into the IOMMU driver through an indirect call. For an IOMMU driver that uses a format supported by the io-pgtable code, the IOMMU driver will then call into the io-pgtable code to map the chunk. Jumping between the IOMMU core code, the IOMMU driver, and the io-pgtable code and back for each element in a scatter-gather list is not efficient. Instead, add a map_sg() hook in both the IOMMU driver ops and the io-pgtable ops. iommu_map_sg() can then call into the IOMMU driver's map_sg() hook with the entire scatter-gather list, which can call into the io-pgtable map_sg() hook, which can process the entire scatter-gather list, signficantly reducing the number of indirect calls, and jumps between these layers, boosting performance. On a system that uses the ARM SMMU driver, and the ARM LPAE format, the current implementation of iommu_map_sg() yields the following latencies for mapping scatter-gather lists of various sizes. These latencies are calculated by repeating the mapping operation 10 times: sizeiommu_map_sg latency 4K0.624 us 64K9.468 us 1M 122.557 us 2M 239.807 us 12M 1435.979 us 24M 2884.968 us 32M 3832.979 us On the same system, the proposed modifications yield the following results: sizeiommu_map_sg latency 4K3.645 us 64K4.198 us 1M 11.010 us 2M 17.125 us 12M 82.416 us 24M 158.677 us 32M 210.468 us The procedure for collecting the iommu_map_sg latencies is the same in both experiments. Clearly, reducing the jumps between the different layers in the IOMMU code offers a signficant performance boost in iommu_map_sg() latency. Thanks, Isaac Isaac J. Manjarres (5): iommu/io-pgtable: Introduce map_sg() as a page table op iommu/io-pgtable-arm: Hook up map_sg() iommu/io-pgtable-arm-v7s: Hook up map_sg() iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers iommu/arm-smmu: Hook up map_sg() drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 drivers/iommu/io-pgtable-arm-v7s.c| 90 +++ drivers/iommu/io-pgtable-arm.c| 86 + drivers/iommu/iommu.c | 25 -- include/linux/io-pgtable.h| 6 +++ include/linux/iommu.h | 13 + 6 files changed, 234 insertions(+), 5 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH 5/5] iommu/arm-smmu: Hook up map_sg()
Now that everything is in place for iommu_map_sg() to defer mapping a scatter-gather list to the io-pgtable layer, implement the map_sg() callback in the SMMU driver, so that iommu_map_sg() can invoke it with the entire scatter-gather list that will be mapped. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index d8c6bfd..52acc68 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1208,6 +1208,24 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, return ret; } +static int arm_smmu_map_sg(struct iommu_domain *domain, unsigned long iova, + struct scatterlist *sg, unsigned int nents, int prot, + gfp_t gfp, size_t *mapped) +{ + struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + int ret; + + if (!ops) + return -ENODEV; + + arm_smmu_rpm_get(smmu); + ret = ops->map_sg(ops, iova, sg, nents, prot, gfp, mapped); + arm_smmu_rpm_put(smmu); + + return ret; +} + static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *gather) { @@ -1624,6 +1642,7 @@ static struct iommu_ops arm_smmu_ops = { .domain_free= arm_smmu_domain_free, .attach_dev = arm_smmu_attach_dev, .map= arm_smmu_map, + .map_sg = arm_smmu_map_sg, .unmap = arm_smmu_unmap, .flush_iotlb_all= arm_smmu_flush_iotlb_all, .iotlb_sync = arm_smmu_iotlb_sync, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH 3/5] iommu/io-pgtable-arm-v7s: Hook up map_sg()
Implement the map_sg io-pgtable op for the ARMv7s io-pgtable code, so that IOMMU drivers can call it when they need to map a scatter-gather list. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/io-pgtable-arm-v7s.c | 90 ++ 1 file changed, 90 insertions(+) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index 1d92ac9..40d96d2 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -545,6 +545,95 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova, return ret; } +static int arm_v7s_map_by_pgsize(struct io_pgtable_ops *ops, +unsigned long iova, phys_addr_t paddr, +size_t size, int prot, gfp_t gfp, +size_t *mapped) +{ + struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable *iop = >iop; + struct io_pgtable_cfg *cfg = >cfg; + unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap); + int ret; + size_t pgsize; + + if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 0x%x\n", + iova, , size, min_pagesz); + return -EINVAL; + } + + if (WARN_ON((iova + size) >= (1ULL << cfg->ias) || + (paddr + size) >= (1ULL << cfg->oas))) + return -ERANGE; + + while (size) { + pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size); + ret = __arm_v7s_map(data, iova, paddr, pgsize, prot, 1, + data->pgd, gfp); + + if (iop->cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) { + io_pgtable_tlb_flush_walk(>iop, iova, size, + ARM_V7S_BLOCK_SIZE(2)); + } else { + wmb(); + } + + if (ret) + return ret; + + iova += pgsize; + paddr += pgsize; + *mapped += pgsize; + size -= pgsize; + } + + return 0; +} + +static int arm_v7s_map_sg(struct io_pgtable_ops *ops, unsigned long iova, + struct scatterlist *sg, unsigned int nents, + int iommu_prot, gfp_t gfp, size_t *mapped) +{ + size_t len = 0; + unsigned int i = 0; + int ret; + phys_addr_t start; + + *mapped = 0; + + /* If no access, then nothing to do */ + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) + return 0; + + while (i <= nents) { + phys_addr_t s_phys = sg_phys(sg); + + if (len && s_phys != start + len) { + ret = arm_v7s_map_by_pgsize(ops, iova + *mapped, start, + len, iommu_prot, gfp, + mapped); + + if (ret) + return ret; + + len = 0; + } + + if (len) { + len += sg->length; + } else { + len = sg->length; + start = s_phys; + } + + if (++i < nents) + sg = sg_next(sg); + } + + return 0; +} + static void arm_v7s_free_pgtable(struct io_pgtable *iop) { struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop); @@ -783,6 +872,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, data->iop.ops = (struct io_pgtable_ops) { .map= arm_v7s_map, + .map_sg = arm_v7s_map_sg, .unmap = arm_v7s_unmap, .iova_to_phys = arm_v7s_iova_to_phys, }; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH 4/5] iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
Add support for IOMMU drivers to have their own map_sg() callbacks. This completes the path for having iommu_map_sg() invoke an IOMMU driver's map_sg() callback, which can then invoke the io-pgtable map_sg() callback with the entire scatter-gather list, so that it can be processed entirely in the io-pgtable layer. For IOMMU drivers that do not provide a callback, the default implementation of iterating through the scatter-gather list, while calling iommu_map() will be used. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/iommu.c | 13 + include/linux/iommu.h | 5 + 2 files changed, 18 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 0da0687..46acd5c 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2535,11 +2535,24 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { + const struct iommu_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; int ret; + if (ops->map_sg) { + ret = ops->map_sg(domain, iova, sg, nents, prot, gfp, ); + + if (ops->iotlb_sync_map) + ops->iotlb_sync_map(domain); + + if (ret) + goto out_err; + + return mapped; + } + while (i <= nents) { phys_addr_t s_phys = sg_phys(sg); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 0e40a38..bac7681 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -192,6 +192,8 @@ struct iommu_iotlb_gather { * @attach_dev: attach device to an iommu domain * @detach_dev: detach device from an iommu domain * @map: map a physically contiguous memory region to an iommu domain + * @map_sg: map a scatter-gather list of physically contiguous chunks to + * an iommu domain. * @unmap: unmap a physically contiguous memory region from an iommu domain * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain * @iotlb_sync_map: Sync mappings created recently using @map to the hardware @@ -243,6 +245,9 @@ struct iommu_ops { void (*detach_dev)(struct iommu_domain *domain, struct device *dev); int (*map)(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot, gfp_t gfp); + int (*map_sg)(struct iommu_domain *domain, unsigned long iova, + struct scatterlist *sg, unsigned int nents, int prot, + gfp_t gfp, size_t *mapped); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather); void (*flush_iotlb_all)(struct iommu_domain *domain); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH 2/5] iommu/io-pgtable-arm: Hook up map_sg()
Implement the map_sg io-pgtable op for the ARM LPAE io-pgtable code, so that IOMMU drivers can call it when they need to map a scatter-gather list. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/io-pgtable-arm.c | 86 ++ drivers/iommu/iommu.c | 12 +++--- include/linux/iommu.h | 8 3 files changed, 101 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 87def58..9c17d9d 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -473,6 +473,91 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova, return ret; } +static int arm_lpae_map_by_pgsize(struct io_pgtable_ops *ops, + unsigned long iova, phys_addr_t paddr, + size_t size, int iommu_prot, gfp_t gfp, + size_t *mapped) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable_cfg *cfg = >iop.cfg; + arm_lpae_iopte *ptep = data->pgd; + int ret, lvl = data->start_level; + arm_lpae_iopte prot = arm_lpae_prot_to_pte(data, iommu_prot); + unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap); + long iaext = (s64)(iova + size) >> cfg->ias; + size_t pgsize; + + if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 0x%x\n", + iova, , size, min_pagesz); + return -EINVAL; + } + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext || (paddr + size) >> cfg->oas)) + return -ERANGE; + + while (size) { + pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size); + ret = __arm_lpae_map(data, iova, paddr, pgsize, prot, lvl, ptep, +gfp); + if (ret) + return ret; + + iova += pgsize; + paddr += pgsize; + *mapped += pgsize; + size -= pgsize; + } + + return 0; +} + +static int arm_lpae_map_sg(struct io_pgtable_ops *ops, unsigned long iova, + struct scatterlist *sg, unsigned int nents, + int iommu_prot, gfp_t gfp, size_t *mapped) +{ + + size_t len = 0; + unsigned int i = 0; + int ret; + phys_addr_t start; + + *mapped = 0; + + /* If no access, then nothing to do */ + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) + return 0; + + while (i <= nents) { + phys_addr_t s_phys = sg_phys(sg); + + if (len && s_phys != start + len) { + ret = arm_lpae_map_by_pgsize(ops, iova + *mapped, start, +len, iommu_prot, gfp, +mapped); + + if (ret) + return ret; + + len = 0; + } + + if (len) { + len += sg->length; + } else { + len = sg->length; + start = s_phys; + } + + if (++i < nents) + sg = sg_next(sg); + } + + return 0; +} + static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl, arm_lpae_iopte *ptep) { @@ -750,6 +835,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) data->iop.ops = (struct io_pgtable_ops) { .map= arm_lpae_map, + .map_sg = arm_lpae_map_sg, .unmap = arm_lpae_unmap, .iova_to_phys = arm_lpae_iova_to_phys, }; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ffeebda..0da0687 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2346,8 +2346,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) } EXPORT_SYMBOL_GPL(iommu_iova_to_phys); -static size_t iommu_pgsize(struct iommu_domain *domain, - unsigned long addr_merge, size_t size) +size_t iommu_pgsize(unsigned long pgsize_bitmap, unsigned long addr_merge, + size_t size) { unsigned int pgsize_idx; size_t pgsize; @@ -2366,7 +2366,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain, pgsize = (1UL << (pgsize_idx + 1)) - 1; /* throw away page sizes not supported by the hardware */ - pgsize &= domain->pgsize_bitmap; + pgsize &= pgsize_bitmap; /* make sure we're still sane
[PATCH 1/5] iommu/io-pgtable: Introduce map_sg() as a page table op
While mapping a scatter-gather list, iommu_map_sg() calls into the IOMMU driver through an indirect call, which can call into the io-pgtable code through another indirect call. This sequence of going through the IOMMU core code, the IOMMU driver, and finally the io-pgtable code, occurs for every element in the scatter-gather list, in the worse case, which is not optimal. Introduce a map_sg callback in the io-pgtable ops so that IOMMU drivers can invoke it with the complete scatter-gather list, so that it can be processed within the io-pgtable code entirely, reducing the number of indirect calls, and boosting overall iommu_map_sg() performance. Signed-off-by: Isaac J. Manjarres --- include/linux/io-pgtable.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index ea727eb..6d0e731 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -147,6 +147,9 @@ struct io_pgtable_cfg { * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers. * * @map: Map a physically contiguous memory region. + * @map_sg: Map a scatter-gather list of physically contiguous memory + *chunks. The mapped pointer argument is used to store how + *many bytes are mapped. * @unmap:Unmap a physically contiguous memory region. * @iova_to_phys: Translate iova to physical address. * @@ -156,6 +159,9 @@ struct io_pgtable_cfg { struct io_pgtable_ops { int (*map)(struct io_pgtable_ops *ops, unsigned long iova, phys_addr_t paddr, size_t size, int prot, gfp_t gfp); + int (*map_sg)(struct io_pgtable_ops *ops, unsigned long iova, + struct scatterlist *sg, unsigned int nents, int prot, + gfp_t gfp, size_t *mapped); size_t (*unmap)(struct io_pgtable_ops *ops, unsigned long iova, size_t size, struct iommu_iotlb_gather *gather); phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 3/7] iommu/arm-smmu: Add dependency on io-pgtable format modules
The SMMU driver depends on the availability of the ARM LPAE and ARM V7S io-pgtable format code to work properly. In preparation for having the io-pgtable formats as modules, add a "pre" dependency with MODULE_SOFTDEP() to ensure that the io-pgtable format modules are loaded before loading the ARM SMMU driver module. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index d8c6bfd..a72649f 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -2351,3 +2351,4 @@ MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations"); MODULE_AUTHOR("Will Deacon "); MODULE_ALIAS("platform:arm-smmu"); MODULE_LICENSE("GPL v2"); +MODULE_SOFTDEP("pre: io-pgtable-arm io-pgtable-arm-v7s"); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 2/7] iommu/io-pgtable: Add refcounting for io-pgtable format modules
In preparation for modularizing io-pgtable formats, add support for reference counting the io-pgtable format modules to ensure that the modules are not unloaded while they are in use. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/io-pgtable-arm-v7s.c | 1 + drivers/iommu/io-pgtable-arm.c | 5 + drivers/iommu/io-pgtable.c | 12 ++-- include/linux/io-pgtable.h | 2 ++ 4 files changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index 89aad2f..a5cb755a 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -840,6 +840,7 @@ static struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns = { .fmt= ARM_V7S, .alloc = arm_v7s_alloc_pgtable, .free = arm_v7s_free_pgtable, + .owner = THIS_MODULE, }; #ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S_SELFTEST diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index ff0ea2f..e8b1e34 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -1049,26 +1049,31 @@ static struct io_pgtable_init_fns io_pgtable_arm_lpae_init_fns[] = { .fmt= ARM_32_LPAE_S1, .alloc = arm_32_lpae_alloc_pgtable_s1, .free = arm_lpae_free_pgtable, + .owner = THIS_MODULE, }, { .fmt= ARM_32_LPAE_S2, .alloc = arm_32_lpae_alloc_pgtable_s2, .free = arm_lpae_free_pgtable, + .owner = THIS_MODULE, }, { .fmt= ARM_64_LPAE_S1, .alloc = arm_64_lpae_alloc_pgtable_s1, .free = arm_lpae_free_pgtable, + .owner = THIS_MODULE, }, { .fmt= ARM_64_LPAE_S2, .alloc = arm_64_lpae_alloc_pgtable_s2, .free = arm_lpae_free_pgtable, + .owner = THIS_MODULE, }, { .fmt= ARM_MALI_LPAE, .alloc = arm_mali_lpae_alloc_pgtable, .free = arm_lpae_free_pgtable, + .owner = THIS_MODULE, }, }; diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c index 2c6eb2e..cc83542 100644 --- a/drivers/iommu/io-pgtable.c +++ b/drivers/iommu/io-pgtable.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -52,9 +53,14 @@ struct io_pgtable_ops *alloc_io_pgtable_ops(enum io_pgtable_fmt fmt, if (!fns) return NULL; + if (!try_module_get(fns->owner)) + return NULL; + iop = fns->alloc(cfg, cookie); - if (!iop) + if (!iop) { + module_put(fns->owner); return NULL; + } iop->fmt= fmt; iop->cookie = cookie; @@ -79,8 +85,10 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops) iop = io_pgtable_ops_to_pgtable(ops); io_pgtable_tlb_flush_all(iop); fns = io_pgtable_get_init_fns(iop->fmt); - if (fns) + if (fns) { fns->free(iop); + module_put(fns->owner); + } } EXPORT_SYMBOL_GPL(free_io_pgtable_ops); diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 45b367ce..a03b262 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -169,11 +169,13 @@ struct io_pgtable_ops { * @fmt: The page table format. * @alloc: Allocate a set of page tables described by cfg. * @free: Free the page tables associated with iop. + * @owner: Driver module providing these ops. */ struct io_pgtable_init_fns { enum io_pgtable_fmt fmt; struct io_pgtable *(*alloc)(struct io_pgtable_cfg *cfg, void *cookie); void (*free)(struct io_pgtable *iop); + struct module *owner; }; /** -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 6/7] drm/panfrost: Add dependency on io-pgtable-arm format module
The Panfrost DRM driver depends on the availability of the ARM LPAE io-pgtable format code to work properly. In preparation for having the io-pgtable formats as modules, add a "pre" dependency with MODULE_SOFTDEP() to ensure that the io-pgtable-arm format module is loaded before loading the Panfrost DRM driver module. Signed-off-by: Isaac J. Manjarres --- drivers/gpu/drm/panfrost/panfrost_drv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 83a461b..7294622 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -704,3 +704,4 @@ module_platform_driver(panfrost_driver); MODULE_AUTHOR("Panfrost Project Developers"); MODULE_DESCRIPTION("Panfrost DRM Driver"); MODULE_LICENSE("GPL v2"); +MODULE_SOFTDEP("pre: io-pgtable-arm"); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 1/7] iommu/io-pgtable: Introduce dynamic io-pgtable fmt registration
The io-pgtable code constructs an array of init functions for each page table format at compile time. This is not ideal, as this increases the footprint of the io-pgtable code, as well as prevents io-pgtable formats from being built as kernel modules. In preparation for modularizing the io-pgtable formats, switch to a dynamic registration scheme, where each io-pgtable format can register their init functions with the io-pgtable code at boot or module insertion time. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/io-pgtable-arm-v7s.c | 34 +- drivers/iommu/io-pgtable-arm.c | 90 ++-- drivers/iommu/io-pgtable.c | 94 -- include/linux/io-pgtable.h | 51 + 4 files changed, 209 insertions(+), 60 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index 1d92ac9..89aad2f 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -835,7 +836,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, return NULL; } -struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns = { +static struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns = { + .fmt= ARM_V7S, .alloc = arm_v7s_alloc_pgtable, .free = arm_v7s_free_pgtable, }; @@ -982,5 +984,33 @@ static int __init arm_v7s_do_selftests(void) pr_info("self test ok\n"); return 0; } -subsys_initcall(arm_v7s_do_selftests); +#else +static int arm_v7s_do_selftests(void) +{ + return 0; +} #endif + +static int __init arm_v7s_init(void) +{ + int ret; + + ret = io_pgtable_ops_register(_pgtable_arm_v7s_init_fns); + if (ret < 0) { + pr_err("Failed to register ARM V7S format\n"); + return ret; + } + + ret = arm_v7s_do_selftests(); + if (ret < 0) + io_pgtable_ops_unregister(_pgtable_arm_v7s_init_fns); + + return ret; +} +core_initcall(arm_v7s_init); + +static void __exit arm_v7s_exit(void) +{ + io_pgtable_ops_unregister(_pgtable_arm_v7s_init_fns); +} +module_exit(arm_v7s_exit); diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 87def58..ff0ea2f 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -1043,29 +1044,32 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) return NULL; } -struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = { - .alloc = arm_64_lpae_alloc_pgtable_s1, - .free = arm_lpae_free_pgtable, -}; - -struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns = { - .alloc = arm_64_lpae_alloc_pgtable_s2, - .free = arm_lpae_free_pgtable, -}; - -struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s1_init_fns = { - .alloc = arm_32_lpae_alloc_pgtable_s1, - .free = arm_lpae_free_pgtable, -}; - -struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s2_init_fns = { - .alloc = arm_32_lpae_alloc_pgtable_s2, - .free = arm_lpae_free_pgtable, -}; - -struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = { - .alloc = arm_mali_lpae_alloc_pgtable, - .free = arm_lpae_free_pgtable, +static struct io_pgtable_init_fns io_pgtable_arm_lpae_init_fns[] = { + { + .fmt= ARM_32_LPAE_S1, + .alloc = arm_32_lpae_alloc_pgtable_s1, + .free = arm_lpae_free_pgtable, + }, + { + .fmt= ARM_32_LPAE_S2, + .alloc = arm_32_lpae_alloc_pgtable_s2, + .free = arm_lpae_free_pgtable, + }, + { + .fmt= ARM_64_LPAE_S1, + .alloc = arm_64_lpae_alloc_pgtable_s1, + .free = arm_lpae_free_pgtable, + }, + { + .fmt= ARM_64_LPAE_S2, + .alloc = arm_64_lpae_alloc_pgtable_s2, + .free = arm_lpae_free_pgtable, + }, + { + .fmt= ARM_MALI_LPAE, + .alloc = arm_mali_lpae_alloc_pgtable, + .free = arm_lpae_free_pgtable, + }, }; #ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST @@ -1250,5 +1254,43 @@ static int __init arm_lpae_do_selftests(void) pr_info("selftest: completed with %d PASS %d FAIL\n", pass, fail); return fail ? -EFAULT : 0; } -subsys_initcall(arm_lpae_do_selftests); +#else +static int __init arm_lpae_do_selftests(void) +{ + return 0; +} #endif + +static int __init arm_lpae_init(void) +{ + int ret, i; + + for (i = 0; i < ARRAY_SIZE(io_pgtable_arm_lpae_init_fns); i++) { + ret = io_pgtable_op
[PATCH v2 4/7] iommu/arm-smmu-v3: Add dependency on io-pgtable-arm format module
The SMMUv3 driver depends on the availability of the ARM LPAE io-pgtable format code to work properly. In preparation for having the io-pgtable formats as modules, add a "pre" dependency with MODULE_SOFTDEP() to ensure that the io-pgtable-arm format module is loaded before loading the ARM SMMUv3 driver module. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8ca7415..c498ac8 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3650,3 +3650,4 @@ MODULE_DESCRIPTION("IOMMU API for ARM architected SMMUv3 implementations"); MODULE_AUTHOR("Will Deacon "); MODULE_ALIAS("platform:arm-smmu-v3"); MODULE_LICENSE("GPL v2"); +MODULE_SOFTDEP("pre: io-pgtable-arm"); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 7/7] iommu/io-pgtable-arm: Allow building modular io-pgtable fmts
Now that everything is in place for modular io-pgtable formats, allow the ARM LPAE and ARMV7S io-pgtable formats to be built as modules. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/Kconfig | 4 ++-- drivers/iommu/io-pgtable-arm-v7s.c | 2 ++ drivers/iommu/io-pgtable-arm.c | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 192ef8f..7e4f44f 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -30,7 +30,7 @@ config IOMMU_IO_PGTABLE bool config IOMMU_IO_PGTABLE_LPAE - bool "ARMv7/v8 Long Descriptor Format" + tristate "ARMv7/v8 Long Descriptor Format" select IOMMU_IO_PGTABLE depends on ARM || ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64) help @@ -49,7 +49,7 @@ config IOMMU_IO_PGTABLE_LPAE_SELFTEST If unsure, say N here. config IOMMU_IO_PGTABLE_ARMV7S - bool "ARMv7/v8 Short Descriptor Format" + tristate "ARMv7/v8 Short Descriptor Format" select IOMMU_IO_PGTABLE depends on ARM || ARM64 || COMPILE_TEST help diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index a5cb755a..9d9f08f 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -1015,3 +1015,5 @@ static void __exit arm_v7s_exit(void) io_pgtable_ops_unregister(_pgtable_arm_v7s_init_fns); } module_exit(arm_v7s_exit); + +MODULE_LICENSE("GPL v2"); diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index e8b1e34..e0de4ad 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -1299,3 +1299,5 @@ static void __exit arm_lpae_exit(void) io_pgtable_ops_unregister(_pgtable_arm_lpae_init_fns[i]); } module_exit(arm_lpae_exit); + +MODULE_LICENSE("GPL v2"); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[RFC PATCH v2 0/7] iommu: Permit modular builds of io-pgtable drivers
The goal of the Generic Kernel Image (GKI) effort is to have a common kernel image that works across multiple Android devices. This involves generating a kernel image that has core features integrated into it, while SoC specific functionality can be added to the kernel for the device as a module. Along with modularizing IOMMU drivers, this also means building the io-pgtable code as modules, which allows for SoC vendors to only include the io-pgtable implementations that they use. For example, GKI for arm64 must include support for both the IOMMU ARM LPAE/V7S formats at the moment. Having the code for both formats as modules allows SoC vendors to only provide the page table format that they use, along with their IOMMU driver. Main changes since v1: 1) Retain io-pgtable.c as part of the core kernel The patches are split into 4 parts: 1) Modularizing io-pgtable-arm[-v7s].c, while leaving the io-pgtable.c code as part of the core kernel, requires removing the references to the ARM LPAE and ARM V7S io-pgtable init functions, and using a dynamic method for formats to register their io-pgtable init functions. The reason for defining an io_pgtable_init_fns_node structure is to not have the data structures used to store the init functions seep into the io-pgtable fmt drivers. Doing so allows for changing the internal data structure used to keep track of the init functions, without impacting the client data structures. 2) Taking references to the io-pgtable format drivers to ensure that they cannot be unloaded while in use. 3) Adding pre MODULE_SOFTDEP() dependencies to drivers in the kernel that are tristate, and invoke [alloc/free]_io_pgtable_ops(). This makes it so that the io-pgtable format drivers are loaded before the driver that needs them. 4) Changing the Kconfig options for the ARM LPAE nad ARM V7S to tristate. Thanks in advance for the feedback, Isaac J. Manjarres Isaac J. Manjarres (7): iommu/io-pgtable: Introduce dynamic io-pgtable fmt registration iommu/io-pgtable: Add refcounting for io-pgtable format modules iommu/arm-smmu: Add dependency on io-pgtable format modules iommu/arm-smmu-v3: Add dependency on io-pgtable-arm format module drm/msm: Add dependency on io-pgtable-arm format module drm/panfrost: Add dependency on io-pgtable-arm format module iommu/io-pgtable-arm: Allow building modular io-pgtable fmts drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/panfrost/panfrost_drv.c | 1 + drivers/iommu/Kconfig | 4 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 + drivers/iommu/arm/arm-smmu/arm-smmu.c | 1 + drivers/iommu/io-pgtable-arm-v7s.c | 37 +- drivers/iommu/io-pgtable-arm.c | 97 +++--- drivers/iommu/io-pgtable.c | 104 +++- include/linux/io-pgtable.h | 53 +- 9 files changed, 236 insertions(+), 63 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2 5/7] drm/msm: Add dependency on io-pgtable-arm format module
The MSM DRM driver depends on the availability of the ARM LPAE io-pgtable format code to work properly. In preparation for having the io-pgtable formats as modules, add a "pre" dependency with MODULE_SOFTDEP() to ensure that the io-pgtable-arm format module is loaded before loading the MSM DRM driver module. Signed-off-by: Isaac J. Manjarres --- drivers/gpu/drm/msm/msm_drv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 535a026..8be3506 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -1369,3 +1369,4 @@ module_exit(msm_drm_unregister); MODULE_AUTHOR("Rob Clark
[PATCH 3/3] iommu/io-pgtable: Allow building as a module
Now that all of the required symbols have been exported, and the io-pgtable code can correctly refer to the io-pgtable init functions when their source files are built as modules, allow the io-pgtable code to be built as a module. The expectation is that the io-pgtable core code, along with the descriptor format (either or both ARM LPAE and ARMV7S) can be built as modules. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/Kconfig | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 192ef8f..d7de6db 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -27,10 +27,10 @@ menu "Generic IOMMU Pagetable Support" # Selected by the actual pagetable implementations config IOMMU_IO_PGTABLE - bool + tristate config IOMMU_IO_PGTABLE_LPAE - bool "ARMv7/v8 Long Descriptor Format" + tristate "ARMv7/v8 Long Descriptor Format" select IOMMU_IO_PGTABLE depends on ARM || ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64) help @@ -49,7 +49,7 @@ config IOMMU_IO_PGTABLE_LPAE_SELFTEST If unsure, say N here. config IOMMU_IO_PGTABLE_ARMV7S - bool "ARMv7/v8 Short Descriptor Format" + tristate "ARMv7/v8 Short Descriptor Format" select IOMMU_IO_PGTABLE depends on ARM || ARM64 || COMPILE_TEST help -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH 2/3] iommu/io-pgtable: Prepare for modularization
The io-pgtable source file uses the #ifdef preprocessor macro to construct the io_pgtable_init_table structure. However, the #ifdef macro evaluates to true if the config it is testing is set to y. This is not ideal when the configs that the io-pgtable code checks for can be m, so use IS_ENABLED() instead. Also, add the GPL v2 module license to the io-pgtable source file. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/io-pgtable.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c index 94394c8..867ecb4 100644 --- a/drivers/iommu/io-pgtable.c +++ b/drivers/iommu/io-pgtable.c @@ -10,18 +10,19 @@ #include #include #include +#include #include static const struct io_pgtable_init_fns * io_pgtable_init_table[IO_PGTABLE_NUM_FMTS] = { -#ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE +#if IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_LPAE) [ARM_32_LPAE_S1] = _pgtable_arm_32_lpae_s1_init_fns, [ARM_32_LPAE_S2] = _pgtable_arm_32_lpae_s2_init_fns, [ARM_64_LPAE_S1] = _pgtable_arm_64_lpae_s1_init_fns, [ARM_64_LPAE_S2] = _pgtable_arm_64_lpae_s2_init_fns, [ARM_MALI_LPAE] = _pgtable_arm_mali_lpae_init_fns, #endif -#ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S +#if IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) [ARM_V7S] = _pgtable_arm_v7s_init_fns, #endif }; @@ -68,3 +69,5 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops) io_pgtable_init_table[iop->fmt]->free(iop); } EXPORT_SYMBOL_GPL(free_io_pgtable_ops); + +MODULE_LICENSE("GPL v2"); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[RFC PATCH 0/3] iommu: Permit modular builds of io-pgtable drivers
The goal of the Generic Kernel Image (GKI) effort is to have a common image that works across multiple Android devices. This involves generating a kernel image that has core features integrated into it, while SoC specific functionality can be added to the kernel for the device as a module. Along with modularizing IOMMU drivers, this also means building the io-pgtable code as modules, which allows for SoC vendors to only include the io-pgtable implementations that they use. For example, GKI for arm64 must include support for both the IOMMU ARM LPAE/V7S formats at the moment. Having the code for both formats as modules allows SoC vendors to only provide the page table format that they use, along with their IOMMU driver. Modularizing both io-pgtable.c, as well as the io-pgtable-arm[-v7s].c files, works out rather nicely, as the main interface that clients use to interact with the page tables is already exported (i.e. alloc_io_pgtable_ops and free_io_pgtable_ops). It also makes it so that neither the io-pgtable-arm[-v7s] modules or the io-pgtable modules can be unloaded without unloading the IOMMU driver, which can only happen when there aren't any references to the IOMMU driver module. Thanks in advance for the feedback, Isaac J. Manjarres Isaac J. Manjarres (3): iommu/io-pgtable-arm: Prepare for modularization iommu/io-pgtable: Prepare for modularization iommu/io-pgtable: Allow building as a module drivers/iommu/Kconfig | 6 +++--- drivers/iommu/io-pgtable-arm-v7s.c | 4 drivers/iommu/io-pgtable-arm.c | 8 drivers/iommu/io-pgtable.c | 7 +-- 4 files changed, 20 insertions(+), 5 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH 1/3] iommu/io-pgtable-arm: Prepare for modularization
The io-pgtable-arm and io-pgtable-arm-v7s source files will be compiled as separate modules, along with the io-pgtable source. Export the symbols for the io-pgtable init function structures for the io-pgtable module to use. Signed-off-by: Isaac J. Manjarres --- drivers/iommu/io-pgtable-arm-v7s.c | 4 drivers/iommu/io-pgtable-arm.c | 8 2 files changed, 12 insertions(+) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index 1d92ac9..f062c1c 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -839,6 +840,7 @@ struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns = { .alloc = arm_v7s_alloc_pgtable, .free = arm_v7s_free_pgtable, }; +EXPORT_SYMBOL_GPL(io_pgtable_arm_v7s_init_fns); #ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S_SELFTEST @@ -984,3 +986,5 @@ static int __init arm_v7s_do_selftests(void) } subsys_initcall(arm_v7s_do_selftests); #endif + +MODULE_LICENSE("GPL v2"); diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 87def58..2623d57 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -1047,26 +1048,31 @@ struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = { .alloc = arm_64_lpae_alloc_pgtable_s1, .free = arm_lpae_free_pgtable, }; +EXPORT_SYMBOL_GPL(io_pgtable_arm_64_lpae_s1_init_fns); struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns = { .alloc = arm_64_lpae_alloc_pgtable_s2, .free = arm_lpae_free_pgtable, }; +EXPORT_SYMBOL_GPL(io_pgtable_arm_64_lpae_s2_init_fns); struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s1_init_fns = { .alloc = arm_32_lpae_alloc_pgtable_s1, .free = arm_lpae_free_pgtable, }; +EXPORT_SYMBOL_GPL(io_pgtable_arm_32_lpae_s1_init_fns); struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s2_init_fns = { .alloc = arm_32_lpae_alloc_pgtable_s2, .free = arm_lpae_free_pgtable, }; +EXPORT_SYMBOL_GPL(io_pgtable_arm_32_lpae_s2_init_fns); struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = { .alloc = arm_mali_lpae_alloc_pgtable, .free = arm_lpae_free_pgtable, }; +EXPORT_SYMBOL_GPL(io_pgtable_arm_mali_lpae_init_fns); #ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST @@ -1252,3 +1258,5 @@ static int __init arm_lpae_do_selftests(void) } subsys_initcall(arm_lpae_do_selftests); #endif + +MODULE_LICENSE("GPL v2"); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[RFC PATCH] fork: Free per-cpu cached vmalloc'ed thread stacks with
The per-cpu cached vmalloc'ed stacks are currently freed in the CPU hotplug teardown path by the free_vm_stack_cache() callback, which invokes vfree(), which may result in purging the list of lazily freed vmap areas. Purging all of the lazily freed vmap areas can take a long time when the list of vmap areas is large. This is problematic, as free_vm_stack_cache() is invoked prior to the offline CPU's timers being migrated. This is not desirable as it can lead to timer migration delays in the CPU hotplug teardown path, and timer callbacks will be invoked long after the timer has expired. For example, on a system that has only one online CPU (CPU 1) that is running a heavy workload, and another CPU that is being offlined, the online CPU will invoke free_vm_stack_cache() to free the cached vmalloc'ed stacks for the CPU being offlined. When there are 2702 vmap areas that total to 13498 pages, free_vm_stack_cache() takes over 2 seconds to execute: [001] 399.335808: cpuhp_enter: cpu: 0005 target: 0 step: 67 (free_vm_stack_cache) /* The first vmap area to be freed */ [001] 399.337157: __purge_vmap_area_lazy: [0:2702] 0xffc033da8000 - 0xffc033dad000 (5 : 13498) /* After two seconds */ [001] 401.528010: __purge_vmap_area_lazy: [1563:2702] 0xffc02fe1 - 0xffc02fe15000 (5 : 5765) Instead of freeing the per-cpu cached vmalloc'ed stacks synchronously with respect to the CPU hotplug teardown state machine, free them asynchronously to help move along the CPU hotplug teardown state machine quickly. Signed-off-by: Isaac J. Manjarres --- kernel/fork.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/fork.c b/kernel/fork.c index 4d32190..68346a0 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -202,7 +202,7 @@ static int free_vm_stack_cache(unsigned int cpu) if (!vm_stack) continue; - vfree(vm_stack->addr); + vfree_atomic(vm_stack->addr); cached_vm_stacks[i] = NULL; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] mm/usercopy: Use memory range to be accessed for wraparound check
Currently, when checking to see if accessing n bytes starting at address "ptr" will cause a wraparound in the memory addresses, the check in check_bogus_address() adds an extra byte, which is incorrect, as the range of addresses that will be accessed is [ptr, ptr + (n - 1)]. This can lead to incorrectly detecting a wraparound in the memory address, when trying to read 4 KB from memory that is mapped to the the last possible page in the virtual address space, when in fact, accessing that range of memory would not cause a wraparound to occur. Use the memory range that will actually be accessed when considering if accessing a certain amount of bytes will cause the memory address to wrap around. Fixes: f5509cc18daa ("mm: Hardened usercopy") Co-developed-by: Prasad Sodagudi Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres Cc: sta...@vger.kernel.org Reviewed-by: William Kucharski Acked-by: Kees Cook --- mm/usercopy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/usercopy.c b/mm/usercopy.c index 2a09796..98e92486 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -147,7 +147,7 @@ static inline void check_bogus_address(const unsigned long ptr, unsigned long n, bool to_user) { /* Reject if object wraps past end of memory. */ - if (ptr + n < ptr) + if (ptr + (n - 1) < ptr) usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n); /* Reject if NULL or ZERO-allocation. */ -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] mm/usercopy: Use memory range to be accessed for wraparound check
Currently, when checking to see if accessing n bytes starting at address "ptr" will cause a wraparound in the memory addresses, the check in check_bogus_address() adds an extra byte, which is incorrect, as the range of addresses that will be accessed is [ptr, ptr + (n - 1)]. This can lead to incorrectly detecting a wraparound in the memory address, when trying to read 4 KB from memory that is mapped to the the last possible page in the virtual address space, when in fact, accessing that range of memory would not cause a wraparound to occur. Use the memory range that will actually be accessed when considering if accessing a certain amount of bytes will cause the memory address to wrap around. Change-Id: I2563a5988e41122727ede17180f365e999b953e6 Fixes: f5509cc18daa ("mm: Hardened usercopy") Co-Developed-by: Prasad Sodagudi Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres Cc: sta...@vger.kernel.org --- mm/usercopy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/usercopy.c b/mm/usercopy.c index 852eb4e..0293645 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -151,7 +151,7 @@ static inline void check_bogus_address(const unsigned long ptr, unsigned long n, bool to_user) { /* Reject if object wraps past end of memory. */ - if (ptr + n < ptr) + if (ptr + (n - 1) < ptr) usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n); /* Reject if NULL or ZERO-allocation. */ -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] mm/usercopy: Use memory range to be accessed for wraparound check
Currently, when checking to see if accessing n bytes starting at address "ptr" will cause a wraparound in the memory addresses, the check in check_bogus_address() adds an extra byte, which is incorrect, as the range of addresses that will be accessed is [ptr, ptr + (n - 1)]. This can lead to incorrectly detecting a wraparound in the memory address, when trying to read 4 KB from memory that is mapped to the the last possible page in the virtual address space, when in fact, accessing that range of memory would not cause a wraparound to occur. Use the memory range that will actually be accessed when considering if accessing a certain amount of bytes will cause the memory address to wrap around. Change-Id: I2563a5988e41122727ede17180f365e999b953e6 Fixes: f5509cc18daa ("mm: Hardened usercopy") Co-Developed-by: Prasad Sodagudi Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres Cc: sta...@vger.kernel.org --- mm/usercopy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/usercopy.c b/mm/usercopy.c index 852eb4e..0293645 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -151,7 +151,7 @@ static inline void check_bogus_address(const unsigned long ptr, unsigned long n, bool to_user) { /* Reject if object wraps past end of memory. */ - if (ptr + n < ptr) + if (ptr + (n - 1) < ptr) usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n); /* Reject if NULL or ZERO-allocation. */ -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] stop_machine: Atomically queue and wake stopper threads
From: Prasad Sodagudi When cpu_stop_queue_work() releases the lock for the stopper thread that was queued into its wake queue, preemption is enabled, which leads to the following deadlock: CPU0 CPU1 sched_setaffinity(0, ...) __set_cpus_allowed_ptr() stop_one_cpu(0, ...) stop_two_cpus(0, 1, ...) cpu_stop_queue_work(0, ...) cpu_stop_queue_two_works(0, ..., 1, ...) -grabs lock for migration/0- -spins with preemption disabled, waiting for migration/0's lock to be released- -adds work items for migration/0 and queues migration/0 to its wake_q- -releases lock for migration/0 and preemption is enabled- -current thread is preempted, and __set_cpus_allowed_ptr has changed the thread's cpu allowed mask to CPU1 only- -acquires migration/0 and migration/1's locks- -adds work for migration/0 but does not add migration/0 to wake_q, since it is already in a wake_q- -adds work for migration/1 and adds migration/1 to its wake_q- -releases migration/0 and migration/1's locks, wakes migration/1, and enables preemption- -since migration/1 is requested to run, migration/1 begins to run and waits on migration/0, but migration/0 will never be able to run, since the thread that can wake it is affine to CPU1- Disable preemption in cpu_stop_queue_work() before queueing works for stopper threads, and queueing the stopper thread in the wake queue, to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 34b6652..067cb83 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -81,6 +81,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) unsigned long flags; bool enabled; + preempt_disable(); raw_spin_lock_irqsave(>lock, flags); enabled = stopper->enabled; if (enabled) @@ -90,6 +91,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) raw_spin_unlock_irqrestore(>lock, flags); wake_up_q(); + preempt_enable(); return enabled; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] stop_machine: Atomically queue and wake stopper threads
From: Prasad Sodagudi When cpu_stop_queue_work() releases the lock for the stopper thread that was queued into its wake queue, preemption is enabled, which leads to the following deadlock: CPU0 CPU1 sched_setaffinity(0, ...) __set_cpus_allowed_ptr() stop_one_cpu(0, ...) stop_two_cpus(0, 1, ...) cpu_stop_queue_work(0, ...) cpu_stop_queue_two_works(0, ..., 1, ...) -grabs lock for migration/0- -spins with preemption disabled, waiting for migration/0's lock to be released- -adds work items for migration/0 and queues migration/0 to its wake_q- -releases lock for migration/0 and preemption is enabled- -current thread is preempted, and __set_cpus_allowed_ptr has changed the thread's cpu allowed mask to CPU1 only- -acquires migration/0 and migration/1's locks- -adds work for migration/0 but does not add migration/0 to wake_q, since it is already in a wake_q- -adds work for migration/1 and adds migration/1 to its wake_q- -releases migration/0 and migration/1's locks, wakes migration/1, and enables preemption- -since migration/1 is requested to run, migration/1 begins to run and waits on migration/0, but migration/0 will never be able to run, since the thread that can wake it is affine to CPU1- Disable preemption in cpu_stop_queue_work() before queueing works for stopper threads, and queueing the stopper thread in the wake queue, to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 34b6652..067cb83 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -81,6 +81,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) unsigned long flags; bool enabled; + preempt_disable(); raw_spin_lock_irqsave(>lock, flags); enabled = stopper->enabled; if (enabled) @@ -90,6 +91,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) raw_spin_unlock_irqrestore(>lock, flags); wake_up_q(); + preempt_enable(); return enabled; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[tip:sched/core] stop_machine: Disable preemption after queueing stopper threads
Commit-ID: 2610e88946632afb78aa58e61f11368ac4c0af7b Gitweb: https://git.kernel.org/tip/2610e88946632afb78aa58e61f11368ac4c0af7b Author: Isaac J. Manjarres AuthorDate: Tue, 17 Jul 2018 12:35:29 -0700 Committer: Ingo Molnar CommitDate: Wed, 25 Jul 2018 11:25:08 +0200 stop_machine: Disable preemption after queueing stopper threads This commit: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") does not fully address the race condition that can occur as follows: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bige...@linutronix.de Cc: gre...@linuxfoundation.org Cc: m...@codeblueprint.co.uk Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") Link: http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isa...@codeaurora.org Signed-off-by: Ingo Molnar --- kernel/stop_machine.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 1ff523dae6e2..e190d1ef3a23 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -260,6 +260,15 @@ retry: err = 0; __cpu_stop_queue_work(stopper1, work1, ); __cpu_stop_queue_work(stopper2, work2, ); + /* +* The waking up of stopper threads has to happen +* in the same scheduling context as the queueing. +* Otherwise, there is a possibility of one of the +* above stoppers being woken up by another CPU, +* and preempting us. This will cause us to n ot +* wake up the other stopper forever. +*/ + preempt_disable(); unlock: raw_spin_unlock(>lock); raw_spin_unlock_irq(>lock); @@ -271,7 +280,6 @@ unlock: } if (!err) { - preempt_disable(); wake_up_q(); preempt_enable(); }
[tip:sched/core] stop_machine: Disable preemption after queueing stopper threads
Commit-ID: 2610e88946632afb78aa58e61f11368ac4c0af7b Gitweb: https://git.kernel.org/tip/2610e88946632afb78aa58e61f11368ac4c0af7b Author: Isaac J. Manjarres AuthorDate: Tue, 17 Jul 2018 12:35:29 -0700 Committer: Ingo Molnar CommitDate: Wed, 25 Jul 2018 11:25:08 +0200 stop_machine: Disable preemption after queueing stopper threads This commit: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") does not fully address the race condition that can occur as follows: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: bige...@linutronix.de Cc: gre...@linuxfoundation.org Cc: m...@codeblueprint.co.uk Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") Link: http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isa...@codeaurora.org Signed-off-by: Ingo Molnar --- kernel/stop_machine.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 1ff523dae6e2..e190d1ef3a23 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -260,6 +260,15 @@ retry: err = 0; __cpu_stop_queue_work(stopper1, work1, ); __cpu_stop_queue_work(stopper2, work2, ); + /* +* The waking up of stopper threads has to happen +* in the same scheduling context as the queueing. +* Otherwise, there is a possibility of one of the +* above stoppers being woken up by another CPU, +* and preempting us. This will cause us to n ot +* wake up the other stopper forever. +*/ + preempt_disable(); unlock: raw_spin_unlock(>lock); raw_spin_unlock_irq(>lock); @@ -271,7 +280,6 @@ unlock: } if (!err) { - preempt_disable(); wake_up_q(); preempt_enable(); }
[PATCH] stop_machine: Disable preemption after queueing stopper threads
This commit: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") does not fully address the race condition that can occur as follows: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 1ff523d..e190d1e 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, err = 0; __cpu_stop_queue_work(stopper1, work1, ); __cpu_stop_queue_work(stopper2, work2, ); + /* +* The waking up of stopper threads has to happen +* in the same scheduling context as the queueing. +* Otherwise, there is a possibility of one of the +* above stoppers being woken up by another CPU, +* and preempting us. This will cause us to n ot +* wake up the other stopper forever. +*/ + preempt_disable(); unlock: raw_spin_unlock(>lock); raw_spin_unlock_irq(>lock); @@ -271,7 +280,6 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, } if (!err) { - preempt_disable(); wake_up_q(); preempt_enable(); } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] stop_machine: Disable preemption after queueing stopper threads
This commit: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") does not fully address the race condition that can occur as follows: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads") Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 1ff523d..e190d1e 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, err = 0; __cpu_stop_queue_work(stopper1, work1, ); __cpu_stop_queue_work(stopper2, work2, ); + /* +* The waking up of stopper threads has to happen +* in the same scheduling context as the queueing. +* Otherwise, there is a possibility of one of the +* above stoppers being woken up by another CPU, +* and preempting us. This will cause us to n ot +* wake up the other stopper forever. +*/ + preempt_disable(); unlock: raw_spin_unlock(>lock); raw_spin_unlock_irq(>lock); @@ -271,7 +280,6 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, } if (!err) { - preempt_disable(); wake_up_q(); preempt_enable(); } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v4] stop_machine: Disable preemption after queueing stopper threads
After cpu_stop_queue_two_works() queues the cpu_stop works for the stopper threads, it releases the locks held for both threads, which enables preemption, which allows the following race condition to occur: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..e190d1e 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, err = 0; __cpu_stop_queue_work(stopper1, work1, ); __cpu_stop_queue_work(stopper2, work2, ); + /* +* The waking up of stopper threads has to happen +* in the same scheduling context as the queueing. +* Otherwise, there is a possibility of one of the +* above stoppers being woken up by another CPU, +* and preempting us. This will cause us to n ot +* wake up the other stopper forever. +*/ + preempt_disable(); unlock: raw_spin_unlock(>lock); raw_spin_unlock_irq(>lock); @@ -270,7 +279,10 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, goto retry; } - wake_up_q(); + if (!err) { + wake_up_q(); + preempt_enable(); + } return err; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v4] stop_machine: Disable preemption after queueing stopper threads
After cpu_stop_queue_two_works() queues the cpu_stop works for the stopper threads, it releases the locks held for both threads, which enables preemption, which allows the following race condition to occur: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..e190d1e 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, err = 0; __cpu_stop_queue_work(stopper1, work1, ); __cpu_stop_queue_work(stopper2, work2, ); + /* +* The waking up of stopper threads has to happen +* in the same scheduling context as the queueing. +* Otherwise, there is a possibility of one of the +* above stoppers being woken up by another CPU, +* and preempting us. This will cause us to n ot +* wake up the other stopper forever. +*/ + preempt_disable(); unlock: raw_spin_unlock(>lock); raw_spin_unlock_irq(>lock); @@ -270,7 +279,10 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, goto retry; } - wake_up_q(); + if (!err) { + wake_up_q(); + preempt_enable(); + } return err; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[tip:sched/urgent] stop_machine: Disable preemption when waking two stopper threads
Commit-ID: 9fb8d5dc4b649dd190e1af4ead670753e71bf907 Gitweb: https://git.kernel.org/tip/9fb8d5dc4b649dd190e1af4ead670753e71bf907 Author: Isaac J. Manjarres AuthorDate: Tue, 3 Jul 2018 15:02:14 -0700 Committer: Thomas Gleixner CommitDate: Sun, 15 Jul 2018 12:12:45 +0200 stop_machine: Disable preemption when waking two stopper threads When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition: The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Disable preemption when waking the stopper threads in cpu_stop_queue_two_works(). Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi Signed-off-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Thomas Gleixner Cc: pet...@infradead.org Cc: m...@codeblueprint.co.uk Cc: bige...@linutronix.de Cc: gre...@linuxfoundation.org Cc: sta...@vger.kernel.org Link: https://lkml.kernel.org/r/1530655334-4601-1-git-send-email-isa...@codeaurora.org --- kernel/stop_machine.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a2c238..1ff523dae6e2 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -270,7 +270,11 @@ unlock: goto retry; } - wake_up_q(); + if (!err) { + preempt_disable(); + wake_up_q(); + preempt_enable(); + } return err; }
[tip:sched/urgent] stop_machine: Disable preemption when waking two stopper threads
Commit-ID: 9fb8d5dc4b649dd190e1af4ead670753e71bf907 Gitweb: https://git.kernel.org/tip/9fb8d5dc4b649dd190e1af4ead670753e71bf907 Author: Isaac J. Manjarres AuthorDate: Tue, 3 Jul 2018 15:02:14 -0700 Committer: Thomas Gleixner CommitDate: Sun, 15 Jul 2018 12:12:45 +0200 stop_machine: Disable preemption when waking two stopper threads When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition: The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Disable preemption when waking the stopper threads in cpu_stop_queue_two_works(). Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi Signed-off-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Thomas Gleixner Cc: pet...@infradead.org Cc: m...@codeblueprint.co.uk Cc: bige...@linutronix.de Cc: gre...@linuxfoundation.org Cc: sta...@vger.kernel.org Link: https://lkml.kernel.org/r/1530655334-4601-1-git-send-email-isa...@codeaurora.org --- kernel/stop_machine.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a2c238..1ff523dae6e2 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -270,7 +270,11 @@ unlock: goto retry; } - wake_up_q(); + if (!err) { + preempt_disable(); + wake_up_q(); + preempt_enable(); + } return err; }
[PATCH v3] stop_machine: Disable preemption when waking two stopper threads
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition: The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Disable preemption when waking the stopper threads in cpu_stop_queue_two_works(). Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..1ff523d 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, goto retry; } - wake_up_q(); + if (!err) { + preempt_disable(); + wake_up_q(); + preempt_enable(); + } return err; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v3] stop_machine: Disable preemption when waking two stopper threads
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition: The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Disable preemption when waking the stopper threads in cpu_stop_queue_two_works(). Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi Co-Developed-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Cc: sta...@vger.kernel.org --- kernel/stop_machine.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..1ff523d 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, goto retry; } - wake_up_q(); + if (!err) { + preempt_disable(); + wake_up_q(); + preempt_enable(); + } return err; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2] stop_machine: Disable preemption when waking two stopper threads
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition: The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Disable preemption when waking the stopper threads in cpu_stop_queue_two_works() to ensure that the worker thread that is waking up the stopper threads isn't preempted by the source CPU's stopper thread, and permanently scheduled out, leaving the remaining stopper thread asleep in the wake queue. Co-developed-by: Pavankumar Kondeti Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres --- kernel/stop_machine.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..1ff523d 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, goto retry; } - wake_up_q(); + if (!err) { + preempt_disable(); + wake_up_q(); + preempt_enable(); + } return err; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH v2] stop_machine: Disable preemption when waking two stopper threads
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition: The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Disable preemption when waking the stopper threads in cpu_stop_queue_two_works() to ensure that the worker thread that is waking up the stopper threads isn't preempted by the source CPU's stopper thread, and permanently scheduled out, leaving the remaining stopper thread asleep in the wake queue. Co-developed-by: Pavankumar Kondeti Signed-off-by: Prasad Sodagudi Signed-off-by: Pavankumar Kondeti Signed-off-by: Isaac J. Manjarres --- kernel/stop_machine.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..1ff523d 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, goto retry; } - wake_up_q(); + if (!err) { + preempt_disable(); + wake_up_q(); + preempt_enable(); + } return err; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] stop_machine: Remove cpu swap from stop_two_cpus
When invoking migrate_swap(), stop_two_cpus() swaps the source and destination CPU IDs if the destination CPU ID is greater than the source CPU ID. This leads to the following race condition: The source CPU invokes migrate_swap and sets itself as the source CPU, and sets the destination CPU to another CPU, such that the CPU ID of the destination CPU is greater than that of the source CPU ID, and invokes stop_two_cpus(cpu1=destination CPU, cpu2=source CPU,...) Now, stop_two_cpus sees that the destination CPU ID is greater than the source CPU ID, and performs the swap, so that cpu1=source CPU, and cpu2=destination CPU. The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop() and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Remove CPU ID swapping in stop_two_cpus() so that the source CPU's stopper thread is added to the wake queue last, so that the source CPU's stopper thread is woken up last, ensuring that all other threads that it depends on are woken up before it runs. Co-developed-by: Prasad Sodagudi Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres --- kernel/stop_machine.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..d10d633 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -307,8 +307,6 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void * cpu_stop_init_done(, 2); set_state(, MULTI_STOP_PREPARE); - if (cpu1 > cpu2) - swap(cpu1, cpu2); if (cpu_stop_queue_two_works(cpu1, , cpu2, )) return -ENOENT; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
[PATCH] stop_machine: Remove cpu swap from stop_two_cpus
When invoking migrate_swap(), stop_two_cpus() swaps the source and destination CPU IDs if the destination CPU ID is greater than the source CPU ID. This leads to the following race condition: The source CPU invokes migrate_swap and sets itself as the source CPU, and sets the destination CPU to another CPU, such that the CPU ID of the destination CPU is greater than that of the source CPU ID, and invokes stop_two_cpus(cpu1=destination CPU, cpu2=source CPU,...) Now, stop_two_cpus sees that the destination CPU ID is greater than the source CPU ID, and performs the swap, so that cpu1=source CPU, and cpu2=destination CPU. The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop() and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Remove CPU ID swapping in stop_two_cpus() so that the source CPU's stopper thread is added to the wake queue last, so that the source CPU's stopper thread is woken up last, ensuring that all other threads that it depends on are woken up before it runs. Co-developed-by: Prasad Sodagudi Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres --- kernel/stop_machine.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index f89014a..d10d633 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -307,8 +307,6 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void * cpu_stop_init_done(, 2); set_state(, MULTI_STOP_PREPARE); - if (cpu1 > cpu2) - swap(cpu1, cpu2); if (cpu_stop_queue_two_works(cpu1, , cpu2, )) return -ENOENT; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project