from:"Isaac J. Manjarres"

[PATCH] iommu/arm-smmu-qcom: Fix mask extraction for bootloader programmed SMRs

2021-01-25 Thread Isaac J. Manjarres

When extracting the mask for a SMR that was programmed by the
bootloader, the SMR's valid bit is also extracted and is treated
as part of the mask, which is not correct. Consider the scenario
where an SMMU master whose context is determined by a bootloader
programmed SMR is removed (omitting parts of device/driver core):

->iommu_release_device()
 -> arm_smmu_release_device()
  -> arm_smmu_master_free_smes()
   -> arm_smmu_free_sme() /* Assume that the SME is now free */
   -> arm_smmu_write_sme()
-> arm_smmu_write_smr() /* Construct SMR value using mask and SID */

Since the valid bit was considered as part of the mask, the SMR will
be programmed as valid.

Fix the SMR mask extraction step for bootloader programmed SMRs
by masking out the valid bit when we know that we're already
working with a valid SMR.

Fixes: 07a7f2caaa5a ("iommu/arm-smmu-qcom: Read back stream mappings")
Signed-off-by: Isaac J. Manjarres 
Cc: sta...@vger.kernel.org
---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index bcda170..abb1d2f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -206,6 +206,8 @@ static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
 
if (FIELD_GET(ARM_SMMU_SMR_VALID, smr)) {
+   /* Ignore valid bit for SMR mask extraction. */
+   smr &= ~ARM_SMMU_SMR_VALID;
smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
smmu->smrs[i].valid = true;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 0/5] Optimize iommu_map_sg() performance

2021-01-11 Thread Isaac J. Manjarres

The iommu_map_sg() code currently iterates through the given
scatter-gather list, and in the worst case, invokes iommu_map()
for each element in the scatter-gather list, which calls into
the IOMMU driver through an indirect call. For an IOMMU driver
that uses a format supported by the io-pgtable code, the IOMMU
driver will then call into the io-pgtable code to map the chunk.

Jumping between the IOMMU core code, the IOMMU driver, and the
io-pgtable code and back for each element in a scatter-gather list
is not efficient.

Instead, add a map_sg() hook in both the IOMMU driver ops and the
io-pgtable ops. iommu_map_sg() can then call into the IOMMU driver's
map_sg() hook with the entire scatter-gather list, which can call
into the io-pgtable map_sg() hook, which can process the entire
scatter-gather list, signficantly reducing the number of indirect
calls, and jumps between these layers, boosting performance.

On a system that uses the ARM SMMU driver, and the ARM LPAE format,
the current implementation of iommu_map_sg() yields the following
latencies for mapping scatter-gather lists of various sizes. These
latencies are calculated by repeating the mapping operation 10 times:

sizeiommu_map_sg latency
  4K0.624 us
 64K9.468 us
  1M  122.557 us
  2M  239.807 us
 12M 1435.979 us
 24M 2884.968 us
 32M 3832.979 us

On the same system, the proposed modifications yield the following
results:

sizeiommu_map_sg latency
  4K3.645 us
 64K4.198 us
  1M   11.010 us
  2M   17.125 us
 12M   82.416 us
 24M  158.677 us
 32M  210.468 us

The procedure for collecting the iommu_map_sg latencies is
the same in both experiments. Clearly, reducing the jumps
between the different layers in the IOMMU code offers a
signficant performance boost in iommu_map_sg() latency.

Changes since v1:

-Fixed an off by one error in arm_[lpae/v7s]_map_by_pgsize
when checking if the IOVA and physical address ranges being
mapped are within the appropriate limits.
-Added Sai Prakash Ranjan's "Tested-by" tag.

Thanks,
Isaac

Isaac J. Manjarres (5):
  iommu/io-pgtable: Introduce map_sg() as a page table op
  iommu/io-pgtable-arm: Hook up map_sg()
  iommu/io-pgtable-arm-v7s: Hook up map_sg()
  iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
  iommu/arm-smmu: Hook up map_sg()

 drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 
 drivers/iommu/io-pgtable-arm-v7s.c| 90 +++
 drivers/iommu/io-pgtable-arm.c| 86 +
 drivers/iommu/iommu.c | 25 --
 include/linux/io-pgtable.h|  6 +++
 include/linux/iommu.h | 13 +
 6 files changed, 234 insertions(+), 5 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 1/5] iommu/io-pgtable: Introduce map_sg() as a page table op

2021-01-11 Thread Isaac J. Manjarres

While mapping a scatter-gather list, iommu_map_sg() calls
into the IOMMU driver through an indirect call, which can
call into the io-pgtable code through another indirect call.

This sequence of going through the IOMMU core code, the IOMMU
driver, and finally the io-pgtable code, occurs for every
element in the scatter-gather list, in the worse case, which
is not optimal.

Introduce a map_sg callback in the io-pgtable ops so that
IOMMU drivers can invoke it with the complete scatter-gather
list, so that it can be processed within the io-pgtable
code entirely, reducing the number of indirect calls, and
boosting overall iommu_map_sg() performance.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
---
 include/linux/io-pgtable.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index ea727eb..6d0e731 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -147,6 +147,9 @@ struct io_pgtable_cfg {
  * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers.
  *
  * @map:  Map a physically contiguous memory region.
+ * @map_sg:   Map a scatter-gather list of physically contiguous memory
+ *chunks. The mapped pointer argument is used to store how
+ *many bytes are mapped.
  * @unmap:Unmap a physically contiguous memory region.
  * @iova_to_phys: Translate iova to physical address.
  *
@@ -156,6 +159,9 @@ struct io_pgtable_cfg {
 struct io_pgtable_ops {
int (*map)(struct io_pgtable_ops *ops, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_sg)(struct io_pgtable_ops *ops, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents, int prot,
+ gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct io_pgtable_ops *ops, unsigned long iova,
size_t size, struct iommu_iotlb_gather *gather);
phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 5/5] iommu/arm-smmu: Hook up map_sg()

2021-01-11 Thread Isaac J. Manjarres

Now that everything is in place for iommu_map_sg() to defer
mapping a scatter-gather list to the io-pgtable layer, implement
the map_sg() callback in the SMMU driver, so that iommu_map_sg()
can invoke it with the entire scatter-gather list that will be
mapped.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index d8c6bfd..52acc68 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1208,6 +1208,24 @@ static int arm_smmu_map(struct iommu_domain *domain, 
unsigned long iova,
return ret;
 }
 
+static int arm_smmu_map_sg(struct iommu_domain *domain, unsigned long iova,
+  struct scatterlist *sg, unsigned int nents, int prot,
+  gfp_t gfp, size_t *mapped)
+{
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+   struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
+   int ret;
+
+   if (!ops)
+   return -ENODEV;
+
+   arm_smmu_rpm_get(smmu);
+   ret = ops->map_sg(ops, iova, sg, nents, prot, gfp, mapped);
+   arm_smmu_rpm_put(smmu);
+
+   return ret;
+}
+
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *gather)
 {
@@ -1624,6 +1642,7 @@ static struct iommu_ops arm_smmu_ops = {
.domain_free= arm_smmu_domain_free,
.attach_dev = arm_smmu_attach_dev,
.map= arm_smmu_map,
+   .map_sg = arm_smmu_map_sg,
.unmap  = arm_smmu_unmap,
.flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 3/5] iommu/io-pgtable-arm-v7s: Hook up map_sg()

2021-01-11 Thread Isaac J. Manjarres

Implement the map_sg io-pgtable op for the ARMv7s io-pgtable
code, so that IOMMU drivers can call it when they need to map
a scatter-gather list.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 90 ++
 1 file changed, 90 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 1d92ac9..8665dab 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -545,6 +545,95 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_v7s_map_by_pgsize(struct io_pgtable_ops *ops,
+unsigned long iova, phys_addr_t paddr,
+size_t size, int prot, gfp_t gfp,
+size_t *mapped)
+{
+   struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   struct io_pgtable *iop = >iop;
+   struct io_pgtable_cfg *cfg = >cfg;
+   unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap);
+   int ret;
+   size_t pgsize;
+
+   if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) {
+   pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 
0x%x\n",
+  iova, , size, min_pagesz);
+   return -EINVAL;
+   }
+
+   if (WARN_ON((iova + size - 1) >= (1ULL << cfg->ias) ||
+   (paddr + size - 1) >= (1ULL << cfg->oas)))
+   return -ERANGE;
+
+   while (size) {
+   pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size);
+   ret = __arm_v7s_map(data, iova, paddr, pgsize, prot, 1,
+   data->pgd, gfp);
+
+   if (iop->cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
+   io_pgtable_tlb_flush_walk(>iop, iova, size,
+ ARM_V7S_BLOCK_SIZE(2));
+   } else {
+   wmb();
+   }
+
+   if (ret)
+   return ret;
+
+   iova += pgsize;
+   paddr += pgsize;
+   *mapped += pgsize;
+   size -= pgsize;
+   }
+
+   return 0;
+}
+
+static int arm_v7s_map_sg(struct io_pgtable_ops *ops, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents,
+ int iommu_prot, gfp_t gfp, size_t *mapped)
+{
+   size_t len = 0;
+   unsigned int i = 0;
+   int ret;
+   phys_addr_t start;
+
+   *mapped = 0;
+
+   /* If no access, then nothing to do */
+   if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+   return 0;
+
+   while (i <= nents) {
+   phys_addr_t s_phys = sg_phys(sg);
+
+   if (len && s_phys != start + len) {
+   ret = arm_v7s_map_by_pgsize(ops, iova + *mapped, start,
+   len, iommu_prot, gfp,
+   mapped);
+
+   if (ret)
+   return ret;
+
+   len = 0;
+   }
+
+   if (len) {
+   len += sg->length;
+   } else {
+   len = sg->length;
+   start = s_phys;
+   }
+
+   if (++i < nents)
+   sg = sg_next(sg);
+   }
+
+   return 0;
+}
+
 static void arm_v7s_free_pgtable(struct io_pgtable *iop)
 {
struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop);
@@ -783,6 +872,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_v7s_map,
+   .map_sg = arm_v7s_map_sg,
.unmap  = arm_v7s_unmap,
.iova_to_phys   = arm_v7s_iova_to_phys,
};
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 4/5] iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers

2021-01-11 Thread Isaac J. Manjarres

Add support for IOMMU drivers to have their own map_sg() callbacks.
This completes the path for having iommu_map_sg() invoke an IOMMU
driver's map_sg() callback, which can then invoke the io-pgtable
map_sg() callback with the entire scatter-gather list, so that it
can be processed entirely in the io-pgtable layer.

For IOMMU drivers that do not provide a callback, the default
implementation of iterating through the scatter-gather list, while
calling iommu_map() will be used.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
---
 drivers/iommu/iommu.c | 13 +
 include/linux/iommu.h |  5 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0da0687..46acd5c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2535,11 +2535,24 @@ static size_t __iommu_map_sg(struct iommu_domain 
*domain, unsigned long iova,
 struct scatterlist *sg, unsigned int nents, int 
prot,
 gfp_t gfp)
 {
+   const struct iommu_ops *ops = domain->ops;
size_t len = 0, mapped = 0;
phys_addr_t start;
unsigned int i = 0;
int ret;
 
+   if (ops->map_sg) {
+   ret = ops->map_sg(domain, iova, sg, nents, prot, gfp, );
+
+   if (ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain);
+
+   if (ret)
+   goto out_err;
+
+   return mapped;
+   }
+
while (i <= nents) {
phys_addr_t s_phys = sg_phys(sg);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 0e40a38..bac7681 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -192,6 +192,8 @@ struct iommu_iotlb_gather {
  * @attach_dev: attach device to an iommu domain
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
+ * @map_sg: map a scatter-gather list of physically contiguous chunks to
+ *  an iommu domain.
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
  * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
@@ -243,6 +245,9 @@ struct iommu_ops {
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
int (*map)(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_sg)(struct iommu_domain *domain, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents, int prot,
+ gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 2/5] iommu/io-pgtable-arm: Hook up map_sg()

2021-01-11 Thread Isaac J. Manjarres

Implement the map_sg io-pgtable op for the ARM LPAE io-pgtable
code, so that IOMMU drivers can call it when they need to map
a scatter-gather list.

Signed-off-by: Isaac J. Manjarres 
Tested-by: Sai Prakash Ranjan 
---
 drivers/iommu/io-pgtable-arm.c | 86 ++
 drivers/iommu/iommu.c  | 12 +++---
 include/linux/iommu.h  |  8 
 3 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58..0c11529 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -473,6 +473,91 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_lpae_map_by_pgsize(struct io_pgtable_ops *ops,
+ unsigned long iova, phys_addr_t paddr,
+ size_t size, int iommu_prot, gfp_t gfp,
+ size_t *mapped)
+{
+   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   struct io_pgtable_cfg *cfg = >iop.cfg;
+   arm_lpae_iopte *ptep = data->pgd;
+   int ret, lvl = data->start_level;
+   arm_lpae_iopte prot = arm_lpae_prot_to_pte(data, iommu_prot);
+   unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap);
+   long iaext = (s64)(iova + size - 1) >> cfg->ias;
+   size_t pgsize;
+
+   if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) {
+   pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 
0x%x\n",
+  iova, , size, min_pagesz);
+   return -EINVAL;
+   }
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+   iaext = ~iaext;
+   if (WARN_ON(iaext || (paddr + size - 1) >> cfg->oas))
+   return -ERANGE;
+
+   while (size) {
+   pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size);
+   ret = __arm_lpae_map(data, iova, paddr, pgsize, prot, lvl, ptep,
+gfp);
+   if (ret)
+   return ret;
+
+   iova += pgsize;
+   paddr += pgsize;
+   *mapped += pgsize;
+   size -= pgsize;
+   }
+
+   return 0;
+}
+
+static int arm_lpae_map_sg(struct io_pgtable_ops *ops, unsigned long iova,
+  struct scatterlist *sg, unsigned int nents,
+  int iommu_prot, gfp_t gfp, size_t *mapped)
+{
+
+   size_t len = 0;
+   unsigned int i = 0;
+   int ret;
+   phys_addr_t start;
+
+   *mapped = 0;
+
+   /* If no access, then nothing to do */
+   if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+   return 0;
+
+   while (i <= nents) {
+   phys_addr_t s_phys = sg_phys(sg);
+
+   if (len && s_phys != start + len) {
+   ret = arm_lpae_map_by_pgsize(ops, iova + *mapped, start,
+len, iommu_prot, gfp,
+mapped);
+
+   if (ret)
+   return ret;
+
+   len = 0;
+   }
+
+   if (len) {
+   len += sg->length;
+   } else {
+   len = sg->length;
+   start = s_phys;
+   }
+
+   if (++i < nents)
+   sg = sg_next(sg);
+   }
+
+   return 0;
+}
+
 static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl,
arm_lpae_iopte *ptep)
 {
@@ -750,6 +835,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_lpae_map,
+   .map_sg = arm_lpae_map_sg,
.unmap  = arm_lpae_unmap,
.iova_to_phys   = arm_lpae_iova_to_phys,
};
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda..0da0687 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2346,8 +2346,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain 
*domain, dma_addr_t iova)
 }
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
-static size_t iommu_pgsize(struct iommu_domain *domain,
-  unsigned long addr_merge, size_t size)
+size_t iommu_pgsize(unsigned long pgsize_bitmap, unsigned long addr_merge,
+   size_t size)
 {
unsigned int pgsize_idx;
size_t pgsize;
@@ -2366,7 +2366,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
pgsize = (1UL << (pgsize_idx + 1)) - 1;
 
/* throw away page sizes not supported by the hardware */
-   pgsize &= domain->pgsize_bitmap;
+   pgsize &= pgsize_bitmap;

[PATCH 0/5] Optimize iommu_map_sg() performance

2021-01-08 Thread Isaac J. Manjarres

The iommu_map_sg() code currently iterates through the given
scatter-gather list, and in the worst case, invokes iommu_map()
for each element in the scatter-gather list, which calls into
the IOMMU driver through an indirect call. For an IOMMU driver
that uses a format supported by the io-pgtable code, the IOMMU
driver will then call into the io-pgtable code to map the chunk.

Jumping between the IOMMU core code, the IOMMU driver, and the
io-pgtable code and back for each element in a scatter-gather list
is not efficient.

Instead, add a map_sg() hook in both the IOMMU driver ops and the
io-pgtable ops. iommu_map_sg() can then call into the IOMMU driver's
map_sg() hook with the entire scatter-gather list, which can call
into the io-pgtable map_sg() hook, which can process the entire
scatter-gather list, signficantly reducing the number of indirect
calls, and jumps between these layers, boosting performance.

On a system that uses the ARM SMMU driver, and the ARM LPAE format,
the current implementation of iommu_map_sg() yields the following
latencies for mapping scatter-gather lists of various sizes. These
latencies are calculated by repeating the mapping operation 10 times:

sizeiommu_map_sg latency
  4K0.624 us
 64K9.468 us
  1M  122.557 us
  2M  239.807 us
 12M 1435.979 us
 24M 2884.968 us
 32M 3832.979 us

On the same system, the proposed modifications yield the following
results:

sizeiommu_map_sg latency
  4K3.645 us
 64K4.198 us
  1M   11.010 us
  2M   17.125 us
 12M   82.416 us
 24M  158.677 us
 32M  210.468 us

The procedure for collecting the iommu_map_sg latencies is
the same in both experiments. Clearly, reducing the jumps
between the different layers in the IOMMU code offers a
signficant performance boost in iommu_map_sg() latency.

Thanks,
Isaac

Isaac J. Manjarres (5):
  iommu/io-pgtable: Introduce map_sg() as a page table op
  iommu/io-pgtable-arm: Hook up map_sg()
  iommu/io-pgtable-arm-v7s: Hook up map_sg()
  iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers
  iommu/arm-smmu: Hook up map_sg()

 drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 
 drivers/iommu/io-pgtable-arm-v7s.c| 90 +++
 drivers/iommu/io-pgtable-arm.c| 86 +
 drivers/iommu/iommu.c | 25 --
 include/linux/io-pgtable.h|  6 +++
 include/linux/iommu.h | 13 +
 6 files changed, 234 insertions(+), 5 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 5/5] iommu/arm-smmu: Hook up map_sg()

2021-01-08 Thread Isaac J. Manjarres

Now that everything is in place for iommu_map_sg() to defer
mapping a scatter-gather list to the io-pgtable layer, implement
the map_sg() callback in the SMMU driver, so that iommu_map_sg()
can invoke it with the entire scatter-gather list that will be
mapped.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index d8c6bfd..52acc68 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1208,6 +1208,24 @@ static int arm_smmu_map(struct iommu_domain *domain, 
unsigned long iova,
return ret;
 }
 
+static int arm_smmu_map_sg(struct iommu_domain *domain, unsigned long iova,
+  struct scatterlist *sg, unsigned int nents, int prot,
+  gfp_t gfp, size_t *mapped)
+{
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+   struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
+   int ret;
+
+   if (!ops)
+   return -ENODEV;
+
+   arm_smmu_rpm_get(smmu);
+   ret = ops->map_sg(ops, iova, sg, nents, prot, gfp, mapped);
+   arm_smmu_rpm_put(smmu);
+
+   return ret;
+}
+
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *gather)
 {
@@ -1624,6 +1642,7 @@ static struct iommu_ops arm_smmu_ops = {
.domain_free= arm_smmu_domain_free,
.attach_dev = arm_smmu_attach_dev,
.map= arm_smmu_map,
+   .map_sg = arm_smmu_map_sg,
.unmap  = arm_smmu_unmap,
.flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 3/5] iommu/io-pgtable-arm-v7s: Hook up map_sg()

2021-01-08 Thread Isaac J. Manjarres

Implement the map_sg io-pgtable op for the ARMv7s io-pgtable
code, so that IOMMU drivers can call it when they need to map
a scatter-gather list.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 90 ++
 1 file changed, 90 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 1d92ac9..40d96d2 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -545,6 +545,95 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_v7s_map_by_pgsize(struct io_pgtable_ops *ops,
+unsigned long iova, phys_addr_t paddr,
+size_t size, int prot, gfp_t gfp,
+size_t *mapped)
+{
+   struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   struct io_pgtable *iop = >iop;
+   struct io_pgtable_cfg *cfg = >cfg;
+   unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap);
+   int ret;
+   size_t pgsize;
+
+   if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) {
+   pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 
0x%x\n",
+  iova, , size, min_pagesz);
+   return -EINVAL;
+   }
+
+   if (WARN_ON((iova + size) >= (1ULL << cfg->ias) ||
+   (paddr + size) >= (1ULL << cfg->oas)))
+   return -ERANGE;
+
+   while (size) {
+   pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size);
+   ret = __arm_v7s_map(data, iova, paddr, pgsize, prot, 1,
+   data->pgd, gfp);
+
+   if (iop->cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
+   io_pgtable_tlb_flush_walk(>iop, iova, size,
+ ARM_V7S_BLOCK_SIZE(2));
+   } else {
+   wmb();
+   }
+
+   if (ret)
+   return ret;
+
+   iova += pgsize;
+   paddr += pgsize;
+   *mapped += pgsize;
+   size -= pgsize;
+   }
+
+   return 0;
+}
+
+static int arm_v7s_map_sg(struct io_pgtable_ops *ops, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents,
+ int iommu_prot, gfp_t gfp, size_t *mapped)
+{
+   size_t len = 0;
+   unsigned int i = 0;
+   int ret;
+   phys_addr_t start;
+
+   *mapped = 0;
+
+   /* If no access, then nothing to do */
+   if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+   return 0;
+
+   while (i <= nents) {
+   phys_addr_t s_phys = sg_phys(sg);
+
+   if (len && s_phys != start + len) {
+   ret = arm_v7s_map_by_pgsize(ops, iova + *mapped, start,
+   len, iommu_prot, gfp,
+   mapped);
+
+   if (ret)
+   return ret;
+
+   len = 0;
+   }
+
+   if (len) {
+   len += sg->length;
+   } else {
+   len = sg->length;
+   start = s_phys;
+   }
+
+   if (++i < nents)
+   sg = sg_next(sg);
+   }
+
+   return 0;
+}
+
 static void arm_v7s_free_pgtable(struct io_pgtable *iop)
 {
struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop);
@@ -783,6 +872,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_v7s_map,
+   .map_sg = arm_v7s_map_sg,
.unmap  = arm_v7s_unmap,
.iova_to_phys   = arm_v7s_iova_to_phys,
};
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 4/5] iommu: Introduce map_sg() as an IOMMU op for IOMMU drivers

2021-01-08 Thread Isaac J. Manjarres

Add support for IOMMU drivers to have their own map_sg() callbacks.
This completes the path for having iommu_map_sg() invoke an IOMMU
driver's map_sg() callback, which can then invoke the io-pgtable
map_sg() callback with the entire scatter-gather list, so that it
can be processed entirely in the io-pgtable layer.

For IOMMU drivers that do not provide a callback, the default
implementation of iterating through the scatter-gather list, while
calling iommu_map() will be used.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/iommu.c | 13 +
 include/linux/iommu.h |  5 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0da0687..46acd5c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2535,11 +2535,24 @@ static size_t __iommu_map_sg(struct iommu_domain 
*domain, unsigned long iova,
 struct scatterlist *sg, unsigned int nents, int 
prot,
 gfp_t gfp)
 {
+   const struct iommu_ops *ops = domain->ops;
size_t len = 0, mapped = 0;
phys_addr_t start;
unsigned int i = 0;
int ret;
 
+   if (ops->map_sg) {
+   ret = ops->map_sg(domain, iova, sg, nents, prot, gfp, );
+
+   if (ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain);
+
+   if (ret)
+   goto out_err;
+
+   return mapped;
+   }
+
while (i <= nents) {
phys_addr_t s_phys = sg_phys(sg);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 0e40a38..bac7681 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -192,6 +192,8 @@ struct iommu_iotlb_gather {
  * @attach_dev: attach device to an iommu domain
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
+ * @map_sg: map a scatter-gather list of physically contiguous chunks to
+ *  an iommu domain.
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
  * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
@@ -243,6 +245,9 @@ struct iommu_ops {
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
int (*map)(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_sg)(struct iommu_domain *domain, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents, int prot,
+ gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 2/5] iommu/io-pgtable-arm: Hook up map_sg()

2021-01-08 Thread Isaac J. Manjarres

Implement the map_sg io-pgtable op for the ARM LPAE io-pgtable
code, so that IOMMU drivers can call it when they need to map
a scatter-gather list.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/io-pgtable-arm.c | 86 ++
 drivers/iommu/iommu.c  | 12 +++---
 include/linux/iommu.h  |  8 
 3 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58..9c17d9d 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -473,6 +473,91 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_lpae_map_by_pgsize(struct io_pgtable_ops *ops,
+ unsigned long iova, phys_addr_t paddr,
+ size_t size, int iommu_prot, gfp_t gfp,
+ size_t *mapped)
+{
+   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   struct io_pgtable_cfg *cfg = >iop.cfg;
+   arm_lpae_iopte *ptep = data->pgd;
+   int ret, lvl = data->start_level;
+   arm_lpae_iopte prot = arm_lpae_prot_to_pte(data, iommu_prot);
+   unsigned int min_pagesz = 1 << __ffs(cfg->pgsize_bitmap);
+   long iaext = (s64)(iova + size) >> cfg->ias;
+   size_t pgsize;
+
+   if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) {
+   pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 
0x%x\n",
+  iova, , size, min_pagesz);
+   return -EINVAL;
+   }
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+   iaext = ~iaext;
+   if (WARN_ON(iaext || (paddr + size) >> cfg->oas))
+   return -ERANGE;
+
+   while (size) {
+   pgsize = iommu_pgsize(cfg->pgsize_bitmap, iova | paddr, size);
+   ret = __arm_lpae_map(data, iova, paddr, pgsize, prot, lvl, ptep,
+gfp);
+   if (ret)
+   return ret;
+
+   iova += pgsize;
+   paddr += pgsize;
+   *mapped += pgsize;
+   size -= pgsize;
+   }
+
+   return 0;
+}
+
+static int arm_lpae_map_sg(struct io_pgtable_ops *ops, unsigned long iova,
+  struct scatterlist *sg, unsigned int nents,
+  int iommu_prot, gfp_t gfp, size_t *mapped)
+{
+
+   size_t len = 0;
+   unsigned int i = 0;
+   int ret;
+   phys_addr_t start;
+
+   *mapped = 0;
+
+   /* If no access, then nothing to do */
+   if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
+   return 0;
+
+   while (i <= nents) {
+   phys_addr_t s_phys = sg_phys(sg);
+
+   if (len && s_phys != start + len) {
+   ret = arm_lpae_map_by_pgsize(ops, iova + *mapped, start,
+len, iommu_prot, gfp,
+mapped);
+
+   if (ret)
+   return ret;
+
+   len = 0;
+   }
+
+   if (len) {
+   len += sg->length;
+   } else {
+   len = sg->length;
+   start = s_phys;
+   }
+
+   if (++i < nents)
+   sg = sg_next(sg);
+   }
+
+   return 0;
+}
+
 static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl,
arm_lpae_iopte *ptep)
 {
@@ -750,6 +835,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_lpae_map,
+   .map_sg = arm_lpae_map_sg,
.unmap  = arm_lpae_unmap,
.iova_to_phys   = arm_lpae_iova_to_phys,
};
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda..0da0687 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2346,8 +2346,8 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain 
*domain, dma_addr_t iova)
 }
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
-static size_t iommu_pgsize(struct iommu_domain *domain,
-  unsigned long addr_merge, size_t size)
+size_t iommu_pgsize(unsigned long pgsize_bitmap, unsigned long addr_merge,
+   size_t size)
 {
unsigned int pgsize_idx;
size_t pgsize;
@@ -2366,7 +2366,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
pgsize = (1UL << (pgsize_idx + 1)) - 1;
 
/* throw away page sizes not supported by the hardware */
-   pgsize &= domain->pgsize_bitmap;
+   pgsize &= pgsize_bitmap;
 
/* make sure we're still sane

[PATCH 1/5] iommu/io-pgtable: Introduce map_sg() as a page table op

2021-01-08 Thread Isaac J. Manjarres

While mapping a scatter-gather list, iommu_map_sg() calls
into the IOMMU driver through an indirect call, which can
call into the io-pgtable code through another indirect call.

This sequence of going through the IOMMU core code, the IOMMU
driver, and finally the io-pgtable code, occurs for every
element in the scatter-gather list, in the worse case, which
is not optimal.

Introduce a map_sg callback in the io-pgtable ops so that
IOMMU drivers can invoke it with the complete scatter-gather
list, so that it can be processed within the io-pgtable
code entirely, reducing the number of indirect calls, and
boosting overall iommu_map_sg() performance.

Signed-off-by: Isaac J. Manjarres 
---
 include/linux/io-pgtable.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index ea727eb..6d0e731 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -147,6 +147,9 @@ struct io_pgtable_cfg {
  * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers.
  *
  * @map:  Map a physically contiguous memory region.
+ * @map_sg:   Map a scatter-gather list of physically contiguous memory
+ *chunks. The mapped pointer argument is used to store how
+ *many bytes are mapped.
  * @unmap:Unmap a physically contiguous memory region.
  * @iova_to_phys: Translate iova to physical address.
  *
@@ -156,6 +159,9 @@ struct io_pgtable_cfg {
 struct io_pgtable_ops {
int (*map)(struct io_pgtable_ops *ops, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_sg)(struct io_pgtable_ops *ops, unsigned long iova,
+ struct scatterlist *sg, unsigned int nents, int prot,
+ gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct io_pgtable_ops *ops, unsigned long iova,
size_t size, struct iommu_iotlb_gather *gather);
phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 3/7] iommu/arm-smmu: Add dependency on io-pgtable format modules

2020-12-21 Thread Isaac J. Manjarres

The SMMU driver depends on the availability of the ARM LPAE and
ARM V7S io-pgtable format code to work properly. In preparation
for having the io-pgtable formats as modules, add a "pre"
dependency with MODULE_SOFTDEP() to ensure that the io-pgtable
format modules are loaded before loading the ARM SMMU driver module.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index d8c6bfd..a72649f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -2351,3 +2351,4 @@ MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU 
implementations");
 MODULE_AUTHOR("Will Deacon ");
 MODULE_ALIAS("platform:arm-smmu");
 MODULE_LICENSE("GPL v2");
+MODULE_SOFTDEP("pre: io-pgtable-arm io-pgtable-arm-v7s");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 2/7] iommu/io-pgtable: Add refcounting for io-pgtable format modules

2020-12-21 Thread Isaac J. Manjarres

In preparation for modularizing io-pgtable formats, add support
for reference counting the io-pgtable format modules to ensure
that the modules are not unloaded while they are in use.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/io-pgtable-arm-v7s.c |  1 +
 drivers/iommu/io-pgtable-arm.c |  5 +
 drivers/iommu/io-pgtable.c | 12 ++--
 include/linux/io-pgtable.h |  2 ++
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 89aad2f..a5cb755a 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -840,6 +840,7 @@ static struct io_pgtable_init_fns 
io_pgtable_arm_v7s_init_fns = {
.fmt= ARM_V7S,
.alloc  = arm_v7s_alloc_pgtable,
.free   = arm_v7s_free_pgtable,
+   .owner  = THIS_MODULE,
 };
 
 #ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S_SELFTEST
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index ff0ea2f..e8b1e34 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -1049,26 +1049,31 @@ static struct io_pgtable_init_fns 
io_pgtable_arm_lpae_init_fns[] = {
.fmt= ARM_32_LPAE_S1,
.alloc  = arm_32_lpae_alloc_pgtable_s1,
.free   = arm_lpae_free_pgtable,
+   .owner  = THIS_MODULE,
},
{
.fmt= ARM_32_LPAE_S2,
.alloc  = arm_32_lpae_alloc_pgtable_s2,
.free   = arm_lpae_free_pgtable,
+   .owner  = THIS_MODULE,
},
{
.fmt= ARM_64_LPAE_S1,
.alloc  = arm_64_lpae_alloc_pgtable_s1,
.free   = arm_lpae_free_pgtable,
+   .owner  = THIS_MODULE,
},
{
.fmt= ARM_64_LPAE_S2,
.alloc  = arm_64_lpae_alloc_pgtable_s2,
.free   = arm_lpae_free_pgtable,
+   .owner  = THIS_MODULE,
},
{
.fmt= ARM_MALI_LPAE,
.alloc  = arm_mali_lpae_alloc_pgtable,
.free   = arm_lpae_free_pgtable,
+   .owner  = THIS_MODULE,
},
 };
 
diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index 2c6eb2e..cc83542 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -52,9 +53,14 @@ struct io_pgtable_ops *alloc_io_pgtable_ops(enum 
io_pgtable_fmt fmt,
if (!fns)
return NULL;
 
+   if (!try_module_get(fns->owner))
+   return NULL;
+
iop = fns->alloc(cfg, cookie);
-   if (!iop)
+   if (!iop) {
+   module_put(fns->owner);
return NULL;
+   }
 
iop->fmt= fmt;
iop->cookie = cookie;
@@ -79,8 +85,10 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops)
iop = io_pgtable_ops_to_pgtable(ops);
io_pgtable_tlb_flush_all(iop);
fns = io_pgtable_get_init_fns(iop->fmt);
-   if (fns)
+   if (fns) {
fns->free(iop);
+   module_put(fns->owner);
+   }
 }
 EXPORT_SYMBOL_GPL(free_io_pgtable_ops);
 
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 45b367ce..a03b262 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -169,11 +169,13 @@ struct io_pgtable_ops {
  * @fmt:   The page table format.
  * @alloc: Allocate a set of page tables described by cfg.
  * @free:  Free the page tables associated with iop.
+ * @owner: Driver module providing these ops.
  */
 struct io_pgtable_init_fns {
enum io_pgtable_fmt fmt;
struct io_pgtable *(*alloc)(struct io_pgtable_cfg *cfg, void *cookie);
void (*free)(struct io_pgtable *iop);
+   struct module *owner;
 };
 
 /**
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 6/7] drm/panfrost: Add dependency on io-pgtable-arm format module

2020-12-21 Thread Isaac J. Manjarres

The Panfrost DRM driver depends on the availability of the ARM LPAE
io-pgtable format code to work properly. In preparation for having the
io-pgtable formats as modules, add a "pre" dependency with
MODULE_SOFTDEP() to ensure that the io-pgtable-arm format module is loaded
before loading the Panfrost DRM driver module.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 83a461b..7294622 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -704,3 +704,4 @@ module_platform_driver(panfrost_driver);
 MODULE_AUTHOR("Panfrost Project Developers");
 MODULE_DESCRIPTION("Panfrost DRM Driver");
 MODULE_LICENSE("GPL v2");
+MODULE_SOFTDEP("pre: io-pgtable-arm");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 1/7] iommu/io-pgtable: Introduce dynamic io-pgtable fmt registration

2020-12-21 Thread Isaac J. Manjarres

The io-pgtable code constructs an array of init functions for each
page table format at compile time. This is not ideal, as this
increases the footprint of the io-pgtable code, as well as prevents
io-pgtable formats from being built as kernel modules.

In preparation for modularizing the io-pgtable formats, switch to a
dynamic registration scheme, where each io-pgtable format can register
their init functions with the io-pgtable code at boot or module
insertion time.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 34 +-
 drivers/iommu/io-pgtable-arm.c | 90 ++--
 drivers/iommu/io-pgtable.c | 94 --
 include/linux/io-pgtable.h | 51 +
 4 files changed, 209 insertions(+), 60 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 1d92ac9..89aad2f 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -835,7 +836,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
return NULL;
 }
 
-struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns = {
+static struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns = {
+   .fmt= ARM_V7S,
.alloc  = arm_v7s_alloc_pgtable,
.free   = arm_v7s_free_pgtable,
 };
@@ -982,5 +984,33 @@ static int __init arm_v7s_do_selftests(void)
pr_info("self test ok\n");
return 0;
 }
-subsys_initcall(arm_v7s_do_selftests);
+#else
+static int arm_v7s_do_selftests(void)
+{
+   return 0;
+}
 #endif
+
+static int __init arm_v7s_init(void)
+{
+   int ret;
+
+   ret = io_pgtable_ops_register(_pgtable_arm_v7s_init_fns);
+   if (ret < 0) {
+   pr_err("Failed to register ARM V7S format\n");
+   return ret;
+   }
+
+   ret = arm_v7s_do_selftests();
+   if (ret < 0)
+   io_pgtable_ops_unregister(_pgtable_arm_v7s_init_fns);
+
+   return ret;
+}
+core_initcall(arm_v7s_init);
+
+static void __exit arm_v7s_exit(void)
+{
+   io_pgtable_ops_unregister(_pgtable_arm_v7s_init_fns);
+}
+module_exit(arm_v7s_exit);
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58..ff0ea2f 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1043,29 +1044,32 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, 
void *cookie)
return NULL;
 }
 
-struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = {
-   .alloc  = arm_64_lpae_alloc_pgtable_s1,
-   .free   = arm_lpae_free_pgtable,
-};
-
-struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns = {
-   .alloc  = arm_64_lpae_alloc_pgtable_s2,
-   .free   = arm_lpae_free_pgtable,
-};
-
-struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s1_init_fns = {
-   .alloc  = arm_32_lpae_alloc_pgtable_s1,
-   .free   = arm_lpae_free_pgtable,
-};
-
-struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s2_init_fns = {
-   .alloc  = arm_32_lpae_alloc_pgtable_s2,
-   .free   = arm_lpae_free_pgtable,
-};
-
-struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = {
-   .alloc  = arm_mali_lpae_alloc_pgtable,
-   .free   = arm_lpae_free_pgtable,
+static struct io_pgtable_init_fns io_pgtable_arm_lpae_init_fns[] = {
+   {
+   .fmt= ARM_32_LPAE_S1,
+   .alloc  = arm_32_lpae_alloc_pgtable_s1,
+   .free   = arm_lpae_free_pgtable,
+   },
+   {
+   .fmt= ARM_32_LPAE_S2,
+   .alloc  = arm_32_lpae_alloc_pgtable_s2,
+   .free   = arm_lpae_free_pgtable,
+   },
+   {
+   .fmt= ARM_64_LPAE_S1,
+   .alloc  = arm_64_lpae_alloc_pgtable_s1,
+   .free   = arm_lpae_free_pgtable,
+   },
+   {
+   .fmt= ARM_64_LPAE_S2,
+   .alloc  = arm_64_lpae_alloc_pgtable_s2,
+   .free   = arm_lpae_free_pgtable,
+   },
+   {
+   .fmt= ARM_MALI_LPAE,
+   .alloc  = arm_mali_lpae_alloc_pgtable,
+   .free   = arm_lpae_free_pgtable,
+   },
 };
 
 #ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST
@@ -1250,5 +1254,43 @@ static int __init arm_lpae_do_selftests(void)
pr_info("selftest: completed with %d PASS %d FAIL\n", pass, fail);
return fail ? -EFAULT : 0;
 }
-subsys_initcall(arm_lpae_do_selftests);
+#else
+static int __init arm_lpae_do_selftests(void)
+{
+   return 0;
+}
 #endif
+
+static int __init arm_lpae_init(void)
+{
+   int ret, i;
+
+   for (i = 0; i < ARRAY_SIZE(io_pgtable_arm_lpae_init_fns); i++) {
+   ret = io_pgtable_op

[PATCH v2 4/7] iommu/arm-smmu-v3: Add dependency on io-pgtable-arm format module

2020-12-21 Thread Isaac J. Manjarres

The SMMUv3 driver depends on the availability of the ARM LPAE io-pgtable
format code to work properly. In preparation for having the io-pgtable
formats as modules, add a "pre" dependency with MODULE_SOFTDEP() to
ensure that the io-pgtable-arm format module is loaded before loading
the ARM SMMUv3 driver module.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8ca7415..c498ac8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3650,3 +3650,4 @@ MODULE_DESCRIPTION("IOMMU API for ARM architected SMMUv3 
implementations");
 MODULE_AUTHOR("Will Deacon ");
 MODULE_ALIAS("platform:arm-smmu-v3");
 MODULE_LICENSE("GPL v2");
+MODULE_SOFTDEP("pre: io-pgtable-arm");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 7/7] iommu/io-pgtable-arm: Allow building modular io-pgtable fmts

2020-12-21 Thread Isaac J. Manjarres

Now that everything is in place for modular io-pgtable formats,
allow the ARM LPAE and ARMV7S io-pgtable formats to be built
as modules.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/Kconfig  | 4 ++--
 drivers/iommu/io-pgtable-arm-v7s.c | 2 ++
 drivers/iommu/io-pgtable-arm.c | 2 ++
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 192ef8f..7e4f44f 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -30,7 +30,7 @@ config IOMMU_IO_PGTABLE
bool
 
 config IOMMU_IO_PGTABLE_LPAE
-   bool "ARMv7/v8 Long Descriptor Format"
+   tristate "ARMv7/v8 Long Descriptor Format"
select IOMMU_IO_PGTABLE
depends on ARM || ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64)
help
@@ -49,7 +49,7 @@ config IOMMU_IO_PGTABLE_LPAE_SELFTEST
  If unsure, say N here.
 
 config IOMMU_IO_PGTABLE_ARMV7S
-   bool "ARMv7/v8 Short Descriptor Format"
+   tristate "ARMv7/v8 Short Descriptor Format"
select IOMMU_IO_PGTABLE
depends on ARM || ARM64 || COMPILE_TEST
help
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index a5cb755a..9d9f08f 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -1015,3 +1015,5 @@ static void __exit arm_v7s_exit(void)
io_pgtable_ops_unregister(_pgtable_arm_v7s_init_fns);
 }
 module_exit(arm_v7s_exit);
+
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index e8b1e34..e0de4ad 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -1299,3 +1299,5 @@ static void __exit arm_lpae_exit(void)
io_pgtable_ops_unregister(_pgtable_arm_lpae_init_fns[i]);
 }
 module_exit(arm_lpae_exit);
+
+MODULE_LICENSE("GPL v2");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[RFC PATCH v2 0/7] iommu: Permit modular builds of io-pgtable drivers

2020-12-21 Thread Isaac J. Manjarres

The goal of the Generic Kernel Image (GKI) effort is to have a common
kernel image that works across multiple Android devices. This involves
generating a kernel image that has core features integrated into it,
while SoC specific functionality can be added to the kernel for the
device as a module.

Along with modularizing IOMMU drivers, this also means building the
io-pgtable code as modules, which allows for SoC vendors to only include
the io-pgtable implementations that they use. For example, GKI for arm64
must include support for both the IOMMU ARM LPAE/V7S formats at the
moment. Having the code for both formats as modules allows SoC vendors
to only provide the page table format that they use, along with their
IOMMU driver.

Main changes since v1:

1) Retain io-pgtable.c as part of the core kernel

The patches are split into 4 parts:

1) Modularizing io-pgtable-arm[-v7s].c, while leaving the io-pgtable.c
code as part of the core kernel, requires removing the references to
the ARM LPAE and ARM V7S io-pgtable init functions, and using a
dynamic method for formats to register their io-pgtable init functions.

The reason for defining an io_pgtable_init_fns_node structure is to
not have the data structures used to store the init functions seep into
the io-pgtable fmt drivers. Doing so allows for changing the internal
data structure used to keep track of the init functions, without impacting
the client data structures.

2) Taking references to the io-pgtable format drivers to ensure that they
cannot be unloaded while in use.

3) Adding pre MODULE_SOFTDEP() dependencies to drivers in the kernel
that are tristate, and invoke [alloc/free]_io_pgtable_ops(). This makes
it so that the io-pgtable format drivers are loaded before the driver
that needs them.

4) Changing the Kconfig options for the ARM LPAE nad ARM V7S to tristate.

Thanks in advance for the feedback,

Isaac J. Manjarres

Isaac J. Manjarres (7):
  iommu/io-pgtable: Introduce dynamic io-pgtable fmt registration
  iommu/io-pgtable: Add refcounting for io-pgtable format modules
  iommu/arm-smmu: Add dependency on io-pgtable format modules
  iommu/arm-smmu-v3: Add dependency on io-pgtable-arm format module
  drm/msm: Add dependency on io-pgtable-arm format module
  drm/panfrost: Add dependency on io-pgtable-arm format module
  iommu/io-pgtable-arm: Allow building modular io-pgtable fmts

 drivers/gpu/drm/msm/msm_drv.c   |   1 +
 drivers/gpu/drm/panfrost/panfrost_drv.c |   1 +
 drivers/iommu/Kconfig   |   4 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |   1 +
 drivers/iommu/arm/arm-smmu/arm-smmu.c   |   1 +
 drivers/iommu/io-pgtable-arm-v7s.c  |  37 +-
 drivers/iommu/io-pgtable-arm.c  |  97 +++---
 drivers/iommu/io-pgtable.c  | 104 +++-
 include/linux/io-pgtable.h  |  53 +-
 9 files changed, 236 insertions(+), 63 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2 5/7] drm/msm: Add dependency on io-pgtable-arm format module

2020-12-21 Thread Isaac J. Manjarres

The MSM DRM driver depends on the availability of the ARM LPAE io-pgtable
format code to work properly. In preparation for having the io-pgtable
formats as modules, add a "pre" dependency with MODULE_SOFTDEP() to
ensure that the io-pgtable-arm format module is loaded before loading
the MSM DRM driver module.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/gpu/drm/msm/msm_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 535a026..8be3506 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1369,3 +1369,4 @@ module_exit(msm_drm_unregister);
 MODULE_AUTHOR("Rob Clark

[PATCH 3/3] iommu/io-pgtable: Allow building as a module

2020-12-18 Thread Isaac J. Manjarres

Now that all of the required symbols have been exported,
and the io-pgtable code can correctly refer to the
io-pgtable init functions when their source files are built
as modules, allow the io-pgtable code to be built as a module. The
expectation is that the io-pgtable core code, along with
the descriptor format (either or both ARM LPAE and ARMV7S)
can be built as modules.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 192ef8f..d7de6db 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -27,10 +27,10 @@ menu "Generic IOMMU Pagetable Support"
 
 # Selected by the actual pagetable implementations
 config IOMMU_IO_PGTABLE
-   bool
+   tristate
 
 config IOMMU_IO_PGTABLE_LPAE
-   bool "ARMv7/v8 Long Descriptor Format"
+   tristate "ARMv7/v8 Long Descriptor Format"
select IOMMU_IO_PGTABLE
depends on ARM || ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64)
help
@@ -49,7 +49,7 @@ config IOMMU_IO_PGTABLE_LPAE_SELFTEST
  If unsure, say N here.
 
 config IOMMU_IO_PGTABLE_ARMV7S
-   bool "ARMv7/v8 Short Descriptor Format"
+   tristate "ARMv7/v8 Short Descriptor Format"
select IOMMU_IO_PGTABLE
depends on ARM || ARM64 || COMPILE_TEST
help
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 2/3] iommu/io-pgtable: Prepare for modularization

2020-12-18 Thread Isaac J. Manjarres

The io-pgtable source file uses the #ifdef preprocessor macro
to construct the io_pgtable_init_table structure. However,
the #ifdef macro evaluates to true if the config it is testing
is set to y. This is not ideal when the configs that the
io-pgtable code checks for can be m, so use IS_ENABLED() instead.
Also, add the GPL v2 module license to the io-pgtable source file.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/io-pgtable.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index 94394c8..867ecb4 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -10,18 +10,19 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static const struct io_pgtable_init_fns *
 io_pgtable_init_table[IO_PGTABLE_NUM_FMTS] = {
-#ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE
+#if IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_LPAE)
[ARM_32_LPAE_S1] = _pgtable_arm_32_lpae_s1_init_fns,
[ARM_32_LPAE_S2] = _pgtable_arm_32_lpae_s2_init_fns,
[ARM_64_LPAE_S1] = _pgtable_arm_64_lpae_s1_init_fns,
[ARM_64_LPAE_S2] = _pgtable_arm_64_lpae_s2_init_fns,
[ARM_MALI_LPAE] = _pgtable_arm_mali_lpae_init_fns,
 #endif
-#ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S
+#if IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S)
[ARM_V7S] = _pgtable_arm_v7s_init_fns,
 #endif
 };
@@ -68,3 +69,5 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops)
io_pgtable_init_table[iop->fmt]->free(iop);
 }
 EXPORT_SYMBOL_GPL(free_io_pgtable_ops);
+
+MODULE_LICENSE("GPL v2");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[RFC PATCH 0/3] iommu: Permit modular builds of io-pgtable drivers

2020-12-18 Thread Isaac J. Manjarres

The goal of the Generic Kernel Image (GKI) effort is to have a common
image that works across multiple Android devices. This involves generating
a kernel image that has core features integrated into it, while SoC specific
functionality can be added to the kernel for the device as a module.

Along with modularizing IOMMU drivers, this also means building the io-pgtable
code as modules, which allows for SoC vendors to only include the io-pgtable
implementations that they use. For example, GKI for arm64 must include
support for both the IOMMU ARM LPAE/V7S formats at the moment. Having the code
for both formats as modules allows SoC vendors to only provide the page table
format that they use, along with their IOMMU driver.

Modularizing both io-pgtable.c, as well as the io-pgtable-arm[-v7s].c files,
works out rather nicely, as the main interface that clients use to interact
with the page tables is already exported (i.e. alloc_io_pgtable_ops and
free_io_pgtable_ops). It also makes it so that neither the io-pgtable-arm[-v7s]
modules or the io-pgtable modules can be unloaded without unloading the IOMMU
driver, which can only happen when there aren't any references to the IOMMU
driver module.

Thanks in advance for the feedback,

Isaac J. Manjarres

Isaac J. Manjarres (3):
  iommu/io-pgtable-arm: Prepare for modularization
  iommu/io-pgtable: Prepare for modularization
  iommu/io-pgtable: Allow building as a module

 drivers/iommu/Kconfig  | 6 +++---
 drivers/iommu/io-pgtable-arm-v7s.c | 4 
 drivers/iommu/io-pgtable-arm.c | 8 
 drivers/iommu/io-pgtable.c | 7 +--
 4 files changed, 20 insertions(+), 5 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH 1/3] iommu/io-pgtable-arm: Prepare for modularization

2020-12-18 Thread Isaac J. Manjarres

The io-pgtable-arm and io-pgtable-arm-v7s source files will
be compiled as separate modules, along with the io-pgtable
source. Export the symbols for the io-pgtable init function
structures for the io-pgtable module to use.

Signed-off-by: Isaac J. Manjarres 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 4 
 drivers/iommu/io-pgtable-arm.c | 8 
 2 files changed, 12 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 1d92ac9..f062c1c 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -839,6 +840,7 @@ struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns = {
.alloc  = arm_v7s_alloc_pgtable,
.free   = arm_v7s_free_pgtable,
 };
+EXPORT_SYMBOL_GPL(io_pgtable_arm_v7s_init_fns);
 
 #ifdef CONFIG_IOMMU_IO_PGTABLE_ARMV7S_SELFTEST
 
@@ -984,3 +986,5 @@ static int __init arm_v7s_do_selftests(void)
 }
 subsys_initcall(arm_v7s_do_selftests);
 #endif
+
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58..2623d57 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1047,26 +1048,31 @@ struct io_pgtable_init_fns 
io_pgtable_arm_64_lpae_s1_init_fns = {
.alloc  = arm_64_lpae_alloc_pgtable_s1,
.free   = arm_lpae_free_pgtable,
 };
+EXPORT_SYMBOL_GPL(io_pgtable_arm_64_lpae_s1_init_fns);
 
 struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns = {
.alloc  = arm_64_lpae_alloc_pgtable_s2,
.free   = arm_lpae_free_pgtable,
 };
+EXPORT_SYMBOL_GPL(io_pgtable_arm_64_lpae_s2_init_fns);
 
 struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s1_init_fns = {
.alloc  = arm_32_lpae_alloc_pgtable_s1,
.free   = arm_lpae_free_pgtable,
 };
+EXPORT_SYMBOL_GPL(io_pgtable_arm_32_lpae_s1_init_fns);
 
 struct io_pgtable_init_fns io_pgtable_arm_32_lpae_s2_init_fns = {
.alloc  = arm_32_lpae_alloc_pgtable_s2,
.free   = arm_lpae_free_pgtable,
 };
+EXPORT_SYMBOL_GPL(io_pgtable_arm_32_lpae_s2_init_fns);
 
 struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = {
.alloc  = arm_mali_lpae_alloc_pgtable,
.free   = arm_lpae_free_pgtable,
 };
+EXPORT_SYMBOL_GPL(io_pgtable_arm_mali_lpae_init_fns);
 
 #ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST
 
@@ -1252,3 +1258,5 @@ static int __init arm_lpae_do_selftests(void)
 }
 subsys_initcall(arm_lpae_do_selftests);
 #endif
+
+MODULE_LICENSE("GPL v2");
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[RFC PATCH] fork: Free per-cpu cached vmalloc'ed thread stacks with

2020-09-04 Thread Isaac J. Manjarres

The per-cpu cached vmalloc'ed stacks are currently freed in the
CPU hotplug teardown path by the free_vm_stack_cache() callback,
which invokes vfree(), which may result in purging the list of
lazily freed vmap areas.

Purging all of the lazily freed vmap areas can take a long time
when the list of vmap areas is large. This is problematic, as
free_vm_stack_cache() is invoked prior to the offline CPU's timers
being migrated. This is not desirable as it can lead to timer
migration delays in the CPU hotplug teardown path, and timer callbacks
will be invoked long after the timer has expired.

For example, on a system that has only one online CPU (CPU 1) that is
running a heavy workload, and another CPU that is being offlined,
the online CPU will invoke free_vm_stack_cache() to free the cached
vmalloc'ed stacks for the CPU being offlined. When there are 2702
vmap areas that total to 13498 pages, free_vm_stack_cache() takes
over 2 seconds to execute:

[001]   399.335808: cpuhp_enter: cpu: 0005 target:   0 step:  67 
(free_vm_stack_cache)

/* The first vmap area to be freed */
[001]   399.337157: __purge_vmap_area_lazy: [0:2702] 0xffc033da8000 - 
0xffc033dad000 (5 : 13498)

/* After two seconds */
[001]   401.528010: __purge_vmap_area_lazy: [1563:2702] 0xffc02fe1 - 
0xffc02fe15000 (5 : 5765)

Instead of freeing the per-cpu cached vmalloc'ed stacks synchronously
with respect to the CPU hotplug teardown state machine, free them
asynchronously to help move along the CPU hotplug teardown state machine
quickly.

Signed-off-by: Isaac J. Manjarres 
---
 kernel/fork.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 4d32190..68346a0 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -202,7 +202,7 @@ static int free_vm_stack_cache(unsigned int cpu)
if (!vm_stack)
continue;
 
-   vfree(vm_stack->addr);
+   vfree_atomic(vm_stack->addr);
cached_vm_stacks[i] = NULL;
}
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] mm/usercopy: Use memory range to be accessed for wraparound check

2019-07-30 Thread Isaac J. Manjarres

Currently, when checking to see if accessing n bytes starting at
address "ptr" will cause a wraparound in the memory addresses,
the check in check_bogus_address() adds an extra byte, which is
incorrect, as the range of addresses that will be accessed is
[ptr, ptr + (n - 1)].

This can lead to incorrectly detecting a wraparound in the
memory address, when trying to read 4 KB from memory that is
mapped to the the last possible page in the virtual address
space, when in fact, accessing that range of memory would not
cause a wraparound to occur.

Use the memory range that will actually be accessed when
considering if accessing a certain amount of bytes will cause
the memory address to wrap around.

Fixes: f5509cc18daa ("mm: Hardened usercopy")
Co-developed-by: Prasad Sodagudi 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Isaac J. Manjarres 
Cc: sta...@vger.kernel.org
Reviewed-by: William Kucharski 
Acked-by: Kees Cook 
---
 mm/usercopy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/usercopy.c b/mm/usercopy.c
index 2a09796..98e92486 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -147,7 +147,7 @@ static inline void check_bogus_address(const unsigned long 
ptr, unsigned long n,
   bool to_user)
 {
/* Reject if object wraps past end of memory. */
-   if (ptr + n < ptr)
+   if (ptr + (n - 1) < ptr)
usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n);
 
/* Reject if NULL or ZERO-allocation. */
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] mm/usercopy: Use memory range to be accessed for wraparound check

2018-11-13 Thread Isaac J. Manjarres

Currently, when checking to see if accessing n bytes starting at
address "ptr" will cause a wraparound in the memory addresses,
the check in check_bogus_address() adds an extra byte, which is
incorrect, as the range of addresses that will be accessed is
[ptr, ptr + (n - 1)].

This can lead to incorrectly detecting a wraparound in the
memory address, when trying to read 4 KB from memory that is
mapped to the the last possible page in the virtual address
space, when in fact, accessing that range of memory would not
cause a wraparound to occur.

Use the memory range that will actually be accessed when
considering if accessing a certain amount of bytes will cause
the memory address to wrap around.

Change-Id: I2563a5988e41122727ede17180f365e999b953e6
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Co-Developed-by: Prasad Sodagudi 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Isaac J. Manjarres 
Cc: sta...@vger.kernel.org
---
 mm/usercopy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/usercopy.c b/mm/usercopy.c
index 852eb4e..0293645 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -151,7 +151,7 @@ static inline void check_bogus_address(const unsigned long 
ptr, unsigned long n,
   bool to_user)
 {
/* Reject if object wraps past end of memory. */
-   if (ptr + n < ptr)
+   if (ptr + (n - 1) < ptr)
usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n);
 
/* Reject if NULL or ZERO-allocation. */
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] mm/usercopy: Use memory range to be accessed for wraparound check

2018-11-13 Thread Isaac J. Manjarres

Currently, when checking to see if accessing n bytes starting at
address "ptr" will cause a wraparound in the memory addresses,
the check in check_bogus_address() adds an extra byte, which is
incorrect, as the range of addresses that will be accessed is
[ptr, ptr + (n - 1)].

This can lead to incorrectly detecting a wraparound in the
memory address, when trying to read 4 KB from memory that is
mapped to the the last possible page in the virtual address
space, when in fact, accessing that range of memory would not
cause a wraparound to occur.

Use the memory range that will actually be accessed when
considering if accessing a certain amount of bytes will cause
the memory address to wrap around.

Change-Id: I2563a5988e41122727ede17180f365e999b953e6
Fixes: f5509cc18daa ("mm: Hardened usercopy")
Co-Developed-by: Prasad Sodagudi 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Isaac J. Manjarres 
Cc: sta...@vger.kernel.org
---
 mm/usercopy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/usercopy.c b/mm/usercopy.c
index 852eb4e..0293645 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -151,7 +151,7 @@ static inline void check_bogus_address(const unsigned long 
ptr, unsigned long n,
   bool to_user)
 {
/* Reject if object wraps past end of memory. */
-   if (ptr + n < ptr)
+   if (ptr + (n - 1) < ptr)
usercopy_abort("wrapped address", NULL, to_user, 0, ptr + n);
 
/* Reject if NULL or ZERO-allocation. */
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] stop_machine: Atomically queue and wake stopper threads

2018-08-03 Thread Isaac J. Manjarres

From: Prasad Sodagudi 

When cpu_stop_queue_work() releases the lock for the stopper
thread that was queued into its wake queue, preemption is
enabled, which leads to the following deadlock:

CPU0  CPU1
sched_setaffinity(0, ...)
__set_cpus_allowed_ptr()
stop_one_cpu(0, ...)  stop_two_cpus(0, 1, ...)
cpu_stop_queue_work(0, ...)   cpu_stop_queue_two_works(0, ..., 1, ...)

-grabs lock for migration/0-
  -spins with preemption disabled,
   waiting for migration/0's lock to be
   released-

-adds work items for migration/0
and queues migration/0 to its
wake_q-

-releases lock for migration/0
 and preemption is enabled-

-current thread is preempted,
and __set_cpus_allowed_ptr
has changed the thread's
cpu allowed mask to CPU1 only-

  -acquires migration/0 and migration/1's
   locks-

  -adds work for migration/0 but does not
   add migration/0 to wake_q, since it is
   already in a wake_q-

  -adds work for migration/1 and adds
   migration/1 to its wake_q-

  -releases migration/0 and migration/1's
   locks, wakes migration/1, and enables
   preemption-

  -since migration/1 is requested to run,
   migration/1 begins to run and waits on
   migration/0, but migration/0 will never
   be able to run, since the thread that
   can wake it is affine to CPU1-

Disable preemption in cpu_stop_queue_work() before
queueing works for stopper threads, and queueing the stopper
thread in the wake queue, to ensure that the operation
of queueing the works and waking the stopper threads is atomic.

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Isaac J. Manjarres 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 34b6652..067cb83 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -81,6 +81,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct 
cpu_stop_work *work)
unsigned long flags;
bool enabled;
 
+   preempt_disable();
raw_spin_lock_irqsave(>lock, flags);
enabled = stopper->enabled;
if (enabled)
@@ -90,6 +91,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct 
cpu_stop_work *work)
raw_spin_unlock_irqrestore(>lock, flags);
 
wake_up_q();
+   preempt_enable();
 
return enabled;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] stop_machine: Atomically queue and wake stopper threads

2018-08-03 Thread Isaac J. Manjarres

From: Prasad Sodagudi 

When cpu_stop_queue_work() releases the lock for the stopper
thread that was queued into its wake queue, preemption is
enabled, which leads to the following deadlock:

CPU0  CPU1
sched_setaffinity(0, ...)
__set_cpus_allowed_ptr()
stop_one_cpu(0, ...)  stop_two_cpus(0, 1, ...)
cpu_stop_queue_work(0, ...)   cpu_stop_queue_two_works(0, ..., 1, ...)

-grabs lock for migration/0-
  -spins with preemption disabled,
   waiting for migration/0's lock to be
   released-

-adds work items for migration/0
and queues migration/0 to its
wake_q-

-releases lock for migration/0
 and preemption is enabled-

-current thread is preempted,
and __set_cpus_allowed_ptr
has changed the thread's
cpu allowed mask to CPU1 only-

  -acquires migration/0 and migration/1's
   locks-

  -adds work for migration/0 but does not
   add migration/0 to wake_q, since it is
   already in a wake_q-

  -adds work for migration/1 and adds
   migration/1 to its wake_q-

  -releases migration/0 and migration/1's
   locks, wakes migration/1, and enables
   preemption-

  -since migration/1 is requested to run,
   migration/1 begins to run and waits on
   migration/0, but migration/0 will never
   be able to run, since the thread that
   can wake it is affine to CPU1-

Disable preemption in cpu_stop_queue_work() before
queueing works for stopper threads, and queueing the stopper
thread in the wake queue, to ensure that the operation
of queueing the works and waking the stopper threads is atomic.

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Isaac J. Manjarres 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 34b6652..067cb83 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -81,6 +81,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct 
cpu_stop_work *work)
unsigned long flags;
bool enabled;
 
+   preempt_disable();
raw_spin_lock_irqsave(>lock, flags);
enabled = stopper->enabled;
if (enabled)
@@ -90,6 +91,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct 
cpu_stop_work *work)
raw_spin_unlock_irqrestore(>lock, flags);
 
wake_up_q();
+   preempt_enable();
 
return enabled;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[tip:sched/core] stop_machine: Disable preemption after queueing stopper threads

2018-07-25 Thread tip-bot for Isaac J. Manjarres

Commit-ID:  2610e88946632afb78aa58e61f11368ac4c0af7b
Gitweb: https://git.kernel.org/tip/2610e88946632afb78aa58e61f11368ac4c0af7b
Author: Isaac J. Manjarres 
AuthorDate: Tue, 17 Jul 2018 12:35:29 -0700
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Jul 2018 11:25:08 +0200

stop_machine: Disable preemption after queueing stopper threads

This commit:

  9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper 
threads")

does not fully address the race condition that can occur
as follows:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: bige...@linutronix.de
Cc: gre...@linuxfoundation.org
Cc: m...@codeblueprint.co.uk
Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper 
threads")
Link: 
http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isa...@codeaurora.org
Signed-off-by: Ingo Molnar 
---
 kernel/stop_machine.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 1ff523dae6e2..e190d1ef3a23 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ retry:
err = 0;
__cpu_stop_queue_work(stopper1, work1, );
__cpu_stop_queue_work(stopper2, work2, );
+   /*
+* The waking up of stopper threads has to happen
+* in the same scheduling context as the queueing.
+* Otherwise, there is a possibility of one of the
+* above stoppers being woken up by another CPU,
+* and preempting us. This will cause us to n ot
+* wake up the other stopper forever.
+*/
+   preempt_disable();
 unlock:
raw_spin_unlock(>lock);
raw_spin_unlock_irq(>lock);
@@ -271,7 +280,6 @@ unlock:
}
 
if (!err) {
-   preempt_disable();
wake_up_q();
preempt_enable();
}

[tip:sched/core] stop_machine: Disable preemption after queueing stopper threads

2018-07-25 Thread tip-bot for Isaac J. Manjarres

Commit-ID:  2610e88946632afb78aa58e61f11368ac4c0af7b
Gitweb: https://git.kernel.org/tip/2610e88946632afb78aa58e61f11368ac4c0af7b
Author: Isaac J. Manjarres 
AuthorDate: Tue, 17 Jul 2018 12:35:29 -0700
Committer:  Ingo Molnar 
CommitDate: Wed, 25 Jul 2018 11:25:08 +0200

stop_machine: Disable preemption after queueing stopper threads

This commit:

  9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper 
threads")

does not fully address the race condition that can occur
as follows:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: bige...@linutronix.de
Cc: gre...@linuxfoundation.org
Cc: m...@codeblueprint.co.uk
Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper 
threads")
Link: 
http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isa...@codeaurora.org
Signed-off-by: Ingo Molnar 
---
 kernel/stop_machine.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 1ff523dae6e2..e190d1ef3a23 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ retry:
err = 0;
__cpu_stop_queue_work(stopper1, work1, );
__cpu_stop_queue_work(stopper2, work2, );
+   /*
+* The waking up of stopper threads has to happen
+* in the same scheduling context as the queueing.
+* Otherwise, there is a possibility of one of the
+* above stoppers being woken up by another CPU,
+* and preempting us. This will cause us to n ot
+* wake up the other stopper forever.
+*/
+   preempt_disable();
 unlock:
raw_spin_unlock(>lock);
raw_spin_unlock_irq(>lock);
@@ -271,7 +280,6 @@ unlock:
}
 
if (!err) {
-   preempt_disable();
wake_up_q();
preempt_enable();
}

[PATCH] stop_machine: Disable preemption after queueing stopper threads

2018-07-17 Thread Isaac J. Manjarres

This commit:

9fb8d5dc4b64 ("stop_machine, Disable preemption when
waking two stopper threads")

does not fully address the race condition that can occur
as follows:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper 
threads")
Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 1ff523d..e190d1e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
err = 0;
__cpu_stop_queue_work(stopper1, work1, );
__cpu_stop_queue_work(stopper2, work2, );
+   /*
+* The waking up of stopper threads has to happen
+* in the same scheduling context as the queueing.
+* Otherwise, there is a possibility of one of the
+* above stoppers being woken up by another CPU,
+* and preempting us. This will cause us to n ot
+* wake up the other stopper forever.
+*/
+   preempt_disable();
 unlock:
raw_spin_unlock(>lock);
raw_spin_unlock_irq(>lock);
@@ -271,7 +280,6 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
}
 
if (!err) {
-   preempt_disable();
wake_up_q();
preempt_enable();
}
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] stop_machine: Disable preemption after queueing stopper threads

2018-07-17 Thread Isaac J. Manjarres

This commit:

9fb8d5dc4b64 ("stop_machine, Disable preemption when
waking two stopper threads")

does not fully address the race condition that can occur
as follows:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper 
threads")
Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 1ff523d..e190d1e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
err = 0;
__cpu_stop_queue_work(stopper1, work1, );
__cpu_stop_queue_work(stopper2, work2, );
+   /*
+* The waking up of stopper threads has to happen
+* in the same scheduling context as the queueing.
+* Otherwise, there is a possibility of one of the
+* above stoppers being woken up by another CPU,
+* and preempting us. This will cause us to n ot
+* wake up the other stopper forever.
+*/
+   preempt_disable();
 unlock:
raw_spin_unlock(>lock);
raw_spin_unlock_irq(>lock);
@@ -271,7 +280,6 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
}
 
if (!err) {
-   preempt_disable();
wake_up_q();
preempt_enable();
}
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v4] stop_machine: Disable preemption after queueing stopper threads

2018-07-17 Thread Isaac J. Manjarres

After cpu_stop_queue_two_works() queues the cpu_stop works
for the stopper threads, it releases the locks held for
both threads, which enables preemption, which allows the
following race condition to occur:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..e190d1e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
err = 0;
__cpu_stop_queue_work(stopper1, work1, );
__cpu_stop_queue_work(stopper2, work2, );
+   /*
+* The waking up of stopper threads has to happen
+* in the same scheduling context as the queueing.
+* Otherwise, there is a possibility of one of the
+* above stoppers being woken up by another CPU,
+* and preempting us. This will cause us to n ot
+* wake up the other stopper forever.
+*/
+   preempt_disable();
 unlock:
raw_spin_unlock(>lock);
raw_spin_unlock_irq(>lock);
@@ -270,7 +279,10 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v4] stop_machine: Disable preemption after queueing stopper threads

2018-07-17 Thread Isaac J. Manjarres

After cpu_stop_queue_two_works() queues the cpu_stop works
for the stopper threads, it releases the locks held for
both threads, which enables preemption, which allows the
following race condition to occur:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..e190d1e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
err = 0;
__cpu_stop_queue_work(stopper1, work1, );
__cpu_stop_queue_work(stopper2, work2, );
+   /*
+* The waking up of stopper threads has to happen
+* in the same scheduling context as the queueing.
+* Otherwise, there is a possibility of one of the
+* above stoppers being woken up by another CPU,
+* and preempting us. This will cause us to n ot
+* wake up the other stopper forever.
+*/
+   preempt_disable();
 unlock:
raw_spin_unlock(>lock);
raw_spin_unlock_irq(>lock);
@@ -270,7 +279,10 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[tip:sched/urgent] stop_machine: Disable preemption when waking two stopper threads

2018-07-15 Thread tip-bot for Isaac J. Manjarres

Commit-ID:  9fb8d5dc4b649dd190e1af4ead670753e71bf907
Gitweb: https://git.kernel.org/tip/9fb8d5dc4b649dd190e1af4ead670753e71bf907
Author: Isaac J. Manjarres 
AuthorDate: Tue, 3 Jul 2018 15:02:14 -0700
Committer:  Thomas Gleixner 
CommitDate: Sun, 15 Jul 2018 12:12:45 +0200

stop_machine: Disable preemption when waking two stopper threads

When cpu_stop_queue_two_works() begins to wake the stopper threads, it does
so without preemption disabled, which leads to the following race
condition:

The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source
CPU, and cpu2 as the destination CPU. When adding the stopper threads to
the wake queue used in this function, the source CPU stopper thread is
added first, and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the threads are
woken up in the order that they are queued in, so the source CPU's stopper
thread is woken up first, and it preempts the thread running on the source
CPU.

The stopper thread will then execute on the source CPU, disable preemption,
and begin executing multi_cpu_stop(), and wait for an ack from the
destination CPU's stopper thread, with preemption still disabled. Since the
worker thread that woke up the stopper thread on the source CPU is affine
to the source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper thread from
the wake queue, and thus, the destination CPU's stopper thread will never
run, causing the source CPU's stopper thread to wait forever, and stall.

Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works().

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi 
Signed-off-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Thomas Gleixner 
Cc: pet...@infradead.org
Cc: m...@codeblueprint.co.uk
Cc: bige...@linutronix.de
Cc: gre...@linuxfoundation.org
Cc: sta...@vger.kernel.org
Link: 
https://lkml.kernel.org/r/1530655334-4601-1-git-send-email-isa...@codeaurora.org

---
 kernel/stop_machine.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a2c238..1ff523dae6e2 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -270,7 +270,11 @@ unlock:
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   preempt_disable();
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }

[tip:sched/urgent] stop_machine: Disable preemption when waking two stopper threads

2018-07-15 Thread tip-bot for Isaac J. Manjarres

Commit-ID:  9fb8d5dc4b649dd190e1af4ead670753e71bf907
Gitweb: https://git.kernel.org/tip/9fb8d5dc4b649dd190e1af4ead670753e71bf907
Author: Isaac J. Manjarres 
AuthorDate: Tue, 3 Jul 2018 15:02:14 -0700
Committer:  Thomas Gleixner 
CommitDate: Sun, 15 Jul 2018 12:12:45 +0200

stop_machine: Disable preemption when waking two stopper threads

When cpu_stop_queue_two_works() begins to wake the stopper threads, it does
so without preemption disabled, which leads to the following race
condition:

The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source
CPU, and cpu2 as the destination CPU. When adding the stopper threads to
the wake queue used in this function, the source CPU stopper thread is
added first, and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the threads are
woken up in the order that they are queued in, so the source CPU's stopper
thread is woken up first, and it preempts the thread running on the source
CPU.

The stopper thread will then execute on the source CPU, disable preemption,
and begin executing multi_cpu_stop(), and wait for an ack from the
destination CPU's stopper thread, with preemption still disabled. Since the
worker thread that woke up the stopper thread on the source CPU is affine
to the source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper thread from
the wake queue, and thus, the destination CPU's stopper thread will never
run, causing the source CPU's stopper thread to wait forever, and stall.

Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works().

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi 
Signed-off-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Thomas Gleixner 
Cc: pet...@infradead.org
Cc: m...@codeblueprint.co.uk
Cc: bige...@linutronix.de
Cc: gre...@linuxfoundation.org
Cc: sta...@vger.kernel.org
Link: 
https://lkml.kernel.org/r/1530655334-4601-1-git-send-email-isa...@codeaurora.org

---
 kernel/stop_machine.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a2c238..1ff523dae6e2 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -270,7 +270,11 @@ unlock:
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   preempt_disable();
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }

[PATCH v3] stop_machine: Disable preemption when waking two stopper threads

2018-07-03 Thread Isaac J. Manjarres

When cpu_stop_queue_two_works() begins to wake the stopper
threads, it does so without preemption disabled, which leads
to the following race condition:

The source CPU calls cpu_stop_queue_two_works(), with cpu1
as the source CPU, and cpu2 as the destination CPU. When
adding the stopper threads to the wake queue used in this
function, the source CPU stopper thread is added first,
and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the
threads are woken up in the order that they are queued in,
so the source CPU's stopper thread is woken up first, and
it preempts the thread running on the source CPU.

The stopper thread will then execute on the source CPU,
disable preemption, and begin executing multi_cpu_stop(),
and wait for an ack from the destination CPU's stopper thread,
with preemption still disabled. Since the worker thread that
woke up the stopper thread on the source CPU is affine to the
source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper
thread from the wake queue, and thus, the destination CPU's
stopper thread will never run, causing the source CPU's stopper
thread to wait forever, and stall.

Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works().

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..1ff523d 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   preempt_disable();
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v3] stop_machine: Disable preemption when waking two stopper threads

2018-07-03 Thread Isaac J. Manjarres

When cpu_stop_queue_two_works() begins to wake the stopper
threads, it does so without preemption disabled, which leads
to the following race condition:

The source CPU calls cpu_stop_queue_two_works(), with cpu1
as the source CPU, and cpu2 as the destination CPU. When
adding the stopper threads to the wake queue used in this
function, the source CPU stopper thread is added first,
and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the
threads are woken up in the order that they are queued in,
so the source CPU's stopper thread is woken up first, and
it preempts the thread running on the source CPU.

The stopper thread will then execute on the source CPU,
disable preemption, and begin executing multi_cpu_stop(),
and wait for an ack from the destination CPU's stopper thread,
with preemption still disabled. Since the worker thread that
woke up the stopper thread on the source CPU is affine to the
source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper
thread from the wake queue, and thus, the destination CPU's
stopper thread will never run, causing the source CPU's stopper
thread to wait forever, and stall.

Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works().

Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. 
active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi 
Co-Developed-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Cc: sta...@vger.kernel.org
---
 kernel/stop_machine.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..1ff523d 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   preempt_disable();
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2] stop_machine: Disable preemption when waking two stopper threads

2018-06-29 Thread Isaac J. Manjarres

When cpu_stop_queue_two_works() begins to wake the stopper
threads, it does so without preemption disabled, which leads
to the following race condition:

The source CPU calls cpu_stop_queue_two_works(), with cpu1
as the source CPU, and cpu2 as the destination CPU. When
adding the stopper threads to the wake queue used in this
function, the source CPU stopper thread is added first,
and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the
threads are woken up in the order that they are queued in,
so the source CPU's stopper thread is woken up first, and
it preempts the thread running on the source CPU.

The stopper thread will then execute on the source CPU,
disable preemption, and begin executing multi_cpu_stop(),
and wait for an ack from the destination CPU's stopper thread,
with preemption still disabled. Since the worker thread that
woke up the stopper thread on the source CPU is affine to the
source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper
thread from the wake queue, and thus, the destination CPU's
stopper thread will never run, causing the source CPU's stopper
thread to wait forever, and stall.

Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works() to ensure that the worker thread
that is waking up the stopper threads isn't preempted
by the source CPU's stopper thread, and permanently
scheduled out, leaving the remaining stopper thread asleep
in the wake queue.

Co-developed-by: Pavankumar Kondeti 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
---
 kernel/stop_machine.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..1ff523d 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   preempt_disable();
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH v2] stop_machine: Disable preemption when waking two stopper threads

2018-06-29 Thread Isaac J. Manjarres

When cpu_stop_queue_two_works() begins to wake the stopper
threads, it does so without preemption disabled, which leads
to the following race condition:

The source CPU calls cpu_stop_queue_two_works(), with cpu1
as the source CPU, and cpu2 as the destination CPU. When
adding the stopper threads to the wake queue used in this
function, the source CPU stopper thread is added first,
and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the
threads are woken up in the order that they are queued in,
so the source CPU's stopper thread is woken up first, and
it preempts the thread running on the source CPU.

The stopper thread will then execute on the source CPU,
disable preemption, and begin executing multi_cpu_stop(),
and wait for an ack from the destination CPU's stopper thread,
with preemption still disabled. Since the worker thread that
woke up the stopper thread on the source CPU is affine to the
source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper
thread from the wake queue, and thus, the destination CPU's
stopper thread will never run, causing the source CPU's stopper
thread to wait forever, and stall.

Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works() to ensure that the worker thread
that is waking up the stopper threads isn't preempted
by the source CPU's stopper thread, and permanently
scheduled out, leaving the remaining stopper thread asleep
in the wake queue.

Co-developed-by: Pavankumar Kondeti 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Pavankumar Kondeti 
Signed-off-by: Isaac J. Manjarres 
---
 kernel/stop_machine.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..1ff523d 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -270,7 +270,11 @@ static int cpu_stop_queue_two_works(int cpu1, struct 
cpu_stop_work *work1,
goto retry;
}
 
-   wake_up_q();
+   if (!err) {
+   preempt_disable();
+   wake_up_q();
+   preempt_enable();
+   }
 
return err;
 }
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] stop_machine: Remove cpu swap from stop_two_cpus

2018-06-26 Thread Isaac J. Manjarres

When invoking migrate_swap(), stop_two_cpus() swaps the
source and destination CPU IDs if the destination CPU
ID is greater than the source CPU ID. This leads to the
following race condition:

The source CPU invokes migrate_swap and sets itself as
the source CPU, and sets the destination CPU to another
CPU, such that the CPU ID of the destination CPU is
greater than that of the source CPU ID, and invokes
stop_two_cpus(cpu1=destination CPU, cpu2=source CPU,...)
Now, stop_two_cpus sees that the destination CPU ID is
greater than the source CPU ID, and performs the swap, so
that cpu1=source CPU, and cpu2=destination CPU.

The source CPU calls cpu_stop_queue_two_works(), with cpu1
as the source CPU, and cpu2 as the destination CPU. When
adding the stopper threads to the wake queue used in this
function, the source CPU stopper thread is added first,
and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the
threads are woken up in the order that they are queued in,
so the source CPU's stopper thread is woken up first, and
it preempts the thread running on the source CPU.

The stopper thread will then execute on the source CPU,
disable preemption, and begin executing multi_cpu_stop()
and wait for an ack from the destination CPU's stopper thread,
with preemption still disabled. Since the worker thread that
woke up the stopper thread on the source CPU is affine to the
source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper
thread from the wake queue, and thus, the destination CPU's
stopper thread will never run, causing the source CPU's stopper
thread to wait forever, and stall.

Remove CPU ID swapping in stop_two_cpus() so that the
source CPU's stopper thread is added to the wake queue last,
so that the source CPU's stopper thread is woken up last,
ensuring that all other threads that it depends on are woken
up before it runs.

Co-developed-by: Prasad Sodagudi 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Isaac J. Manjarres 
---
 kernel/stop_machine.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..d10d633 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -307,8 +307,6 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, 
cpu_stop_fn_t fn, void *
cpu_stop_init_done(, 2);
set_state(, MULTI_STOP_PREPARE);
 
-   if (cpu1 > cpu2)
-   swap(cpu1, cpu2);
if (cpu_stop_queue_two_works(cpu1, , cpu2, ))
return -ENOENT;
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] stop_machine: Remove cpu swap from stop_two_cpus

2018-06-26 Thread Isaac J. Manjarres

When invoking migrate_swap(), stop_two_cpus() swaps the
source and destination CPU IDs if the destination CPU
ID is greater than the source CPU ID. This leads to the
following race condition:

The source CPU invokes migrate_swap and sets itself as
the source CPU, and sets the destination CPU to another
CPU, such that the CPU ID of the destination CPU is
greater than that of the source CPU ID, and invokes
stop_two_cpus(cpu1=destination CPU, cpu2=source CPU,...)
Now, stop_two_cpus sees that the destination CPU ID is
greater than the source CPU ID, and performs the swap, so
that cpu1=source CPU, and cpu2=destination CPU.

The source CPU calls cpu_stop_queue_two_works(), with cpu1
as the source CPU, and cpu2 as the destination CPU. When
adding the stopper threads to the wake queue used in this
function, the source CPU stopper thread is added first,
and the destination CPU stopper thread is added last.

When wake_up_q() is invoked to wake the stopper threads, the
threads are woken up in the order that they are queued in,
so the source CPU's stopper thread is woken up first, and
it preempts the thread running on the source CPU.

The stopper thread will then execute on the source CPU,
disable preemption, and begin executing multi_cpu_stop()
and wait for an ack from the destination CPU's stopper thread,
with preemption still disabled. Since the worker thread that
woke up the stopper thread on the source CPU is affine to the
source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper
thread from the wake queue, and thus, the destination CPU's
stopper thread will never run, causing the source CPU's stopper
thread to wait forever, and stall.

Remove CPU ID swapping in stop_two_cpus() so that the
source CPU's stopper thread is added to the wake queue last,
so that the source CPU's stopper thread is woken up last,
ensuring that all other threads that it depends on are woken
up before it runs.

Co-developed-by: Prasad Sodagudi 
Signed-off-by: Prasad Sodagudi 
Signed-off-by: Isaac J. Manjarres 
---
 kernel/stop_machine.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..d10d633 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -307,8 +307,6 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, 
cpu_stop_fn_t fn, void *
cpu_stop_init_done(, 2);
set_state(, MULTI_STOP_PREPARE);
 
-   if (cpu1 > cpu2)
-   swap(cpu1, cpu2);
if (cpu_stop_queue_two_works(cpu1, , cpu2, ))
return -ENOENT;
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

45 matches

Mail list logo