Re: [PATCH 2/2] iommu: arm-smmu-v3: Report domain nesting info reuqired for stage1

2021-03-03 Thread Vivek Gautam
Hi Eric,

On Fri, Feb 12, 2021 at 11:44 PM Auger Eric  wrote:
>
> Hi Vivek,
>
> On 2/12/21 11:58 AM, Vivek Gautam wrote:
> > Update nested domain information required for stage1 page table.
>
> s/reuqired/required in the commit title

Oh! my bad.

> >
> > Signed-off-by: Vivek Gautam 
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 ++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index c11dd3940583..728018921fae 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -2555,6 +2555,7 @@ static int arm_smmu_domain_nesting_info(struct 
> > arm_smmu_domain *smmu_domain,
> >   void *data)
> >  {
> >   struct iommu_nesting_info *info = (struct iommu_nesting_info *)data;
> > + struct arm_smmu_device *smmu = smmu_domain->smmu;
> >   unsigned int size;
> >
> >   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> > @@ -2571,9 +2572,20 @@ static int arm_smmu_domain_nesting_info(struct 
> > arm_smmu_domain *smmu_domain,
> >   return 0;
> >   }
> >
> > - /* report an empty iommu_nesting_info for now */
> > - memset(info, 0x0, size);
> > + /* Update the nesting info as required for stage1 page tables */
> > + info->addr_width = smmu->ias;
> > + info->format = IOMMU_PASID_FORMAT_ARM_SMMU_V3;
> > + info->features = IOMMU_NESTING_FEAT_BIND_PGTBL |
> I understood IOMMU_NESTING_FEAT_BIND_PGTBL advertises the requirement to
> bind tables per PASID, ie. passing iommu_gpasid_bind_data.
> In ARM case I guess you plan to use attach/detach_pasid_table API with
> iommu_pasid_table_config struct. So I understood we should add a new
> feature here.

Right, the idea is to let vfio know that we support pasid table binding, and
I thought we could use the same flag. But clearly that's not the case.
Will add a new feature.

> > +  IOMMU_NESTING_FEAT_PAGE_RESP |
> > +  IOMMU_NESTING_FEAT_CACHE_INVLD;
> > + info->pasid_bits = smmu->ssid_bits;
> > + info->vendor.smmuv3.asid_bits = smmu->asid_bits;
> > + info->vendor.smmuv3.pgtbl_fmt = ARM_64_LPAE_S1;
> > + memset(&info->padding, 0x0, 12);
> > + memset(&info->vendor.smmuv3.padding, 0x0, 9);
> > +
> >   info->argsz = size;
> > +
> spurious new line

Sure, will remove it.

Best regards
Vivek

> >   return 0;
> >  }
> >
> >
>
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu: arm-smmu-v3: Report domain nesting info reuqired for stage1

2021-02-12 Thread Vivek Gautam
Update nested domain information required for stage1 page table.

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c11dd3940583..728018921fae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2555,6 +2555,7 @@ static int arm_smmu_domain_nesting_info(struct 
arm_smmu_domain *smmu_domain,
void *data)
 {
struct iommu_nesting_info *info = (struct iommu_nesting_info *)data;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
unsigned int size;
 
if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
@@ -2571,9 +2572,20 @@ static int arm_smmu_domain_nesting_info(struct 
arm_smmu_domain *smmu_domain,
return 0;
}
 
-   /* report an empty iommu_nesting_info for now */
-   memset(info, 0x0, size);
+   /* Update the nesting info as required for stage1 page tables */
+   info->addr_width = smmu->ias;
+   info->format = IOMMU_PASID_FORMAT_ARM_SMMU_V3;
+   info->features = IOMMU_NESTING_FEAT_BIND_PGTBL |
+IOMMU_NESTING_FEAT_PAGE_RESP |
+IOMMU_NESTING_FEAT_CACHE_INVLD;
+   info->pasid_bits = smmu->ssid_bits;
+   info->vendor.smmuv3.asid_bits = smmu->asid_bits;
+   info->vendor.smmuv3.pgtbl_fmt = ARM_64_LPAE_S1;
+   memset(&info->padding, 0x0, 12);
+   memset(&info->vendor.smmuv3.padding, 0x0, 9);
+
info->argsz = size;
+
return 0;
 }
 
-- 
2.17.1



[PATCH 0/2] Domain nesting info for arm-smmu

2021-02-12 Thread Vivek Gautam
These couple of patches are adding nesting information for arm
and are based on the domain nesting info patches by Yi [1,2,3].

Based on the discussion in the thread [4], sending these out as
I have been using in my tree [5] for nested translation based
on virtio-iommu on Arm reference platforms.

Thanks & regards
Vivek

[1] 
https://lore.kernel.org/kvm/1599734733-6431-2-git-send-email-yi.l@intel.com/
[2] 
https://lore.kernel.org/kvm/1599734733-6431-3-git-send-email-yi.l@intel.com/
[3] 
https://lore.kernel.org/kvm/1599734733-6431-4-git-send-email-yi.l@intel.com/
[4] https://lore.kernel.org/kvm/306e7dd2-9eb2-0ca3-6a93-7c9aa0821...@arm.com/
[5] 
https://github.com/vivek-arm/linux/tree/5.11-rc3-nested-pgtbl-arm-smmuv3-virtio-iommu

Vivek Gautam (2):
  iommu: Report domain nesting info for arm-smmu-v3
  iommu: arm-smmu-v3: Report domain nesting info reuqired for stage1

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 +--
 include/uapi/linux/iommu.h  | 31 +
 2 files changed, 39 insertions(+), 8 deletions(-)

-- 
2.17.1



[PATCH 1/2] iommu: Report domain nesting info for arm-smmu-v3

2021-02-12 Thread Vivek Gautam
Add a vendor specific structure for domain nesting info for
arm smmu-v3, and necessary info fields required to populate
stage1 page tables.

Signed-off-by: Vivek Gautam 
---
 include/uapi/linux/iommu.h | 31 +--
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 4d3d988fa353..5f059bcf7720 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -323,7 +323,8 @@ struct iommu_gpasid_bind_data {
 #define IOMMU_GPASID_BIND_VERSION_11
__u32 version;
 #define IOMMU_PASID_FORMAT_INTEL_VTD   1
-#define IOMMU_PASID_FORMAT_LAST2
+#define IOMMU_PASID_FORMAT_ARM_SMMU_V3 2
+#define IOMMU_PASID_FORMAT_LAST3
__u32 format;
__u32 addr_width;
 #define IOMMU_SVA_GPASID_VAL   (1 << 0) /* guest PASID valid */
@@ -409,6 +410,21 @@ struct iommu_nesting_info_vtd {
__u64   ecap_reg;
 };
 
+/*
+ * struct iommu_nesting_info_arm_smmuv3 - Arm SMMU-v3 nesting info.
+ */
+struct iommu_nesting_info_arm_smmuv3 {
+   __u32   flags;
+   __u16   asid_bits;
+
+   /* Arm LPAE page table format as per kernel */
+#define ARM_PGTBL_32_LPAE_S1   (0x0)
+#define ARM_PGTBL_64_LPAE_S1   (0x2)
+   __u8pgtbl_fmt;
+
+   __u8padding[9];
+};
+
 /*
  * struct iommu_nesting_info - Information for nesting-capable IOMMU.
  *userspace should check it before using
@@ -445,11 +461,13 @@ struct iommu_nesting_info_vtd {
  * +---+--+
  *
  * data struct types defined for @format:
- * ++=+
- * | @format| data struct |
- * ++=+
- * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd   |
- * ++-+
+ * ++==+
+ * | @format| data struct  |
+ * ++==+
+ * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd|
+ * +---+---+
+ * | IOMMU_PASID_FORMAT_ARM_SMMU_V3 | struct iommu_nesting_info_arm_smmuv3 |
+ * ++--+
  *
  */
 struct iommu_nesting_info {
@@ -466,6 +484,7 @@ struct iommu_nesting_info {
/* Vendor specific data */
union {
struct iommu_nesting_info_vtd vtd;
+   struct iommu_nesting_info_arm_smmuv3 smmuv3;
} vendor;
 };
 
-- 
2.17.1



[PATCH RFC v1 08/15] iommu: Add asid_bits to arm smmu-v3 stage1 table info

2021-01-15 Thread Vivek Gautam
aisd_bits data is required to prepare stage-1 tables for arm-smmu-v3.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/iommu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 082d758dd016..96abbfc7c643 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -357,7 +357,7 @@ struct iommu_pasid_smmuv3 {
__u32   version;
__u8s1fmt;
__u8s1dss;
-   __u8padding[2];
+   __u16   asid_bits;
 };
 
 /**
-- 
2.17.1



[PATCH RFC v1 14/15] iommu/virtio: Add support for Arm LPAE page table format

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

When PASID isn't supported, we can still register one set of tables.
Add support to register Arm LPAE based page table.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Clean-ups to add right tcr definitions and accomodate
with parent patches]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c  | 131 +-
 include/uapi/linux/virtio_iommu.h |  30 +++
 2 files changed, 139 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index b5222da1dc74..9cc3d35125e9 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -135,6 +135,13 @@ struct viommu_event {
 #define to_viommu_domain(domain)   \
container_of(domain, struct viommu_domain, domain)
 
+#define VIRTIO_FIELD_PREP(_mask, _shift, _val) \
+   ({  \
+   (((_val) << VIRTIO_IOMMU_PGTF_ARM_ ## _shift) & \
+(VIRTIO_IOMMU_PGTF_ARM_ ## _mask <<\
+ VIRTIO_IOMMU_PGTF_ARM_ ## _shift));   \
+   })
+
 static int viommu_get_req_errno(void *buf, size_t len)
 {
struct virtio_iommu_req_tail *tail = buf + len - sizeof(*tail);
@@ -897,6 +904,76 @@ static int viommu_simple_attach(struct viommu_domain 
*vdomain,
return ret;
 }
 
+static int viommu_config_arm_pgt(struct viommu_endpoint *vdev,
+struct io_pgtable_cfg *cfg,
+struct virtio_iommu_req_attach_pgt_arm *req,
+u64 *asid)
+{
+   int id;
+   struct virtio_iommu_probe_table_format *pgtf = (void *)vdev->pgtf;
+   typeof(&cfg->arm_lpae_s1_cfg.tcr) tcr = &cfg->arm_lpae_s1_cfg.tcr;
+   u64 __tcr;
+
+   if (pgtf->asid_bits != 8 && pgtf->asid_bits != 16)
+   return -EINVAL;
+
+   id = ida_simple_get(&asid_ida, 1, 1 << pgtf->asid_bits, GFP_KERNEL);
+   if (id < 0)
+   return -ENOMEM;
+
+   __tcr = VIRTIO_FIELD_PREP(T0SZ_MASK, T0SZ_SHIFT, tcr->tsz) |
+   VIRTIO_FIELD_PREP(IRGN0_MASK, IRGN0_SHIFT, tcr->irgn) |
+   VIRTIO_FIELD_PREP(ORGN0_MASK, ORGN0_SHIFT, tcr->orgn) |
+   VIRTIO_FIELD_PREP(SH0_MASK, SH0_SHIFT, tcr->sh) |
+   VIRTIO_FIELD_PREP(TG0_MASK, TG0_SHIFT, tcr->tg) |
+   VIRTIO_IOMMU_PGTF_ARM_EPD1 | VIRTIO_IOMMU_PGTF_ARM_HPD0 |
+   VIRTIO_IOMMU_PGTF_ARM_HPD1;
+
+   req->format = cpu_to_le16(VIRTIO_IOMMU_FOMRAT_PGTF_ARM_LPAE);
+   req->ttbr   = cpu_to_le64(cfg->arm_lpae_s1_cfg.ttbr);
+   req->tcr= cpu_to_le64(__tcr);
+   req->mair   = cpu_to_le64(cfg->arm_lpae_s1_cfg.mair);
+   req->asid   = cpu_to_le16(id);
+
+   *asid = id;
+   return 0;
+}
+
+static int viommu_attach_pgtable(struct viommu_endpoint *vdev,
+struct viommu_domain *vdomain,
+enum io_pgtable_fmt fmt,
+struct io_pgtable_cfg *cfg,
+u64 *asid)
+{
+   int ret;
+   int i, eid;
+
+   struct virtio_iommu_req_attach_table req = {
+   .head.type  = VIRTIO_IOMMU_T_ATTACH_TABLE,
+   .domain = cpu_to_le32(vdomain->id),
+   };
+
+   switch (fmt) {
+   case ARM_64_LPAE_S1:
+   ret = viommu_config_arm_pgt(vdev, cfg, (void *)&req, asid);
+   if (ret)
+   return ret;
+   break;
+   default:
+   WARN_ON(1);
+   return -EINVAL;
+   }
+
+   vdev_for_each_id(i, eid, vdev) {
+   req.endpoint = cpu_to_le32(eid);
+   ret = viommu_send_req_sync(vdomain->viommu, &req, sizeof(req));
+   if (ret)
+   return ret;
+   }
+
+   return 0;
+}
+
 static int viommu_teardown_pgtable(struct viommu_domain *vdomain)
 {
struct iommu_vendor_psdtable_cfg *pst_cfg;
@@ -972,32 +1049,42 @@ static int viommu_setup_pgtable(struct viommu_endpoint 
*vdev,
if (!ops)
return -ENOMEM;
 
-   pst_cfg = &tbl->cfg;
-   cfgi = &pst_cfg->vendor.cfg;
-   id = ida_simple_get(&asid_ida, 1, 1 << desc->asid_bits, GFP_KERNEL);
-   if (id < 0) {
-   ret = id;
-   goto err_free_pgtable;
-   }
+   if (!tbl) {
+   /* No PASID support, send attach_table */
+   ret = viommu_attach_pgtable(vdev, vdomain, fmt, &cfg,

[PATCH RFC v1 13/15] iommu/virtio: Attach Arm PASID tables when available

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

When the ARM PASID table format is reported in a probe, send an attach
request and install the page tables for iommu_map/iommu_unmap use.
Architecture-specific components are already abstracted to libraries. We
just need to pass config bits around and setup an alternative mechanism to
the mapping tree.

We reuse the convention already adopted by other IOMMU architectures (ARM
SMMU and AMD IOMMU), that entry 0 in the PASID table is reserved for
non-PASID traffic. Bind the PASID table, and setup entry 0 to be modified
with iommu_map/unmap.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Bunch of refactoring and clean-ups to use iommu-pasid-table APIs,
creating iommu_pasid_table, and configuring based on reported
pasid format. Couple of additional methods have also been created
to configure vendor specific pasid configuration]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 314 +++
 1 file changed, 314 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 004ea94e3731..b5222da1dc74 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include "iommu-pasid-table.h"
 
 #define MSI_IOVA_BASE  0x800
 #define MSI_IOVA_LENGTH0x10
@@ -33,6 +34,9 @@
 #define VIOMMU_EVENT_VQ1
 #define VIOMMU_NR_VQS  2
 
+/* Some architectures need an Address Space ID for each page table */
+static DEFINE_IDA(asid_ida);
+
 struct viommu_dev {
struct iommu_device iommu;
struct device   *dev;
@@ -55,6 +59,7 @@ struct viommu_dev {
u32 probe_size;
 
boolhas_map:1;
+   boolhas_table:1;
 };
 
 struct viommu_mapping {
@@ -76,6 +81,7 @@ struct viommu_domain {
struct mutexmutex; /* protects viommu pointer */
unsigned intid;
u32 map_flags;
+   struct iommu_pasid_table*pasid_tbl;
 
/* Default address space when a table is bound */
struct viommu_mmmm;
@@ -891,6 +897,285 @@ static int viommu_simple_attach(struct viommu_domain 
*vdomain,
return ret;
 }
 
+static int viommu_teardown_pgtable(struct viommu_domain *vdomain)
+{
+   struct iommu_vendor_psdtable_cfg *pst_cfg;
+   struct arm_smmu_cfg_info *cfgi;
+   u32 asid;
+
+   if (!vdomain->mm.ops)
+   return 0;
+
+   free_io_pgtable_ops(vdomain->mm.ops);
+   vdomain->mm.ops = NULL;
+
+   if (vdomain->pasid_tbl) {
+   pst_cfg = &vdomain->pasid_tbl->cfg;
+   cfgi = &pst_cfg->vendor.cfg;
+   asid = cfgi->s1_cfg->cd.asid;
+
+   iommu_psdtable_write(vdomain->pasid_tbl, pst_cfg, 0, NULL);
+   ida_simple_remove(&asid_ida, asid);
+   }
+
+   return 0;
+}
+
+static int viommu_setup_pgtable(struct viommu_endpoint *vdev,
+   struct viommu_domain *vdomain)
+{
+   int ret, id;
+   u32 asid;
+   enum io_pgtable_fmt fmt;
+   struct io_pgtable_ops *ops = NULL;
+   struct viommu_dev *viommu = vdev->viommu;
+   struct virtio_iommu_probe_table_format *desc = vdev->pgtf;
+   struct iommu_pasid_table *tbl = vdomain->pasid_tbl;
+   struct iommu_vendor_psdtable_cfg *pst_cfg;
+   struct arm_smmu_cfg_info *cfgi;
+   struct io_pgtable_cfg cfg = {
+   .iommu_dev  = viommu->dev->parent,
+   .tlb= &viommu_flush_ops,
+   .pgsize_bitmap  = vdev->pgsize_mask ? vdev->pgsize_mask :
+ vdomain->domain.pgsize_bitmap,
+   .ias= (vdev->input_end ? ilog2(vdev->input_end) :
+  
ilog2(vdomain->domain.geometry.aperture_end)) + 1,
+   .oas= vdev->output_bits,
+   };
+
+   if (!desc)
+   return -EINVAL;
+
+   if (!vdev->output_bits)
+   return -ENODEV;
+
+   switch (le16_to_cpu(desc->format)) {
+   case VIRTIO_IOMMU_FOMRAT_PGTF_ARM_LPAE:
+   fmt = ARM_64_LPAE_S1;
+   break;
+   default:
+   dev_err(vdev->dev, "unsupported page table format 0x%x\n",
+   le16_to_cpu(desc->format));
+   return -EINVAL;
+   }
+
+   if (vdomain->mm.ops) {
+   /

[PATCH RFC v1 15/15] iommu/virtio: Update fault type and reason info for viommu fault

2021-01-15 Thread Vivek Gautam
Fault type information can tell about a page request fault or
an unreceoverable fault, and further additions to fault reasons
and the related PASID information can help in handling faults
efficiently.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c  | 27 +--
 include/uapi/linux/virtio_iommu.h | 13 -
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 9cc3d35125e9..10ef9e98214a 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -652,9 +652,16 @@ static int viommu_fault_handler(struct viommu_dev *viommu,
char *reason_str;
 
u8 reason   = fault->reason;
+   u16 type= fault->flt_type;
u32 flags   = le32_to_cpu(fault->flags);
u32 endpoint= le32_to_cpu(fault->endpoint);
u64 address = le64_to_cpu(fault->address);
+   u32 pasid   = le32_to_cpu(fault->pasid);
+
+   if (type == VIRTIO_IOMMU_FAULT_F_PAGE_REQ) {
+   dev_info(viommu->dev, "Page request fault - unhandled\n");
+   return 0;
+   }
 
switch (reason) {
case VIRTIO_IOMMU_FAULT_R_DOMAIN:
@@ -663,6 +670,21 @@ static int viommu_fault_handler(struct viommu_dev *viommu,
case VIRTIO_IOMMU_FAULT_R_MAPPING:
reason_str = "page";
break;
+   case VIRTIO_IOMMU_FAULT_R_WALK_EABT:
+   reason_str = "page walk external abort";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_PTE_FETCH:
+   reason_str = "pte fetch";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_PERMISSION:
+   reason_str = "permission";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_ACCESS:
+   reason_str = "access";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_OOR_ADDRESS:
+   reason_str = "output address";
+   break;
case VIRTIO_IOMMU_FAULT_R_UNKNOWN:
default:
reason_str = "unknown";
@@ -671,8 +693,9 @@ static int viommu_fault_handler(struct viommu_dev *viommu,
 
/* TODO: find EP by ID and report_iommu_fault */
if (flags & VIRTIO_IOMMU_FAULT_F_ADDRESS)
-   dev_err_ratelimited(viommu->dev, "%s fault from EP %u at %#llx 
[%s%s%s]\n",
-   reason_str, endpoint, address,
+   dev_err_ratelimited(viommu->dev,
+   "%s fault from EP %u PASID %u at %#llx 
[%s%s%s]\n",
+   reason_str, endpoint, pasid, address,
flags & VIRTIO_IOMMU_FAULT_F_READ ? "R" : 
"",
flags & VIRTIO_IOMMU_FAULT_F_WRITE ? "W" : 
"",
flags & VIRTIO_IOMMU_FAULT_F_EXEC ? "X" : 
"");
diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 608c8d642e1f..a537d82777f7 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -290,19 +290,30 @@ struct virtio_iommu_req_invalidate {
 #define VIRTIO_IOMMU_FAULT_R_UNKNOWN   0
 #define VIRTIO_IOMMU_FAULT_R_DOMAIN1
 #define VIRTIO_IOMMU_FAULT_R_MAPPING   2
+#define VIRTIO_IOMMU_FAULT_R_WALK_EABT 3
+#define VIRTIO_IOMMU_FAULT_R_PTE_FETCH 4
+#define VIRTIO_IOMMU_FAULT_R_PERMISSION5
+#define VIRTIO_IOMMU_FAULT_R_ACCESS6
+#define VIRTIO_IOMMU_FAULT_R_OOR_ADDRESS   7
 
 #define VIRTIO_IOMMU_FAULT_F_READ  (1 << 0)
 #define VIRTIO_IOMMU_FAULT_F_WRITE (1 << 1)
 #define VIRTIO_IOMMU_FAULT_F_EXEC  (1 << 2)
 #define VIRTIO_IOMMU_FAULT_F_ADDRESS   (1 << 8)
 
+#define VIRTIO_IOMMU_FAULT_F_DMA_UNRECOV   1
+#define VIRTIO_IOMMU_FAULT_F_PAGE_REQ  2
+
 struct virtio_iommu_fault {
__u8reason;
-   __u8reserved[3];
+   __le16  flt_type;
+   __u8reserved;
__le32  flags;
__le32  endpoint;
__u8reserved2[4];
__le64  address;
+   __le32  pasid;
+   __u8reserved3[4];
 };
 
 #endif
-- 
2.17.1



[PATCH RFC v1 00/15] iommu/virtio: Nested stage support with Arm

2021-01-15 Thread Vivek Gautam
This patch-series aims at enabling Nested stage translation in guests
using virtio-iommu as the paravirtualized iommu. The backend is supported
with Arm SMMU-v3 that provides nested stage-1 and stage-2 translation.

This series derives its purpose from various efforts happening to add
support for Shared Virtual Addressing (SVA) in host and guest. On Arm,
most of the support for SVA has already landed. The support for nested
stage translation and fault reporting to guest has been proposed [1].
The related changes required in VFIO [2] framework have also been put
forward.

This series proposes changes in virtio-iommu to program PASID tables
and related stage-1 page tables. A simple iommu-pasid-table library
is added for this purpose that interacts with vendor drivers to
allocate and populate PASID tables.
In Arm SMMUv3 we propose to pull the Context Descriptor (CD) management
code out of the arm-smmu-v3 driver and add that as a glue vendor layer
to support allocating CD tables, and populating them with right values.
These CD tables are essentially the PASID tables and contain stage-1
page table configurations too.
A request to setup these CD tables come from virtio-iommu driver using
the iommu-pasid-table library when running on Arm. The virtio-iommu
then pass these PASID tables to the host using the right virtio backend
and support in VMM.

For testing we have added necessary support in kvmtool. The changes in
kvmtool are based on virtio-iommu development branch by Jean-Philippe
Brucker [3].

The tested kernel branch contains following in the order bottom to top
on the git hash -
a) v5.11-rc3
b) arm-smmu-v3 [1] and vfio [2] changes from Eric to add nested page
   table support for Arm.
c) Smmu test engine patches from Jean-Philippe's branch [4]
d) This series
e) Domain nesting info patches [5][6][7].
f) Changes to add arm-smmu-v3 specific nesting info (to be sent to
   the list).

This kernel is tested on Neoverse reference software stack with
Fixed virtual platform. Public version of the software stack and
FVP is available here[8][9].

A big thanks to Jean-Philippe for his contributions towards this work
and for his valuable guidance.

[1] 
https://lore.kernel.org/linux-iommu/20201118112151.25412-1-eric.au...@redhat.com/T/
[2] 
https://lore.kernel.org/kvmarm/20201116110030.32335-12-eric.au...@redhat.com/T/
[3] https://jpbrucker.net/git/kvmtool/log/?h=virtio-iommu/devel
[4] https://jpbrucker.net/git/linux/log/?h=sva/smmute
[5] 
https://lore.kernel.org/kvm/1599734733-6431-2-git-send-email-yi.l@intel.com/
[6] 
https://lore.kernel.org/kvm/1599734733-6431-3-git-send-email-yi.l@intel.com/
[7] 
https://lore.kernel.org/kvm/1599734733-6431-4-git-send-email-yi.l@intel.com/
[8] 
https://developer.arm.com/tools-and-software/open-source-software/arm-platforms-software/arm-ecosystem-fvps
[9] 
https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms.git/about/docs/rdn1edge/user-guide.rst

Jean-Philippe Brucker (6):
  iommu/virtio: Add headers for table format probing
  iommu/virtio: Add table format probing
  iommu/virtio: Add headers for binding pasid table in iommu
  iommu/virtio: Add support for INVALIDATE request
  iommu/virtio: Attach Arm PASID tables when available
  iommu/virtio: Add support for Arm LPAE page table format

Vivek Gautam (9):
  iommu/arm-smmu-v3: Create a Context Descriptor library
  iommu: Add a simple PASID table library
  iommu/arm-smmu-v3: Update drivers to work with iommu-pasid-table
  iommu/arm-smmu-v3: Update CD base address info for user-space
  iommu/arm-smmu-v3: Set sync op from consumer driver of cd-lib
  iommu: Add asid_bits to arm smmu-v3 stage1 table info
  iommu/virtio: Update table format probing header
  iommu/virtio: Prepare to add attach pasid table infrastructure
  iommu/virtio: Update fault type and reason info for viommu fault

 drivers/iommu/arm/arm-smmu-v3/Makefile|   2 +-
 .../arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c  | 283 +++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  16 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   4 +-
 drivers/iommu/iommu-pasid-table.h | 140 
 drivers/iommu/virtio-iommu.c  | 692 +-
 include/uapi/linux/iommu.h|   2 +-
 include/uapi/linux/virtio_iommu.h | 158 +++-
 9 files changed, 1303 insertions(+), 262 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
 create mode 100644 drivers/iommu/iommu-pasid-table.h

-- 
2.17.1



[PATCH RFC v1 11/15] iommu/virtio: Add headers for binding pasid table in iommu

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

Add the required UAPI defines for binding pasid tables in virtio-iommu.
This mode allows to hand stage-1 page tables over to the guest.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Refactor to cleanup headers for invalidation]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/virtio_iommu.h | 68 +++
 1 file changed, 68 insertions(+)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 8a0624bab4b2..3481e4a3dd24 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -16,6 +16,7 @@
 #define VIRTIO_IOMMU_F_BYPASS  3
 #define VIRTIO_IOMMU_F_PROBE   4
 #define VIRTIO_IOMMU_F_MMIO5
+#define VIRTIO_IOMMU_F_ATTACH_TABLE6
 
 struct virtio_iommu_range_64 {
__le64  start;
@@ -44,6 +45,8 @@ struct virtio_iommu_config {
 #define VIRTIO_IOMMU_T_MAP 0x03
 #define VIRTIO_IOMMU_T_UNMAP   0x04
 #define VIRTIO_IOMMU_T_PROBE   0x05
+#define VIRTIO_IOMMU_T_ATTACH_TABLE0x06
+#define VIRTIO_IOMMU_T_INVALIDATE  0x07
 
 /* Status types */
 #define VIRTIO_IOMMU_S_OK  0x00
@@ -82,6 +85,37 @@ struct virtio_iommu_req_detach {
struct virtio_iommu_req_tailtail;
 };
 
+struct virtio_iommu_req_attach_table {
+   struct virtio_iommu_req_headhead;
+   __le32  domain;
+   __le32  endpoint;
+   __le16  format;
+   __u8reserved[62];
+   struct virtio_iommu_req_tailtail;
+};
+
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_LINEAR   0x0
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_4KL2 0x1
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_64KL20x2
+
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_DSS_TERM 0x0
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_DSS_BYPASS 0x1
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_DSS_00x2
+
+/* Arm SMMUv3 PASID Table Descriptor */
+struct virtio_iommu_req_attach_pst_arm {
+   struct virtio_iommu_req_headhead;
+   __le32  domain;
+   __le32  endpoint;
+   __le16  format;
+   __u8s1fmt;
+   __u8s1dss;
+   __le64  s1contextptr;
+   __le32  s1cdmax;
+   __u8reserved[48];
+   struct virtio_iommu_req_tailtail;
+};
+
 #define VIRTIO_IOMMU_MAP_F_READ(1 << 0)
 #define VIRTIO_IOMMU_MAP_F_WRITE   (1 << 1)
 #define VIRTIO_IOMMU_MAP_F_MMIO(1 << 2)
@@ -188,6 +222,40 @@ struct virtio_iommu_req_probe {
 */
 };
 
+#define VIRTIO_IOMMU_INVAL_G_DOMAIN(1 << 0)
+#define VIRTIO_IOMMU_INVAL_G_PASID (1 << 1)
+#define VIRTIO_IOMMU_INVAL_G_VA(1 << 2)
+
+#define VIRTIO_IOMMU_INV_T_IOTLB   (1 << 0)
+#define VIRTIO_IOMMU_INV_T_DEV_IOTLB   (1 << 1)
+#define VIRTIO_IOMMU_INV_T_PASID   (1 << 2)
+
+#define VIRTIO_IOMMU_INVAL_F_PASID (1 << 0)
+#define VIRTIO_IOMMU_INVAL_F_ARCHID(1 << 1)
+#define VIRTIO_IOMMU_INVAL_F_LEAF  (1 << 2)
+
+struct virtio_iommu_req_invalidate {
+   struct virtio_iommu_req_headhead;
+   __le16  inv_gran;
+   __le16  inv_type;
+
+   __le16  flags;
+   __u8reserved1[2];
+   __le32  domain;
+
+   __le32  pasid;
+   __u8reserved2[4];
+
+   __le64  archid;
+   __le64  virt_start;
+   __le64  nr_pages;
+
+   /* Page size, in nr of bits, typically 12 for 4k, 30 for 2MB, etc.) */
+   __u8granule;
+   __u8reserved3[11];
+   struct virtio_iommu_req_tailtail;
+};
+
 /* Fault types */
 #define VIRTIO_IOMMU_FAULT_R_UNKNOWN   0
 #define VIRTIO_IOMMU_FAULT_R_DOMAIN1
-- 
2.17.1



[PATCH RFC v1 12/15] iommu/virtio: Add support for INVALIDATE request

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

Add support for tlb invalidation ops that can send invalidation
requests to back-end virtio-iommu when stage-1 page tables are
supported.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Refactoring the iommu_flush_ops, and adding only one pasid sync
op that's needed with current iommu-pasid-table infrastructure.
Also updating uapi defines as required by latest changes]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 95 
 1 file changed, 95 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index ae5dfd3f8269..004ea94e3731 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -63,6 +64,8 @@ struct viommu_mapping {
 };
 
 struct viommu_mm {
+   int pasid;
+   u64 archid;
struct io_pgtable_ops   *ops;
struct viommu_domain*domain;
 };
@@ -692,6 +695,98 @@ static void viommu_event_handler(struct virtqueue *vq)
virtqueue_kick(vq);
 }
 
+/* PASID and pgtable APIs */
+
+static void __viommu_flush_pasid_tlb_all(struct viommu_domain *vdomain,
+int pasid, u64 arch_id, int type)
+{
+   struct virtio_iommu_req_invalidate req = {
+   .head.type  = VIRTIO_IOMMU_T_INVALIDATE,
+   .inv_gran   = cpu_to_le32(VIRTIO_IOMMU_INVAL_G_PASID),
+   .flags  = cpu_to_le32(VIRTIO_IOMMU_INVAL_F_PASID),
+   .inv_type   = cpu_to_le32(type),
+
+   .domain = cpu_to_le32(vdomain->id),
+   .pasid  = cpu_to_le32(pasid),
+   .archid = cpu_to_le64(arch_id),
+   };
+
+   if (viommu_send_req_sync(vdomain->viommu, &req, sizeof(req)))
+   pr_debug("could not send invalidate request\n");
+}
+
+static void viommu_flush_tlb_add(struct iommu_iotlb_gather *gather,
+unsigned long iova, size_t granule,
+void *cookie)
+{
+   struct viommu_mm *viommu_mm = cookie;
+   struct viommu_domain *vdomain = viommu_mm->domain;
+   struct iommu_domain *domain = &vdomain->domain;
+
+   iommu_iotlb_gather_add_page(domain, gather, iova, granule);
+}
+
+static void viommu_flush_tlb_walk(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+   struct viommu_mm *viommu_mm = cookie;
+   struct viommu_domain *vdomain = viommu_mm->domain;
+   struct virtio_iommu_req_invalidate req = {
+   .head.type  = VIRTIO_IOMMU_T_INVALIDATE,
+   .inv_gran   = cpu_to_le32(VIRTIO_IOMMU_INVAL_G_VA),
+   .inv_type   = cpu_to_le32(VIRTIO_IOMMU_INV_T_IOTLB),
+   .flags  = cpu_to_le32(VIRTIO_IOMMU_INVAL_F_ARCHID),
+
+   .domain = cpu_to_le32(vdomain->id),
+   .pasid  = cpu_to_le32(viommu_mm->pasid),
+   .archid = cpu_to_le64(viommu_mm->archid),
+   .virt_start = cpu_to_le64(iova),
+   .nr_pages   = cpu_to_le64(size / granule),
+   .granule= ilog2(granule),
+   };
+
+   if (viommu_add_req(vdomain->viommu, &req, sizeof(req)))
+   pr_debug("could not add invalidate request\n");
+}
+
+static void viommu_flush_tlb_all(void *cookie)
+{
+   struct viommu_mm *viommu_mm = cookie;
+
+   if (!viommu_mm->archid)
+   return;
+
+   __viommu_flush_pasid_tlb_all(viommu_mm->domain, viommu_mm->pasid,
+viommu_mm->archid,
+VIRTIO_IOMMU_INV_T_IOTLB);
+}
+
+static struct iommu_flush_ops viommu_flush_ops = {
+   .tlb_flush_all  = viommu_flush_tlb_all,
+   .tlb_flush_walk = viommu_flush_tlb_walk,
+   .tlb_add_page   = viommu_flush_tlb_add,
+};
+
+static void viommu_flush_pasid(void *cookie, int pasid, bool leaf)
+{
+   struct viommu_domain *vdomain = cookie;
+   struct virtio_iommu_req_invalidate req = {
+   .head.type  = VIRTIO_IOMMU_T_INVALIDATE,
+   .inv_gran   = cpu_to_le32(VIRTIO_IOMMU_INVAL_G_PASID),
+   .inv_type   = cpu_to_le32(VIRTIO_IOMMU_INV_T_PASID),
+   .flags  = cpu_to_le32(VIRTIO_IOMMU_INVAL_F_PASID),
+
+   .domain = cpu_to_le32(vdomain->id),
+   .pasid 

[PATCH RFC v1 10/15] iommu/virtio: Prepare to add attach pasid table infrastructure

2021-01-15 Thread Vivek Gautam
In preparation to add attach pasid table op, separate out the
existing attach request code to a separate method.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 73 +---
 1 file changed, 51 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 12d73321dbf4..ae5dfd3f8269 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -52,6 +52,8 @@ struct viommu_dev {
/* Supported MAP flags */
u32 map_flags;
u32 probe_size;
+
+   boolhas_map:1;
 };
 
 struct viommu_mapping {
@@ -60,6 +62,11 @@ struct viommu_mapping {
u32 flags;
 };
 
+struct viommu_mm {
+   struct io_pgtable_ops   *ops;
+   struct viommu_domain*domain;
+};
+
 struct viommu_domain {
struct iommu_domain domain;
struct viommu_dev   *viommu;
@@ -67,12 +74,20 @@ struct viommu_domain {
unsigned intid;
u32 map_flags;
 
+   /* Default address space when a table is bound */
+   struct viommu_mmmm;
+
+   /* When no table is bound, use generic mappings */
spinlock_t  mappings_lock;
struct rb_root_cached   mappings;
 
unsigned long   nr_endpoints;
 };
 
+#define vdev_for_each_id(i, eid, vdev) \
+   for (i = 0; i < vdev->dev->iommu->fwspec->num_ids &&\
+   ({ eid = vdev->dev->iommu->fwspec->ids[i]; 1; }); i++)
+
 struct viommu_endpoint {
struct device   *dev;
struct viommu_dev   *viommu;
@@ -750,12 +765,40 @@ static void viommu_domain_free(struct iommu_domain 
*domain)
kfree(vdomain);
 }
 
+static int viommu_simple_attach(struct viommu_domain *vdomain,
+   struct viommu_endpoint *vdev)
+{
+   int i, eid, ret;
+   struct virtio_iommu_req_attach req = {
+   .head.type  = VIRTIO_IOMMU_T_ATTACH,
+   .domain = cpu_to_le32(vdomain->id),
+   };
+
+   if (!vdomain->viommu->has_map)
+   return -ENODEV;
+
+   vdev_for_each_id(i, eid, vdev) {
+   req.endpoint = cpu_to_le32(eid);
+
+   ret = viommu_send_req_sync(vdomain->viommu, &req, sizeof(req));
+   if (ret)
+   return ret;
+   }
+
+   if (!vdomain->nr_endpoints) {
+   /*
+* This endpoint is the first to be attached to the domain.
+* Replay existing mappings if any (e.g. SW MSI).
+*/
+   ret = viommu_replay_mappings(vdomain);
+   }
+
+   return ret;
+}
+
 static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
-   int i;
int ret = 0;
-   struct virtio_iommu_req_attach req;
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct viommu_endpoint *vdev = dev_iommu_priv_get(dev);
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
@@ -790,25 +833,9 @@ static int viommu_attach_dev(struct iommu_domain *domain, 
struct device *dev)
if (vdev->vdomain)
vdev->vdomain->nr_endpoints--;
 
-   req = (struct virtio_iommu_req_attach) {
-   .head.type  = VIRTIO_IOMMU_T_ATTACH,
-   .domain = cpu_to_le32(vdomain->id),
-   };
-
-   for (i = 0; i < fwspec->num_ids; i++) {
-   req.endpoint = cpu_to_le32(fwspec->ids[i]);
-
-   ret = viommu_send_req_sync(vdomain->viommu, &req, sizeof(req));
-   if (ret)
-   return ret;
-   }
-
-   if (!vdomain->nr_endpoints) {
-   /*
-* This endpoint is the first to be attached to the domain.
-* Replay existing mappings (e.g. SW MSI).
-*/
-   ret = viommu_replay_mappings(vdomain);
+   if (!vdomain->mm.ops) {
+   /* If we couldn't bind any table, use the mapping tree */
+   ret = viommu_simple_attach(vdomain, vdev);
if (ret)
return ret;
}
@@ -1142,6 +1169,8 @@ static int viommu_probe(struct virtio_device *vdev)
struct virtio_iommu_config, probe_size,
&viommu->probe_size);
 
+   viommu->has_map = virtio_

[PATCH RFC v1 09/15] iommu/virtio: Update table format probing header

2021-01-15 Thread Vivek Gautam
Add info about asid_bits and additional flags to table format
probing header.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/virtio_iommu.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 43821e33e7af..8a0624bab4b2 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -169,7 +169,10 @@ struct virtio_iommu_probe_pasid_size {
 struct virtio_iommu_probe_table_format {
struct virtio_iommu_probe_property  head;
__le16  format;
-   __u8reserved[2];
+   __le16  asid_bits;
+
+   __le32  flags;
+   __u8reserved[4];
 };
 
 struct virtio_iommu_req_probe {
-- 
2.17.1



[PATCH RFC v1 06/15] iommu/virtio: Add headers for table format probing

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

Add required UAPI defines for probing table format for underlying
iommu hardware. The device may provide information about hardware
tables and additional capabilities for each device.
This allows guest to correctly fabricate stage-1 page tables.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Use a single "struct virtio_iommu_probe_table_format" rather
than separate structures for page table and pasid table format.
Also update commit message.]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/virtio_iommu.h | 44 ++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 237e36a280cb..43821e33e7af 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -2,7 +2,7 @@
 /*
  * Virtio-iommu definition v0.12
  *
- * Copyright (C) 2019 Arm Ltd.
+ * Copyright (C) 2019-2021 Arm Ltd.
  */
 #ifndef _UAPI_LINUX_VIRTIO_IOMMU_H
 #define _UAPI_LINUX_VIRTIO_IOMMU_H
@@ -111,6 +111,12 @@ struct virtio_iommu_req_unmap {
 
 #define VIRTIO_IOMMU_PROBE_T_NONE  0
 #define VIRTIO_IOMMU_PROBE_T_RESV_MEM  1
+#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK2
+#define VIRTIO_IOMMU_PROBE_T_INPUT_RANGE   3
+#define VIRTIO_IOMMU_PROBE_T_OUTPUT_SIZE   4
+#define VIRTIO_IOMMU_PROBE_T_PASID_SIZE5
+#define VIRTIO_IOMMU_PROBE_T_PAGE_TABLE_FMT6
+#define VIRTIO_IOMMU_PROBE_T_PASID_TABLE_FMT   7
 
 #define VIRTIO_IOMMU_PROBE_T_MASK  0xfff
 
@@ -130,6 +136,42 @@ struct virtio_iommu_probe_resv_mem {
__le64  end;
 };
 
+struct virtio_iommu_probe_page_size_mask {
+   struct virtio_iommu_probe_property  head;
+   __u8reserved[4];
+   __le64  mask;
+};
+
+struct virtio_iommu_probe_input_range {
+   struct virtio_iommu_probe_property  head;
+   __u8reserved[4];
+   __le64  start;
+   __le64  end;
+};
+
+struct virtio_iommu_probe_output_size {
+   struct virtio_iommu_probe_property  head;
+   __u8bits;
+   __u8reserved[3];
+};
+
+struct virtio_iommu_probe_pasid_size {
+   struct virtio_iommu_probe_property  head;
+   __u8bits;
+   __u8reserved[3];
+};
+
+/* Arm LPAE page table format */
+#define VIRTIO_IOMMU_FOMRAT_PGTF_ARM_LPAE  1
+/* Arm smmu-v3 type PASID table format */
+#define VIRTIO_IOMMU_FORMAT_PSTF_ARM_SMMU_V3   2
+
+struct virtio_iommu_probe_table_format {
+   struct virtio_iommu_probe_property  head;
+   __le16  format;
+   __u8reserved[2];
+};
+
 struct virtio_iommu_req_probe {
struct virtio_iommu_req_headhead;
__le32  endpoint;
-- 
2.17.1



[PATCH RFC v1 05/15] iommu/arm-smmu-v3: Set sync op from consumer driver of cd-lib

2021-01-15 Thread Vivek Gautam
Te change allows different consumers of arm-smmu-v3-cd-lib to set
their respective sync op for pasid entries.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c | 1 -
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c| 7 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
index ec37476c8d09..acaa09acecdd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -265,7 +265,6 @@ struct iommu_vendor_psdtable_ops arm_cd_table_ops = {
.free= arm_smmu_free_cd_tables,
.prepare = arm_smmu_prepare_cd,
.write   = arm_smmu_write_ctx_desc,
-   .sync= arm_smmu_sync_cd,
 };
 
 struct iommu_pasid_table *arm_smmu_register_cd_table(struct device *dev,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2f86c6ac42b6..0c644be22b4b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1869,6 +1869,13 @@ static int arm_smmu_domain_finalise_s1(struct 
arm_smmu_domain *smmu_domain,
if (ret)
goto out_free_cd_tables;
 
+   /*
+* Strange to setup an op here?
+* cd-lib is the actual user of sync op, and therefore the platform
+* drivers should assign this sync/maintenance ops as per need.
+*/
+   tbl->ops->sync = arm_smmu_sync_cd;
+
/*
 * Note that this will end up calling arm_smmu_sync_cd() before
 * the master has been added to the devices list for this domain.
-- 
2.17.1



[PATCH RFC v1 07/15] iommu/virtio: Add table format probing

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

The device may provide information about hardware tables and additional
capabilities for each device. Parse the new probe fields.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Refactor to use "struct virtio_iommu_probe_table_format" rather
than separate structures for page table and pasid table format.]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 102 ++-
 1 file changed, 101 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 2bfdd5734844..12d73321dbf4 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -78,6 +78,17 @@ struct viommu_endpoint {
struct viommu_dev   *viommu;
struct viommu_domain*vdomain;
struct list_headresv_regions;
+
+   /* properties of the physical IOMMU */
+   u64 pgsize_mask;
+   u64 input_start;
+   u64 input_end;
+   u8  output_bits;
+   u8  pasid_bits;
+   /* Preferred PASID table format */
+   void*pstf;
+   /* Preferred page table format */
+   void*pgtf;
 };
 
 struct viommu_request {
@@ -457,6 +468,72 @@ static int viommu_add_resv_mem(struct viommu_endpoint 
*vdev,
return 0;
 }
 
+static int viommu_add_pgsize_mask(struct viommu_endpoint *vdev,
+ struct virtio_iommu_probe_page_size_mask 
*prop,
+ size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->pgsize_mask = le64_to_cpu(prop->mask);
+   return 0;
+}
+
+static int viommu_add_input_range(struct viommu_endpoint *vdev,
+ struct virtio_iommu_probe_input_range *prop,
+ size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->input_start   = le64_to_cpu(prop->start);
+   vdev->input_end = le64_to_cpu(prop->end);
+   return 0;
+}
+
+static int viommu_add_output_size(struct viommu_endpoint *vdev,
+ struct virtio_iommu_probe_output_size *prop,
+ size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->output_bits = prop->bits;
+   return 0;
+}
+
+static int viommu_add_pasid_size(struct viommu_endpoint *vdev,
+struct virtio_iommu_probe_pasid_size *prop,
+size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->pasid_bits = prop->bits;
+   return 0;
+}
+
+static int viommu_add_pgtf(struct viommu_endpoint *vdev, void *pgtf, size_t 
len)
+{
+   /* Select the first page table format available */
+   if (len < sizeof(struct virtio_iommu_probe_table_format) || vdev->pgtf)
+   return -EINVAL;
+
+   vdev->pgtf = kmemdup(pgtf, len, GFP_KERNEL);
+   if (!vdev->pgtf)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static int viommu_add_pstf(struct viommu_endpoint *vdev, void *pstf, size_t 
len)
+{
+   if (len < sizeof(struct virtio_iommu_probe_table_format) || vdev->pstf)
+   return -EINVAL;
+
+   vdev->pstf = kmemdup(pstf, len, GFP_KERNEL);
+   if (!vdev->pstf)
+   return -ENOMEM;
+
+   return 0;
+}
+
 static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device *dev)
 {
int ret;
@@ -493,11 +570,30 @@ static int viommu_probe_endpoint(struct viommu_dev 
*viommu, struct device *dev)
 
while (type != VIRTIO_IOMMU_PROBE_T_NONE &&
   cur < viommu->probe_size) {
+   void *value = prop;
len = le16_to_cpu(prop->length) + sizeof(*prop);
 
switch (type) {
case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
-   ret = viommu_add_resv_mem(vdev, (void *)prop, len);
+   ret = viommu_add_resv_mem(vdev, value, len);
+   break;
+   case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
+   ret = viommu_add_pgsize_mask(vdev, value, len);
+   break;
+   case VIRTIO_IOMMU_PROBE_T_INPUT_RANGE:
+   ret = viommu_add_input_range(vdev, value, len);
+   break;
+   case VIRTIO_IOMMU_PROBE_T_OUTPUT_SIZE:
+   

[PATCH RFC v1 03/15] iommu/arm-smmu-v3: Update drivers to work with iommu-pasid-table

2021-01-15 Thread Vivek Gautam
Update arm-smmu-v3 context descriptor (CD) library driver to work
with iommu-pasid-table APIs. These APIs are then used in arm-smmu-v3
drivers to manage CD tables.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 .../arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c  | 127 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  16 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  47 ---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   7 +-
 drivers/iommu/iommu-pasid-table.h |  10 +-
 5 files changed, 144 insertions(+), 63 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
index 97d1786a8a70..8a7187534706 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -8,17 +8,17 @@
 #include 
 
 #include "arm-smmu-v3.h"
+#include "../../iommu-pasid-table.h"
 
-static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
+static int arm_smmu_alloc_cd_leaf_table(struct device *dev,
struct arm_smmu_l1_ctx_desc *l1_desc)
 {
size_t size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
 
-   l1_desc->l2ptr = dmam_alloc_coherent(smmu->dev, size,
+   l1_desc->l2ptr = dmam_alloc_coherent(dev, size,
 &l1_desc->l2ptr_dma, GFP_KERNEL);
if (!l1_desc->l2ptr) {
-   dev_warn(smmu->dev,
-"failed to allocate context descriptor table\n");
+   dev_warn(dev, "failed to allocate context descriptor table\n");
return -ENOMEM;
}
return 0;
@@ -34,35 +34,39 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
WRITE_ONCE(*dst, cpu_to_le64(val));
 }
 
-static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
+static __le64 *arm_smmu_get_cd_ptr(struct iommu_vendor_psdtable_cfg *pst_cfg,
   u32 ssid)
 {
__le64 *l1ptr;
unsigned int idx;
+   struct device *dev = pst_cfg->iommu_dev;
+   struct arm_smmu_cfg_info *cfgi = &pst_cfg->vendor.cfg;
+   struct arm_smmu_s1_cfg *s1cfg = cfgi->s1_cfg;
+   struct arm_smmu_ctx_desc_cfg *cdcfg = &s1cfg->cdcfg;
struct arm_smmu_l1_ctx_desc *l1_desc;
-   struct arm_smmu_device *smmu = smmu_domain->smmu;
-   struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->s1_cfg.cdcfg;
+   struct iommu_pasid_table *tbl = pasid_table_cfg_to_table(pst_cfg);
 
-   if (smmu_domain->s1_cfg.s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
+   if (s1cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
 
idx = ssid >> CTXDESC_SPLIT;
l1_desc = &cdcfg->l1_desc[idx];
if (!l1_desc->l2ptr) {
-   if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
+   if (arm_smmu_alloc_cd_leaf_table(dev, l1_desc))
return NULL;
 
l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
/* An invalid L1CD can be cached */
-   arm_smmu_sync_cd(smmu_domain, ssid, false);
+   if (iommu_psdtable_sync(tbl, tbl->cookie, ssid, false))
+   return NULL;
}
idx = ssid & (CTXDESC_L2_ENTRIES - 1);
return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS;
 }
 
-int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
-   struct arm_smmu_ctx_desc *cd)
+static int arm_smmu_write_ctx_desc(struct iommu_vendor_psdtable_cfg *pst_cfg,
+  int ssid, void *cookie)
 {
/*
 * This function handles the following cases:
@@ -78,12 +82,15 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain 
*smmu_domain, int ssid,
u64 val;
bool cd_live;
__le64 *cdptr;
-   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   struct arm_smmu_cfg_info *cfgi = &pst_cfg->vendor.cfg;
+   struct arm_smmu_s1_cfg *s1cfg = cfgi->s1_cfg;
+   struct iommu_pasid_table *tbl = pasid_table_cfg_to_table(pst_cfg);
+   struct arm_smmu_ctx_desc *cd = cookie;
 
-   if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
+   if (WARN_ON(ssid >= (1 << s1cfg->s1cdmax)))
return -E2BIG;
 
-   cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
+   cdptr = arm_smmu_get_cd_ptr(pst_cfg, ssid);
if (!cdptr)
return -ENOMEM;
 
@@ -111,7 +118,8 @@ int

[PATCH RFC v1 04/15] iommu/arm-smmu-v3: Update CD base address info for user-space

2021-01-15 Thread Vivek Gautam
Update base address information in vendor pasid table info to pass that
to user-space for stage1 table management.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
index 8a7187534706..ec37476c8d09 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -55,6 +55,9 @@ static __le64 *arm_smmu_get_cd_ptr(struct 
iommu_vendor_psdtable_cfg *pst_cfg,
if (arm_smmu_alloc_cd_leaf_table(dev, l1_desc))
return NULL;
 
+   if (s1cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
+   pst_cfg->base = l1_desc->l2ptr_dma;
+
l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
/* An invalid L1CD can be cached */
@@ -211,6 +214,9 @@ static int arm_smmu_alloc_cd_tables(struct 
iommu_vendor_psdtable_cfg *pst_cfg)
goto err_free_l1;
}
 
+   if (s1cfg->s1fmt == STRTAB_STE_0_S1FMT_64K_L2)
+   pst_cfg->base = cdcfg->cdtab_dma;
+
return 0;
 
 err_free_l1:
-- 
2.17.1



[PATCH RFC v1 02/15] iommu: Add a simple PASID table library

2021-01-15 Thread Vivek Gautam
Add a small API in iommu subsystem to handle PASID table allocation
requests from different consumer drivers, such as a paravirtualized
iommu driver. The API provides ops for allocating and freeing PASID
table, writing to it and managing the table caches.

This library also provides for registering a vendor API that attaches
to these ops. The vendor APIs would eventually perform arch level
implementations for these PASID tables.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/iommu-pasid-table.h | 134 ++
 1 file changed, 134 insertions(+)
 create mode 100644 drivers/iommu/iommu-pasid-table.h

diff --git a/drivers/iommu/iommu-pasid-table.h 
b/drivers/iommu/iommu-pasid-table.h
new file mode 100644
index ..bd4f57656f67
--- /dev/null
+++ b/drivers/iommu/iommu-pasid-table.h
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PASID table management for the IOMMU
+ *
+ * Copyright (C) 2021 Arm Ltd.
+ */
+#ifndef __IOMMU_PASID_TABLE_H
+#define __IOMMU_PASID_TABLE_H
+
+#include 
+
+#include "arm/arm-smmu-v3/arm-smmu-v3.h"
+
+enum pasid_table_fmt {
+   PASID_TABLE_ARM_SMMU_V3,
+   PASID_TABLE_NUM_FMTS,
+};
+
+/**
+ * struct arm_smmu_cfg_info - arm-smmu-v3 specific configuration data
+ *
+ * @s1_cfg: arm-smmu-v3 stage1 config data
+ * @feat_flag: features supported by arm-smmu-v3 implementation
+ */
+struct arm_smmu_cfg_info {
+   struct arm_smmu_s1_cfg  *s1_cfg;
+   u32 feat_flag;
+};
+
+/**
+ * struct iommu_vendor_psdtable_cfg - Configuration data for PASID tables
+ *
+ * @iommu_dev: device performing the DMA table walks
+ * @fmt: The PASID table format
+ * @base: DMA address of the allocated table, set by the vendor driver
+ * @cfg: arm-smmu-v3 specific config data
+ */
+struct iommu_vendor_psdtable_cfg {
+   struct device   *iommu_dev;
+   enum pasid_table_fmtfmt;
+   dma_addr_t  base;
+   union {
+   struct arm_smmu_cfg_infocfg;
+   } vendor;
+};
+
+struct iommu_vendor_psdtable_ops;
+
+/**
+ * struct iommu_pasid_table - describes a set of PASID tables
+ *
+ * @cookie: An opaque token provided by the IOMMU driver and passed back to any
+ * callback routine.
+ * @cfg: A copy of the PASID table configuration
+ * @ops: The PASID table operations in use for this set of page tables
+ */
+struct iommu_pasid_table {
+   void*cookie;
+   struct iommu_vendor_psdtable_cfgcfg;
+   struct iommu_vendor_psdtable_ops*ops;
+};
+
+#define pasid_table_cfg_to_table(pst_cfg) \
+   container_of((pst_cfg), struct iommu_pasid_table, cfg)
+
+struct iommu_vendor_psdtable_ops {
+   int (*alloc)(struct iommu_vendor_psdtable_cfg *cfg);
+   void (*free)(struct iommu_vendor_psdtable_cfg *cfg);
+   void (*prepare)(struct iommu_vendor_psdtable_cfg *cfg,
+   struct io_pgtable_cfg *pgtbl_cfg, u32 asid);
+   int (*write)(struct iommu_vendor_psdtable_cfg *cfg, int ssid,
+void *cookie);
+   void (*sync)(void *cookie, int ssid, bool leaf);
+};
+
+static inline int iommu_psdtable_alloc(struct iommu_pasid_table *tbl,
+  struct iommu_vendor_psdtable_cfg *cfg)
+{
+   if (!tbl->ops->alloc)
+   return -ENOSYS;
+
+   return tbl->ops->alloc(cfg);
+}
+
+static inline void iommu_psdtable_free(struct iommu_pasid_table *tbl,
+  struct iommu_vendor_psdtable_cfg *cfg)
+{
+   if (!tbl->ops->free)
+   return;
+
+   tbl->ops->free(cfg);
+}
+
+static inline int iommu_psdtable_prepare(struct iommu_pasid_table *tbl,
+struct iommu_vendor_psdtable_cfg *cfg,
+struct io_pgtable_cfg *pgtbl_cfg,
+u32 asid)
+{
+   if (!tbl->ops->prepare)
+   return -ENOSYS;
+
+   tbl->ops->prepare(cfg, pgtbl_cfg, asid);
+   return 0;
+}
+
+static inline int iommu_psdtable_write(struct iommu_pasid_table *tbl,
+  struct iommu_vendor_psdtable_cfg *cfg,
+  int ssid, void *cookie)
+{
+   if (!tbl->ops->write)
+   return -ENOSYS;
+
+   return tbl->ops->write(cfg, ssid, cookie);
+}
+
+static inline int iommu_psdtable_sync(struct iommu_pasid_table *tbl,
+ void *cookie, int ssid, bool leaf)
+{
+   if (!tbl->ops->sync)
+   return -ENOSYS;
+
+   tbl->ops->sync(cookie, ssid, leaf);
+   return 0;
+}
+
+/* A placeholder to register 

[PATCH RFC v1 01/15] iommu/arm-smmu-v3: Create a Context Descriptor library

2021-01-15 Thread Vivek Gautam
Para-virtualized iommu drivers in guest may require to create and manage
context descriptor (CD) tables as part of PASID table allocations.
The PASID tables are passed to host to configure stage-1 tables in
hardware.
Make way for a library driver for CD management to allow para-
virtualized iommu driver call such code.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/arm/arm-smmu-v3/Makefile|   2 +-
 .../arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c  | 223 ++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 216 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   3 +
 4 files changed, 228 insertions(+), 216 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c

diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile 
b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 54feb1ecccad..ca1a05b8b8ad 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
-arm_smmu_v3-objs-y += arm-smmu-v3.o
+arm_smmu_v3-objs-y += arm-smmu-v3.o arm-smmu-v3-cd-lib.o
 arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
 arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
new file mode 100644
index ..97d1786a8a70
--- /dev/null
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arm-smmu-v3 context descriptor handling library driver
+ *
+ * Copyright (C) 2021 Arm Ltd.
+ */
+
+#include 
+
+#include "arm-smmu-v3.h"
+
+static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
+   struct arm_smmu_l1_ctx_desc *l1_desc)
+{
+   size_t size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
+
+   l1_desc->l2ptr = dmam_alloc_coherent(smmu->dev, size,
+&l1_desc->l2ptr_dma, GFP_KERNEL);
+   if (!l1_desc->l2ptr) {
+   dev_warn(smmu->dev,
+"failed to allocate context descriptor table\n");
+   return -ENOMEM;
+   }
+   return 0;
+}
+
+static void arm_smmu_write_cd_l1_desc(__le64 *dst,
+ struct arm_smmu_l1_ctx_desc *l1_desc)
+{
+   u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
+ CTXDESC_L1_DESC_V;
+
+   /* See comment in arm_smmu_write_ctx_desc() */
+   WRITE_ONCE(*dst, cpu_to_le64(val));
+}
+
+static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
+  u32 ssid)
+{
+   __le64 *l1ptr;
+   unsigned int idx;
+   struct arm_smmu_l1_ctx_desc *l1_desc;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->s1_cfg.cdcfg;
+
+   if (smmu_domain->s1_cfg.s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
+   return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
+
+   idx = ssid >> CTXDESC_SPLIT;
+   l1_desc = &cdcfg->l1_desc[idx];
+   if (!l1_desc->l2ptr) {
+   if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
+   return NULL;
+
+   l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
+   arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
+   /* An invalid L1CD can be cached */
+   arm_smmu_sync_cd(smmu_domain, ssid, false);
+   }
+   idx = ssid & (CTXDESC_L2_ENTRIES - 1);
+   return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS;
+}
+
+int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
+   struct arm_smmu_ctx_desc *cd)
+{
+   /*
+* This function handles the following cases:
+*
+* (1) Install primary CD, for normal DMA traffic (SSID = 0).
+* (2) Install a secondary CD, for SID+SSID traffic.
+* (3) Update ASID of a CD. Atomically write the first 64 bits of the
+* CD, then invalidate the old entry and mappings.
+* (4) Quiesce the context without clearing the valid bit. Disable
+* translation, and ignore any translation fault.
+* (5) Remove a secondary CD.
+*/
+   u64 val;
+   bool cd_live;
+   __le64 *cdptr;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
+   return -E2BIG;
+
+   cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
+   if (!cdptr)
+   return -ENOMEM;

Re: [PATCH v4 3/3] iommu: arm-smmu-impl: Add sdm845 implementation hook

2019-09-05 Thread Vivek Gautam
On Fri, Aug 23, 2019 at 12:03 PM Vivek Gautam
 wrote:
>
> Add reset hook for sdm845 based platforms to turn off
> the wait-for-safe sequence.
>
> Understanding how wait-for-safe logic affects USB and UFS performance
> on MTP845 and DB845 boards:
>
> Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
> to address under-performance issues in real-time clients, such as
> Display, and Camera.
> On receiving an invalidation requests, the SMMU forwards SAFE request
> to these clients and waits for SAFE ack signal from real-time clients.
> The SAFE signal from such clients is used to qualify the start of
> invalidation.
> This logic is controlled by chicken bits, one for each - MDP (display),
> IFE0, and IFE1 (camera), that can be accessed only from secure software
> on sdm845.
>
> This configuration, however, degrades the performance of non-real time
> clients, such as USB, and UFS etc. This happens because, with wait-for-safe
> logic enabled the hardware tries to throttle non-real time clients while
> waiting for SAFE ack signals from real-time clients.
>
> On mtp845 and db845 devices, with wait-for-safe logic enabled by the
> bootloaders we see degraded performance of USB and UFS when kernel
> enables the smmu stage-1 translations for these clients.
> Turn off this wait-for-safe logic from the kernel gets us back the perf
> of USB and UFS devices until we re-visit this when we start seeing perf
> issues on display/camera on upstream supported SDM845 platforms.
> The bootloaders on these boards implement secure monitor callbacks to
> handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
> logic can be toggled.
>
> There are other boards such as cheza whose bootloaders don't enable this
> logic. Such boards don't implement callbacks to handle the specific SCM
> call so disabling this logic for such boards will be a no-op.
>
> This change is inspired by the downstream change from Patrick Daly
> to address performance issues with display and camera by handling
> this wait-for-safe within separte io-pagetable ops to do TLB
> maintenance. So a big thanks to him for the change and for all the
> offline discussions.
>
> Without this change the UFS reads are pretty slow:
> $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
> real0m 22.39s
> user0m 0.00s
> sys 0m 0.01s
>
> With this change they are back to rock!
> $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
> 300+0 records in
> 300+0 records out
> 314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
> real0m 1.03s
> user0m 0.00s
> sys 0m 0.54s
>
> Signed-off-by: Vivek Gautam 
> ---
>  drivers/iommu/arm-smmu-impl.c | 27 ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> index 3f88cd078dd5..0aef87c41f9c 100644
> --- a/drivers/iommu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm-smmu-impl.c
> @@ -6,6 +6,7 @@
>
>  #include 
>  #include 
> +#include 
>
>  #include "arm-smmu.h"
>
> @@ -102,7 +103,6 @@ static struct arm_smmu_device 
> *cavium_smmu_impl_init(struct arm_smmu_device *smm
> return &cs->smmu;
>  }
>
> -
>  #define ARM_MMU500_ACTLR_CPRE  (1 << 1)
>
>  #define ARM_MMU500_ACR_CACHE_LOCK  (1 << 26)
> @@ -147,6 +147,28 @@ static const struct arm_smmu_impl arm_mmu500_impl = {
> .reset = arm_mmu500_reset,
>  };
>
> +static int qcom_sdm845_smmu500_reset(struct arm_smmu_device *smmu)
> +{
> +   int ret;
> +
> +   arm_mmu500_reset(smmu);
> +
> +   /*
> +* To address performance degradation in non-real time clients,
> +* such as USB and UFS, turn off wait-for-safe on sdm845 based boards,
> +* such as MTP and db845, whose firmwares implement secure monitor
> +* call handlers to turn on/off the wait-for-safe logic.
> +*/
> +   ret = qcom_scm_qsmmu500_wait_safe_toggle(0);
> +   if (ret)
> +   dev_warn(smmu->dev, "Failed to turn off SAFE logic\n");
> +
> +   return 0;
> +}
> +
> +const struct arm_smmu_impl qcom_sdm845_smmu500_impl = {
> +   .reset = qcom_sdm845_smmu500_reset,
> +};
>
>  struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
>  {
> @@ -170,5 +192,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> arm_smmu_device *smmu)
>   "calxeda,smmu-secure-config-access"))
> smmu-&g

Re: [PATCH v2 0/3] soc: qcom: llcc cleanups

2019-09-04 Thread Vivek Gautam
On Wed, Sep 4, 2019 at 10:13 AM Bjorn Andersson
 wrote:
>
> On Tue 27 Aug 04:01 PDT 2019, Vivek Gautam wrote:
>
> > On Fri, Aug 2, 2019 at 11:43 AM Vivek Gautam
> >  wrote:
> > >
> > > On Thu, Jul 18, 2019 at 6:33 PM Vivek Gautam
> > >  wrote:
> > > >
> > > > To better support future versions of llcc, consolidating the
> > > > driver to llcc-qcom driver file, and taking care of the dependencies.
> > > > v1 series is availale at:
> > > > https://lore.kernel.org/patchwork/patch/1099573/
> > > >
> > > > Changes since v1:
> > > > Addressing Bjorn's comments -
> > > >  * Not using llcc-plat as the platform driver rather using a single
> > > >driver file now - llcc-qcom.
> > > >  * Removed SCT_ENTRY macro.
> > > >  * Moved few structure definitions from include/linux path to llcc-qcom
> > > >driver as they are not exposed to other subsystems.
> > >
> > > Hi Bjorn,
> > >
> > > How does this cleanup look now? Let me know if there are any
> > > improvements to make here.
> > >
> >
> > Hi Bjorn,
> >
> > Are you planning to pull this series in the next merge window?
> > There's a dt patch as well for llcc on sdm845 [1] that has been lying 
> > around.
> >
> > Let me know if you have concerns with this series. I will be happy to
> > incorporate the suggestions.
> >
>
> No concerns, this is exactly what we discussed before. Sorry for missing
> it. I've picked the patches now.
>
> > [1] https://lore.kernel.org/patchwork/patch/1099318/
> >
>
> This is part of the v5.4 pull request.

Thanks a lot Bjorn.

Best regards
Vivek

>
> Thanks,
> Bjorn
>
> > Thanks & Regards
> > Vivek
> >
> > > Best Regards
> > > Vivek
> > > >
> > > > Vivek Gautam (3):
> > > >   soc: qcom: llcc cleanup to get rid of sdm845 specific driver file
> > > >   soc: qcom: Rename llcc-slice to llcc-qcom
> > > >   soc: qcom: Make llcc-qcom a generic driver
> > > >
> > > >  drivers/soc/qcom/Kconfig   |  14 +--
> > > >  drivers/soc/qcom/Makefile  |   3 +-
> > > >  drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} | 155 
> > > > +++--
> > > >  drivers/soc/qcom/llcc-sdm845.c | 100 
> > > >  include/linux/soc/qcom/llcc-qcom.h | 104 -
> > > >  5 files changed, 152 insertions(+), 224 deletions(-)
> > > >  rename drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} (64%)
> > > >  delete mode 100644 drivers/soc/qcom/llcc-sdm845.c
> > > >
> >
> >
> >
> > --
> > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> > of Code Aurora Forum, hosted by The Linux Foundation



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v2 0/3] soc: qcom: llcc cleanups

2019-08-27 Thread Vivek Gautam
On Fri, Aug 2, 2019 at 11:43 AM Vivek Gautam
 wrote:
>
> On Thu, Jul 18, 2019 at 6:33 PM Vivek Gautam
>  wrote:
> >
> > To better support future versions of llcc, consolidating the
> > driver to llcc-qcom driver file, and taking care of the dependencies.
> > v1 series is availale at:
> > https://lore.kernel.org/patchwork/patch/1099573/
> >
> > Changes since v1:
> > Addressing Bjorn's comments -
> >  * Not using llcc-plat as the platform driver rather using a single
> >driver file now - llcc-qcom.
> >  * Removed SCT_ENTRY macro.
> >  * Moved few structure definitions from include/linux path to llcc-qcom
> >driver as they are not exposed to other subsystems.
>
> Hi Bjorn,
>
> How does this cleanup look now? Let me know if there are any
> improvements to make here.
>

Hi Bjorn,

Are you planning to pull this series in the next merge window?
There's a dt patch as well for llcc on sdm845 [1] that has been lying around.

Let me know if you have concerns with this series. I will be happy to
incorporate the suggestions.

[1] https://lore.kernel.org/patchwork/patch/1099318/

Thanks & Regards
Vivek

> Best Regards
> Vivek
> >
> > Vivek Gautam (3):
> >   soc: qcom: llcc cleanup to get rid of sdm845 specific driver file
> >   soc: qcom: Rename llcc-slice to llcc-qcom
> >   soc: qcom: Make llcc-qcom a generic driver
> >
> >  drivers/soc/qcom/Kconfig   |  14 +--
> >  drivers/soc/qcom/Makefile  |   3 +-
> >  drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} | 155 
> > +++--
> >  drivers/soc/qcom/llcc-sdm845.c | 100 
> >  include/linux/soc/qcom/llcc-qcom.h | 104 -
> >  5 files changed, 152 insertions(+), 224 deletions(-)
> >  rename drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} (64%)
> >  delete mode 100644 drivers/soc/qcom/llcc-sdm845.c
> >



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH v4 3/3] iommu: arm-smmu-impl: Add sdm845 implementation hook

2019-08-22 Thread Vivek Gautam
Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.

Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:

Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.

This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.

On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.

There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.

This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.

Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real0m 22.39s
user0m 0.00s
sys 0m 0.01s

With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real0m 1.03s
user0m 0.00s
sys 0m 0.54s

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/arm-smmu-impl.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 3f88cd078dd5..0aef87c41f9c 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 #include "arm-smmu.h"
 
@@ -102,7 +103,6 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct 
arm_smmu_device *smm
return &cs->smmu;
 }
 
-
 #define ARM_MMU500_ACTLR_CPRE  (1 << 1)
 
 #define ARM_MMU500_ACR_CACHE_LOCK  (1 << 26)
@@ -147,6 +147,28 @@ static const struct arm_smmu_impl arm_mmu500_impl = {
.reset = arm_mmu500_reset,
 };
 
+static int qcom_sdm845_smmu500_reset(struct arm_smmu_device *smmu)
+{
+   int ret;
+
+   arm_mmu500_reset(smmu);
+
+   /*
+* To address performance degradation in non-real time clients,
+* such as USB and UFS, turn off wait-for-safe on sdm845 based boards,
+* such as MTP and db845, whose firmwares implement secure monitor
+* call handlers to turn on/off the wait-for-safe logic.
+*/
+   ret = qcom_scm_qsmmu500_wait_safe_toggle(0);
+   if (ret)
+   dev_warn(smmu->dev, "Failed to turn off SAFE logic\n");
+
+   return 0;
+}
+
+const struct arm_smmu_impl qcom_sdm845_smmu500_impl = {
+   .reset = qcom_sdm845_smmu500_reset,
+};
 
 struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
 {
@@ -170,5 +192,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
  "calxeda,smmu-secure-config-access"))
smmu->impl = &calxeda_impl;
 
+   if (of_device_is_compatible(smmu->dev->of_node, "qcom,sdm845-smmu-500"))
+   smmu->impl = &qcom_sdm845_smmu500_impl;
+
return smmu;
 }
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v4 1/3] firmware: qcom_scm-64: Add atomic version of qcom_scm_call

2019-08-22 Thread Vivek Gautam
There are scnenarios where drivers are required to make a
scm call in atomic context, such as in one of the qcom's
arm-smmu-500 errata [1].

[1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/
  tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842")

Signed-off-by: Vivek Gautam 
Reviewed-by: Bjorn Andersson 
---
 drivers/firmware/qcom_scm-64.c | 136 -
 1 file changed, 92 insertions(+), 44 deletions(-)

diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 91d5ad7cf58b..b6dca32c5ac4 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -62,32 +62,71 @@ static DEFINE_MUTEX(qcom_scm_lock);
 #define FIRST_EXT_ARG_IDX 3
 #define N_REGISTER_ARGS (MAX_QCOM_SCM_ARGS - N_EXT_QCOM_SCM_ARGS + 1)
 
-/**
- * qcom_scm_call() - Invoke a syscall in the secure world
- * @dev:   device
- * @svc_id:service identifier
- * @cmd_id:command identifier
- * @desc:  Descriptor structure containing arguments and return values
- *
- * Sends a command to the SCM and waits for the command to finish processing.
- * This should *only* be called in pre-emptible context.
-*/
-static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
-const struct qcom_scm_desc *desc,
-struct arm_smccc_res *res)
+static void __qcom_scm_call_do(const struct qcom_scm_desc *desc,
+  struct arm_smccc_res *res, u32 fn_id,
+  u64 x5, u32 type)
+{
+   u64 cmd;
+   struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+
+   cmd = ARM_SMCCC_CALL_VAL(type, qcom_smccc_convention,
+ARM_SMCCC_OWNER_SIP, fn_id);
+
+   quirk.state.a6 = 0;
+
+   do {
+   arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
+   desc->args[1], desc->args[2], x5,
+   quirk.state.a6, 0, res, &quirk);
+
+   if (res->a0 == QCOM_SCM_INTERRUPTED)
+   cmd = res->a0;
+
+   } while (res->a0 == QCOM_SCM_INTERRUPTED);
+}
+
+static void qcom_scm_call_do(const struct qcom_scm_desc *desc,
+struct arm_smccc_res *res, u32 fn_id,
+u64 x5, bool atomic)
+{
+   int retry_count = 0;
+
+   if (!atomic) {
+   do {
+   mutex_lock(&qcom_scm_lock);
+
+   __qcom_scm_call_do(desc, res, fn_id, x5,
+  ARM_SMCCC_STD_CALL);
+
+   mutex_unlock(&qcom_scm_lock);
+
+   if (res->a0 == QCOM_SCM_V2_EBUSY) {
+   if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
+   break;
+   msleep(QCOM_SCM_EBUSY_WAIT_MS);
+   }
+   }  while (res->a0 == QCOM_SCM_V2_EBUSY);
+   } else {
+   __qcom_scm_call_do(desc, res, fn_id, x5, ARM_SMCCC_FAST_CALL);
+   }
+}
+
+static int ___qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
+   const struct qcom_scm_desc *desc,
+   struct arm_smccc_res *res, bool atomic)
 {
int arglen = desc->arginfo & 0xf;
-   int retry_count = 0, i;
+   int i;
u32 fn_id = QCOM_SCM_FNID(svc_id, cmd_id);
-   u64 cmd, x5 = desc->args[FIRST_EXT_ARG_IDX];
+   u64 x5 = desc->args[FIRST_EXT_ARG_IDX];
dma_addr_t args_phys = 0;
void *args_virt = NULL;
size_t alloc_len;
-   struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+   gfp_t flag = atomic ? GFP_ATOMIC : GFP_KERNEL;
 
if (unlikely(arglen > N_REGISTER_ARGS)) {
alloc_len = N_EXT_QCOM_SCM_ARGS * sizeof(u64);
-   args_virt = kzalloc(PAGE_ALIGN(alloc_len), GFP_KERNEL);
+   args_virt = kzalloc(PAGE_ALIGN(alloc_len), flag);
 
if (!args_virt)
return -ENOMEM;
@@ -117,33 +156,7 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, 
u32 cmd_id,
x5 = args_phys;
}
 
-   do {
-   mutex_lock(&qcom_scm_lock);
-
-   cmd = ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL,
-qcom_smccc_convention,
-ARM_SMCCC_OWNER_SIP, fn_id);
-
-   quirk.state.a6 = 0;
-
-   do {
-   arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
- desc->args[1], desc->args[2], x5,
- quirk.state.a6, 0, res, &quirk);
-
-   if (res->a0 == QCOM_SCM_INTERRUPTED)
-   cmd = res->a0;
-
-

[PATCH v4 0/3] Qcom smmu-500 wait-for-safe handling for sdm845

2019-08-22 Thread Vivek Gautam
Previous version of the patches are at [1]:

Qcom's implementation of smmu-500 on sdm845 adds a hardware logic called
wait-for-safe. This logic helps in meeting the invalidation requirements
from 'real-time clients', such as display and camera. This wait-for-safe
logic ensures that the invalidations happen after getting an ack from these
devices.
In this patch-series we are disabling this wait-for-safe logic from the
arm-smmu driver's probe as with this enabled the hardware tries to
throttle invalidations from 'non-real-time clients', such as USB and UFS.

For detailed information please refer to patch [3/4] in this series.
I have included the device tree patch too in this series for someone who
would like to test out this. Here's a branch [2] that gets display on MTP
SDM845 device.

This patch series is inspired from downstream work to handle under-performance
issues on real-time clients on sdm845. In downstream we add separate page table
ops to handle TLB maintenance and toggle wait-for-safe in tlb_sync call so that
achieve required performance for display and camera [3, 4].

Changes since v3:
 * Based on arm-smmu implementation cleanup series [5] by Robin Murphy which is
   already merged in Will's tree [6].
 * Implemented the sdm845 specific reset hook which does arm_smmu_device_reset()
   followed by making SCM call to disable the wait-for-safe logic.
 * Removed depedency for SCM call on any dt flag. We invariably try to disable
   the wait-for-safe logic on sdm845. The platforms such as mtp845, and db845
   that implement handlers for this particular SCM call should be able disable
   wait-for-safe logic.
   Other platforms such as cheza don't enable the wait-for-safe logic at all
   from their bootloaders. So there's no need to disable the same.
 * No change in SCM call patches 1 & 2.

Changes since v2:
 * Dropped the patch to add atomic io_read/write scm API.
 * Removed support for any separate page table ops to handle wait-for-safe.
   Currently just disabling this wait-for-safe logic from 
arm_smmu_device_probe()
   to achieve performance on USB/UFS on sdm845.
 * Added a device tree patch to add smmu option for fw-implemented support
   for SCM call to take care of SAFE toggling.

Changes since v1:
 * Addressed Will and Robin's comments:
- Dropped the patch[4] that forked out __arm_smmu_tlb_inv_range_nosync(),
  and __arm_smmu_tlb_sync().
- Cleaned up the errata patch further to use downstream polling mechanism
  for tlb sync.
 * No change in SCM call patches - patches 1 to 3.

[1] https://lore.kernel.org/patchwork/cover/1087453/
[2] https://github.com/vivekgautam1/linux/tree/v5.2-rc4/sdm845-display-working
[3] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/drivers/iommu/arm-smmu.c?h=CogSystems-msm-49/msm-4.9&id=da765c6c75266b38191b38ef086274943f353ea7
[4] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/drivers/iommu/arm-smmu.c?h=CogSystems-msm-49/msm-4.9&id=8696005aaaf745de68f57793c1a534a34345c30a
[5] https://patchwork.kernel.org/patch/11096265/
[6] https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/

Vivek Gautam (3):
  firmware: qcom_scm-64: Add atomic version of qcom_scm_call
  firmware/qcom_scm: Add scm call to handle smmu errata
  iommu: arm-smmu-impl: Add sdm845 implementation hook

 drivers/firmware/qcom_scm-32.c |   5 ++
 drivers/firmware/qcom_scm-64.c | 149 +
 drivers/firmware/qcom_scm.c|   6 ++
 drivers/firmware/qcom_scm.h|   5 ++
 drivers/iommu/arm-smmu-impl.c  |  27 +++-
 include/linux/qcom_scm.h   |   2 +
 6 files changed, 149 insertions(+), 45 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v4 2/3] firmware/qcom_scm: Add scm call to handle smmu errata

2019-08-22 Thread Vivek Gautam
Qcom's smmu-500 needs to toggle wait-for-safe sequence to
handle TLB invalidation sync's.
Few firmwares allow doing that through SCM interface.
Add API to toggle wait for safe from firmware through a
SCM call.

Signed-off-by: Vivek Gautam 
Reviewed-by: Bjorn Andersson 
---
 drivers/firmware/qcom_scm-32.c |  5 +
 drivers/firmware/qcom_scm-64.c | 13 +
 drivers/firmware/qcom_scm.c|  6 ++
 drivers/firmware/qcom_scm.h|  5 +
 include/linux/qcom_scm.h   |  2 ++
 5 files changed, 31 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 215061c581e1..bee8729525ec 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -614,3 +614,8 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t 
addr, unsigned int val)
return qcom_scm_call_atomic2(QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
 addr, val);
 }
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool enable)
+{
+   return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index b6dca32c5ac4..41c06dcfa9e1 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -550,3 +550,16 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t 
addr, unsigned int val)
return qcom_scm_call(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
 &desc, &res);
 }
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool en)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+
+   desc.args[0] = QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL;
+   desc.args[1] = en;
+   desc.arginfo = QCOM_SCM_ARGS(2);
+
+   return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_SMMU_PROGRAM,
+   QCOM_SCM_CONFIG_ERRATA1, &desc, &res);
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 2ddc118dba1b..2b3b7a8c4270 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -344,6 +344,12 @@ int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, 
u32 spare)
 }
 EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_init);
 
+int qcom_scm_qsmmu500_wait_safe_toggle(bool en)
+{
+   return __qcom_scm_qsmmu500_wait_safe_toggle(__scm->dev, en);
+}
+EXPORT_SYMBOL(qcom_scm_qsmmu500_wait_safe_toggle);
+
 int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val)
 {
return __qcom_scm_io_readl(__scm->dev, addr, val);
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 99506bd873c0..baee744dbcfe 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -91,10 +91,15 @@ extern int __qcom_scm_restore_sec_cfg(struct device *dev, 
u32 device_id,
  u32 spare);
 #define QCOM_SCM_IOMMU_SECURE_PTBL_SIZE3
 #define QCOM_SCM_IOMMU_SECURE_PTBL_INIT4
+#define QCOM_SCM_SVC_SMMU_PROGRAM  0x15
+#define QCOM_SCM_CONFIG_ERRATA10x3
+#define QCOM_SCM_CONFIG_ERRATA1_CLIENT_ALL 0x2
 extern int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
 size_t *size);
 extern int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr,
 u32 size, u32 spare);
+extern int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev,
+   bool enable);
 #define QCOM_MEM_PROT_ASSIGN_ID0x16
 extern int  __qcom_scm_assign_mem(struct device *dev,
  phys_addr_t mem_region, size_t mem_sz,
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 3f12cc77fb58..aee3d8580d89 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -57,6 +57,7 @@ extern int qcom_scm_set_remote_state(u32 state, u32 id);
 extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
 extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
 extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
+extern int qcom_scm_qsmmu500_wait_safe_toggle(bool en);
 extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
 extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
 #else
@@ -96,6 +97,7 @@ qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; 
}
 static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return 
-ENODEV; }
 static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { 
return -ENODEV; }
 static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 
spare) { return -ENODEV; }
+static inline int qcom_scm_qsmmu500_wait_safe_toggle(bool en) { return 
-ENODEV; }
 static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { 
return -ENODEV; }
 static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { 
return -ENOD

Re: [PATCH v3 4/4] arm64: dts/sdm845: Enable FW implemented safe sequence handler on MTP

2019-08-11 Thread Vivek Gautam
On Tue, Aug 6, 2019 at 3:56 AM Bjorn Andersson
 wrote:
>
> On Wed 12 Jun 00:15 PDT 2019, Vivek Gautam wrote:
>
> > Indicate on MTP SDM845 that firmware implements handler to
> > TLB invalidate erratum SCM call where SAFE sequence is toggled
> > to achieve optimum performance on real-time clients, such as
> > display and camera.
> >
> > Signed-off-by: Vivek Gautam 
> > ---
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index 78ec373a2b18..6a73d9744a71 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -2368,6 +2368,7 @@
> >   compatible = "qcom,sdm845-smmu-500", "arm,mmu-500";
> >   reg = <0 0x1500 0 0x8>;
> >   #iommu-cells = <2>;
> > + qcom,smmu-500-fw-impl-safe-errata;
>
> Looked back at this series and started to wonder if there there is a
> case where this should not be set? I mean we're after all adding this to
> the top 845 dtsi...

My bad.
This is not valid in case of cheza. Cheza firmware doesn't implement
the safe errata handling hook.
On cheza we just have the liberty of accessing the secure registers
through scm calls - this is what
we were doing in earlier patch series handling this errata.
So, a property like this should go to mtp board's dts file.

Thanks

Vivek

>
> How about making it the default in the driver and opt out of the errata
> once there is a need?
>
> Regards,
> Bjorn
>
> >   #global-interrupts = <1>;
> >   interrupts = ,
> >,
> > --
> > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> > of Code Aurora Forum, hosted by The Linux Foundation
> >
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 1/1] arm64: dts: sdm845: Add device node for Last level cache controller

2019-08-04 Thread Vivek Gautam
Hi Bjorn,

On Wed, Jul 10, 2019 at 5:09 PM Vivek Gautam
 wrote:
>
> From: Sai Prakash Ranjan 
>
> Last level cache (aka. system cache) controller provides control
> over the last level cache present on SDM845. This cache lies after
> the memory noc, right before the DDR.
>
> Signed-off-by: Sai Prakash Ranjan 
> Signed-off-by: Vivek Gautam 
> ---
>  arch/arm64/boot/dts/qcom/sdm845.dtsi | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> index 4babff5f19b5..314241a99290 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> @@ -1275,6 +1275,13 @@
> };
> };
>
> +   cache-controller@110 {
> +   compatible = "qcom,sdm845-llcc";
> +   reg = <0 0x110 0 0x20>, <0 0x130 0 
> 0x5>;
> +   reg-names = "llcc_base", "llcc_broadcast_base";
> +   interrupts = ;
> +   };

Gentle ping. Are you planning to pick this?

Thanks
Vivek
[snip]

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] phy: qualcomm: phy-qcom-qmp: Add of_node_put() before return

2019-08-04 Thread Vivek Gautam
On Sun, Aug 4, 2019 at 9:54 PM Nishka Dasgupta  wrote:
>
> Each iteration of for_each_available_child_of_node puts the previous
> node, but in the case of a return from the middle of the loop, there is
> no put, thus causing a memory leak. Hence add an of_node_put before the
> return in two places.
> Issue found with Coccinelle.
>
> Signed-off-by: Nishka Dasgupta 
> ---
>  drivers/phy/qualcomm/phy-qcom-qmp.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c 
> b/drivers/phy/qualcomm/phy-qcom-qmp.c
> index 34ff6434da8f..2f0652efebf0 100644
> --- a/drivers/phy/qualcomm/phy-qcom-qmp.c
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
> @@ -2094,6 +2094,7 @@ static int qcom_qmp_phy_probe(struct platform_device 
> *pdev)
> dev_err(dev, "failed to create lane%d phy, %d\n",
> id, ret);
> pm_runtime_disable(dev);
> +   of_node_put(child);
> return ret;
> }
>
> @@ -2106,6 +2107,7 @@ static int qcom_qmp_phy_probe(struct platform_device 
> *pdev)
> dev_err(qmp->dev,
> "failed to register pipe clock source\n");
> pm_runtime_disable(dev);
> +   of_node_put(child);

Nice find. Thanks for the patch.

Reviewed-by: Vivek Gautam 

Best regards
Vivek

[snip]
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v2 0/3] soc: qcom: llcc cleanups

2019-08-01 Thread Vivek Gautam
On Thu, Jul 18, 2019 at 6:33 PM Vivek Gautam
 wrote:
>
> To better support future versions of llcc, consolidating the
> driver to llcc-qcom driver file, and taking care of the dependencies.
> v1 series is availale at:
> https://lore.kernel.org/patchwork/patch/1099573/
>
> Changes since v1:
> Addressing Bjorn's comments -
>  * Not using llcc-plat as the platform driver rather using a single
>driver file now - llcc-qcom.
>  * Removed SCT_ENTRY macro.
>  * Moved few structure definitions from include/linux path to llcc-qcom
>driver as they are not exposed to other subsystems.

Hi Bjorn,

How does this cleanup look now? Let me know if there are any
improvements to make here.

Best Regards
Vivek
>
> Vivek Gautam (3):
>   soc: qcom: llcc cleanup to get rid of sdm845 specific driver file
>   soc: qcom: Rename llcc-slice to llcc-qcom
>   soc: qcom: Make llcc-qcom a generic driver
>
>  drivers/soc/qcom/Kconfig   |  14 +--
>  drivers/soc/qcom/Makefile  |   3 +-
>  drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} | 155 
> +++--
>  drivers/soc/qcom/llcc-sdm845.c | 100 
>  include/linux/soc/qcom/llcc-qcom.h | 104 -
>  5 files changed, 152 insertions(+), 224 deletions(-)
>  rename drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} (64%)
>  delete mode 100644 drivers/soc/qcom/llcc-sdm845.c
>
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation
>


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 1/1] tty: serial: qcom_geni_serial: Update the oversampling rate

2019-08-01 Thread Vivek Gautam
For QUP IP versions 2.5 and above the oversampling rate is halved
from 32 to 16. Update this rate after reading hardware version
register, so that the clock divider value is correctly set to
achieve required baud rate.

Signed-off-by: Vivek Gautam 
---
 drivers/tty/serial/qcom_geni_serial.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c 
b/drivers/tty/serial/qcom_geni_serial.c
index 35e5f9c5d5be..318f811585cc 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -920,12 +920,13 @@ static unsigned long get_clk_cfg(unsigned long clk_freq)
return 0;
 }
 
-static unsigned long get_clk_div_rate(unsigned int baud, unsigned int *clk_div)
+static unsigned long get_clk_div_rate(unsigned int baud,
+   unsigned int sampling_rate, unsigned int *clk_div)
 {
unsigned long ser_clk;
unsigned long desired_clk;
 
-   desired_clk = baud * UART_OVERSAMPLING;
+   desired_clk = baud * sampling_rate;
ser_clk = get_clk_cfg(desired_clk);
if (!ser_clk) {
pr_err("%s: Can't find matching DFS entry for baud %d\n",
@@ -951,12 +952,20 @@ static void qcom_geni_serial_set_termios(struct uart_port 
*uport,
u32 ser_clk_cfg;
struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
unsigned long clk_rate;
+   u32 ver, sampling_rate;
 
qcom_geni_serial_stop_rx(uport);
/* baud rate */
baud = uart_get_baud_rate(uport, termios, old, 300, 400);
port->baud = baud;
-   clk_rate = get_clk_div_rate(baud, &clk_div);
+
+   sampling_rate = UART_OVERSAMPLING;
+   /* Sampling rate is halved for IP versions >= 2.5 */
+   ver = geni_se_get_qup_hw_version(&port->se);
+   if (GENI_SE_VERSION_MAJOR(ver) >= 2 && GENI_SE_VERSION_MINOR(ver) >= 5)
+   sampling_rate /= 2;
+
+   clk_rate = get_clk_div_rate(baud, sampling_rate, &clk_div);
if (!clk_rate)
goto out_restart_rx;
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 2/3] soc: qcom: Rename llcc-slice to llcc-qcom

2019-07-18 Thread Vivek Gautam
The cleaning up was done without changing the driver file name
to ensure a cleaner bisect. Change the file name now to facilitate
making the driver generic in subsequent patch.

Signed-off-by: Vivek Gautam 
---
 drivers/soc/qcom/Makefile  | 2 +-
 drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} (100%)

diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index 386bf197e0e5..caf8e0beaa57 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -20,6 +20,6 @@ obj-$(CONFIG_QCOM_SMP2P)  += smp2p.o
 obj-$(CONFIG_QCOM_SMSM)+= smsm.o
 obj-$(CONFIG_QCOM_WCNSS_CTRL) += wcnss_ctrl.o
 obj-$(CONFIG_QCOM_APR) += apr.o
-obj-$(CONFIG_QCOM_LLCC) += llcc-slice.o
+obj-$(CONFIG_QCOM_LLCC) += llcc-qcom.o
 obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
 obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
diff --git a/drivers/soc/qcom/llcc-slice.c b/drivers/soc/qcom/llcc-qcom.c
similarity index 100%
rename from drivers/soc/qcom/llcc-slice.c
rename to drivers/soc/qcom/llcc-qcom.c
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/3] soc: qcom: llcc cleanup to get rid of sdm845 specific driver file

2019-07-18 Thread Vivek Gautam
A single file should suffice the need to program the llcc for
various platforms. Get rid of sdm845 specific driver file to
make way for a more generic driver.

Signed-off-by: Vivek Gautam 
---
 drivers/soc/qcom/Kconfig   |  14 ++
 drivers/soc/qcom/Makefile  |   1 -
 drivers/soc/qcom/llcc-sdm845.c | 100 -
 drivers/soc/qcom/llcc-slice.c  |  60 +++---
 include/linux/soc/qcom/llcc-qcom.h |  57 -
 5 files changed, 77 insertions(+), 155 deletions(-)
 delete mode 100644 drivers/soc/qcom/llcc-sdm845.c

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index a6d1bfb17279..b6cc5816a94b 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -58,17 +58,9 @@ config QCOM_LLCC
depends on ARCH_QCOM || COMPILE_TEST
help
  Qualcomm Technologies, Inc. platform specific
- Last Level Cache Controller(LLCC) driver. This provides interfaces
- to clients that use the LLCC. Say yes here to enable LLCC slice
- driver.
-
-config QCOM_SDM845_LLCC
-   tristate "Qualcomm Technologies, Inc. SDM845 LLCC driver"
-   depends on QCOM_LLCC
-   help
- Say yes here to enable the LLCC driver for SDM845. This provides
- data required to configure LLCC so that clients can start using the
- LLCC slices.
+ Last Level Cache Controller(LLCC) driver for platforms such as,
+ SDM845. This provides interfaces to clients that use the LLCC.
+ Say yes here to enable LLCC slice driver.
 
 config QCOM_MDT_LOADER
tristate
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index eeb088beb15f..386bf197e0e5 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -21,6 +21,5 @@ obj-$(CONFIG_QCOM_SMSM)   += smsm.o
 obj-$(CONFIG_QCOM_WCNSS_CTRL) += wcnss_ctrl.o
 obj-$(CONFIG_QCOM_APR) += apr.o
 obj-$(CONFIG_QCOM_LLCC) += llcc-slice.o
-obj-$(CONFIG_QCOM_SDM845_LLCC) += llcc-sdm845.o
 obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
 obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
diff --git a/drivers/soc/qcom/llcc-sdm845.c b/drivers/soc/qcom/llcc-sdm845.c
deleted file mode 100644
index 86600d97c36d..
--- a/drivers/soc/qcom/llcc-sdm845.c
+++ /dev/null
@@ -1,100 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
- *
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-
-/*
- * SCT(System Cache Table) entry contains of the following members:
- * usecase_id: Unique id for the client's use case
- * slice_id: llcc slice id for each client
- * max_cap: The maximum capacity of the cache slice provided in KB
- * priority: Priority of the client used to select victim line for replacement
- * fixed_size: Boolean indicating if the slice has a fixed capacity
- * bonus_ways: Bonus ways are additional ways to be used for any slice,
- * if client ends up using more than reserved cache ways. Bonus
- * ways are allocated only if they are not reserved for some
- * other client.
- * res_ways: Reserved ways for the cache slice, the reserved ways cannot
- * be used by any other client than the one its assigned to.
- * cache_mode: Each slice operates as a cache, this controls the mode of the
- * slice: normal or TCM(Tightly Coupled Memory)
- * probe_target_ways: Determines what ways to probe for access hit. When
- *configured to 1 only bonus and reserved ways are probed.
- *When configured to 0 all ways in llcc are probed.
- * dis_cap_alloc: Disable capacity based allocation for a client
- * retain_on_pc: If this bit is set and client has maintained active vote
- *   then the ways assigned to this client are not flushed on power
- *   collapse.
- * activate_on_init: Activate the slice immediately after the SCT is programmed
- */
-#define SCT_ENTRY(uid, sid, mc, p, fs, bway, rway, cmod, ptw, dca, rp, a) \
-   {   \
-   .usecase_id = uid,  \
-   .slice_id = sid,\
-   .max_cap = mc,  \
-   .priority = p,  \
-   .fixed_size = fs,   \
-   .bonus_ways = bway, \
-   .res_ways = rway,   \
-   .cache_mode = cmod, \
-   .probe_target_ways = ptw,   \
-   .dis_cap_alloc = dca,   \
-   .retain_on_pc = rp, \
-   .activate_on_init = a,  \
-   }
-
-static struct llcc_slice_config sdm845_data[] =  {
-   SCT_ENTRY(LLCC_CPUSS,1,  2816, 1, 0, 0xffc, 0x2,   0, 0, 1, 1, 1),
-   SCT_ENTRY(LLCC_VIDSC0,   2,  512,  2, 1, 0x0,   0x0f0, 0, 0, 1, 1, 0),
-   SCT_ENTRY(LLCC_VIDSC1,   3,  

[PATCH 3/3] soc: qcom: Make llcc-qcom a generic driver

2019-07-18 Thread Vivek Gautam
This makes way for adding future llcc versions.
Also pull out the llcc-qcom specific definitions from includes.
Includes path now contains the only definitions that are
to be exposed to other subsystems.

Signed-off-by: Vivek Gautam 
---
 drivers/soc/qcom/llcc-qcom.c   | 137 +++--
 include/linux/soc/qcom/llcc-qcom.h |  89 
 2 files changed, 116 insertions(+), 110 deletions(-)

diff --git a/drivers/soc/qcom/llcc-qcom.c b/drivers/soc/qcom/llcc-qcom.c
index 574bb5bf20bc..98563ef0ac6b 100644
--- a/drivers/soc/qcom/llcc-qcom.c
+++ b/drivers/soc/qcom/llcc-qcom.c
@@ -47,6 +47,100 @@
 
 #define BANK_OFFSET_STRIDE   0x8
 
+/**
+ * llcc_slice_config - Data associated with the llcc slice
+ * @usecase_id: Unique id for the client's use case
+ * @slice_id: llcc slice id for each client
+ * @max_cap: The maximum capacity of the cache slice provided in KB
+ * @priority: Priority of the client used to select victim line for replacement
+ * @fixed_size: Boolean indicating if the slice has a fixed capacity
+ * @bonus_ways: Bonus ways are additional ways to be used for any slice,
+ * if client ends up using more than reserved cache ways. Bonus
+ * ways are allocated only if they are not reserved for some
+ * other client.
+ * @res_ways: Reserved ways for the cache slice, the reserved ways cannot
+ * be used by any other client than the one its assigned to.
+ * @cache_mode: Each slice operates as a cache, this controls the mode of the
+ * slice: normal or TCM(Tightly Coupled Memory)
+ * @probe_target_ways: Determines what ways to probe for access hit. When
+ *configured to 1 only bonus and reserved ways are probed.
+ *When configured to 0 all ways in llcc are probed.
+ * @dis_cap_alloc: Disable capacity based allocation for a client
+ * @retain_on_pc: If this bit is set and client has maintained active vote
+ *   then the ways assigned to this client are not flushed on power
+ *   collapse.
+ * @activate_on_init: Activate the slice immediately after it is programmed
+ */
+struct llcc_slice_config {
+   u32 usecase_id;
+   u32 slice_id;
+   u32 max_cap;
+   u32 priority;
+   bool fixed_size;
+   u32 bonus_ways;
+   u32 res_ways;
+   u32 cache_mode;
+   u32 probe_target_ways;
+   bool dis_cap_alloc;
+   bool retain_on_pc;
+   bool activate_on_init;
+};
+
+/**
+ * llcc_drv_data - Data associated with the llcc driver
+ * @regmap: regmap associated with the llcc device
+ * @bcast_regmap: regmap associated with llcc broadcast offset
+ * @cfg: pointer to the data structure for slice configuration
+ * @lock: mutex associated with each slice
+ * @cfg_size: size of the config data table
+ * @max_slices: max slices as read from device tree
+ * @num_banks: Number of llcc banks
+ * @bitmap: Bit map to track the active slice ids
+ * @offsets: Pointer to the bank offsets array
+ * @ecc_irq: interrupt for llcc cache error detection and reporting
+ */
+struct llcc_drv_data {
+   struct regmap *regmap;
+   struct regmap *bcast_regmap;
+   const struct llcc_slice_config *cfg;
+   struct mutex lock;
+   u32 cfg_size;
+   u32 max_slices;
+   u32 num_banks;
+   unsigned long *bitmap;
+   u32 *offsets;
+   int ecc_irq;
+};
+
+/**
+ * llcc_edac_reg_data - llcc edac registers data for each error type
+ * @name: Name of the error
+ * @synd_reg: Syndrome register address
+ * @count_status_reg: Status register address to read the error count
+ * @ways_status_reg: Status register address to read the error ways
+ * @reg_cnt: Number of registers
+ * @count_mask: Mask value to get the error count
+ * @ways_mask: Mask value to get the error ways
+ * @count_shift: Shift value to get the error count
+ * @ways_shift: Shift value to get the error ways
+ */
+struct llcc_edac_reg_data {
+   char *name;
+   u64 synd_reg;
+   u64 count_status_reg;
+   u64 ways_status_reg;
+   u32 reg_cnt;
+   u32 count_mask;
+   u32 ways_mask;
+   u8  count_shift;
+   u8  ways_shift;
+};
+
+struct qcom_llcc_config {
+   const struct llcc_slice_config *sct_data;
+   int size;
+};
+
 static struct llcc_slice_config sdm845_data[] =  {
{ LLCC_CPUSS,1,  2816, 1, 0, 0xffc, 0x2,   0, 0, 1, 1, 1 },
{ LLCC_VIDSC0,   2,  512,  2, 1, 0x0,   0x0f0, 0, 0, 1, 1, 0 },
@@ -68,6 +162,11 @@ static struct llcc_slice_config sdm845_data[] =  {
{ LLCC_AUDHW,22, 1024, 1, 1, 0xffc, 0x2,   0, 0, 1, 1, 0 },
 };
 
+static const struct qcom_llcc_config sdm845_cfg = {
+   .sct_data   = sdm845_data,
+   .size   = ARRAY_SIZE(sdm845_data),
+};
+
 static struct llcc_drv_data *drv_data = (void *) -EPROBE_DEFER;
 
 static const struct regmap_config llcc_regmap_config = {
@@ -347,13 +446,15 @@ static struct regmap *qcom_llcc_init_mmio(s

[PATCH v2 0/3] soc: qcom: llcc cleanups

2019-07-18 Thread Vivek Gautam
To better support future versions of llcc, consolidating the
driver to llcc-qcom driver file, and taking care of the dependencies.
v1 series is availale at:
https://lore.kernel.org/patchwork/patch/1099573/

Changes since v1:
Addressing Bjorn's comments -
 * Not using llcc-plat as the platform driver rather using a single
   driver file now - llcc-qcom.
 * Removed SCT_ENTRY macro.
 * Moved few structure definitions from include/linux path to llcc-qcom
   driver as they are not exposed to other subsystems.

Vivek Gautam (3):
  soc: qcom: llcc cleanup to get rid of sdm845 specific driver file
  soc: qcom: Rename llcc-slice to llcc-qcom
  soc: qcom: Make llcc-qcom a generic driver

 drivers/soc/qcom/Kconfig   |  14 +--
 drivers/soc/qcom/Makefile  |   3 +-
 drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} | 155 +++--
 drivers/soc/qcom/llcc-sdm845.c | 100 
 include/linux/soc/qcom/llcc-qcom.h | 104 -
 5 files changed, 152 insertions(+), 224 deletions(-)
 rename drivers/soc/qcom/{llcc-slice.c => llcc-qcom.c} (64%)
 delete mode 100644 drivers/soc/qcom/llcc-sdm845.c

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 2/2] soc: qcom: llcc-plat: Make the driver more generic

2019-07-12 Thread Vivek Gautam
Hi Bjorn,


Thanks for the review.

On Thu, Jul 11, 2019 at 9:29 PM Bjorn Andersson
 wrote:
>
> On Thu 11 Jul 04:03 PDT 2019, Vivek Gautam wrote:
>
> > - Remove 'sdm845' from names, and use 'plat' instead.
> > - Move SCT_ENTRY macro to header file.
> > - Create a new config structure to asssign to of-match-data.
> >
>
> I interpret the intention of these two patches as that you want to add
> some new platform without having to create one llcc-xyz.c per platform.

That's right. The intention is to avoid creating a new platform specific file.

>
> If that's the case then the only user of this macro would be in plat.c,
> so I don't see a reason for moving it to the header file.

Alright. Better to keep it in the driver file itself.

>
> > Signed-off-by: Vivek Gautam 
> > ---
> >  drivers/soc/qcom/llcc-plat.c   | 77 
> > --
> >  include/linux/soc/qcom/llcc-qcom.h | 45 ++
> >  2 files changed, 68 insertions(+), 54 deletions(-)
> >
> > diff --git a/drivers/soc/qcom/llcc-plat.c b/drivers/soc/qcom/llcc-plat.c
> > index 86600d97c36d..31cff0f75b53 100644
> > --- a/drivers/soc/qcom/llcc-plat.c
> > +++ b/drivers/soc/qcom/llcc-plat.c
> > @@ -1,6 +1,6 @@
> >  // SPDX-License-Identifier: GPL-2.0
> >  /*
> > - * Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
> > + * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
> >   *
> >   */
> >
> > @@ -10,47 +10,7 @@
> >  #include 
> >  #include 
> >
> > -/*
> > - * SCT(System Cache Table) entry contains of the following members:
>
> Should have caught this during previous review, but this comment simply
> duplicates the kerneldoc for struct llcc_slice_config.

Ok, i noticed it now. Will clean it up. I can remove this comment, and update
the one for struct llcc_slice_config.

>
> > - * usecase_id: Unique id for the client's use case
> > - * slice_id: llcc slice id for each client
> > - * max_cap: The maximum capacity of the cache slice provided in KB
> > - * priority: Priority of the client used to select victim line for 
> > replacement
> > - * fixed_size: Boolean indicating if the slice has a fixed capacity
> > - * bonus_ways: Bonus ways are additional ways to be used for any slice,
> > - *   if client ends up using more than reserved cache ways. Bonus
> > - *   ways are allocated only if they are not reserved for some
> > - *   other client.
> > - * res_ways: Reserved ways for the cache slice, the reserved ways cannot
> > - *   be used by any other client than the one its assigned to.
> > - * cache_mode: Each slice operates as a cache, this controls the mode of 
> > the
> > - * slice: normal or TCM(Tightly Coupled Memory)
> > - * probe_target_ways: Determines what ways to probe for access hit. When
> > - *configured to 1 only bonus and reserved ways are 
> > probed.
> > - *When configured to 0 all ways in llcc are probed.
> > - * dis_cap_alloc: Disable capacity based allocation for a client
> > - * retain_on_pc: If this bit is set and client has maintained active vote
> > - *   then the ways assigned to this client are not flushed on 
> > power
> > - *   collapse.
> > - * activate_on_init: Activate the slice immediately after the SCT is 
> > programmed
> > - */
> > -#define SCT_ENTRY(uid, sid, mc, p, fs, bway, rway, cmod, ptw, dca, rp, a) \
>
> This simply maps macro arguments 1:1 to struct members, there's no need
> for a macro for this.

Sure, will remove the macro.

>
> > - {   \
> > - .usecase_id = uid,  \
> > - .slice_id = sid,\
> > - .max_cap = mc,  \
> > - .priority = p,  \
> > - .fixed_size = fs,   \
> > - .bonus_ways = bway, \
> > - .res_ways = rway,   \
> > - .cache_mode = cmod, \
> > - .probe_target_ways = ptw,   \
> > - .dis_cap_alloc = dca,   \
> > - .retain_on_pc = rp, \
> > - .activate_on_init = a,  \
> > - }
> > -
> > -static struct llcc_slice_config sdm845_data[] =  {
> > +static const struct llcc_slice_config sdm845_data[] =  {
> >   SCT_ENTRY(LLCC_CPUSS,1,  2816, 1, 0, 0xffc, 0x2,   0, 

Re: [PATCH 1/2] soc: qcom: llcc: Rename llcc-sdm845 to llcc-plat

2019-07-12 Thread Vivek Gautam
On Thu, Jul 11, 2019 at 9:19 PM Bjorn Andersson
 wrote:
>
> On Thu 11 Jul 04:03 PDT 2019, Vivek Gautam wrote:
>
> > To avoid adding files for each future supported SoCs rename
> > the file to a generic name - llcc-plat, so that llcc configuration
> > tables for other SoCs can be added in the same driver.
> >
>
> We've had a generic LLCC Kconfig option and then a specific SDM845 one,
> with this change we have two different generic options and both would
> either always be enabled or disabled.
>
> So I think you should drop QCOM_SDM845_LLCC and build both llcc-slice
> and llcc-plat into the same qcom_llcc.ko instead.

Yea. I can chuck off the llcc-slice module. But for readability would
it still be
better to maintain separate files. I will drop the SDM845 config, and keep only
QCOM_LLC.

Best regards
Vivek

>
> Regards,
> Bjorn
>
> > Signed-off-by: Vivek Gautam 
> > ---
> >  drivers/soc/qcom/Kconfig| 10 +-
> >  drivers/soc/qcom/Makefile   |  2 +-
> >  drivers/soc/qcom/{llcc-sdm845.c => llcc-plat.c} |  0
> >  3 files changed, 6 insertions(+), 6 deletions(-)
> >  rename drivers/soc/qcom/{llcc-sdm845.c => llcc-plat.c} (100%)
> >
> > diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> > index a6d1bfb17279..8110d415b18e 100644
> > --- a/drivers/soc/qcom/Kconfig
> > +++ b/drivers/soc/qcom/Kconfig
> > @@ -62,13 +62,13 @@ config QCOM_LLCC
> > to clients that use the LLCC. Say yes here to enable LLCC slice
> > driver.
> >
> > -config QCOM_SDM845_LLCC
> > - tristate "Qualcomm Technologies, Inc. SDM845 LLCC driver"
> > +config QCOM_PLAT_LLCC
> > + tristate "Qualcomm Technologies, Inc. platform LLCC driver"
> >   depends on QCOM_LLCC
> >   help
> > -   Say yes here to enable the LLCC driver for SDM845. This provides
> > -   data required to configure LLCC so that clients can start using the
> > -   LLCC slices.
> > +   Say yes here to enable the LLCC driver for Qcom platforms, such as
> > +   SDM845. This provides data required to configure LLCC so that
> > +   clients can start using the LLCC slices.
> >
> >  config QCOM_MDT_LOADER
> >   tristate
> > diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
> > index eeb088beb15f..3bf26667d7ee 100644
> > --- a/drivers/soc/qcom/Makefile
> > +++ b/drivers/soc/qcom/Makefile
> > @@ -21,6 +21,6 @@ obj-$(CONFIG_QCOM_SMSM) += smsm.o
> >  obj-$(CONFIG_QCOM_WCNSS_CTRL) += wcnss_ctrl.o
> >  obj-$(CONFIG_QCOM_APR) += apr.o
> >  obj-$(CONFIG_QCOM_LLCC) += llcc-slice.o
> > -obj-$(CONFIG_QCOM_SDM845_LLCC) += llcc-sdm845.o
> > +obj-$(CONFIG_QCOM_PLAT_LLCC) += llcc-plat.o
> >  obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
> >  obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
> > diff --git a/drivers/soc/qcom/llcc-sdm845.c b/drivers/soc/qcom/llcc-plat.c
> > similarity index 100%
> > rename from drivers/soc/qcom/llcc-sdm845.c
> > rename to drivers/soc/qcom/llcc-plat.c
> > --
> > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> > of Code Aurora Forum, hosted by The Linux Foundation
> >



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 2/2] soc: qcom: llcc-plat: Make the driver more generic

2019-07-11 Thread Vivek Gautam
- Remove 'sdm845' from names, and use 'plat' instead.
- Move SCT_ENTRY macro to header file.
- Create a new config structure to asssign to of-match-data.

Signed-off-by: Vivek Gautam 
---
 drivers/soc/qcom/llcc-plat.c   | 77 --
 include/linux/soc/qcom/llcc-qcom.h | 45 ++
 2 files changed, 68 insertions(+), 54 deletions(-)

diff --git a/drivers/soc/qcom/llcc-plat.c b/drivers/soc/qcom/llcc-plat.c
index 86600d97c36d..31cff0f75b53 100644
--- a/drivers/soc/qcom/llcc-plat.c
+++ b/drivers/soc/qcom/llcc-plat.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
  *
  */
 
@@ -10,47 +10,7 @@
 #include 
 #include 
 
-/*
- * SCT(System Cache Table) entry contains of the following members:
- * usecase_id: Unique id for the client's use case
- * slice_id: llcc slice id for each client
- * max_cap: The maximum capacity of the cache slice provided in KB
- * priority: Priority of the client used to select victim line for replacement
- * fixed_size: Boolean indicating if the slice has a fixed capacity
- * bonus_ways: Bonus ways are additional ways to be used for any slice,
- * if client ends up using more than reserved cache ways. Bonus
- * ways are allocated only if they are not reserved for some
- * other client.
- * res_ways: Reserved ways for the cache slice, the reserved ways cannot
- * be used by any other client than the one its assigned to.
- * cache_mode: Each slice operates as a cache, this controls the mode of the
- * slice: normal or TCM(Tightly Coupled Memory)
- * probe_target_ways: Determines what ways to probe for access hit. When
- *configured to 1 only bonus and reserved ways are probed.
- *When configured to 0 all ways in llcc are probed.
- * dis_cap_alloc: Disable capacity based allocation for a client
- * retain_on_pc: If this bit is set and client has maintained active vote
- *   then the ways assigned to this client are not flushed on power
- *   collapse.
- * activate_on_init: Activate the slice immediately after the SCT is programmed
- */
-#define SCT_ENTRY(uid, sid, mc, p, fs, bway, rway, cmod, ptw, dca, rp, a) \
-   {   \
-   .usecase_id = uid,  \
-   .slice_id = sid,\
-   .max_cap = mc,  \
-   .priority = p,  \
-   .fixed_size = fs,   \
-   .bonus_ways = bway, \
-   .res_ways = rway,   \
-   .cache_mode = cmod, \
-   .probe_target_ways = ptw,   \
-   .dis_cap_alloc = dca,   \
-   .retain_on_pc = rp, \
-   .activate_on_init = a,  \
-   }
-
-static struct llcc_slice_config sdm845_data[] =  {
+static const struct llcc_slice_config sdm845_data[] =  {
SCT_ENTRY(LLCC_CPUSS,1,  2816, 1, 0, 0xffc, 0x2,   0, 0, 1, 1, 1),
SCT_ENTRY(LLCC_VIDSC0,   2,  512,  2, 1, 0x0,   0x0f0, 0, 0, 1, 1, 0),
SCT_ENTRY(LLCC_VIDSC1,   3,  512,  2, 1, 0x0,   0x0f0, 0, 0, 1, 1, 0),
@@ -71,30 +31,39 @@ static struct llcc_slice_config sdm845_data[] =  {
SCT_ENTRY(LLCC_AUDHW,22, 1024, 1, 1, 0xffc, 0x2,   0, 0, 1, 1, 0),
 };
 
-static int sdm845_qcom_llcc_remove(struct platform_device *pdev)
+static const struct qcom_llcc_config sdm845_cfg = {
+   .sct_data   = sdm845_data,
+   .size   = ARRAY_SIZE(sdm845_data),
+};
+
+static int qcom_plat_llcc_remove(struct platform_device *pdev)
 {
return qcom_llcc_remove(pdev);
 }
 
-static int sdm845_qcom_llcc_probe(struct platform_device *pdev)
+static int qcom_plat_llcc_probe(struct platform_device *pdev)
 {
-   return qcom_llcc_probe(pdev, sdm845_data, ARRAY_SIZE(sdm845_data));
+   const struct qcom_llcc_config *cfg;
+
+   cfg = of_device_get_match_data(&pdev->dev);
+
+   return qcom_llcc_probe(pdev, cfg->sct_data, cfg->size);
 }
 
-static const struct of_device_id sdm845_qcom_llcc_of_match[] = {
-   { .compatible = "qcom,sdm845-llcc", },
+static const struct of_device_id qcom_plat_llcc_of_match[] = {
+   { .compatible = "qcom,sdm845-llcc", .data = &sdm845_cfg },
{ }
 };
 
-static struct platform_driver sdm845_qcom_llcc_driver = {
+static struct platform_driver qcom_plat_llcc_driver = {
.driver = {
-   .name = "sdm845-llcc",
-   .of_match_table = sdm845_qcom_llcc_of_match,
+   .name = "qcom-plat-llcc",
+   .of_match_table = qcom_plat_llcc_of_match,
},
-   .probe = sdm845_qcom_llcc_probe

[PATCH 1/2] soc: qcom: llcc: Rename llcc-sdm845 to llcc-plat

2019-07-11 Thread Vivek Gautam
To avoid adding files for each future supported SoCs rename
the file to a generic name - llcc-plat, so that llcc configuration
tables for other SoCs can be added in the same driver.

Signed-off-by: Vivek Gautam 
---
 drivers/soc/qcom/Kconfig| 10 +-
 drivers/soc/qcom/Makefile   |  2 +-
 drivers/soc/qcom/{llcc-sdm845.c => llcc-plat.c} |  0
 3 files changed, 6 insertions(+), 6 deletions(-)
 rename drivers/soc/qcom/{llcc-sdm845.c => llcc-plat.c} (100%)

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index a6d1bfb17279..8110d415b18e 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -62,13 +62,13 @@ config QCOM_LLCC
  to clients that use the LLCC. Say yes here to enable LLCC slice
  driver.
 
-config QCOM_SDM845_LLCC
-   tristate "Qualcomm Technologies, Inc. SDM845 LLCC driver"
+config QCOM_PLAT_LLCC
+   tristate "Qualcomm Technologies, Inc. platform LLCC driver"
depends on QCOM_LLCC
help
- Say yes here to enable the LLCC driver for SDM845. This provides
- data required to configure LLCC so that clients can start using the
- LLCC slices.
+ Say yes here to enable the LLCC driver for Qcom platforms, such as
+ SDM845. This provides data required to configure LLCC so that
+ clients can start using the LLCC slices.
 
 config QCOM_MDT_LOADER
tristate
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index eeb088beb15f..3bf26667d7ee 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -21,6 +21,6 @@ obj-$(CONFIG_QCOM_SMSM)   += smsm.o
 obj-$(CONFIG_QCOM_WCNSS_CTRL) += wcnss_ctrl.o
 obj-$(CONFIG_QCOM_APR) += apr.o
 obj-$(CONFIG_QCOM_LLCC) += llcc-slice.o
-obj-$(CONFIG_QCOM_SDM845_LLCC) += llcc-sdm845.o
+obj-$(CONFIG_QCOM_PLAT_LLCC) += llcc-plat.o
 obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
 obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
diff --git a/drivers/soc/qcom/llcc-sdm845.c b/drivers/soc/qcom/llcc-plat.c
similarity index 100%
rename from drivers/soc/qcom/llcc-sdm845.c
rename to drivers/soc/qcom/llcc-plat.c
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/1] arm64: dts: sdm845: Add device node for Last level cache controller

2019-07-10 Thread Vivek Gautam
From: Sai Prakash Ranjan 

Last level cache (aka. system cache) controller provides control
over the last level cache present on SDM845. This cache lies after
the memory noc, right before the DDR.

Signed-off-by: Sai Prakash Ranjan 
Signed-off-by: Vivek Gautam 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 4babff5f19b5..314241a99290 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -1275,6 +1275,13 @@
};
};
 
+   cache-controller@110 {
+   compatible = "qcom,sdm845-llcc";
+   reg = <0 0x110 0 0x20>, <0 0x130 0 0x5>;
+   reg-names = "llcc_base", "llcc_broadcast_base";
+   interrupts = ;
+   };
+
ufs_mem_hc: ufshc@1d84000 {
compatible = "qcom,sdm845-ufshc", "qcom,ufshc",
 "jedec,ufs-2.0";
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH v3 3/4] iommu/arm-smmu: Add support to handle Qcom's wait-for-safe logic

2019-06-27 Thread Vivek Gautam
On Wed, Jun 26, 2019 at 8:18 PM Will Deacon  wrote:
>
> On Wed, Jun 26, 2019 at 12:03:02PM +0530, Vivek Gautam wrote:
> > On Tue, Jun 25, 2019 at 7:09 PM Will Deacon  wrote:
> > >
> > > On Tue, Jun 25, 2019 at 12:34:56PM +0530, Vivek Gautam wrote:
> > > > On Mon, Jun 24, 2019 at 10:33 PM Will Deacon  wrote:
> > > > > Instead, I think this needs to be part of a separate file that is 
> > > > > maintained
> > > > > by you, which follows on from the work that Krishna is doing for 
> > > > > nvidia
> > > > > built on top of Robin's prototype patches:
> > > > >
> > > > > http://linux-arm.org/git?p=linux-rm.git;a=shortlog;h=refs/heads/iommu/smmu-impl
> > > >
> > > > Looking at this branch quickly, it seem there can be separate 
> > > > implementation
> > > > level configuration file that can be added.
> > > > But will this also handle separate page table ops when required in 
> > > > future.
> > >
> > > Nothing's set in stone, but having the implementation-specific code
> > > constrain the page-table format (especially wrt quirks) sounds reasonable 
> > > to
> > > me. I'm currently waiting for Krishna to respin the nvidia changes [1] on
> > > top of this so that we can see how well the abstractions are holding up.
> >
> > Sure. Would you want me to try Robin's branch and take out the qualcomm
> > related stuff to its own implementation? Or, would you like me to respin 
> > this
> > series so that you can take it in to enable SDM845 boards such as, MTP
> > and dragonboard to have a sane build - debian, etc. so people benefit
> > out of it.
>
> I can't take this series without Acks on the firmware calling changes, and I
> plan to send my 5.3 patches to Joerg at the end of the week so they get some
> time in -next. In which case, I think it may be worth you having a play with
> the branch above so we can get a better idea of any additional smmu_impl hooks
> you may need.

Cool. I will play around with it and get something tangible and meaningful.

>
> > Qualcomm stuff is lying in qcom-smmu and arm-smmu and may take some
> > time to stub out the implementation related details.
>
> Not sure I follow you here. Are you talking about qcom_iommu.c?

That's right. The qcom_iommu.c solved a different issue of secure context bank
allocations, when Rob forked out this driver and reused some of the
arm-smmu.c stuff.

We will take a look at that once we start adding the qcom implementation.

Thanks
Vivek

>
> Will
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v2] arm64: dts: qcom: msm8996: Rename smmu nodes

2019-06-25 Thread Vivek Gautam
On Wed, Jun 19, 2019 at 5:46 AM Bjorn Andersson
 wrote:
>
> Node names shouldn't include a vendor prefix and should whenever
> possible use a generic identifier. Resolve this by renaming the smmu
> nodes "iommu".

The bindings too say so :)
Reviewed-by: Vivek Gautam 

>
> Signed-off-by: Bjorn Andersson 
> ---
>
> Changes since v1:
> - Updated commit message to talk about vendor prefix rather than qcom,
>
>  arch/arm64/boot/dts/qcom/msm8996.dtsi | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index 2ecd9d775d61..c934e00434c7 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -1163,7 +1163,7 @@
> };
> };
>
> -   vfe_smmu: arm,smmu@da {
> +   vfe_smmu: iommu@da {
> compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
> reg = <0xda 0x1>;
>
> @@ -1314,7 +1314,7 @@
> };
> };
>
> -   adreno_smmu: arm,smmu@b4 {
> +   adreno_smmu: iommu@b4 {
> compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
> reg = <0xb4 0x1>;
>
> @@ -1331,7 +1331,7 @@
> power-domains = <&mmcc GPU_GDSC>;
> };
>
> -   mdp_smmu: arm,smmu@d0 {
> +   mdp_smmu: iommu@d0 {
> compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
> reg = <0xd0 0x1>;
>
> @@ -1347,7 +1347,7 @@
> power-domains = <&mmcc MDSS_GDSC>;
> };
>
> -   lpass_q6_smmu: arm,smmu-lpass_q6@160 {
> +   lpass_q6_smmu: iommu@160 {
> compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
> reg = <0x160 0x2>;
> #iommu-cells = <1>;
> --
> 2.18.0
>


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v1] phy: qcom-qmp: Raise qcom_qmp_phy_enable() polling delay

2019-06-14 Thread Vivek Gautam

Hi Marc,

On 6/13/2019 5:02 PM, Marc Gonzalez wrote:

readl_poll_timeout() calls usleep_range() to sleep between reads.
usleep_range() doesn't work efficiently for tiny values.

Raise the polling delay in qcom_qmp_phy_enable() to bring it in line
with the delay in qcom_qmp_phy_com_init().

Signed-off-by: Marc Gonzalez 
---
Vivek, do you remember why you didn't use the same delay value in
qcom_qmp_phy_enable) and qcom_qmp_phy_com_init() ?


phy_qcom_init() thingy came from the PCIE phy driver from downstream 
msm-3.18

PCIE did something as below:

-
do {
    if (pcie_phy_is_ready(dev))
    break;
    retries++;
    usleep_range(REFCLK_STABILIZATION_DELAY_US_MIN,
 REFCLK_STABILIZATION_DELAY_US_MAX);
} while (retries < PHY_READY_TIMEOUT_COUNT);

REFCLK_STABILIZATION_DELAY_US_MIN/MAX ==> 1000/1005
PHY_READY_TIMEOUT_COUNT ==> 10
-


phy_enable() from the usb phy driver from downstream.
 /* Wait for PHY initialization to be done */
 do {
 if (readl_relaxed(phy->base +
 phy->phy_reg[USB3_PHY_PCS_STATUS]) & PHYSTATUS)
 usleep_range(1, 2);
else
break;
 } while (--init_timeout_usec);

init_timeout_usec ==> 1000
-
USB never had a COM_PHY status bit.

So clearly the resolutions were different.

Does this change solves an issue at hand?


---
  drivers/phy/qualcomm/phy-qcom-qmp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c 
b/drivers/phy/qualcomm/phy-qcom-qmp.c
index bb522b915fa9..34ff6434da8f 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp.c
+++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
@@ -1548,7 +1548,7 @@ static int qcom_qmp_phy_enable(struct phy *phy)
status = pcs + cfg->regs[QPHY_PCS_READY_STATUS];
mask = cfg->mask_pcs_ready;
  
-	ret = readl_poll_timeout(status, val, val & mask, 1,

+   ret = readl_poll_timeout(status, val, val & mask, 10,
 PHY_INIT_COMPLETE_TIMEOUT);
if (ret) {
dev_err(qmp->dev, "phy initialization timed-out\n");




Re: [PATCH] arm64: dts: sdm845: Add iommus property to qup1

2019-06-12 Thread Vivek Gautam




On 6/11/2019 4:51 AM, Stephen Boyd wrote:

Quoting Vivek Gautam (2019-06-06 04:17:16)

Hi Stephen,

On Thu, Jun 6, 2019 at 2:27 AM Stephen Boyd  wrote:

Quoting Vivek Gautam (2019-06-04 21:55:26)


Cheza will throw faults for anything that is programmed with TZ on mtp
as all of that should be handled in HLOS. The firmwares of all these
peripherals assume that the SID reservation is done (whether in TZ or HLOS).

I am inclined to moving the iommus property for all 'TZ' to board dts files.
MTP wouldn't need those SIDs. So, the SOC level dtsi will have just the
HLOS SIDs.

So you're saying you'd like to have the SID be <&apps_smmu 0x6c3 0x0> in
the sdm845.dtsi file and then override this on Cheza because our SID is
different (possibly because we don't use GSI)? Why can't we program the
SID in Cheza firmware to match the "HLOS" SID of 0x6c3?

Sorry my bad, I missed the overriding part.
May be we add the lists of SIDs in board dts only. So, cheza dts will
have all these SIDs -
<&apps_smmu 0x6c0 0x3>   // for both 0x6c0 (TZ) and 0x6c3 (HLOS)
<&apps_smmu 0x6d6 0x0>   // if we want to use the GSI dma.
and
MTP will have
<&apps_smmu 0x6c3 0x0>
<&apps_smmu 0x6d6 0x0>
WDUT?

I'd prefer to fix the firmware so that the HLOS SID is used even on this
board. Making Cheza use something different from MTP doesn't sound so
good. Do you know how that works? Is there some configuration register
or something that I should be looking for to see why the SID is not the
HLOS one? It's definitely generating SIDs for the TZ SID (0x6c0), but
I'd like to make sure that we can't change it because it's tied to some
hardware signal like the NS bit and/or the Execution Level. Hopefully
it's a config and then our difference from MTP can be minimized.


I don't think SMMU limits any such programming of SIDs. It's a design 
decision

to program few SIDs in TZ/Hyp and allocate the corresponding context banks
and create respective mappings. You should be able to see these SMR 
configurations

before kernel boots up. I use a simple T32 command -

smmu.add ""  
smmu.streammaptable 

e.g. for sdm845 apps_smmu

smmu.add "apps" MMU500 a:0x1500
smmu.StreamMapTable apps

This dumps all the information regarding the smmu.



As far as I can tell, HLOS on SDM845 has always used GPI (yet another
DMA engine) to do the DMA transfers. And the GPI is the hardware block
that uses the SID of 0x6d6 above, so putting that into iommus for the
geniqup doesn't make any sense given that GPI is another node. Can you
confirm this is the case? Furthermore, the SID of 0x6c3 sounds untested?
Has it ever been generated on SDM845 MTP?


I will get back with this information.

BRs
Vivek



If we ever support GPI, I'd expect to see something like this in DT:

gpi_dma: gpi@a0 {
reg = <0x00a0 0x6>;
iommus = <&apps_smmu 0x6d6 0x0>;
...
};

geniqup@ac {
reg = <0x00ac 0x6000>;
iommus = <&apps_smmu 0x6c3 0x0>;

i2c@{

dmas = <&gpi_dma >;
};

But now I'm worried that the geniqup needs the proper geniqup wrapper
clks to talk to it. Most likely the GPI is embedded inside the geniqup
wrapper and sits right next to the bus to do bus DMA mastering. From the
DT side, it means we should either put it inside the geniqup node, or we
should add the wrapper clks to the GPI node and hope things work out
with regards to clks and shared resources being used at the right time.

If we're left with trying to figure out how to express the different
SIDs depending on the CPU execution state then it may be easier to push
for GPI upstreaming and use that dma engine to "fold" the SID
numberspace into one SID for the GPI. This would avoid having to deal
with the HLOS vs. TZ SID problem by adding a whole other driver. Or we
could just rip out the non-GPI DMA support in this driver because the
SID is all confused.





[PATCH v3 0/4] Qcom smmu-500 wait-for-safe handling for sdm845

2019-06-12 Thread Vivek Gautam
Subject changed, older subject was -
Qcom smmu-500 TLB invalidation errata for sdm845.
Previous version of the patches are at [1]:

Qcom's implementation of smmu-500 on sdm845 adds a hardware logic called
wait-for-safe. This logic helps in meeting the invalidation requirements
from 'real-time clients', such as display and camera. This wait-for-safe
logic ensures that the invalidations happen after getting an ack from these
devices.
In this patch-series we are disabling this wait-for-safe logic from the
arm-smmu driver's probe as with this enabled the hardware tries to
throttle invalidations from 'non-real-time clients', such as USB and UFS.

For detailed information please refer to patch [3/4] in this series.
I have included the device tree patch too in this series for someone who
would like to test out this. Here's a branch [2] that gets display on MTP
SDM845 device.

This patch series is inspired from downstream work to handle under-performance
issues on real-time clients on sdm845. In downstream we add separate page table
ops to handle TLB maintenance and toggle wait-for-safe in tlb_sync call so that
achieve required performance for display and camera [3, 4].

Changes since v2:
 * Dropped the patch to add atomic io_read/write scm API.
 * Removed support for any separate page table ops to handle wait-for-safe.
   Currently just disabling this wait-for-safe logic from 
arm_smmu_device_probe()
   to achieve performance on USB/UFS on sdm845.
 * Added a device tree patch to add smmu option for fw-implemented support
   for SCM call to take care of SAFE toggling.

Changes since v1:
 * Addressed Will and Robin's comments:
- Dropped the patch[4] that forked out __arm_smmu_tlb_inv_range_nosync(),
  and __arm_smmu_tlb_sync().
- Cleaned up the errata patch further to use downstream polling mechanism
  for tlb sync.
 * No change in SCM call patches - patches 1 to 3.

[1] https://lore.kernel.org/patchwork/cover/983913/
[2] https://github.com/vivekgautam1/linux/tree/v5.2-rc4/sdm845-display-working
[3] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/drivers/iommu/arm-smmu.c?h=CogSystems-msm-49/msm-4.9&id=da765c6c75266b38191b38ef086274943f353ea7
[4] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/drivers/iommu/arm-smmu.c?h=CogSystems-msm-49/msm-4.9&id=8696005aaaf745de68f57793c1a534a34345c30a

Vivek Gautam (4):
  firmware: qcom_scm-64: Add atomic version of qcom_scm_call
  firmware/qcom_scm: Add scm call to handle smmu errata
  iommu/arm-smmu: Add support to handle Qcom's wait-for-safe logic
  arm64: dts/sdm845: Enable FW implemented safe sequence handler on MTP

 arch/arm64/boot/dts/qcom/sdm845.dtsi |   1 +
 drivers/firmware/qcom_scm-32.c   |   5 ++
 drivers/firmware/qcom_scm-64.c   | 149 ---
 drivers/firmware/qcom_scm.c  |   6 ++
 drivers/firmware/qcom_scm.h  |   5 ++
 drivers/iommu/arm-smmu.c |  16 
 include/linux/qcom_scm.h |   2 +
 7 files changed, 140 insertions(+), 44 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v3 2/4] firmware/qcom_scm: Add scm call to handle smmu errata

2019-06-12 Thread Vivek Gautam
Qcom's smmu-500 needs to toggle wait-for-safe logic to
handle TLB invalidations.
Few firmwares allow doing that through SCM interface.
Add API to toggle wait for safe from firmware through a
SCM call.

Signed-off-by: Vivek Gautam 
Reviewed-by: Bjorn Andersson 
---
 drivers/firmware/qcom_scm-32.c |  5 +
 drivers/firmware/qcom_scm-64.c | 13 +
 drivers/firmware/qcom_scm.c|  6 ++
 drivers/firmware/qcom_scm.h|  5 +
 include/linux/qcom_scm.h   |  2 ++
 5 files changed, 31 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 215061c581e1..bee8729525ec 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -614,3 +614,8 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t 
addr, unsigned int val)
return qcom_scm_call_atomic2(QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
 addr, val);
 }
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool enable)
+{
+   return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index b6dca32c5ac4..23de54b75cd7 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -550,3 +550,16 @@ int __qcom_scm_io_writel(struct device *dev, phys_addr_t 
addr, unsigned int val)
return qcom_scm_call(dev, QCOM_SCM_SVC_IO, QCOM_SCM_IO_WRITE,
 &desc, &res);
 }
+
+int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev, bool en)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+
+   desc.args[0] = QCOM_SCM_CONFIG_SAFE_EN_CLIENT_ALL;
+   desc.args[1] = en;
+   desc.arginfo = QCOM_SCM_ARGS(2);
+
+   return qcom_scm_call_atomic(dev, QCOM_SCM_SVC_SMMU_PROGRAM,
+   QCOM_SCM_CONFIG_SAFE_EN, &desc, &res);
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 2ddc118dba1b..2b3b7a8c4270 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -344,6 +344,12 @@ int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, 
u32 spare)
 }
 EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_init);
 
+int qcom_scm_qsmmu500_wait_safe_toggle(bool en)
+{
+   return __qcom_scm_qsmmu500_wait_safe_toggle(__scm->dev, en);
+}
+EXPORT_SYMBOL(qcom_scm_qsmmu500_wait_safe_toggle);
+
 int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val)
 {
return __qcom_scm_io_readl(__scm->dev, addr, val);
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 99506bd873c0..0b63ded89b41 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -91,10 +91,15 @@ extern int __qcom_scm_restore_sec_cfg(struct device *dev, 
u32 device_id,
  u32 spare);
 #define QCOM_SCM_IOMMU_SECURE_PTBL_SIZE3
 #define QCOM_SCM_IOMMU_SECURE_PTBL_INIT4
+#define QCOM_SCM_SVC_SMMU_PROGRAM  0x15
+#define QCOM_SCM_CONFIG_SAFE_EN0x3
+#define QCOM_SCM_CONFIG_SAFE_EN_CLIENT_ALL 0x2
 extern int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
 size_t *size);
 extern int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr,
 u32 size, u32 spare);
+extern int __qcom_scm_qsmmu500_wait_safe_toggle(struct device *dev,
+   bool enable);
 #define QCOM_MEM_PROT_ASSIGN_ID0x16
 extern int  __qcom_scm_assign_mem(struct device *dev,
  phys_addr_t mem_region, size_t mem_sz,
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 3f12cc77fb58..aee3d8580d89 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -57,6 +57,7 @@ extern int qcom_scm_set_remote_state(u32 state, u32 id);
 extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
 extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
 extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
+extern int qcom_scm_qsmmu500_wait_safe_toggle(bool en);
 extern int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val);
 extern int qcom_scm_io_writel(phys_addr_t addr, unsigned int val);
 #else
@@ -96,6 +97,7 @@ qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; 
}
 static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return 
-ENODEV; }
 static inline int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size) { 
return -ENODEV; }
 static inline int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 
spare) { return -ENODEV; }
+static inline int qcom_scm_qsmmu500_wait_safe_toggle(bool en) { return 
-ENODEV; }
 static inline int qcom_scm_io_readl(phys_addr_t addr, unsigned int *val) { 
return -ENODEV; }
 static inline int qcom_scm_io_writel(phys_addr_t addr, unsigned int val) { 
return -ENODEV; }
 #e

Re: [PATCH] arm64: dts: sdm845: Add iommus property to qup1

2019-06-06 Thread Vivek Gautam
Hi Stephen,

On Thu, Jun 6, 2019 at 2:27 AM Stephen Boyd  wrote:
>
> Quoting Vivek Gautam (2019-06-04 21:55:26)
> > On Wed, Jun 5, 2019 at 4:16 AM Stephen Boyd  wrote:
> > >
> > > Quoting Bjorn Andersson (2019-06-04 15:37:00)
> > > > On Tue 04 Jun 15:29 PDT 2019, Stephen Boyd wrote:
> > > >
> > > > > The SMMU that sits in front of the QUP needs to be programmed properly
> > > > > so that the i2c geni driver can allocate DMA descriptors. Failure to 
> > > > > do
> > > > > this leads to faults when using devices such as an i2c touchscreen 
> > > > > where
> > > > > the transaction is larger than 32 bytes and we use a DMA buffer.
> > > > >
> > > >
> > > > I'm pretty sure I've run into this problem, but before we marked the
> > > > smmu bypass_disable and as such didn't get the fault, thanks.
> > > >
> > > > >  arm-smmu 1500.iommu: Unexpected global fault, this could be 
> > > > > serious
> > > > >  arm-smmu 1500.iommu: GFSR 0x0002, GFSYNR0 
> > > > > 0x0002, GFSYNR1 0x06c0, GFSYNR2 0x
> > > > >
> > > > > Add the right SID and mask so this works.
> > > > >
> > > > > Cc: Sibi Sankar 
> > > > > Signed-off-by: Stephen Boyd 
> > > > > ---
> > > > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 1 +
> > > > >  1 file changed, 1 insertion(+)
> > > > >
> > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > > > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > > index fcb93300ca62..2e57e861e17c 100644
> > > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > > @@ -900,6 +900,7 @@
> > > > >   #address-cells = <2>;
> > > > >   #size-cells = <2>;
> > > > >   ranges;
> > > > > + iommus = <&apps_smmu 0x6c0 0x3>;
> > > >
> > > > According to the docs this stream belongs to TZ, the HLOS stream should
> > > > be 0x6c3.
> > >
> > > Aye, I saw this line in the downstream kernel but it doesn't work for
> > > me. If I specify <&apps_smmu 0x6c3 0x0> it still blows up. I wonder if
> > > my firmware perhaps is missing some initialization here to make the QUP
> > > operate in HLOS mode? Otherwise, I thought that the 0x3 at the end was
> > > the mask and so it should be split off to the second cell in the DT
> > > specifier but that seemed a little weird.
> >
> > Two things here -
> > 0x6c0 - TZ SID. Do you see above fault on MTP sdm845 devices?
>
> No. I see the above fault on Cheza.

Right, expected.

>
> > 0x6c3/0x6c6 - HLOS SIDs.

My bad, the other SID is 0x6D6.

>
> Why are there two? I see some mentions of GSI mode near these SIDs so
> maybe GSI has to be used for DMA here to get the above two SIDs at the
> IOMMU? Otherwise if we do the non-GSI mode of DMA we're going to use the
> "TZ" SID?

Yea, one for GSI, and the other one for non-GSI DMA. I am unsure at this point
about the use of TZ SID, but i would assume this is the SID that's used by the
qup firmware, and therefore on MTP TZ programs this SID.

>
> >
> > Cheza will throw faults for anything that is programmed with TZ on mtp
> > as all of that should be handled in HLOS. The firmwares of all these
> > peripherals assume that the SID reservation is done (whether in TZ or HLOS).
> >
> > I am inclined to moving the iommus property for all 'TZ' to board dts files.
> > MTP wouldn't need those SIDs. So, the SOC level dtsi will have just the
> > HLOS SIDs.
>
> So you're saying you'd like to have the SID be <&apps_smmu 0x6c3 0x0> in
> the sdm845.dtsi file and then override this on Cheza because our SID is
> different (possibly because we don't use GSI)? Why can't we program the
> SID in Cheza firmware to match the "HLOS" SID of 0x6c3?

Sorry my bad, I missed the overriding part.
May be we add the lists of SIDs in board dts only. So, cheza dts will
have all these SIDs -
<&apps_smmu 0x6c0 0x3>   // for both 0x6c0 (TZ) and 0x6c3 (HLOS)
<&apps_smmu 0x6d6 0x0>   // if we want to use the GSI dma.
and
MTP will have
<&apps_smmu 0x6c3 0x0>
<&apps_smmu 0x6d6 0x0>
WDUT?

>
> >
> > P.S.
> > As you rightly said, the second cell in iommus property is the mask so that
> > the iommu is able to reserve all that SIDs that are covered with the
> > starting SID
> > and the mask.
> >
>
> Well if 0x6c6 is another possibility maybe it should be <&apps_smmu
> 0x6c0 0x7> to cover the 0x6c3 and 0x6c6 SIDs?

The other SID was 0x6D6.

Best regards
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v2] arm64: dts: qcom: Add Dragonboard 845c

2019-06-06 Thread Vivek Gautam
Hi Bjorn,

On Thu, Jun 6, 2019 at 10:10 AM Bjorn Andersson
 wrote:
>
> This adds an initial dts for the Dragonboard 845. Supported
> functionality includes Debug UART, UFS, USB-C (peripheral), USB-A
> (host), microSD-card and Bluetooth.
>
> Initializing the SMMU is clearing the mapping used for the splash screen
> framebuffer, which causes the board to reboot. This can be worked around
> using:
>
>   fastboot oem select-display-panel none

This works well with your SMR handoff RFC series too?

>
> Signed-off-by: Bjorn Andersson 
> ---

Patch looks good, so
Reviewed-by: Vivek Gautam 

Best Regards
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] arm64: dts: qcom: sdm845-mtp: Add Truly display

2019-06-04 Thread Vivek Gautam
On Tue, May 14, 2019 at 2:39 AM Bjorn Andersson
 wrote:
>
> Bring in the Truly display and enable the DSI channels to make the
> mdss/gpu probe, even though we're lacking LABIB, preventing us from
> seeing anything on the screen.
>
> Signed-off-by: Bjorn Andersson 
> ---

Looks good to me and work well too with a wip lab-ibb driver change.

Reviewed-by: Vivek Gautam 
Tested-by: Vivek Gautam 

>  arch/arm64/boot/dts/qcom/sdm845-mtp.dts | 79 +
>  1 file changed, 79 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts 
> b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
> index 02b8357c8ce8..83198a19ff57 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
> +++ b/arch/arm64/boot/dts/qcom/sdm845-mtp.dts
> @@ -352,6 +352,77 @@
> status = "okay";
>  };
>
> +&dsi0 {
> +   status = "okay";
> +   vdda-supply = <&vdda_mipi_dsi0_1p2>;
> +
> +   qcom,dual-dsi-mode;
> +   qcom,master-dsi;
> +
> +   ports {
> +   port@1 {
> +   endpoint {
> +   remote-endpoint = <&truly_in_0>;
> +   data-lanes = <0 1 2 3>;
> +   };
> +   };
> +   };
> +
> +   panel@0 {
> +   compatible = "truly,nt35597-2K-display";
> +   reg = <0>;
> +   vdda-supply = <&vreg_l14a_1p88>;
> +
> +   reset-gpios = <&tlmm 6 GPIO_ACTIVE_LOW>;
> +   mode-gpios = <&tlmm 52 GPIO_ACTIVE_HIGH>;
> +
> +   ports {
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +
> +   port@0 {
> +   reg = <0>;
> +   truly_in_0: endpoint {
> +   remote-endpoint = <&dsi0_out>;
> +   };
> +   };
> +
> +   port@1 {
> +   reg = <1>;
> +   truly_in_1: endpoint {
> +   remote-endpoint = <&dsi1_out>;
> +   };
> +   };
> +   };
> +   };
> +};
> +
> +&dsi0_phy {
> +   status = "okay";
> +   vdds-supply = <&vdda_mipi_dsi0_pll>;
> +};
> +
> +&dsi1 {
> +   status = "okay";
> +   vdda-supply = <&vdda_mipi_dsi1_1p2>;
> +
> +   qcom,dual-dsi-mode;
> +
> +   ports {
> +   port@1 {
> +   endpoint {
> +   remote-endpoint = <&truly_in_1>;
> +   data-lanes = <0 1 2 3>;
> +   };
> +   };
> +   };
> +};
> +
> +&dsi1_phy {
> +   status = "okay";
> +   vdds-supply = <&vdda_mipi_dsi1_pll>;
> +};
> +
>  &gcc {
> protected-clocks = ,
>,
> @@ -365,6 +436,14 @@
> clock-frequency = <40>;
>  };
>
> +&mdss {
> +   status = "okay";
> +};
> +
> +&mdss_mdp {
> +   status = "okay";
> +};
> +
>  &qupv3_id_1 {
> status = "okay";
>  };
> --
> 2.18.0
>


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] arm64: dts: sdm845: Add iommus property to qup1

2019-06-04 Thread Vivek Gautam
On Wed, Jun 5, 2019 at 4:16 AM Stephen Boyd  wrote:
>
> Quoting Bjorn Andersson (2019-06-04 15:37:00)
> > On Tue 04 Jun 15:29 PDT 2019, Stephen Boyd wrote:
> >
> > > The SMMU that sits in front of the QUP needs to be programmed properly
> > > so that the i2c geni driver can allocate DMA descriptors. Failure to do
> > > this leads to faults when using devices such as an i2c touchscreen where
> > > the transaction is larger than 32 bytes and we use a DMA buffer.
> > >
> >
> > I'm pretty sure I've run into this problem, but before we marked the
> > smmu bypass_disable and as such didn't get the fault, thanks.
> >
> > >  arm-smmu 1500.iommu: Unexpected global fault, this could be serious
> > >  arm-smmu 1500.iommu: GFSR 0x0002, GFSYNR0 0x0002, 
> > > GFSYNR1 0x06c0, GFSYNR2 0x
> > >
> > > Add the right SID and mask so this works.
> > >
> > > Cc: Sibi Sankar 
> > > Signed-off-by: Stephen Boyd 
> > > ---
> > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index fcb93300ca62..2e57e861e17c 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -900,6 +900,7 @@
> > >   #address-cells = <2>;
> > >   #size-cells = <2>;
> > >   ranges;
> > > + iommus = <&apps_smmu 0x6c0 0x3>;
> >
> > According to the docs this stream belongs to TZ, the HLOS stream should
> > be 0x6c3.
>
> Aye, I saw this line in the downstream kernel but it doesn't work for
> me. If I specify <&apps_smmu 0x6c3 0x0> it still blows up. I wonder if
> my firmware perhaps is missing some initialization here to make the QUP
> operate in HLOS mode? Otherwise, I thought that the 0x3 at the end was
> the mask and so it should be split off to the second cell in the DT
> specifier but that seemed a little weird.

Two things here -
0x6c0 - TZ SID. Do you see above fault on MTP sdm845 devices?
0x6c3/0x6c6 - HLOS SIDs.

Cheza will throw faults for anything that is programmed with TZ on mtp
as all of that should be handled in HLOS. The firmwares of all these
peripherals assume that the SID reservation is done (whether in TZ or HLOS).

I am inclined to moving the iommus property for all 'TZ' to board dts files.
MTP wouldn't need those SIDs. So, the SOC level dtsi will have just the
HLOS SIDs.

P.S.
As you rightly said, the second cell in iommus property is the mask so that
the iommu is able to reserve all that SIDs that are covered with the
starting SID
and the mask.


Best regards
Vivek
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] of/device: add blacklist for iommu dma_ops

2019-06-03 Thread Vivek Gautam
On Mon, Jun 3, 2019 at 4:14 PM Rob Clark  wrote:
>
> On Mon, Jun 3, 2019 at 12:57 AM Vivek Gautam
>  wrote:
> >
> >
> >
> > On 6/3/2019 11:50 AM, Tomasz Figa wrote:
> > > On Mon, Jun 3, 2019 at 4:40 AM Rob Clark  wrote:
> > >> On Fri, May 10, 2019 at 7:35 AM Rob Clark  wrote:
> > >>> On Tue, Dec 4, 2018 at 2:29 PM Rob Herring  wrote:
> > >>>> On Sat, Dec 1, 2018 at 10:54 AM Rob Clark  wrote:
> > >>>>> This solves a problem we see with drm/msm, caused by getting
> > >>>>> iommu_dma_ops while we attach our own domain and manage it directly at
> > >>>>> the iommu API level:
> > >>>>>
> > >>>>>[0038] user address but active_mm is swapper
> > >>>>>Internal error: Oops: 9605 [#1] PREEMPT SMP
> > >>>>>Modules linked in:
> > >>>>>CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 
> > >>>>> 4.19.3 #90
> > >>>>>Hardware name: xxx (DT)
> > >>>>>Workqueue: events deferred_probe_work_func
> > >>>>>pstate: 80c9 (Nzcv daif +PAN +UAO)
> > >>>>>pc : iommu_dma_map_sg+0x7c/0x2c8
> > >>>>>lr : iommu_dma_map_sg+0x40/0x2c8
> > >>>>>sp : ff80095eb4f0
> > >>>>>x29: ff80095eb4f0 x28: 
> > >>>>>x27: ffc0f9431578 x26: 
> > >>>>>x25:  x24: 0003
> > >>>>>x23: 0001 x22: ffc0fa9ac010
> > >>>>>x21:  x20: ffc0fab40980
> > >>>>>x19: ffc0fab40980 x18: 0003
> > >>>>>x17: 01c4 x16: 0007
> > >>>>>x15: 000e x14: 
> > >>>>>x13:  x12: 0028
> > >>>>>x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> > >>>>>x9 :  x8 : ffc0fab409a0
> > >>>>>x7 :  x6 : 0002
> > >>>>>x5 : 0001 x4 : 
> > >>>>>x3 : 0001 x2 : 0002
> > >>>>>x1 : ffc0f9431578 x0 : 
> > >>>>>Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
> > >>>>>Call trace:
> > >>>>> iommu_dma_map_sg+0x7c/0x2c8
> > >>>>> __iommu_map_sg_attrs+0x70/0x84
> > >>>>> get_pages+0x170/0x1e8
> > >>>>> msm_gem_get_iova+0x8c/0x128
> > >>>>> _msm_gem_kernel_new+0x6c/0xc8
> > >>>>> msm_gem_kernel_new+0x4c/0x58
> > >>>>> dsi_tx_buf_alloc_6g+0x4c/0x8c
> > >>>>> msm_dsi_host_modeset_init+0xc8/0x108
> > >>>>> msm_dsi_modeset_init+0x54/0x18c
> > >>>>> _dpu_kms_drm_obj_init+0x430/0x474
> > >>>>> dpu_kms_hw_init+0x5f8/0x6b4
> > >>>>> msm_drm_bind+0x360/0x6c8
> > >>>>> try_to_bring_up_master.part.7+0x28/0x70
> > >>>>> component_master_add_with_match+0xe8/0x124
> > >>>>> msm_pdev_probe+0x294/0x2b4
> > >>>>> platform_drv_probe+0x58/0xa4
> > >>>>> really_probe+0x150/0x294
> > >>>>> driver_probe_device+0xac/0xe8
> > >>>>> __device_attach_driver+0xa4/0xb4
> > >>>>> bus_for_each_drv+0x98/0xc8
> > >>>>> __device_attach+0xac/0x12c
> > >>>>> device_initial_probe+0x24/0x30
> > >>>>> bus_probe_device+0x38/0x98
> > >>>>> deferred_probe_work_func+0x78/0xa4
> > >>>>> process_one_work+0x24c/0x3dc
> > >>>>> worker_thread+0x280/0x360
> > >>>>> kthread+0x134/0x13c
> > >>>>> ret_from_fork+0x10/0x18
> > >>>>>Code: d284 91000725 6b17039f 5400048a (f9401f40)
> > >>>>>---[ end trace f22dda57f3648e2c ]---
> > >>>>>Kernel panic - not syncing: Fatal exception
> > >>>>>SMP: stopping secondary CPUs
> > >>>>>Kernel Offset: disable

Re: [PATCH 1/1] drm/panel: truly: Add additional delay after pulling down reset gpio

2019-05-28 Thread Vivek Gautam




On 5/28/2019 2:13 PM, Marc Gonzalez wrote:

On 27/05/2019 12:26, Vivek Gautam wrote:


MTP SDM845 panel seems to need additional delay to bring panel
to a workable state. Running modetest without this change displays
blurry artifacts.

Signed-off-by: Vivek Gautam 
---
  drivers/gpu/drm/panel/panel-truly-nt35597.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/panel/panel-truly-nt35597.c 
b/drivers/gpu/drm/panel/panel-truly-nt35597.c
index fc2a66c53db4..aa7153fd3be4 100644
--- a/drivers/gpu/drm/panel/panel-truly-nt35597.c
+++ b/drivers/gpu/drm/panel/panel-truly-nt35597.c
@@ -280,6 +280,7 @@ static int truly_35597_power_on(struct truly_nt35597 *ctx)
gpiod_set_value(ctx->reset_gpio, 1);
usleep_range(1, 2);
gpiod_set_value(ctx->reset_gpio, 0);
+   usleep_range(1, 2);

I'm not sure usleep_range() makes sense with these values.

AFAIU, usleep_range() is typically used for sub-jiffy sleeps, and is based
on HRT to generate an interrupt.

Once we get into jiffy granularity, it seems to me msleep() is good enough.
IIUC, it would piggy-back on the jiffy timer interrupt.

In short, why not just use msleep(10); ?


I am just maintaining the symmetry across older code.

Thanks
Vivek


Regards.




Re: [PATCH v1] phy: qcom-qmp: Add msm8998 PCIe QMP PHY support

2019-03-26 Thread Vivek Gautam
Hi Marc,

On Tue, Mar 26, 2019 at 1:18 PM Kishon Vijay Abraham I  wrote:
>
> Hi,
>
> On 22/03/19 9:42 PM, Marc Gonzalez wrote:
> > Copy init sequence from downstream:
> > https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/arch/arm/boot/dts/qcom/msm8998-v2.dtsi?h=LE.UM.1.3.r3.25#n372
>
> Can't we instead have reference to HW manual or datasheet?
> >
> > Signed-off-by: Marc Gonzalez 
> > ---
> >  .../devicetree/bindings/phy/qcom-qmp-phy.txt  |   5 +
> >  drivers/phy/qualcomm/phy-qcom-qmp.c   | 110 ++
> >  drivers/phy/qualcomm/phy-qcom-qmp.h   |  12 ++
> >  3 files changed, 127 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt 
> > b/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt
> > index 5d181fc3cc18..6000ae34b12b 100644
> > --- a/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt
> > +++ b/Documentation/devicetree/bindings/phy/qcom-qmp-phy.txt
> > @@ -11,6 +11,7 @@ Required properties:
> >  "qcom,msm8996-qmp-usb3-phy" for 14nm USB3 phy on msm8996,
> >  "qcom,msm8998-qmp-usb3-phy" for USB3 QMP V3 phy on msm8998,
> >  "qcom,msm8998-qmp-ufs-phy" for UFS QMP phy on msm8998,
> > +"qcom,msm8998-qmp-pcie-phy" for PCIe QMP phy on msm8998,
> >  "qcom,sdm845-qmp-usb3-phy" for USB3 QMP V3 phy on sdm845,
> >  "qcom,sdm845-qmp-usb3-uni-phy" for USB3 QMP V3 UNI phy on 
> > sdm845,
> >  "qcom,sdm845-qmp-ufs-phy" for UFS QMP phy on sdm845.
> > @@ -48,6 +49,8 @@ Required properties:
> >   "aux", "cfg_ahb", "ref".
> >   For "qcom,msm8998-qmp-ufs-phy" must contain:
> >   "ref", "ref_aux".
> > + For "qcom,msm8998-qmp-pcie-phy" must contain:
> > + "aux", "cfg_ahb", "ref".
> >   For "qcom,sdm845-qmp-usb3-phy" must contain:
> >   "aux", "cfg_ahb", "ref", "com_aux".
> >   For "qcom,sdm845-qmp-usb3-uni-phy" must contain:
> > @@ -70,6 +73,8 @@ Required properties:
> >   For "qcom,msm8998-qmp-usb3-phy" must contain
> >   "phy", "common".
> >   For "qcom,msm8998-qmp-ufs-phy": no resets are listed.
> > + For "qcom,msm8998-qmp-pcie-phy" must contain:
> > + "phy", "common", "cfg".
> >   For "qcom,sdm845-qmp-usb3-phy" must contain:
> >   "phy", "common".
> >   For "qcom,sdm845-qmp-usb3-uni-phy" must contain:
>
> Please send the dt binding in a separate patch.
>
> Thanks
> Kishon

Thanks for the patch. Besides above comments from Kishon it looks good.
Reviewed-by: Vivek Gautam 

Best regards
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v6 8/8] arm64: dts: qcom: sdm845: Add Q6V5 MSS node

2019-03-25 Thread Vivek Gautam
Hi Doug,


On Thu, Feb 28, 2019 at 2:34 AM Doug Anderson  wrote:
>
> Hi,
>
> On Tue, Feb 26, 2019 at 3:54 PM Doug Anderson  wrote:
> >
> > Hi,
> >
> > On Tue, Feb 5, 2019 at 9:13 PM Bjorn Andersson
> >  wrote:
> > >
> > > From: Sibi Sankar 
> > >
> > > This patch adds Q6V5 MSS remoteproc node for SDM845 SoCs.
> > >
> > > Signed-off-by: Sibi Sankar 
> > > Reviewed-by: Douglas Anderson 
> > > Signed-off-by: Bjorn Andersson 
> > > ---
> > >
> > > Changes since v5:
> > > - None
> > >
> > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 58 
> > >  1 file changed, 58 insertions(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
> > > b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index 560c16616ee6..5c41f6fe3e1b 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -1612,6 +1612,64 @@
> > > };
> > > };
> > >
> > > +   mss_pil: remoteproc@408 {
> > > +   compatible = "qcom,sdm845-mss-pil";
> > > +   reg = <0 0x0408 0 0x408>, <0 0x0418 0 
> > > 0x48>;
> > > +   reg-names = "qdsp6", "rmb";
> >
> > I found that when I disabled IOMMU bypass by booting with
> > "arm-smmu.disable_bypass=y" that I'd get this failure:
> >
> > ---
> >
> > [   13.633776] qcom-q6v5-mss 408.remoteproc: MBA booted, loading mpss
> > [   13.647694] arm-smmu 1500.iommu: Unexpected global fault, this
> > could be serious
> > [   13.660278] arm-smmu 1500.iommu: GFSR 0x8002, GFSYNR0
> > 0x, GFSYNR1 0x0781, GFSYNR2 0x
> > ...
> > [   14.648830] qcom-q6v5-mss 408.remoteproc: MPSS header
> > authentication timed out
> > [   14.657141] qcom-q6v5-mss 408.remoteproc: port failed halt
> > [   14.664983] remoteproc remoteproc0: can't start rproc
> > 408.remoteproc: -110
> >
> > ---
> >
> > Adding "iommus = <&apps_smmu 0x781 0>;" here fixed my problem.  NOTE
> > that I'm no expert on IOMMUs so you should confirm that this is right,
> > but if it is then maybe you could include it in the next spin of the
> > series?  I got the "0x781" just by looking at the value of the GFSYNR1
> > in the above splat.  I wasn't sure what to put for the mask so I put
> > 0x0.
>
> Upon more testing the "iommus" line that I came up with avoids the
> global fault but doesn't actually work.  I just get:
>
> qcom-q6v5-mss 408.remoteproc: failed to allocate mdt buffer
>
> I'm hoping someone from Qualcomm can help out here and say how this
> should be solved.  Thanks!

I and Sibi had a chance to look at this, and we could compare things
with MTP sdm845
device as well.

>From the 845 block diagram it's clear that one of the MPSS paths goes
through SMMU
and therefore we have the SIDs 0x780 - 0x783 reserved for these streams.
However, it is recommended to use them in a bypass mode (S2CR_TYPE_BYPASS).

On MTP devices, the secure code programs these SIDs in SMMU and, as these
SMRs are marked secure they are not visible to the kernel. Thus kernel wouldn't
overwrite anything.
However, in your case there's no such reservation by the secure code.
In such a case,
we may need to make SMMU aware of these SIDs in the kernel.

And please note that adding "iommus = <&apps_smmu 0x781 0>" to the PIL
device may not
be the correct thing to do, since actual MPSS data streams don't use the SMMU.
So, configuring DMA path via SMMU isn't right.

Thanks & regards
Vivek

>
>
> -Doug



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-29 Thread Vivek Gautam
On Tue, Jan 29, 2019 at 8:34 PM Ard Biesheuvel
 wrote:
>
> (+ Bjorn)
>
> On Mon, 28 Jan 2019 at 12:27, Vivek Gautam  
> wrote:
> >
> > Hi Ard,
> >
> > On Thu, Jan 24, 2019 at 1:25 PM Ard Biesheuvel
> >  wrote:
> > >
> > > On Thu, 24 Jan 2019 at 07:58, Vivek Gautam  
> > > wrote:
> > > >
> > > > On Mon, Jan 21, 2019 at 7:55 PM Ard Biesheuvel
> > > >  wrote:
> > > > >
> > > > > On Mon, 21 Jan 2019 at 14:56, Robin Murphy  
> > > > > wrote:
> > > > > >
> > > > > > On 21/01/2019 13:36, Ard Biesheuvel wrote:
> > > > > > > On Mon, 21 Jan 2019 at 14:25, Robin Murphy  
> > > > > > > wrote:
> > > > > > >>
> > > > > > >> On 21/01/2019 10:50, Ard Biesheuvel wrote:
> > > > > > >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam 
> > > > > > >>>  wrote:
> > > > > > >>>>
> > > > > > >>>> Hi,
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
> > > > > > >>>>  wrote:
> > > > > > >>>>>
> > > > > > >>>>> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam 
> > > > > > >>>>>  wrote:
> > > > > > >>>>>>
> > > > > > >>>>>> Qualcomm SoCs have an additional level of cache called as
> > > > > > >>>>>> System cache, aka. Last level cache (LLC). This cache sits 
> > > > > > >>>>>> right
> > > > > > >>>>>> before the DDR, and is tightly coupled with the memory 
> > > > > > >>>>>> controller.
> > > > > > >>>>>> The clients using this cache request their slices from this
> > > > > > >>>>>> system cache, make it active, and can then start using it.
> > > > > > >>>>>> For these clients with smmu, to start using the system cache 
> > > > > > >>>>>> for
> > > > > > >>>>>> buffers and, related page tables [1], memory attributes need 
> > > > > > >>>>>> to be
> > > > > > >>>>>> set accordingly. This series add the required support.
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>> Does this actually improve performance on reads from a 
> > > > > > >>>>> device? The
> > > > > > >>>>> non-cache coherent DMA routines perform an unconditional 
> > > > > > >>>>> D-cache
> > > > > > >>>>> invalidate by VA to the PoC before reading from the buffers 
> > > > > > >>>>> filled by
> > > > > > >>>>> the device, and I would expect the PoC to be defined as lying 
> > > > > > >>>>> beyond
> > > > > > >>>>> the LLC to still guarantee the architected behavior.
> > > > > > >>>>
> > > > > > >>>> We have seen performance improvements when running Manhattan
> > > > > > >>>> GFXBench benchmarks.
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>> Ah ok, that makes sense, since in that case, the data flow is 
> > > > > > >>> mostly
> > > > > > >>> to the device, not from the device.
> > > > > > >>>
> > > > > > >>>> As for the PoC, from my knowledge on sdm845 the system cache, 
> > > > > > >>>> aka
> > > > > > >>>> Last level cache (LLC) lies beyond the point of coherency.
> > > > > > >>>> Non-cache coherent buffers will not be cached to system cache 
> > > > > > >>>> also, and
> > > > > > >>>> no additional software cache maintenance ops are required for 
> > > > > > >>>> system cache.
> > > > > > >>>> Pratik can add mor

Re: [PATCH 2/2] iommu/arm-smmu: Add support for non-coherent page table mappings

2019-01-29 Thread Vivek Gautam
Hi Will,

On Tue, Jan 22, 2019 at 11:14 AM Will Deacon  wrote:
>
> On Mon, Jan 21, 2019 at 11:35:30AM +0530, Vivek Gautam wrote:
> > On Sun, Jan 20, 2019 at 5:31 AM Will Deacon  wrote:
> > > On Thu, Jan 17, 2019 at 02:57:18PM +0530, Vivek Gautam wrote:
> > > > Adding a device tree option for arm smmu to enable non-cacheable
> > > > memory for page tables.
> > > > We already enable a smmu feature for coherent walk based on
> > > > whether the smmu device is dma-coherent or not. Have an option
> > > > to enable non-cacheable page table memory to force set it for
> > > > particular smmu devices.
> > >
> > > Hmm, I must be missing something here. What is the difference between this
> > > new property, and simply omitting dma-coherent on the SMMU?
> >
> > So, this is what I understood from the email thread for Last level
> > cache support -
> > Robin pointed to the fact that we may need to add support for setting
> > non-cacheable
> > mappings in the TCR.
> > Currently, we don't do that for SMMUs that omit dma-coherent.
> > We rely on the interconnect to handle the configuration set in TCR,
> > and let interconnect
> > ignore the cacheability if it can't support.
>
> I think that's a bug. With that fixed, can you get what you want by omitting
> "dma-coherent"?

Based on the discussion on the first patch in this series [1], I can
update the series.
First thing can be -
if QUIRK_NO_DMA is set (i.e. the IOMMU _is_ coherent) then we use a
cacheable TCR;
So, we may need an additional check for this when setting the TCR.

For the second case -
IOMMUs that are *not* coherent, i.e ones that are omitting
'dma-coherent' property,
anyways have to access the page table directly from memory. We take
care of the CPU
side of this by allocating non-coherent memory, and making sure that we sync the
PTEs from map call.
Shouldn't we mark TCR for these IOMMUs as non-cacheable for inner and outer
cacheability attribute?


[1] https://lore.kernel.org/patchwork/patch/1032939/

Regards
Vivek

>
> Will



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 1/2] iommu/io-pgtable-arm: Add support for non-coherent page tables

2019-01-28 Thread Vivek Gautam
On Mon, Jan 21, 2019 at 6:43 PM Robin Murphy  wrote:
>
> On 17/01/2019 09:27, Vivek Gautam wrote:
> >  From Robin's comment [1] about touching TCR configurations -
> >
> > "TBH if we're going to touch the TCR attributes at all then we should
> > probably correct that sloppiness first - there's an occasional argument
> > for using non-cacheable pagetables even on a coherent SMMU if reducing
> > snoop traffic/latency on walks outweighs the cost of cache maintenance
> > on PTE updates, but anyone thinking they can get that by overriding
> > dma-coherent silently gets the worst of both worlds thanks to this
> > current TCR value."
> >
> > We have IO_PGTABLE_QUIRK_NO_DMA quirk present, but we don't force
> > anybody _not_ using dma-coherent smmu to have non-cacheable page table
> > mappings.
> > Having another quirk flag can help in having non-cacheable memory for
> > page tables once and for all.
> >
> > [1] https://lore.kernel.org/patchwork/patch/1020906/
> >
> > Signed-off-by: Vivek Gautam 
> > ---
> >   drivers/iommu/io-pgtable-arm.c | 17 -
> >   drivers/iommu/io-pgtable.h |  6 ++
> >   2 files changed, 18 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 237cacd4a62b..c76919c30f1a 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -780,7 +780,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg 
> > *cfg, void *cookie)
> >   struct arm_lpae_io_pgtable *data;
> >
> >   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA 
> > |
> > - IO_PGTABLE_QUIRK_NON_STRICT))
> > + IO_PGTABLE_QUIRK_NON_STRICT |
> > + IO_PGTABLE_QUIRK_NON_COHERENT))
> >   return NULL;
> >
> >   data = arm_lpae_alloc_pgtable(cfg);
> > @@ -788,9 +789,14 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg 
> > *cfg, void *cookie)
> >   return NULL;
> >
> >   /* TCR */
> > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
> > -   (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
> > -   (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
> > + reg = ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT;
> > +
> > + if (cfg->quirks & IO_PGTABLE_QUIRK_NON_COHERENT)
> > + reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT |
> > +ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT;
> > + else
> > + reg |= ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT |
> > +ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT;
> >
> >   switch (ARM_LPAE_GRANULE(data)) {
> >   case SZ_4K:
> > @@ -873,7 +879,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg 
> > *cfg, void *cookie)
> >
> >   /* The NS quirk doesn't apply at stage 2 */
> >   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA |
> > - IO_PGTABLE_QUIRK_NON_STRICT))
> > + IO_PGTABLE_QUIRK_NON_STRICT |
> > + IO_PGTABLE_QUIRK_NON_COHERENT))
> >   return NULL;
> >
> >   data = arm_lpae_alloc_pgtable(cfg);
> > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> > index 47d5ae559329..46604cf7b017 100644
> > --- a/drivers/iommu/io-pgtable.h
> > +++ b/drivers/iommu/io-pgtable.h
> > @@ -75,6 +75,11 @@ struct io_pgtable_cfg {
> >* IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
> >*  on unmap, for DMA domains using the flush queue mechanism for
> >*  delayed invalidation.
> > +  *
> > +  * IO_PGTABLE_QUIRK_NON_COHERENT: Enforce non-cacheable mappings for
> > +  *  pagetables even on a coherent SMMU for cases where reducing
> > +  *  snoop traffic/latency on walks outweighs the cost of cache
> > +  *  maintenance on PTE updates.
>
> Hmm, we can't actually "enforce" anything with this as-is - all we're
> doing is setting the attributes that the IOMMU will use for pagetable
> walks, and that has no impact on how the CPU actually writes PTEs to
> memory. In particular, in the case of a hardware-coherent IOMMU which is
> described as such, even if we make the dma_map/sync calls they still
> won't do 

Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-28 Thread Vivek Gautam
Hi Ard,

On Thu, Jan 24, 2019 at 1:25 PM Ard Biesheuvel
 wrote:
>
> On Thu, 24 Jan 2019 at 07:58, Vivek Gautam  
> wrote:
> >
> > On Mon, Jan 21, 2019 at 7:55 PM Ard Biesheuvel
> >  wrote:
> > >
> > > On Mon, 21 Jan 2019 at 14:56, Robin Murphy  wrote:
> > > >
> > > > On 21/01/2019 13:36, Ard Biesheuvel wrote:
> > > > > On Mon, 21 Jan 2019 at 14:25, Robin Murphy  
> > > > > wrote:
> > > > >>
> > > > >> On 21/01/2019 10:50, Ard Biesheuvel wrote:
> > > > >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam 
> > > > >>>  wrote:
> > > > >>>>
> > > > >>>> Hi,
> > > > >>>>
> > > > >>>>
> > > > >>>> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
> > > > >>>>  wrote:
> > > > >>>>>
> > > > >>>>> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam 
> > > > >>>>>  wrote:
> > > > >>>>>>
> > > > >>>>>> Qualcomm SoCs have an additional level of cache called as
> > > > >>>>>> System cache, aka. Last level cache (LLC). This cache sits right
> > > > >>>>>> before the DDR, and is tightly coupled with the memory 
> > > > >>>>>> controller.
> > > > >>>>>> The clients using this cache request their slices from this
> > > > >>>>>> system cache, make it active, and can then start using it.
> > > > >>>>>> For these clients with smmu, to start using the system cache for
> > > > >>>>>> buffers and, related page tables [1], memory attributes need to 
> > > > >>>>>> be
> > > > >>>>>> set accordingly. This series add the required support.
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>> Does this actually improve performance on reads from a device? The
> > > > >>>>> non-cache coherent DMA routines perform an unconditional D-cache
> > > > >>>>> invalidate by VA to the PoC before reading from the buffers 
> > > > >>>>> filled by
> > > > >>>>> the device, and I would expect the PoC to be defined as lying 
> > > > >>>>> beyond
> > > > >>>>> the LLC to still guarantee the architected behavior.
> > > > >>>>
> > > > >>>> We have seen performance improvements when running Manhattan
> > > > >>>> GFXBench benchmarks.
> > > > >>>>
> > > > >>>
> > > > >>> Ah ok, that makes sense, since in that case, the data flow is mostly
> > > > >>> to the device, not from the device.
> > > > >>>
> > > > >>>> As for the PoC, from my knowledge on sdm845 the system cache, aka
> > > > >>>> Last level cache (LLC) lies beyond the point of coherency.
> > > > >>>> Non-cache coherent buffers will not be cached to system cache 
> > > > >>>> also, and
> > > > >>>> no additional software cache maintenance ops are required for 
> > > > >>>> system cache.
> > > > >>>> Pratik can add more if I am missing something.
> > > > >>>>
> > > > >>>> To take care of the memory attributes from DMA APIs side, we can 
> > > > >>>> add a
> > > > >>>> DMA_ATTR definition to take care of any dma non-coherent APIs 
> > > > >>>> calls.
> > > > >>>>
> > > > >>>
> > > > >>> So does the device use the correct inner non-cacheable, outer
> > > > >>> writeback cacheable attributes if the SMMU is in pass-through?
> > > > >>>
> > > > >>> We have been looking into another use case where the fact that the
> > > > >>> SMMU overrides memory attributes is causing issues (WC mappings used
> > > > >>> by the radeon and amdgpu driver). So if the SMMU would honour the
> > > > >>> existing attributes, would you still need the SMMU changes?
> > > > >>
> > > > >> Even if we could force a 

Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-23 Thread Vivek Gautam
On Mon, Jan 21, 2019 at 7:55 PM Ard Biesheuvel
 wrote:
>
> On Mon, 21 Jan 2019 at 14:56, Robin Murphy  wrote:
> >
> > On 21/01/2019 13:36, Ard Biesheuvel wrote:
> > > On Mon, 21 Jan 2019 at 14:25, Robin Murphy  wrote:
> > >>
> > >> On 21/01/2019 10:50, Ard Biesheuvel wrote:
> > >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam 
> > >>>  wrote:
> > >>>>
> > >>>> Hi,
> > >>>>
> > >>>>
> > >>>> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
> > >>>>  wrote:
> > >>>>>
> > >>>>> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam 
> > >>>>>  wrote:
> > >>>>>>
> > >>>>>> Qualcomm SoCs have an additional level of cache called as
> > >>>>>> System cache, aka. Last level cache (LLC). This cache sits right
> > >>>>>> before the DDR, and is tightly coupled with the memory controller.
> > >>>>>> The clients using this cache request their slices from this
> > >>>>>> system cache, make it active, and can then start using it.
> > >>>>>> For these clients with smmu, to start using the system cache for
> > >>>>>> buffers and, related page tables [1], memory attributes need to be
> > >>>>>> set accordingly. This series add the required support.
> > >>>>>>
> > >>>>>
> > >>>>> Does this actually improve performance on reads from a device? The
> > >>>>> non-cache coherent DMA routines perform an unconditional D-cache
> > >>>>> invalidate by VA to the PoC before reading from the buffers filled by
> > >>>>> the device, and I would expect the PoC to be defined as lying beyond
> > >>>>> the LLC to still guarantee the architected behavior.
> > >>>>
> > >>>> We have seen performance improvements when running Manhattan
> > >>>> GFXBench benchmarks.
> > >>>>
> > >>>
> > >>> Ah ok, that makes sense, since in that case, the data flow is mostly
> > >>> to the device, not from the device.
> > >>>
> > >>>> As for the PoC, from my knowledge on sdm845 the system cache, aka
> > >>>> Last level cache (LLC) lies beyond the point of coherency.
> > >>>> Non-cache coherent buffers will not be cached to system cache also, and
> > >>>> no additional software cache maintenance ops are required for system 
> > >>>> cache.
> > >>>> Pratik can add more if I am missing something.
> > >>>>
> > >>>> To take care of the memory attributes from DMA APIs side, we can add a
> > >>>> DMA_ATTR definition to take care of any dma non-coherent APIs calls.
> > >>>>
> > >>>
> > >>> So does the device use the correct inner non-cacheable, outer
> > >>> writeback cacheable attributes if the SMMU is in pass-through?
> > >>>
> > >>> We have been looking into another use case where the fact that the
> > >>> SMMU overrides memory attributes is causing issues (WC mappings used
> > >>> by the radeon and amdgpu driver). So if the SMMU would honour the
> > >>> existing attributes, would you still need the SMMU changes?
> > >>
> > >> Even if we could force a stage 2 mapping with the weakest pagetable
> > >> attributes (such that combining would work), there would still need to
> > >> be a way to set the TCR attributes appropriately if this behaviour is
> > >> wanted for the SMMU's own table walks as well.
> > >>
> > >
> > > Isn't that just a matter of implementing support for SMMUs that lack
> > > the 'dma-coherent' attribute?
> >
> > Not quite - in general they need INC-ONC attributes in case there
> > actually is something in the architectural outer-cacheable domain.
>
> But is it a problem to use INC-ONC attributes for the SMMU PTW on this
> chip? AIUI, the reason for the SMMU changes is to avoid the
> performance hit of snooping, which is more expensive than cache
> maintenance of SMMU page tables. So are you saying the by-VA cache
> maintenance is not relayed to this system cache, resulting in page
> table updates to be invisible to masters using INC-ONC attributes?

The reason for this SMMU chan

Re: [PATCH 1/3] iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes

2019-01-22 Thread Vivek Gautam
On Mon, Jan 21, 2019 at 7:23 PM Robin Murphy  wrote:
>
> On 21/01/2019 05:53, Vivek Gautam wrote:
> > A number of arm_smmu_domain's attributes can be assigned based
> > on the iommu domains's attributes. These local attributes better
> > be managed by a bitmap.
> > So remove boolean flags and move to a 32-bit bitmap, and enable
> > each bits separtely.
> >
> > Signed-off-by: Vivek Gautam 
> > ---
> >   drivers/iommu/arm-smmu.c | 10 ++
> >   1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 7ebbcf1b2eb3..52b300dfc096 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -257,10 +257,11 @@ struct arm_smmu_domain {
> >   const struct iommu_gather_ops   *tlb_ops;
> >   struct arm_smmu_cfg cfg;
> >   enum arm_smmu_domain_stage  stage;
> > - boolnon_strict;
> >   struct mutexinit_mutex; /* Protects smmu pointer 
> > */
> >   spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> > TLB syncs */
> >   struct iommu_domain domain;
> > +#define ARM_SMMU_DOMAIN_ATTR_NON_STRICT  BIT(0)
> > + unsigned intattr;
> >   };
> >
> >   struct arm_smmu_option_prop {
> > @@ -901,7 +902,7 @@ static int arm_smmu_init_domain_context(struct 
> > iommu_domain *domain,
> >   if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
> >   pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
> >
> > - if (smmu_domain->non_strict)
> > + if (smmu_domain->attr & ARM_SMMU_DOMAIN_ATTR_NON_STRICT)
> >   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> >
> >   /* Non coherent page table mappings only for Stage-1 */
> > @@ -1598,7 +1599,8 @@ static int arm_smmu_domain_get_attr(struct 
> > iommu_domain *domain,
> >   case IOMMU_DOMAIN_DMA:
> >   switch (attr) {
> >   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> > - *(int *)data = smmu_domain->non_strict;
> > + *(int *)data = !!(smmu_domain->attr &
> > +   ARM_SMMU_DOMAIN_ATTR_NON_STRICT);
> >   return 0;
> >   default:
> >   return -ENODEV;
> > @@ -1638,7 +1640,7 @@ static int arm_smmu_domain_set_attr(struct 
> > iommu_domain *domain,
> >   case IOMMU_DOMAIN_DMA:
> >   switch (attr) {
> >   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> > - smmu_domain->non_strict = *(int *)data;
> > + smmu_domain->attr |= ARM_SMMU_DOMAIN_ATTR_NON_STRICT;
>
> But what if *data == 0?

Right, a check for data here also similar to what we are doing for
QCOM_SYS_CACHE [1].

[1] https://lore.kernel.org/patchwork/patch/1033796/

Regards
Vivek

>
> Robin.
>
> >   break;
> >   default:
> >   ret = -ENODEV;
> >
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCHv3 3/4] coresight: etm4x: Add support to enable ETMv4.2

2019-01-21 Thread Vivek Gautam



On 1/18/2019 5:52 PM, Sai Prakash Ranjan wrote:

SDM845 has ETMv4.2 and can use the existing etm4x driver.
But the current etm driver checks only for ETMv4.0 and
errors out for other etm4x versions. This patch adds this
missing support to enable SoC's with ETMv4x to use same
driver by checking only the ETM architecture major version
number.

Without this change, we get below error during etm probe:

/ # dmesg | grep etm
[6.660093] coresight-etm4x: probe of 704.etm failed with error -22
[6.666902] coresight-etm4x: probe of 714.etm failed with error -22
[6.673708] coresight-etm4x: probe of 724.etm failed with error -22
[6.680511] coresight-etm4x: probe of 734.etm failed with error -22
[6.687313] coresight-etm4x: probe of 744.etm failed with error -22
[6.694113] coresight-etm4x: probe of 754.etm failed with error -22
[6.700914] coresight-etm4x: probe of 764.etm failed with error -22
[6.707717] coresight-etm4x: probe of 774.etm failed with error -22

With this change, etm probe is successful:

/ # dmesg | grep coresight
[6.659198] coresight-etm4x 704.etm: CPU0: ETM v4.2 initialized
[6.665848] coresight-etm4x 714.etm: CPU1: ETM v4.2 initialized
[6.672493] coresight-etm4x 724.etm: CPU2: ETM v4.2 initialized
[6.679129] coresight-etm4x 734.etm: CPU3: ETM v4.2 initialized
[6.685770] coresight-etm4x 744.etm: CPU4: ETM v4.2 initialized
[6.692403] coresight-etm4x 754.etm: CPU5: ETM v4.2 initialized
[6.699024] coresight-etm4x 764.etm: CPU6: ETM v4.2 initialized
[6.705646] coresight-etm4x 774.etm: CPU7: ETM v4.2 initialized

Signed-off-by: Sai Prakash Ranjan 
---
  drivers/hwtracing/coresight/coresight-etm4x.c | 2 +-
  drivers/hwtracing/coresight/coresight-etm4x.h | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c 
b/drivers/hwtracing/coresight/coresight-etm4x.c
index 53e2fb6e86f6..93d5f1f3145e 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x.c
@@ -55,7 +55,7 @@ static void etm4_os_unlock(struct etmv4_drvdata *drvdata)
  
  static bool etm4_arch_supported(u8 arch)

  {
-   switch (arch) {
+   switch (arch >> 4) {



While this looks good, from what it looks like arch is a combination of 
major version
minor version. So, will it be better to masks, and shifts macros instead 
of a magic

number shift.
But, frankly it's upto Mathieu to decide the readability of this. So, I 
leave it to him.


Thanks
Vivek


case ETM_ARCH_V4:
break;
default:
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h 
b/drivers/hwtracing/coresight/coresight-etm4x.h
index 52786e9d8926..05d4bd330881 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -136,7 +136,7 @@
  #define ETM_MAX_RES_SEL   16
  #define ETM_MAX_SS_CMP8
  
-#define ETM_ARCH_V4			0x40

+#define ETM_ARCH_V40x4
  #define ETMv4_SYNC_MASK   0x1F
  #define ETM_CYC_THRESHOLD_MASK0xFFF
  #define ETM_CYC_THRESHOLD_DEFAULT   0x100


Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-21 Thread Vivek Gautam
Hi,


On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel
 wrote:
>
> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam  
> wrote:
> >
> > Qualcomm SoCs have an additional level of cache called as
> > System cache, aka. Last level cache (LLC). This cache sits right
> > before the DDR, and is tightly coupled with the memory controller.
> > The clients using this cache request their slices from this
> > system cache, make it active, and can then start using it.
> > For these clients with smmu, to start using the system cache for
> > buffers and, related page tables [1], memory attributes need to be
> > set accordingly. This series add the required support.
> >
>
> Does this actually improve performance on reads from a device? The
> non-cache coherent DMA routines perform an unconditional D-cache
> invalidate by VA to the PoC before reading from the buffers filled by
> the device, and I would expect the PoC to be defined as lying beyond
> the LLC to still guarantee the architected behavior.

We have seen performance improvements when running Manhattan
GFXBench benchmarks.

As for the PoC, from my knowledge on sdm845 the system cache, aka
Last level cache (LLC) lies beyond the point of coherency.
Non-cache coherent buffers will not be cached to system cache also, and
no additional software cache maintenance ops are required for system cache.
Pratik can add more if I am missing something.

To take care of the memory attributes from DMA APIs side, we can add a
DMA_ATTR definition to take care of any dma non-coherent APIs calls.

Regards
Vivek
>
>
>
> > This change is a realisation of following changes from downstream msm-4.9:
> > iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
> > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]
> >
> > Changes since v2:
> >  - Split the patches into io-pgtable-arm driver and arm-smmu driver.
> >  - Converted smmu domain attributes to a bitmap, so multiple attributes
> >can be managed easily.
> >  - With addition of non-coherent page table mapping support [4], this
> >patch series now aligns with the understanding of upgrading the
> >non-coherent devices to use some level of outer cache.
> >  - Updated the macros and comments to reflect the use of QCOM_SYS_CACHE.
> >  - QCOM_SYS_CACHE can still be used at stage 2, so that doens't depend on
> >stage-1 mapping.
> >  - Added change to disable the attribute from arm_smmu_domain_set_attr()
> >when needed.
> >  - Removed the page protection controls for QCOM_SYS_CACHE at the DMA API
> >level.
> >
> > Goes on top of the non-coherent page tables support patch series [4]
> >
> > [1] https://patchwork.kernel.org/patch/10302791/
> > [2] 
> > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=bf762276796e79ca90014992f4d9da5593fa7d51
> > [3] 
> > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602
> > [4] https://lore.kernel.org/patchwork/cover/1032938/
> >
> > Vivek Gautam (3):
> >   iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes
> >   iommu/io-pgtable-arm: Add support to use system cache
> >   iommu/arm-smmu: Add support to use system cache
> >
> >  drivers/iommu/arm-smmu.c   | 28 
> >  drivers/iommu/io-pgtable-arm.c | 15 +--
> >  drivers/iommu/io-pgtable.h |  4 
> >  include/linux/iommu.h  |  2 ++
> >  4 files changed, 43 insertions(+), 6 deletions(-)
> >
> > --
> > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> > of Code Aurora Forum, hosted by The Linux Foundation
> >
> >
> > ___
> > linux-arm-kernel mailing list
> > linux-arm-ker...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/2] iommu/arm-smmu: Add support for non-coherent page table mappings

2019-01-20 Thread Vivek Gautam
Hi Will,


On Sun, Jan 20, 2019 at 5:31 AM Will Deacon  wrote:
>
> On Thu, Jan 17, 2019 at 02:57:18PM +0530, Vivek Gautam wrote:
> > Adding a device tree option for arm smmu to enable non-cacheable
> > memory for page tables.
> > We already enable a smmu feature for coherent walk based on
> > whether the smmu device is dma-coherent or not. Have an option
> > to enable non-cacheable page table memory to force set it for
> > particular smmu devices.
>
> Hmm, I must be missing something here. What is the difference between this
> new property, and simply omitting dma-coherent on the SMMU?

So, this is what I understood from the email thread for Last level
cache support -
Robin pointed to the fact that we may need to add support for setting
non-cacheable
mappings in the TCR.
Currently, we don't do that for SMMUs that omit dma-coherent.
We rely on the interconnect to handle the configuration set in TCR,
and let interconnect
ignore the cacheability if it can't support.

Moreover, Robin suggested that we should take care of SMMUs, for which
removing snoop latency on walks by making mappings as non-cacheable
outweighs the cost of cache maintenance on PTE updates.

So, this change adds another property to do this non-cacheable mappings
explicitly. As I pointed, omitting 'dma-coherent', and corresponding
IO_PGTABLE_QUIRK_NO_DMA' does takes care of few things.

Should we handle the TCR settings too with this quirk?

Regards
Vivek
>
> Will
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 2/3] iommu/io-pgtable-arm: Add support to use system cache

2019-01-20 Thread Vivek Gautam
Few Qualcomm platforms such as, sdm845 have an additional outer
cache called as System cache, aka. Last level cache (LLC) that
allows non-coherent devices to upgrade to using caching.

There is a fundamental assumption that non-coherent devices can't
access caches. This change adds an exception where they *can* use
some level of cache despite still being non-coherent overall.
The coherent devices that use cacheable memory, and CPU make use of
this system cache by default.

Looking at memory types, we have following -
a) Normal uncached :- MAIR 0x44, inner non-cacheable,
  outer non-cacheable;
b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
  outer read write-back non-transient;
  attribute setting for coherenet I/O devices.
and, for non-coherent i/o devices that can allocate in system cache
another type gets added -
c) Normal sys-cached :- MAIR 0xf4, inner non-cacheable,
outer read write-back non-transient

Coherent I/O devices use system cache by marking the memory as
normal cached.
Non-coherent I/O devices should mark the memory as normal
sys-cached in page tables to use system cache.

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/io-pgtable-arm.c | 15 +--
 drivers/iommu/io-pgtable.h |  4 
 include/linux/iommu.h  |  2 ++
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index c76919c30f1a..0e55772702da 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -168,10 +168,12 @@
 #define ARM_LPAE_MAIR_ATTR_MASK0xff
 #define ARM_LPAE_MAIR_ATTR_DEVICE  0x04
 #define ARM_LPAE_MAIR_ATTR_NC  0x44
+#define ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE  0xf4
 #define ARM_LPAE_MAIR_ATTR_WBRWA   0xff
 #define ARM_LPAE_MAIR_ATTR_IDX_NC  0
 #define ARM_LPAE_MAIR_ATTR_IDX_CACHE   1
 #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2
+#define ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE  3
 
 /* IOPTE accessors */
 #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
@@ -443,6 +445,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
arm_lpae_io_pgtable *data,
else if (prot & IOMMU_CACHE)
pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
+   else if (prot & IOMMU_QCOM_SYS_CACHE)
+   pte |= (ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE
+   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
} else {
pte = ARM_LPAE_PTE_HAP_FAULT;
if (prot & IOMMU_READ)
@@ -781,7 +786,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
 
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
IO_PGTABLE_QUIRK_NON_STRICT |
-   IO_PGTABLE_QUIRK_NON_COHERENT))
+   IO_PGTABLE_QUIRK_NON_COHERENT |
+   IO_PGTABLE_QUIRK_QCOM_SYS_CACHE))
return NULL;
 
data = arm_lpae_alloc_pgtable(cfg);
@@ -794,6 +800,9 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
if (cfg->quirks & IO_PGTABLE_QUIRK_NON_COHERENT)
reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT |
   ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT;
+   else if (cfg->quirks & IO_PGTABLE_QUIRK_QCOM_SYS_CACHE)
+   reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT |
+ ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT;
else
reg |= ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT |
   ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT;
@@ -848,7 +857,9 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
  (ARM_LPAE_MAIR_ATTR_WBRWA
   << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_CACHE)) |
  (ARM_LPAE_MAIR_ATTR_DEVICE
-  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV));
+  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)) |
+ (ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE
+  << 
ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE));
 
cfg->arm_lpae_s1_cfg.mair[0] = reg;
cfg->arm_lpae_s1_cfg.mair[1] = 0;
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 46604cf7b017..fb237e8aa9f1 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -80,6 +80,9 @@ struct io_pgtable_cfg {
 *  pagetables even on a coherent SMMU for cases where reducing
 *  snoop traffic/latency on walks outweighs the cost of cache
 *  maintenance on PTE up

[PATCH 1/3] iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes

2019-01-20 Thread Vivek Gautam
A number of arm_smmu_domain's attributes can be assigned based
on the iommu domains's attributes. These local attributes better
be managed by a bitmap.
So remove boolean flags and move to a 32-bit bitmap, and enable
each bits separtely.

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/arm-smmu.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7ebbcf1b2eb3..52b300dfc096 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -257,10 +257,11 @@ struct arm_smmu_domain {
const struct iommu_gather_ops   *tlb_ops;
struct arm_smmu_cfg cfg;
enum arm_smmu_domain_stage  stage;
-   boolnon_strict;
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
struct iommu_domain domain;
+#define ARM_SMMU_DOMAIN_ATTR_NON_STRICTBIT(0)
+   unsigned intattr;
 };
 
 struct arm_smmu_option_prop {
@@ -901,7 +902,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
 
-   if (smmu_domain->non_strict)
+   if (smmu_domain->attr & ARM_SMMU_DOMAIN_ATTR_NON_STRICT)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
/* Non coherent page table mappings only for Stage-1 */
@@ -1598,7 +1599,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
-   *(int *)data = smmu_domain->non_strict;
+   *(int *)data = !!(smmu_domain->attr &
+ ARM_SMMU_DOMAIN_ATTR_NON_STRICT);
return 0;
default:
return -ENODEV;
@@ -1638,7 +1640,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
-   smmu_domain->non_strict = *(int *)data;
+   smmu_domain->attr |= ARM_SMMU_DOMAIN_ATTR_NON_STRICT;
break;
default:
ret = -ENODEV;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache

2019-01-20 Thread Vivek Gautam
Qualcomm SoCs have an additional level of cache called as
System cache, aka. Last level cache (LLC). This cache sits right
before the DDR, and is tightly coupled with the memory controller.
The clients using this cache request their slices from this
system cache, make it active, and can then start using it.
For these clients with smmu, to start using the system cache for
buffers and, related page tables [1], memory attributes need to be
set accordingly. This series add the required support.

This change is a realisation of following changes from downstream msm-4.9:
iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]

Changes since v2:
 - Split the patches into io-pgtable-arm driver and arm-smmu driver.
 - Converted smmu domain attributes to a bitmap, so multiple attributes
   can be managed easily.
 - With addition of non-coherent page table mapping support [4], this
   patch series now aligns with the understanding of upgrading the
   non-coherent devices to use some level of outer cache.
 - Updated the macros and comments to reflect the use of QCOM_SYS_CACHE.
 - QCOM_SYS_CACHE can still be used at stage 2, so that doens't depend on
   stage-1 mapping.
 - Added change to disable the attribute from arm_smmu_domain_set_attr()
   when needed.
 - Removed the page protection controls for QCOM_SYS_CACHE at the DMA API
   level.

Goes on top of the non-coherent page tables support patch series [4]

[1] https://patchwork.kernel.org/patch/10302791/
[2] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=bf762276796e79ca90014992f4d9da5593fa7d51
[3] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602
[4] https://lore.kernel.org/patchwork/cover/1032938/

Vivek Gautam (3):
  iommu/arm-smmu: Move to bitmap for arm_smmu_domain atrributes
  iommu/io-pgtable-arm: Add support to use system cache
  iommu/arm-smmu: Add support to use system cache

 drivers/iommu/arm-smmu.c   | 28 
 drivers/iommu/io-pgtable-arm.c | 15 +--
 drivers/iommu/io-pgtable.h |  4 
 include/linux/iommu.h  |  2 ++
 4 files changed, 43 insertions(+), 6 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 3/3] iommu/arm-smmu: Add support to use system cache

2019-01-20 Thread Vivek Gautam
Few Qualcomm platforms, such as sdm845 have an additional outer
cache called as System cache, aka. Last level cache (LLC) that
allows non-coherent devices to upgrade to using caching.
This last level cache sits right before the DDR, and is tightly
coupled with the memory controller.
The cache is available to a number of devices - coherent and
non-coherent, present in the SoC system, and to CPUs.
The devices request their slices from this system cache, make it
active, and can then start using it.

Devices can set iommu domain attributes and page protection
while mapping the buffers to set the required memory attributes
to use system cache for buffers and page tables.
This change adds the support for iommu domain attributes and the
interaction with io page table driver.

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/arm-smmu.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 52b300dfc096..324f3bb54c78 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -260,7 +260,8 @@ struct arm_smmu_domain {
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
struct iommu_domain domain;
-#define ARM_SMMU_DOMAIN_ATTR_NON_STRICTBIT(0)
+#define ARM_SMMU_DOMAIN_ATTR_QCOM_SYS_CACHEBIT(1)
+#define ARM_SMMU_DOMAIN_ATTR_NON_STRICTBIT(0)
unsigned intattr;
 };
 
@@ -910,6 +911,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_COHERENT;
 
+   if (smmu_domain->attr & ARM_SMMU_DOMAIN_ATTR_QCOM_SYS_CACHE)
+   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_QCOM_SYS_CACHE;
+
smmu_domain->smmu = smmu;
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops) {
@@ -1592,6 +1596,10 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case DOMAIN_ATTR_NESTING:
*(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
return 0;
+   case DOMAIN_ATTR_QCOM_SYS_CACHE:
+   *(int *)data = !!(smmu_domain->attr &
+ ARM_SMMU_DOMAIN_ATTR_QCOM_SYS_CACHE);
+   return 0;
default:
return -ENODEV;
}
@@ -1633,6 +1641,16 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
else
smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
break;
+   case DOMAIN_ATTR_QCOM_SYS_CACHE:
+   if (smmu_domain->smmu) {
+   ret = -EPERM;
+   goto out_unlock;
+   }
+   if (*(int *)data)
+   smmu_domain->attr |= 
ARM_SMMU_DOMAIN_ATTR_QCOM_SYS_CACHE;
+   else
+   smmu_domain->attr &= 
~ARM_SMMU_DOMAIN_ATTR_QCOM_SYS_CACHE;
+   break;
default:
ret = -ENODEV;
}
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 2/2] iommu/arm-smmu: Add support for non-coherent page table mappings

2019-01-17 Thread Vivek Gautam
Adding a device tree option for arm smmu to enable non-cacheable
memory for page tables.
We already enable a smmu feature for coherent walk based on
whether the smmu device is dma-coherent or not. Have an option
to enable non-cacheable page table memory to force set it for
particular smmu devices.

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/arm-smmu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index af18a7e7f917..7ebbcf1b2eb3 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -188,6 +188,7 @@ struct arm_smmu_device {
u32 features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_PGTBL_NON_COHERENT (1 << 1)
u32 options;
enum arm_smmu_arch_version  version;
enum arm_smmu_implementationmodel;
@@ -273,6 +274,7 @@ static bool using_legacy_binding, using_generic_binding;
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+   { ARM_SMMU_OPT_PGTBL_NON_COHERENT, "arm,smmu-pgtable-non-coherent" },
{ 0, NULL},
 };
 
@@ -902,6 +904,11 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
if (smmu_domain->non_strict)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
+   /* Non coherent page table mappings only for Stage-1 */
+   if (smmu->options & ARM_SMMU_OPT_PGTBL_NON_COHERENT &&
+   smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
+   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_COHERENT;
+
smmu_domain->smmu = smmu;
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops) {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 0/2] iommu/arm: Add support for non-coherent page tables

2019-01-17 Thread Vivek Gautam
As discussed in the Qcom system cache support thread [1], it is
imperative that we enable the support for non-cacheable page tables
for SMMU implementations for which removing snoop latency on walks
by making mappings as non-cacheable, outweighs the cost of cache
maintenance on PTE updates.

This series adds a new SMMU device tree option to let the particular
SMMU configuration setup cacheable or non-cacheable mappings for
page-tables out of box. We set a new quirk for i/o page tables -
IO_PGTABLE_QUIRK_NON_COHERENT, that lets us set different TCR
configurations.

This quirk enables the non-cacheable page tables for all masters
sitting on SMMU. Should this control be available per smmu_domain
as each master may have a different perf requirement?
Enabling this for the entire SMMU may not be desirable for all
masters.

[1] https://lore.kernel.org/patchwork/patch/1020906/

Vivek Gautam (2):
  iommu/io-pgtable-arm: Add support for non-coherent page tables
  iommu/arm-smmu: Add support for non-coherent page table mappings

 drivers/iommu/arm-smmu.c   |  7 +++
 drivers/iommu/io-pgtable-arm.c | 17 -
 drivers/iommu/io-pgtable.h |  6 ++
 3 files changed, 25 insertions(+), 5 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/2] iommu/io-pgtable-arm: Add support for non-coherent page tables

2019-01-17 Thread Vivek Gautam
>From Robin's comment [1] about touching TCR configurations -

"TBH if we're going to touch the TCR attributes at all then we should
probably correct that sloppiness first - there's an occasional argument
for using non-cacheable pagetables even on a coherent SMMU if reducing
snoop traffic/latency on walks outweighs the cost of cache maintenance
on PTE updates, but anyone thinking they can get that by overriding
dma-coherent silently gets the worst of both worlds thanks to this
current TCR value."

We have IO_PGTABLE_QUIRK_NO_DMA quirk present, but we don't force
anybody _not_ using dma-coherent smmu to have non-cacheable page table
mappings.
Having another quirk flag can help in having non-cacheable memory for
page tables once and for all.

[1] https://lore.kernel.org/patchwork/patch/1020906/

Signed-off-by: Vivek Gautam 
---
 drivers/iommu/io-pgtable-arm.c | 17 -
 drivers/iommu/io-pgtable.h |  6 ++
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 237cacd4a62b..c76919c30f1a 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -780,7 +780,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
struct arm_lpae_io_pgtable *data;
 
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
-   IO_PGTABLE_QUIRK_NON_STRICT))
+   IO_PGTABLE_QUIRK_NON_STRICT |
+   IO_PGTABLE_QUIRK_NON_COHERENT))
return NULL;
 
data = arm_lpae_alloc_pgtable(cfg);
@@ -788,9 +789,14 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
return NULL;
 
/* TCR */
-   reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
- (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
- (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+   reg = ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT;
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_NON_COHERENT)
+   reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT |
+  ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT;
+   else
+   reg |= ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT |
+  ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT;
 
switch (ARM_LPAE_GRANULE(data)) {
case SZ_4K:
@@ -873,7 +879,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, 
void *cookie)
 
/* The NS quirk doesn't apply at stage 2 */
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA |
-   IO_PGTABLE_QUIRK_NON_STRICT))
+   IO_PGTABLE_QUIRK_NON_STRICT |
+   IO_PGTABLE_QUIRK_NON_COHERENT))
return NULL;
 
data = arm_lpae_alloc_pgtable(cfg);
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 47d5ae559329..46604cf7b017 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -75,6 +75,11 @@ struct io_pgtable_cfg {
 * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
 *  on unmap, for DMA domains using the flush queue mechanism for
 *  delayed invalidation.
+*
+* IO_PGTABLE_QUIRK_NON_COHERENT: Enforce non-cacheable mappings for
+*  pagetables even on a coherent SMMU for cases where reducing
+*  snoop traffic/latency on walks outweighs the cost of cache
+*  maintenance on PTE updates.
 */
#define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
#define IO_PGTABLE_QUIRK_NO_PERMS   BIT(1)
@@ -82,6 +87,7 @@ struct io_pgtable_cfg {
#define IO_PGTABLE_QUIRK_ARM_MTK_4GBBIT(3)
#define IO_PGTABLE_QUIRK_NO_DMA BIT(4)
#define IO_PGTABLE_QUIRK_NON_STRICT BIT(5)
+   #define IO_PGTABLE_QUIRK_NON_COHERENT   BIT(6)
unsigned long   quirks;
unsigned long   pgsize_bitmap;
unsigned intias;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH v1] arm64: dts: qcom: msm8996: Disable USB2 PHY suspend by core

2019-01-07 Thread Vivek Gautam
On Thu, Jan 3, 2019 at 6:18 PM Manu Gautam  wrote:
>
> QUSB2 PHY on msm8996 doesn't work well when autosuspend by
> dwc3 core using USB2PHYCFG register is enabled. One of the
> issue seen is that PHY driver reports PLL lock failure and
> fails phy_init() if dwc3 core has USB2 PHY suspend enabled.
> Fix this by using quirks to disable USB2 PHY LPM/suspend and
> dwc3 core already takes care of explicitly suspending PHY
> during suspend if quirks are specified.
>
> Signed-off-by: Manu Gautam 
> ---

This works well for db820c [1].
Tested-by: Vivek Gautam 

[1] https://github.com/vivekgautam1/linux/commits/origin/v4.20-rc5/db820c

Best regards
Vivek

>  arch/arm64/boot/dts/qcom/msm8996.dtsi | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index b29fe80d7288..1f14ca35afc2 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -911,6 +911,8 @@
> interrupts = <0 138 IRQ_TYPE_LEVEL_HIGH>;
> phys = <&hsusb_phy2>;
> phy-names = "usb2-phy";
> +   snps,dis_u2_susphy_quirk;
> +   snps,dis_enblslpm_quirk;
> };
> };
>
> @@ -940,6 +942,8 @@
> interrupts = <0 131 IRQ_TYPE_LEVEL_HIGH>;
> phys = <&hsusb_phy1>, <&ssusb_phy_0>;
> phy-names = "usb2-phy", "usb3-phy";
> +   snps,dis_u2_susphy_quirk;
> +   snps,dis_enblslpm_quirk;
> };
> };
>
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH v2 1/1] drm/prime: Use sg_dma_len() macro to get sg's length

2019-01-07 Thread Vivek Gautam
After mapping a sg list the we should use sg_dma_address() and
sg_dma_len() macros to access sg->address and sg->length. Fix
the same for sg->length in drm_prime_sg_to_page_addr_arrays().

Signed-off-by: Vivek Gautam 
---

Changes since v1:
 - Fixed compilation error: replaced sg_dma_length() with sg_dma_len().

This came while debugging one dmabuf import issue that we are seeing
on sdm845 target.
The dmabuf which is prepared by video (venus in this case), is imported
by drm device.
The import call flow looks like follows:

drm_gem_prime_import()
 - drm_gem_prime_import_dev()
   - dma_buf_attach() & dma_buf_map_attachment()
 - From dma_buf_map_attachment()
   - vb2_dma_sg_dmabuf_ops_map()
 - dma_map_sg(): this updates the sg->nents.

>From debugging, the sg table mapping results in sg's 'nents' to be less that
the original nents. Now drm device prepares the page information based on
this sg table, and messes up with the mappings, and we start seeing random
crashes as below from drm's memory space.

Although this change isn't helping to fix the issue currently, but
this fix seems the right thing to do.

One thing to notice is that, if we restore the sg->nents to
sg->orig_nents in vb2_dma_sg_dmabuf_ops_map(), we don't see the below
corruptions.

Any pointers on this will be highly appreciated.
Thanks.

--
[  338.070558] Unable to handle kernel paging request at virtual address 
4038
[  338.078751] Mem abort info:
[  338.081671]   ESR = 0x9604
[  338.084860]   Exception class = DABT (current EL), IL = 32 bits
[  338.090972]   SET = 0, FnV = 0
[  338.094139]   EA = 0, S1PTW = 0
[  338.097393] Data abort info:
[  338.100375]   ISV = 0, ISS = 0x0004
[  338.104362]   CM = 0, WnR = 0
[  338.107446] [4038] address between user and kernel address ranges
[  338.114801] Internal error: Oops: 9604 [#1] PREEMPT SMP
[  338.120527] Modules linked in: rfcomm uinput cdc_ether venus_dec venus_enc 
usbnet videobuf2_dma_sg videobuf2_memops hci_uart btqca bluetooth r8152 mii 
ath10k_snoc venus_core ath10k_core v4l2_mem2mem videobuf2_v4l2 videobuf2_common 
ath mac80211 ecdh_generic qcom_q6v5_mss lzo lzo_compress qcom_q6v5_adsp 
qcom_common qcom_q6v5 zram bridge stp llc ipt_MASQUERADE fuse snd_seq_dummy 
snd_seq snd_seq_device cfg80211 joydev
[  338.158192] CPU: 4 PID: 3235 Comm: chrome Tainted: GW 4.19.0 
#2
[  338.165700] Hardware name: Google Cheza (rev1) (DT)
[  338.170720] pstate: 8049 (Nzcv daif +PAN -UAO)
[  338.175660] pc : drm_mm_insert_node_in_range+0xfc/0x348
[  338.181035] lr : drm_mm_insert_node_in_range+0x24/0x348
[  338.186407] sp : ff8013033b30
[  338.189816] x29: ff8013033bd0 x28: ff8008591894
[  338.195275] x27: 0010 x26: 
[  338.200734] x25:  x24: 
[  338.206194] x23:  x22: ffc0f48b7e08
[  338.211656] x21:  x20: 005d
[  338.217118] x19:  x18: 
[  338.222581] x17:  x16: 
[  338.228046] x15:  x14: 
[  338.233511] x13: 0001 x12: ffc0b1da7200
[  338.238978] x11: 0010 x10: 0010
[  338.244437] x9 : 0008 x8 : 4000
[  338.249898] x7 :  x6 : 
[  338.255361] x5 :  x4 : 
[  338.260823] x3 :  x2 : 005d
[  338.266285] x1 : ffc0b1da7100 x0 : ffc0b0215800
[  338.271748] Process chrome (pid: 3235, stack limit = 0x0900f416)
[  338.278628] Call trace:
[  338.281151]  drm_mm_insert_node_in_range+0xfc/0x348
[  338.286168]  msm_gem_map_vma+0x60/0xdc
[  338.290022]  msm_gem_get_iova+0xb4/0xf4
[  338.293967]  msm_ioctl_gem_info+0x90/0xdc
[  338.298089]  drm_ioctl_kernel+0xa8/0xe8
[  338.302043]  drm_ioctl+0x218/0x384
[  338.305547]  drm_compat_ioctl+0xd8/0xe8
[  338.309503]  __arm64_compat_sys_ioctl+0x134/0x20c
[  338.314339]  el0_svc_common+0xa0/0xf0
[  338.318108]  el0_svc_compat_handler+0x2c/0x38
[  338.322588]  el0_svc_compat+0x8/0x18
[  338.326274] Code: f94066c8 aa1f03e0 321d03e9 321c03ea (f9401d0b)
[  338.332538] ---[ end trace 5c09e60869887d87 ]---
[  338.354633] Kernel panic - not syncing: Fatal exception
[  338.360018] SMP: stopping secondary CPUs
[  338.364179] Kernel Offset: disabled
[  338.367779] CPU features: 0x0,22802a18
[  338.371643] Memory Limit: none
--

 drivers/gpu/drm/drm_prime.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 231e3f6d5f41..aa87ba9c0d7b 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -945,7 +945,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, 
struct page **pages,
 
index = 0;
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
-   

Re: [PATCH 1/1] drm/prime: Use sg_dma_len() macro to get sg's length

2019-01-07 Thread Vivek Gautam
On Mon, Jan 7, 2019 at 4:14 PM kbuild test robot  wrote:
>
> Hi Vivek,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on linus/master]
> [also build test ERROR on v5.0-rc1 next-20190107]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Vivek-Gautam/drm-prime-Use-sg_dma_len-macro-to-get-sg-s-length/20190107-181350
> config: x86_64-randconfig-x013-201901 (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64
>
> All errors (new ones prefixed by >>):
>
>drivers/gpu/drm/drm_prime.c: In function 
> 'drm_prime_sg_to_page_addr_arrays':
> >> drivers/gpu/drm/drm_prime.c:948:9: error: implicit declaration of function 
> >> 'sg_dma_length'; did you mean 'sg_dma_len'? 
> >> [-Werror=implicit-function-declaration]
>   len = sg_dma_length(sg);
> ^
> sg_dma_len

Sorry, my fat finger :(
This should be as suggested - sg_dma_len().

Thanks
Vivek

>cc1: some warnings being treated as errors
>
> vim +948 drivers/gpu/drm/drm_prime.c
>
>926
>927  /**
>928   * drm_prime_sg_to_page_addr_arrays - convert an sg table into a page 
> array
>929   * @sgt: scatter-gather table to convert
>930   * @pages: optional array of page pointers to store the page array in
>931   * @addrs: optional array to store the dma bus address of each page
>932   * @max_entries: size of both the passed-in arrays
>933   *
>934   * Exports an sg table into an array of pages and addresses. This is 
> currently
>935   * required by the TTM driver in order to do correct fault handling.
>936   */
>937  int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct 
> page **pages,
>938   dma_addr_t *addrs, int 
> max_entries)
>939  {
>940  unsigned count;
>941  struct scatterlist *sg;
>942  struct page *page;
>943  u32 len, index;
>944  dma_addr_t addr;
>945
>946  index = 0;
>947  for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>  > 948  len = sg_dma_length(sg);
>949  page = sg_page(sg);
>950  addr = sg_dma_address(sg);
>951
>952  while (len > 0) {
>953  if (WARN_ON(index >= max_entries))
>954  return -1;
>955  if (pages)
>956  pages[index] = page;
>957  if (addrs)
>958  addrs[index] = addr;
>959
>960  page++;
>961  addr += PAGE_SIZE;
>962  len -= PAGE_SIZE;
>963  index++;
>964  }
>965  }
>966  return 0;
>967  }
>968  EXPORT_SYMBOL(drm_prime_sg_to_page_addr_arrays);
>969
>
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 1/1] drm/prime: Use sg_dma_len() macro to get sg's length

2019-01-07 Thread Vivek Gautam
After mapping a sg list we should use sg_dma_address(), and
sg_dma_len() macros to access sg->address and sg->length. Fix
the same for sg->length in drm_prime_sg_to_page_addr_arrays().

Signed-off-by: Vivek Gautam 
---

This came while debugging one dmabuf import issue that we are seeing
on sdm845 target.
The dmabuf which is prepared by video (venus in this case), is imported
by drm device.
The import call flow looks like follows:

drm_gem_prime_import()
 - drm_gem_prime_import_dev()
   - dma_buf_attach() & dma_buf_map_attachment()
 - From dma_buf_map_attachment()
   - vb2_dma_sg_dmabuf_ops_map()
 - dma_map_sg(): this updates the sg->nents.

>From debugging, the sg table mapping results in sg's 'nents' to be less that
the original nents. Now drm device prepares the page information based on
this sg table, and messes up with the mappings, and we start seeing random
crashes as below from drm's memory space.

Although this change isn't helping to fix the issue currently, but
this fix seems the right thing to do.

One thing to notice is that, if we restore the sg->nents to
sg->orig_nents in vb2_dma_sg_dmabuf_ops_map(), we don't see the below
corruptions.

Any pointers on this will be highly appreciated.
Thanks.

--
[  338.070558] Unable to handle kernel paging request at virtual address 
4038
[  338.078751] Mem abort info:
[  338.081671]   ESR = 0x9604
[  338.084860]   Exception class = DABT (current EL), IL = 32 bits
[  338.090972]   SET = 0, FnV = 0
[  338.094139]   EA = 0, S1PTW = 0
[  338.097393] Data abort info:
[  338.100375]   ISV = 0, ISS = 0x0004
[  338.104362]   CM = 0, WnR = 0
[  338.107446] [4038] address between user and kernel address ranges
[  338.114801] Internal error: Oops: 9604 [#1] PREEMPT SMP
[  338.120527] Modules linked in: rfcomm uinput cdc_ether venus_dec venus_enc 
usbnet videobuf2_dma_sg videobuf2_memops hci_uart btqca bluetooth r8152 mii 
ath10k_snoc venus_core ath10k_core v4l2_mem2mem videobuf2_v4l2 videobuf2_common 
ath mac80211 ecdh_generic qcom_q6v5_mss lzo lzo_compress qcom_q6v5_adsp 
qcom_common qcom_q6v5 zram bridge stp llc ipt_MASQUERADE fuse snd_seq_dummy 
snd_seq snd_seq_device cfg80211 joydev
[  338.158192] CPU: 4 PID: 3235 Comm: chrome Tainted: GW 4.19.0 
#2
[  338.165700] Hardware name: Google Cheza (rev1) (DT)
[  338.170720] pstate: 8049 (Nzcv daif +PAN -UAO)
[  338.175660] pc : drm_mm_insert_node_in_range+0xfc/0x348
[  338.181035] lr : drm_mm_insert_node_in_range+0x24/0x348
[  338.186407] sp : ff8013033b30
[  338.189816] x29: ff8013033bd0 x28: ff8008591894
[  338.195275] x27: 0010 x26: 
[  338.200734] x25:  x24: 
[  338.206194] x23:  x22: ffc0f48b7e08
[  338.211656] x21:  x20: 005d
[  338.217118] x19:  x18: 
[  338.222581] x17:  x16: 
[  338.228046] x15:  x14: 
[  338.233511] x13: 0001 x12: ffc0b1da7200
[  338.238978] x11: 0010 x10: 0010
[  338.244437] x9 : 0008 x8 : 4000
[  338.249898] x7 :  x6 : 
[  338.255361] x5 :  x4 : 
[  338.260823] x3 :  x2 : 005d
[  338.266285] x1 : ffc0b1da7100 x0 : ffc0b0215800
[  338.271748] Process chrome (pid: 3235, stack limit = 0x0900f416)
[  338.278628] Call trace:
[  338.281151]  drm_mm_insert_node_in_range+0xfc/0x348
[  338.286168]  msm_gem_map_vma+0x60/0xdc
[  338.290022]  msm_gem_get_iova+0xb4/0xf4
[  338.293967]  msm_ioctl_gem_info+0x90/0xdc
[  338.298089]  drm_ioctl_kernel+0xa8/0xe8
[  338.302043]  drm_ioctl+0x218/0x384
[  338.305547]  drm_compat_ioctl+0xd8/0xe8
[  338.309503]  __arm64_compat_sys_ioctl+0x134/0x20c
[  338.314339]  el0_svc_common+0xa0/0xf0
[  338.318108]  el0_svc_compat_handler+0x2c/0x38
[  338.322588]  el0_svc_compat+0x8/0x18
[  338.326274] Code: f94066c8 aa1f03e0 321d03e9 321c03ea (f9401d0b)
[  338.332538] ---[ end trace 5c09e60869887d87 ]---
[  338.354633] Kernel panic - not syncing: Fatal exception
[  338.360018] SMP: stopping secondary CPUs
[  338.364179] Kernel Offset: disabled
[  338.367779] CPU features: 0x0,22802a18
[  338.371643] Memory Limit: none
--

 drivers/gpu/drm/drm_prime.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 231e3f6d5f41..0d9b1c43523a 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -945,7 +945,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, 
struct page **pages,
 
index = 0;
for_each_sg(sgt->sgl, sg, sgt->nents, count) {
-   len = sg->length;
+   len = sg_dma_length(sg);

Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2019-01-01 Thread Vivek Gautam
Hi Robin,

On Fri, Dec 7, 2018 at 2:54 PM Vivek Gautam  wrote:
>
> Hi Robin,
>
> On Tue, Dec 4, 2018 at 8:51 PM Robin Murphy  wrote:
> >
> > On 04/12/2018 11:01, Vivek Gautam wrote:
> > > Qualcomm SoCs have an additional level of cache called as
> > > System cache, aka. Last level cache (LLC). This cache sits right
> > > before the DDR, and is tightly coupled with the memory controller.
> > > The cache is available to all the clients present in the SoC system.
> > > The clients request their slices from this system cache, make it
> > > active, and can then start using it.
> > > For these clients with smmu, to start using the system cache for
> > > buffers and, related page tables [1], memory attributes need to be
> > > set accordingly.
> > > This change updates the MAIR and TCR configurations with correct
> > > attributes to use this system cache.
> > >
> > > To explain a little about memory attribute requirements here:
> > >
> > > Non-coherent I/O devices can't look-up into inner caches. However,
> > > coherent I/O devices can. But both can allocate in the system cache
> > > based on system policy and configured memory attributes in page
> > > tables.
> > > CPUs can access both inner and outer caches (including system cache,
> > > aka. Last level cache), and can allocate into system cache too
> > > based on memory attributes, and system policy.
> > >
> > > Further looking at memory types, we have following -
> > > a) Normal uncached :- MAIR 0x44, inner non-cacheable,
> > >outer non-cacheable;
> > > b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
> > >outer read write-back non-transient;
> > >attribute setting for coherenet I/O devices.
> > >
> > > and, for non-coherent i/o devices that can allocate in system cache
> > > another type gets added -
> > > c) Normal sys-cached/non-inner-cached :-
> > >MAIR 0xf4, inner non-cacheable,
> > >outer read write-back non-transient
> > >
> > > So, CPU will automatically use the system cache for memory marked as
> > > normal cached. The normal sys-cached is downgraded to normal non-cached
> > > memory for CPUs.
> > > Coherent I/O devices can use system cache by marking the memory as
> > > normal cached.
> > > Non-coherent I/O devices, to use system cache, should mark the memory as
> > > normal sys-cached in page tables.
> > >
> > > This change is a realisation of following changes
> > > from downstream msm-4.9:
> > > iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
> > > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]
> > >
> > > [1] https://patchwork.kernel.org/patch/10302791/
> > > [2] 
> > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=bf762276796e79ca90014992f4d9da5593fa7d51
> > > [3] 
> > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602
> > >
> > > Signed-off-by: Vivek Gautam 
> > > ---
> > >
> > > Changes since v1:
> > >   - Addressed Tomasz's comments for basing the change on
> > > "NO_INNER_CACHE" concept for non-coherent I/O devices
> > > rather than capturing "SYS_CACHE". This is to indicate
> > > clearly the intent of non-coherent I/O devices that
> > > can't access inner caches.
> >
> > That seems backwards to me - there is already a fundamental assumption
> > that non-coherent devices can't access caches. What we're adding here is
> > a weird exception where they *can* use some level of cache despite still
> > being non-coherent overall.
> >
> > In other words, it's not a case of downgrading coherent devices'
> > accesses to bypass inner caches, it's upgrading non-coherent devices'
> > accesses to hit the outer cache. That's certainly the understanding I
> > got from talking with Pratik at Plumbers, and it does appear to fit with
> > your explanation above despite the final conclusion you draw being
> > different.
>
> Thanks for the thorough review of the change.
> Right, I guess it's rather an upgrade for non-coherent devices to use
> an outer cache than a downgrade for coherent devices.
>
> >
> > I do see what Toma

Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2019-01-01 Thread Vivek Gautam
On Thu, Dec 13, 2018 at 9:20 AM Tomasz Figa  wrote:
>
> On Fri, Dec 7, 2018 at 6:25 PM Vivek Gautam  
> wrote:
> >
> > Hi Robin,
> >
> > On Tue, Dec 4, 2018 at 8:51 PM Robin Murphy  wrote:
> > >
> > > On 04/12/2018 11:01, Vivek Gautam wrote:
> > > > Qualcomm SoCs have an additional level of cache called as
> > > > System cache, aka. Last level cache (LLC). This cache sits right
> > > > before the DDR, and is tightly coupled with the memory controller.
> > > > The cache is available to all the clients present in the SoC system.
> > > > The clients request their slices from this system cache, make it
> > > > active, and can then start using it.
> > > > For these clients with smmu, to start using the system cache for
> > > > buffers and, related page tables [1], memory attributes need to be
> > > > set accordingly.
> > > > This change updates the MAIR and TCR configurations with correct
> > > > attributes to use this system cache.
> > > >
> > > > To explain a little about memory attribute requirements here:
> > > >
> > > > Non-coherent I/O devices can't look-up into inner caches. However,
> > > > coherent I/O devices can. But both can allocate in the system cache
> > > > based on system policy and configured memory attributes in page
> > > > tables.
> > > > CPUs can access both inner and outer caches (including system cache,
> > > > aka. Last level cache), and can allocate into system cache too
> > > > based on memory attributes, and system policy.
> > > >
> > > > Further looking at memory types, we have following -
> > > > a) Normal uncached :- MAIR 0x44, inner non-cacheable,
> > > >outer non-cacheable;
> > > > b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
> > > >outer read write-back non-transient;
> > > >attribute setting for coherenet I/O devices.
> > > >
> > > > and, for non-coherent i/o devices that can allocate in system cache
> > > > another type gets added -
> > > > c) Normal sys-cached/non-inner-cached :-
> > > >MAIR 0xf4, inner non-cacheable,
> > > >outer read write-back non-transient
> > > >
> > > > So, CPU will automatically use the system cache for memory marked as
> > > > normal cached. The normal sys-cached is downgraded to normal non-cached
> > > > memory for CPUs.
> > > > Coherent I/O devices can use system cache by marking the memory as
> > > > normal cached.
> > > > Non-coherent I/O devices, to use system cache, should mark the memory as
> > > > normal sys-cached in page tables.
> > > >
> > > > This change is a realisation of following changes
> > > > from downstream msm-4.9:
> > > > iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
> > > > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]
> > > >
> > > > [1] https://patchwork.kernel.org/patch/10302791/
> > > > [2] 
> > > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=bf762276796e79ca90014992f4d9da5593fa7d51
> > > > [3] 
> > > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602
> > > >
> > > > Signed-off-by: Vivek Gautam 
> > > > ---
> > > >
> > > > Changes since v1:
> > > >   - Addressed Tomasz's comments for basing the change on
> > > > "NO_INNER_CACHE" concept for non-coherent I/O devices
> > > > rather than capturing "SYS_CACHE". This is to indicate
> > > > clearly the intent of non-coherent I/O devices that
> > > > can't access inner caches.
> > >
> > > That seems backwards to me - there is already a fundamental assumption
> > > that non-coherent devices can't access caches. What we're adding here is
> > > a weird exception where they *can* use some level of cache despite still
> > > being non-coherent overall.
> > >
> > > In other words, it's not a case of downgrading coherent devices'
> > > accesses to bypass inner caches, it's upgrading non-coherent devices'
> > > accesses to hit the outer cache. That&#x

Re: [PATCH v1] phy: qcom-ufs: Use iopoll.h readl_poll_timeout macro

2019-01-01 Thread Vivek Gautam
On Fri, Dec 21, 2018 at 9:43 PM Marc Gonzalez  wrote:
>
> The private copy of readl_poll_timeout is no longer needed.
> Use the implementation in iopoll.h instead.
>
> Signed-off-by: Marc Gonzalez 
> ---
>  drivers/phy/qualcomm/phy-qcom-ufs-i.h | 19 +--
>  1 file changed, 1 insertion(+), 18 deletions(-)
>
> diff --git a/drivers/phy/qualcomm/phy-qcom-ufs-i.h 
> b/drivers/phy/qualcomm/phy-qcom-ufs-i.h
> index 681644e43248..f798fb64de94 100644
> --- a/drivers/phy/qualcomm/phy-qcom-ufs-i.h
> +++ b/drivers/phy/qualcomm/phy-qcom-ufs-i.h
> @@ -23,24 +23,7 @@
>  #include 
>  #include 
>  #include 
> -
> -#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \
> -({ \
> -   ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \
> -   might_sleep_if(timeout_us); \
> -   for (;;) { \
> -   (val) = readl(addr); \
> -   if (cond) \
> -   break; \
> -   if (timeout_us && ktime_compare(ktime_get(), timeout) > 0) { \
> -   (val) = readl(addr); \
> -   break; \
> -   } \
> -   if (sleep_us) \
> -   usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \
> -   } \
> -   (cond) ? 0 : -ETIMEDOUT; \
> -})
> +#include 
>
>  #define UFS_QCOM_PHY_CAL_ENTRY(reg, val)   \
> {   \
> --
> 2.17.1

Thanks for the patch. LGTM.
Reviewed-by: Vivek Gautam 

Best regards
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [RESEND PATCH v4 1/1] dt-bindings: arm-smmu: Add binding doc for Qcom smmu-500

2018-12-16 Thread Vivek Gautam
On Thu, Dec 13, 2018 at 4:16 PM Will Deacon  wrote:
>
> On Thu, Dec 13, 2018 at 02:35:07PM +0530, Vivek Gautam wrote:
> > Qcom's implementation of arm,mmu-500 works well with current
> > arm-smmu driver implementation. Adding a soc specific compatible
> > along with arm,mmu-500 makes the bindings future safe.
> >
> > Signed-off-by: Vivek Gautam 
> > Reviewed-by: Rob Herring 
> > Cc: Will Deacon 
> > ---
> >
> > Hi Joerg,
> > I am picking this out separately from the sdm845 smmu support
> > series [1], so that this can go through iommu tree.
> > The dt patch from the series [1] can be taken through arm-soc tree.
> >
> > Hi Will,
> > As asked [2], here's the resend version of dt binding patch for sdm845.
> > Kindly ack this so that Joerg can pull this in.
>
> Acked-by: Will Deacon 

Thanks a lot Will for the Ack.

Regards
Vivek

>
> Joerg -- please can you take this on top of the pull request I sent already?
> Vivek included it as part of a separate series which I thought was going
> via arm-soc, but actually it needs to go with the other arm-smmu patches
> in order to avoid conflicts.
>
> Cheers,
>
> Will
>
> >  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
> > b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> > index a6504b37cc21..3133f3ba7567 100644
> > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> > @@ -27,6 +27,10 @@ conditions.
> >"qcom,msm8996-smmu-v2", "qcom,smmu-v2",
> >"qcom,sdm845-smmu-v2", "qcom,smmu-v2".
> >
> > +  Qcom SoCs implementing "arm,mmu-500" must also include,
> > +  as below, SoC-specific compatibles:
> > +  "qcom,sdm845-smmu-500", "arm,mmu-500"
> > +
> >  - reg   : Base address and size of the SMMU.
> >
> >  - #global-interrupts : The number of global interrupts exposed by the
> > --
> > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> > of Code Aurora Forum, hosted by The Linux Foundation
> >
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[RESEND PATCH v4 1/1] dt-bindings: arm-smmu: Add binding doc for Qcom smmu-500

2018-12-13 Thread Vivek Gautam
Qcom's implementation of arm,mmu-500 works well with current
arm-smmu driver implementation. Adding a soc specific compatible
along with arm,mmu-500 makes the bindings future safe.

Signed-off-by: Vivek Gautam 
Reviewed-by: Rob Herring 
Cc: Will Deacon 
---

Hi Joerg,
I am picking this out separately from the sdm845 smmu support
series [1], so that this can go through iommu tree.
The dt patch from the series [1] can be taken through arm-soc tree.

Hi Will,
As asked [2], here's the resend version of dt binding patch for sdm845.
Kindly ack this so that Joerg can pull this in.

Thanks
Vivek

[1] https://patchwork.kernel.org/cover/10636359/
[2] https://patchwork.kernel.org/patch/10636363/

 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index a6504b37cc21..3133f3ba7567 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -27,6 +27,10 @@ conditions.
   "qcom,msm8996-smmu-v2", "qcom,smmu-v2",
   "qcom,sdm845-smmu-v2", "qcom,smmu-v2".
 
+  Qcom SoCs implementing "arm,mmu-500" must also include,
+  as below, SoC-specific compatibles:
+  "qcom,sdm845-smmu-500", "arm,mmu-500"
+
 - reg   : Base address and size of the SMMU.
 
 - #global-interrupts : The number of global interrupts exposed by the
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 2/2] arm64: dts: msm8996: Add display smmu node

2018-12-12 Thread Vivek Gautam
From: Archit Taneja 

Add device node for display smmu, aka. mdp_smmu.

Signed-off-by: Archit Taneja 
Signed-off-by: Vivek Gautam 
---
 arch/arm64/boot/dts/qcom/msm8996.dtsi | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 197e186eac10..949e3b99fda4 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -1121,6 +1121,23 @@
power-domains = <&mmcc GPU_GDSC>;
};
 
+   mdp_smmu: arm,smmu@d0 {
+   compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
+   reg = <0xd0 0x1>;
+
+   #global-interrupts = <1>;
+   interrupts = ,
+,
+;
+   #iommu-cells = <1>;
+
+   clocks = <&mmcc SMMU_MDP_AHB_CLK>,
+<&mmcc SMMU_MDP_AXI_CLK>;
+   clock-names = "iface", "bus";
+
+   power-domains = <&mmcc MDSS_GDSC>;
+   };
+
agnoc@0 {
power-domains = <&gcc AGGRE0_NOC_GDSC>;
compatible = "simple-pm-bus";
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/2] arm64: dts: msm8996: Add graphics smmu node

2018-12-12 Thread Vivek Gautam
From: Jordan Crouse 

Add device node for graphics smmu, aka. adreno_smmu.

Signed-off-by: Jordan Crouse 
Signed-off-by: Vivek Gautam 
---
 arch/arm64/boot/dts/qcom/msm8996.dtsi | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 99b7495455a6..197e186eac10 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -1104,6 +1104,23 @@
};
};
 
+   adreno_smmu: arm,smmu@b4 {
+   compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
+   reg = <0xb4 0x1>;
+
+   #global-interrupts = <1>;
+   interrupts = ,
+,
+;
+   #iommu-cells = <1>;
+
+   clocks = <&mmcc GPU_AHB_CLK>,
+<&gcc GCC_MMSS_BIMC_GFX_CLK>;
+   clock-names = "iface", "bus";
+
+   power-domains = <&mmcc GPU_GDSC>;
+   };
+
agnoc@0 {
power-domains = <&gcc AGGRE0_NOC_GDSC>;
compatible = "simple-pm-bus";
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 0/2] arm64: dts: msm8996: Add display and graphics smmu

2018-12-12 Thread Vivek Gautam
The driver side patches are now pulled in [1]. So, we can now enable
these smmu's used by display and graphics.
This has been lying in my test trees [2] for a while, and work well with
display and gpu enabled on msm8996.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates
[2] https://github.com/vivekgautam1/linux/tree/origin/v4.20-rc5/db820c

Archit Taneja (1):
  arm64: dts: msm8996: Add display smmu node

Jordan Crouse (1):
  arm64: dts: msm8996: Add graphics smmu node

 arch/arm64/boot/dts/qcom/msm8996.dtsi | 34 ++
 1 file changed, 34 insertions(+)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH v4 1/2] dt-bindings: arm-smmu: Add binding doc for Qcom smmu-500

2018-12-12 Thread Vivek Gautam
Hi Will,

On Fri, Oct 12, 2018 at 11:37 AM Vivek Gautam
 wrote:
>
>
>
> On 10/12/2018 3:46 AM, Rob Herring wrote:
> > On Thu, 11 Oct 2018 15:19:29 +0530, Vivek Gautam wrote:
> >> Qcom's implementation of arm,mmu-500 works well with current
> >> arm-smmu driver implementation. Adding a soc specific compatible
> >> along with arm,mmu-500 makes the bindings future safe.
> >>
> >> Signed-off-by: Vivek Gautam 
> >> ---
> >>
> >> Changes since v3:
> >>   - Refined language more to state things directly for the bindings
> >> description.
> >>
> >>   Documentation/devicetree/bindings/iommu/arm,smmu.txt | 4 
> >>   1 file changed, 4 insertions(+)
> >>
> > Reviewed-by: Rob Herring 
>
> Thank you Rob.
>

Can you please pick this one as well to your tree? This goes on top of the
bindings patch for "qcom,smmu-v2". So, it can't go through Andy's tree.
Will ask Andy to pick the second patch of the series, that adds the dt node.

I guess as I sent this one along with the dt patch, I would have
mistakenly added
you to 'cc' list rather than 'to' list.
Let me know if you would like me to resend it.

Thank
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 1/1] media: venus: core: Set dma maximum segment size

2018-12-07 Thread Vivek Gautam



On 12/7/2018 3:38 PM, Stanimir Varbanov wrote:

Hi Vivek,

Thanks for the patch!

On 12/5/18 10:31 AM, Vivek Gautam wrote:

Turning on CONFIG_DMA_API_DEBUG_SG results in the following error:

[  460.308650] [ cut here ]
[  460.313490] qcom-venus aa0.video-codec: DMA-API: mapping sg segment 
longer than device claims to support [len=4194304] [max=65536]
[  460.326017] WARNING: CPU: 3 PID: 3555 at src/kernel/dma/debug.c:1301 
debug_dma_map_sg+0x174/0x254
[  460.33] Modules linked in: venus_dec venus_enc videobuf2_dma_sg 
videobuf2_memops hci_uart btqca bluetooth venus_core v4l2_mem2mem 
videobuf2_v4l2 videobuf2_common ath10k_snoc ath10k_core ath lzo lzo_compress 
zramjoydev
[  460.375811] CPU: 3 PID: 3555 Comm: V4L2DecoderThre Tainted: GW   
  4.19.1 #82
[  460.384223] Hardware name: Google Cheza (rev1) (DT)
[  460.389251] pstate: 6049 (nZCv daif +PAN -UAO)
[  460.394191] pc : debug_dma_map_sg+0x174/0x254
[  460.398680] lr : debug_dma_map_sg+0x174/0x254
[  460.403162] sp : ff80200c37d0
[  460.406583] x29: ff80200c3830 x28: 0001
[  460.412056] x27:  x26: ffc0f785ea80
[  460.417532] x25:  x24: ffc0f4ea1290
[  460.423001] x23: ffc09e700300 x22: ffc0f4ea1290
[  460.428470] x21: ff8009037000 x20: 0001
[  460.433936] x19: ff80091b x18: 
[  460.439411] x17:  x16: f251
[  460.444885] x15: 0006 x14: 0720072007200720
[  460.450354] x13: ff800af536e0 x12: 
[  460.455822] x11:  x10: 
[  460.461288] x9 : 537944d9c6c48d00 x8 : 537944d9c6c48d00
[  460.466758] x7 :  x6 : ffc0f8d98f80
[  460.472230] x5 :  x4 : 
[  460.477703] x3 : 008a x2 : ffc0fdb13948
[  460.483170] x1 : ffc0fdb0b0b0 x0 : 007a
[  460.488640] Call trace:
[  460.491165]  debug_dma_map_sg+0x174/0x254
[  460.495307]  vb2_dma_sg_alloc+0x260/0x2dc [videobuf2_dma_sg]
[  460.501150]  __vb2_queue_alloc+0x164/0x374 [videobuf2_common]
[  460.507076]  vb2_core_reqbufs+0xfc/0x23c [videobuf2_common]
[  460.512815]  vb2_reqbufs+0x44/0x5c [videobuf2_v4l2]
[  460.517853]  v4l2_m2m_reqbufs+0x44/0x78 [v4l2_mem2mem]
[  460.523144]  v4l2_m2m_ioctl_reqbufs+0x1c/0x28 [v4l2_mem2mem]
[  460.528976]  v4l_reqbufs+0x30/0x40
[  460.532480]  __video_do_ioctl+0x36c/0x454
[  460.536610]  video_usercopy+0x25c/0x51c
[  460.540572]  video_ioctl2+0x38/0x48
[  460.544176]  v4l2_ioctl+0x60/0x74
[  460.547602]  do_video_ioctl+0x948/0x3520
[  460.551648]  v4l2_compat_ioctl32+0x60/0x98
[  460.555872]  __arm64_compat_sys_ioctl+0x134/0x20c
[  460.560718]  el0_svc_common+0x9c/0xe4
[  460.564498]  el0_svc_compat_handler+0x2c/0x38
[  460.568982]  el0_svc_compat+0x8/0x18
[  460.572672] ---[ end trace ce209b87b2f3af88 ]---

 From above warning one would deduce that the sg segment will overflow
the device's capacity. In reality, the hardware can accommodate larger
sg segments.
So, initialize the max segment size properly to weed out this warning.

Based on a similar patch sent by Sean Paul for mdss:
https://patchwork.kernel.org/patch/10671457/

Signed-off-by: Vivek Gautam 
---
  drivers/media/platform/qcom/venus/core.c | 8 
  1 file changed, 8 insertions(+)

Acked-by: Stanimir Varbanov 



Thanks Stan.


Best regards
Vivek



Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2018-12-07 Thread Vivek Gautam
Hi Robin,

On Tue, Dec 4, 2018 at 8:51 PM Robin Murphy  wrote:
>
> On 04/12/2018 11:01, Vivek Gautam wrote:
> > Qualcomm SoCs have an additional level of cache called as
> > System cache, aka. Last level cache (LLC). This cache sits right
> > before the DDR, and is tightly coupled with the memory controller.
> > The cache is available to all the clients present in the SoC system.
> > The clients request their slices from this system cache, make it
> > active, and can then start using it.
> > For these clients with smmu, to start using the system cache for
> > buffers and, related page tables [1], memory attributes need to be
> > set accordingly.
> > This change updates the MAIR and TCR configurations with correct
> > attributes to use this system cache.
> >
> > To explain a little about memory attribute requirements here:
> >
> > Non-coherent I/O devices can't look-up into inner caches. However,
> > coherent I/O devices can. But both can allocate in the system cache
> > based on system policy and configured memory attributes in page
> > tables.
> > CPUs can access both inner and outer caches (including system cache,
> > aka. Last level cache), and can allocate into system cache too
> > based on memory attributes, and system policy.
> >
> > Further looking at memory types, we have following -
> > a) Normal uncached :- MAIR 0x44, inner non-cacheable,
> >outer non-cacheable;
> > b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
> >outer read write-back non-transient;
> >attribute setting for coherenet I/O devices.
> >
> > and, for non-coherent i/o devices that can allocate in system cache
> > another type gets added -
> > c) Normal sys-cached/non-inner-cached :-
> >MAIR 0xf4, inner non-cacheable,
> >outer read write-back non-transient
> >
> > So, CPU will automatically use the system cache for memory marked as
> > normal cached. The normal sys-cached is downgraded to normal non-cached
> > memory for CPUs.
> > Coherent I/O devices can use system cache by marking the memory as
> > normal cached.
> > Non-coherent I/O devices, to use system cache, should mark the memory as
> > normal sys-cached in page tables.
> >
> > This change is a realisation of following changes
> > from downstream msm-4.9:
> > iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
> > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]
> >
> > [1] https://patchwork.kernel.org/patch/10302791/
> > [2] 
> > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=bf762276796e79ca90014992f4d9da5593fa7d51
> > [3] 
> > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602
> >
> > Signed-off-by: Vivek Gautam 
> > ---
> >
> > Changes since v1:
> >   - Addressed Tomasz's comments for basing the change on
> > "NO_INNER_CACHE" concept for non-coherent I/O devices
> > rather than capturing "SYS_CACHE". This is to indicate
> > clearly the intent of non-coherent I/O devices that
> > can't access inner caches.
>
> That seems backwards to me - there is already a fundamental assumption
> that non-coherent devices can't access caches. What we're adding here is
> a weird exception where they *can* use some level of cache despite still
> being non-coherent overall.
>
> In other words, it's not a case of downgrading coherent devices'
> accesses to bypass inner caches, it's upgrading non-coherent devices'
> accesses to hit the outer cache. That's certainly the understanding I
> got from talking with Pratik at Plumbers, and it does appear to fit with
> your explanation above despite the final conclusion you draw being
> different.

Thanks for the thorough review of the change.
Right, I guess it's rather an upgrade for non-coherent devices to use
an outer cache than a downgrade for coherent devices.

>
> I do see what Tomasz meant in terms of the TCR attributes, but what we
> currently do there is a little unintuitive and not at all representative
> of actual mapping attributes - I'll come back to that inline.
>
> >   drivers/iommu/arm-smmu.c   | 15 +++
> >   drivers/iommu/dma-iommu.c  |  3 +++
> >   drivers/iommu/io-pgtable-arm.c | 22 +-
> >   drivers/iommu/io-pgtable.h |  5 +
> >   include/linux/iommu.h  | 

Re: [PATCH v6 2/5] phy: qcom-qmp: Utilize fully-specified DT registers

2018-12-07 Thread Vivek Gautam
*
>  * Get memory resources for each phy lane:
> -* Resources are indexed as: tx -> 0; rx -> 1; pcs -> 2; and
> -* pcs_misc (optional) -> 3.
> +* Resources are indexed as: tx -> 0; rx -> 1; pcs -> 2.
> +* For dual lane PHYs: tx2 -> 3, rx2 -> 4, pcs_misc (optional) -> 5
> +* For single lane PHYs: pcs_misc (optional) -> 3.
>  */
> qphy->tx = of_iomap(np, 0);
> if (!qphy->tx)
> @@ -1630,7 +1630,32 @@ int qcom_qmp_phy_create(struct device *dev, struct 
> device_node *np, int id)
> if (!qphy->pcs)
> return -ENOMEM;
>
> -   qphy->pcs_misc = of_iomap(np, 3);
> +   /*
> +* If this is a dual-lane PHY, then there should be registers for the
> +* second lane. Some old device trees did not specify this, so fall
> +* back to old legacy behavior of assuming they can be reached at an
> +* offset from the first lane.
> +*/
> +   if (qmp->cfg->is_dual_lane_phy) {
> +   qphy->tx2 = of_iomap(np, 3);
> +   qphy->rx2 = of_iomap(np, 4);
> +   if (!qphy->tx2 || !qphy->rx2) {
> +   dev_warn(dev,
> +"Underspecified device tree, falling back to 
> legacy register regions\n");
> +
> +   /* In the old version, pcs_misc is at index 3. */
> +   qphy->pcs_misc = qphy->tx2;
> +   qphy->tx2 = qphy->tx + QMP_PHY_LEGACY_LANE_STRIDE;
> +   qphy->rx2 = qphy->rx + QMP_PHY_LEGACY_LANE_STRIDE;
> +
> +   } else {
> +   qphy->pcs_misc = of_iomap(np, 5);
> +   }
> +
> +   } else {
> +   qphy->pcs_misc = of_iomap(np, 3);
> +   }
> +
> if (!qphy->pcs_misc)
> dev_vdbg(dev, "PHY pcs_misc-reg not used\n");
>
> --
> 2.18.1
>

Tested on db820c [1]. USB, PCIe come up.
Tested-by: Vivek Gautam 

[1] https://github.com/vivekgautam1/linux/tree/origin/v4.20-rc5/db820c

BRs

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v1 4/4] phy: qcom-qmp: Expose provided clocks to DT

2018-12-07 Thread Vivek Gautam
On Fri, Nov 30, 2018 at 3:46 AM Evan Green  wrote:
>
> Register a simple clock provider for the PHY pipe clock sources so that
> device tree users can point at these clocks via phandles to the lane
> nodes.
>
> Signed-off-by: Evan Green 
> ---
>
>  drivers/phy/qualcomm/phy-qcom-qmp.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c 
> b/drivers/phy/qualcomm/phy-qcom-qmp.c
> index 8204d55e2d650..b4006818e1b65 100644
> --- a/drivers/phy/qualcomm/phy-qcom-qmp.c
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
> @@ -1542,6 +1542,11 @@ static int qcom_qmp_phy_clk_init(struct device *dev)
> return devm_clk_bulk_get(dev, num, qmp->clks);
>  }
>
> +static void phy_pipe_clk_release_provider(void *res)
> +{
> +   of_clk_del_provider(res);
> +}
> +
>  /*
>   * Register a fixed rate pipe clock.
>   *
> @@ -1588,7 +1593,23 @@ static int phy_pipe_clk_register(struct qcom_qmp *qmp, 
> struct device_node *np)
> fixed->fixed_rate = 12500;
> fixed->hw.init = &init;
>
> -   return devm_clk_hw_register(qmp->dev, &fixed->hw);
> +   ret = devm_clk_hw_register(qmp->dev, &fixed->hw);
> +   if (ret)
> +   return ret;
> +
> +   ret = of_clk_add_hw_provider(np, of_clk_hw_simple_get, &fixed->hw);
> +   if (ret)
> +   return ret;
> +
> +   /*
> +* Roll a devm action because the clock provider is the child node, 
> but
> +* the child node is not actually a device.
> +*/
> +   ret = devm_add_action(qmp->dev, phy_pipe_clk_release_provider, np);
> +   if (ret)
> +   phy_pipe_clk_release_provider(np);
> +
> +   return ret;
>  }
>
>  static const struct phy_ops qcom_qmp_phy_gen_ops = {
> --
> 2.18.1
>

Tested on db820c [1]
Tested-by: Vivek Gautam 

[1] https://github.com/vivekgautam1/linux/tree/origin/v4.20-rc5/db820c

BRs
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v1 2/4] arm64: dts: qcom: msm8996: Fix QMP PHY #clock-cells

2018-12-07 Thread Vivek Gautam
On Fri, Nov 30, 2018 at 3:45 AM Evan Green  wrote:
>
> Move #clock-cells into the child node and set it to 0 to conform to the
> proper binding specification.
>
> Signed-off-by: Evan Green 
> ---
>
>  arch/arm64/boot/dts/qcom/msm8996.dtsi | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index 13bb96444df00..4af740ca0880f 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -767,7 +767,6 @@
> phy@34000 {
> compatible = "qcom,msm8996-qmp-pcie-phy";
> reg = <0x34000 0x488>;
> -   #clock-cells = <1>;
> #address-cells = <1>;
> #size-cells = <1>;
> ranges;
> @@ -790,6 +789,7 @@
> reg = <0x035000 0x130>,
> <0x035200 0x200>,
> <0x035400 0x1dc>;
> +   #clock-cells = <0>;
> #phy-cells = <0>;
>
> clock-output-names = "pcie_0_pipe_clk_src";
> @@ -803,6 +803,7 @@
> reg = <0x036000 0x130>,
> <0x036200 0x200>,
> <0x036400 0x1dc>;
> +   #clock-cells = <0>;
> #phy-cells = <0>;
>
> clock-output-names = "pcie_1_pipe_clk_src";
> @@ -816,6 +817,7 @@
> reg = <0x037000 0x130>,
> <0x037200 0x200>,
> <0x037400 0x1dc>;
> +   #clock-cells = <0>;
> #phy-cells = <0>;
>
> clock-output-names = "pcie_2_pipe_clk_src";
> @@ -829,7 +831,6 @@
> phy@741 {
> compatible = "qcom,msm8996-qmp-usb3-phy";
> reg = <0x741 0x1c4>;
> -   #clock-cells = <1>;
> #address-cells = <1>;
> #size-cells = <1>;
> ranges;
> @@ -851,6 +852,7 @@
> reg = <0x7410200 0x200>,
>     <0x7410400 0x130>,
> <0x7410600 0x1a8>;
> +   #clock-cells = <0>;
> #phy-cells = <0>;
>
> clock-output-names = "usb3_phy_pipe_clk_src";
> --
> 2.18.1
>

Tested on db820c [1]
Tested-by: Vivek Gautam 

[1] https://github.com/vivekgautam1/linux/tree/origin/v4.20-rc5/db820c

BRs
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v1 0/4] phy: qcom-qmp: Fix clock-cells binding and provider

2018-12-06 Thread Vivek Gautam
Hi,

On Fri, Dec 7, 2018 at 10:15 AM Kishon Vijay Abraham I  wrote:
>
> Vivek,
>
> On 04/12/18 6:07 PM, Vivek Gautam wrote:
> > Hi Kishon,
> >
> > On Tue, Dec 4, 2018 at 1:44 PM Kishon Vijay Abraham I  wrote:
> >>
> >> Hi Andy Gross, David Brown, Vivek,
> >>
> >> On 30/11/18 3:43 AM, Evan Green wrote:
> >>> This series fixes the QMP PHY bindings, which had specified #clock-cells
> >>> in the parent node, and had set it to 1. Putting it in the parent node is
> >>> wrong because the clock providers are the child nodes, so this change
> >>> moves it there. Having it set to 1 is also wrong, since nothing is ever
> >>> specified as to what should go in that cell. So this changes it to zero.
> >>> Finally, this change completes a little bit of code to actually allow 
> >>> these
> >>> exposed clocks to be pointed at in DT.
> >>>
> >>> I had no idea how to fix up ipq8074.dtsi. It seems to be completely wrong 
> >>> in
> >>> that it doesn't specify #clock-cells at all, has no child nodes, and
> >>> specifies clock-output-names in the parent node. As far as I can tell this
> >>> doesn't work at all. But I can't add the child nodes myself because I 
> >>> don't know
> >>> 1) how many there are, and 2) the registers in them. I also have no way 
> >>> to test it.
> >>>
> >>> Speaking of testing, I was able to test this on sdm845, but haven't 
> >>> tested msm8996.
> >>
> >> Can someone help test this series in msm8996?
> >
> > Sure, will give it a try tomorrow.
>
> I'm planning to close the merge by today. Can you test this series please?

Sorry, got held up with an issue yesterday. Will update you in couple of hours.

Thanks
Vivek

[snip]


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v10 0/6] Support for Qualcomm UFS QMP PHY on SDM845

2018-12-04 Thread Vivek Gautam
On Tue, Oct 23, 2018 at 10:07 AM Can Guo  wrote:
>
> This patch series adds support for UFS QMP PHY on SDM845 and the
> compatible string for it. This patch series depends on the current
> proposed QMP V3 USB3 UNI PHY support for sdm845 driver [1], on
> the DT bindings for the QMP V3 USB3 PHYs based dirver [2], and also
> rebased on updated pipe_clk initialization sequence [3]. This series
> can only be merged once the dependent patches do.
> [1] 
> http://lists-archives.com/linux-kernel/29071659-dt-bindings-phy-qcom-qmp-update-bindings-for-sdm845.html
> [2] 
> http://lists-archives.com/linux-kernel/29071660-phy-qcom-qmp-add-qmp-v3-usb3-uni-phy-support-for-sdm845.html
> [3] https://patchwork.kernel.org/patch/10376551/

Besides my comment for PATCH 4/6, I have already reviewed the entire series,
and it looks good.
If adding new bindings for sdm845 needs a further review, can you separate out
just the phy patches from this series (patch 1, 2, 3 & 6), and re-send them.
We can ask Kishon if he can pull them in for this merge window.
Thanks.

best regards
Vivek

>
> Changes since v9:
> - Incorporated review comments from Rob.
>
> Changes since v8:
> - Add one new change to support ufs core reset.
> - Incorporated review comments from Evan, Vivek.
>
> Changes since v7:
> - Add one new change to update UFS PHY power on sequence
> - Incorporated review comments from Evan, Vivek and Manu.
>
> Changes since v6:
> - Add one new change to clean up some structs and field
> - Updates the PHY power control sequence.
> - Incorporated review comments from Vivek and Manu.
>
> Changes since v5:
> - Updates the PHY power control sequence.
> - Updates UFS PHY power on condition check.
>
> Changes since v4:
> - Adds 'ref_aux' clock back to SDM845 UFS PHY clock list.
> - Power on PHY before serdes configuration starts.
> - Updates the UFS PHY initialization sequence.
> - Updates a few UFS PHY registers.
> - Incorporated review comments from Vivek and Manu.
>
> Changes since v3:
> - Incorporated review comments from Vivek and Rob.
>
> Changes since v2:
> - Incorporated review comments from Vivek and Rob.
> - Remove "ref_aux" from sdm845 ufs phy clock list structure.
>
> Changes since v1:
> - Incorporated review comments from Vivek and Manu.
> - Update the commit title of patch 2.
>
> Can Guo (5):
>   phy: Update PHY power control sequence
>   phy: General struct and field cleanup
>   phy: Add QMP phy based UFS phy support for sdm845
>   scsi: ufs: Power on phy after it is initialized
>   dt-bindings: phy-qcom-qmp: Add UFS phy compatible string for sdm845
>
> Dov Levenglick (1):
>   scsi: ufs: Add core reset support
>
>  .../devicetree/bindings/phy/qcom-qmp-phy.txt   |   4 +-
>  drivers/phy/qualcomm/phy-qcom-qmp.c| 216 
> +++--
>  drivers/phy/qualcomm/phy-qcom-qmp.h|  15 ++
>  drivers/scsi/ufs/ufs-qcom.c|  34 +++-
>  drivers/scsi/ufs/ufs-qcom.h|   1 +
>  drivers/scsi/ufs/ufshcd-pltfrm.c   |  22 +++
>  drivers/scsi/ufs/ufshcd.c  |  13 ++
>  drivers/scsi/ufs/ufshcd.h  |  12 ++
>  8 files changed, 296 insertions(+), 21 deletions(-)
>
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v10 4/6] scsi: ufs: Add core reset support

2018-12-04 Thread Vivek Gautam
On Tue, Oct 23, 2018 at 10:06 AM Can Guo  wrote:
>
> From: Dov Levenglick 
>
> Enables core reset support. Add full initialization of the PHY and the
> controller before initializing UFS PHY and during link recovery.
>
> Signed-off-by: Dov Levenglick 
> Signed-off-by: Amit Nischal 
> Signed-off-by: Subhash Jadavani 
> Signed-off-by: Can Guo 
> ---
>  drivers/scsi/ufs/ufs-qcom.c  | 30 ++
>  drivers/scsi/ufs/ufshcd-pltfrm.c | 22 ++
>  drivers/scsi/ufs/ufshcd.c| 13 +
>  drivers/scsi/ufs/ufshcd.h| 12 
>  4 files changed, 77 insertions(+)
>
> diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
> index 2b38db2..698b92d 100644
> --- a/drivers/scsi/ufs/ufs-qcom.c
> +++ b/drivers/scsi/ufs/ufs-qcom.c
> @@ -616,6 +616,35 @@ static int ufs_qcom_resume(struct ufs_hba *hba, enum 
> ufs_pm_op pm_op)
> return err;
>  }
>
> +static int ufs_qcom_core_reset(struct ufs_hba *hba)
> +{
> +   int ret = -ENOTSUPP;
> +
> +   if (!hba->core_reset) {

This check doesn't make much sense.
You call this ".core_reset" callback only when "hba->core_reset" is available.
Why do we need to check this again here?

> +   dev_err(hba->dev, "%s: failed, err = %d\n", __func__,
> +   ret);
> +   goto out;
> +   }
> +
> +   ret = reset_control_assert(hba->core_reset);
> +   if (ret) {
> +   dev_err(hba->dev, "core_reset assert failed, err = %d\n",
> +   ret);
> +   goto out;
> +   }
> +
> +   /* As per spec, delay is required to let reset assert go through */
> +   usleep_range(1, 2);
> +
> +   ret = reset_control_deassert(hba->core_reset);
> +   if (ret)
> +   dev_err(hba->dev, "core_reset deassert failed, err = %d\n",
> +   ret);
> +
> +out:
> +   return ret;
> +}
> +
>  struct ufs_qcom_dev_params {
> u32 pwm_rx_gear;/* pwm rx gear to work in */
> u32 pwm_tx_gear;/* pwm tx gear to work in */
> @@ -1670,6 +1699,7 @@ static void ufs_qcom_dump_dbg_regs(struct ufs_hba *hba)
> .apply_dev_quirks   = ufs_qcom_apply_dev_quirks,
> .suspend= ufs_qcom_suspend,
> .resume = ufs_qcom_resume,
> +   .core_reset = ufs_qcom_core_reset,
> .dbg_register_dump  = ufs_qcom_dump_dbg_regs,
>  };
>
> diff --git a/drivers/scsi/ufs/ufshcd-pltfrm.c 
> b/drivers/scsi/ufs/ufshcd-pltfrm.c
> index e82bde0..dab11a7 100644
> --- a/drivers/scsi/ufs/ufshcd-pltfrm.c
> +++ b/drivers/scsi/ufs/ufshcd-pltfrm.c
> @@ -42,6 +42,22 @@
>
>  #define UFSHCD_DEFAULT_LANES_PER_DIRECTION 2
>
> +static int ufshcd_parse_reset_info(struct ufs_hba *hba)
> +{
> +   int ret = 0;
> +
> +   hba->core_reset = devm_reset_control_get_optional_exclusive(hba->dev,
> +   "rst");
> +   if (IS_ERR(hba->core_reset)) {
> +   ret = PTR_ERR(hba->core_reset);

First thing, you need to check here for EPROBE_DEFER, and return that
as reset framework may not be probed when this is probing.

Secondly, this whole parse thing can as well be moved to vops (variant ops) as
that's the device having knowledge of resets. Moreover, not all qcom
ufs controllers
have the reset, so I am tilting towards adding a of_match_data field and
corresponding compatible binding for sdm845 (and may be for future
SoCs too) so that we can make this reset mandatory for SoCs
where things won't work without it.
Simply acknowledging the absence of the reset and marking it as NULL
won't help 845 and brothers that need the reset.

Or, do we have any other solution to make this reset mandatory for 845?

> +   dev_err(hba->dev, "core_reset unavailable,err = %d\n",
> +   ret);
> +   hba->core_reset = NULL;
> +   }
> +
> +   return ret;
> +}
> +
>  static int ufshcd_parse_clock_info(struct ufs_hba *hba)
>  {
> int ret = 0;
> @@ -340,6 +356,12 @@ int ufshcd_pltfrm_init(struct platform_device *pdev,
> goto dealloc_host;
> }
>
> +   err = ufshcd_parse_reset_info(hba);
> +   if (err) {
> +   dev_err(&pdev->dev, "%s: reset parse failed %d\n",
> +   __func__, err);
> +   }
> +
> pm_runtime_set_active(&pdev->dev);
> pm_runtime_enable(&pdev->dev);
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index a355d98..d18c3af 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -3657,6 +3657,15 @@ static int ufshcd_link_recovery(struct ufs_hba *hba)
> ufshcd_set_eh_in_progress(hba);
> spin_unlock_irqrestore(hba->host->host_lock, flags);
>
> +   if (hba->core_reset) {
> +   ret = ufshcd_vops_core_reset(hba);
> +   if (ret)
> +  

Re: [PATCH v1 0/4] phy: qcom-qmp: Fix clock-cells binding and provider

2018-12-04 Thread Vivek Gautam
Hi Kishon,

On Tue, Dec 4, 2018 at 1:44 PM Kishon Vijay Abraham I  wrote:
>
> Hi Andy Gross, David Brown, Vivek,
>
> On 30/11/18 3:43 AM, Evan Green wrote:
> > This series fixes the QMP PHY bindings, which had specified #clock-cells
> > in the parent node, and had set it to 1. Putting it in the parent node is
> > wrong because the clock providers are the child nodes, so this change
> > moves it there. Having it set to 1 is also wrong, since nothing is ever
> > specified as to what should go in that cell. So this changes it to zero.
> > Finally, this change completes a little bit of code to actually allow these
> > exposed clocks to be pointed at in DT.
> >
> > I had no idea how to fix up ipq8074.dtsi. It seems to be completely wrong in
> > that it doesn't specify #clock-cells at all, has no child nodes, and
> > specifies clock-output-names in the parent node. As far as I can tell this
> > doesn't work at all. But I can't add the child nodes myself because I don't 
> > know
> > 1) how many there are, and 2) the registers in them. I also have no way to 
> > test it.
> >
> > Speaking of testing, I was able to test this on sdm845, but haven't tested 
> > msm8996.
>
> Can someone help test this series in msm8996?

Sure, will give it a try tomorrow.

Thanks
Vivek

>
> Thanks
> Kishon
>
> >
> > This patch sits atop the UFS device nodes series [1].
> >
> > [1] 
> > https://lore.kernel.org/lkml/20181026173544.136037-1-evgr...@chromium.org/
> >
> >
> >
> > Evan Green (4):
> >   dt-bindings: phy-qcom-qmp: Move #clock-cells to child
> >   arm64: dts: qcom: msm8996: Fix QMP PHY #clock-cells
> >   arm64: dts: qcom: sdm845: Fix QMP PHY #clock-cells
> >   phy: qcom-qmp: Expose provided clocks to DT
> >
> >  .../devicetree/bindings/phy/qcom-qmp-phy.txt  | 11 -
> >  arch/arm64/boot/dts/qcom/msm8996.dtsi |  6 +++--
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi  |  4 ++--
> >  drivers/phy/qualcomm/phy-qcom-qmp.c   | 23 ++-
> >  4 files changed, 33 insertions(+), 11 deletions(-)
> >



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2018-12-04 Thread Vivek Gautam
Qualcomm SoCs have an additional level of cache called as
System cache, aka. Last level cache (LLC). This cache sits right
before the DDR, and is tightly coupled with the memory controller.
The cache is available to all the clients present in the SoC system.
The clients request their slices from this system cache, make it
active, and can then start using it.
For these clients with smmu, to start using the system cache for
buffers and, related page tables [1], memory attributes need to be
set accordingly.
This change updates the MAIR and TCR configurations with correct
attributes to use this system cache.

To explain a little about memory attribute requirements here:

Non-coherent I/O devices can't look-up into inner caches. However,
coherent I/O devices can. But both can allocate in the system cache
based on system policy and configured memory attributes in page
tables.
CPUs can access both inner and outer caches (including system cache,
aka. Last level cache), and can allocate into system cache too
based on memory attributes, and system policy.

Further looking at memory types, we have following -
a) Normal uncached :- MAIR 0x44, inner non-cacheable,
  outer non-cacheable;
b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
  outer read write-back non-transient;
  attribute setting for coherenet I/O devices.

and, for non-coherent i/o devices that can allocate in system cache
another type gets added -
c) Normal sys-cached/non-inner-cached :-
  MAIR 0xf4, inner non-cacheable,
  outer read write-back non-transient

So, CPU will automatically use the system cache for memory marked as
normal cached. The normal sys-cached is downgraded to normal non-cached
memory for CPUs.
Coherent I/O devices can use system cache by marking the memory as
normal cached.
Non-coherent I/O devices, to use system cache, should mark the memory as
normal sys-cached in page tables.

This change is a realisation of following changes
from downstream msm-4.9:
iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]

[1] https://patchwork.kernel.org/patch/10302791/
[2] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=bf762276796e79ca90014992f4d9da5593fa7d51
[3] 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602

Signed-off-by: Vivek Gautam 
---

Changes since v1:
 - Addressed Tomasz's comments for basing the change on
   "NO_INNER_CACHE" concept for non-coherent I/O devices
   rather than capturing "SYS_CACHE". This is to indicate
   clearly the intent of non-coherent I/O devices that
   can't access inner caches.

 drivers/iommu/arm-smmu.c   | 15 +++
 drivers/iommu/dma-iommu.c  |  3 +++
 drivers/iommu/io-pgtable-arm.c | 22 +-
 drivers/iommu/io-pgtable.h |  5 +
 include/linux/iommu.h  |  3 +++
 5 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index ba18d89d4732..047f7ff95b0d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -255,6 +255,7 @@ struct arm_smmu_domain {
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
struct iommu_domain domain;
+   boolno_inner_cache;
 };
 
 struct arm_smmu_option_prop {
@@ -897,6 +898,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
if (smmu_domain->non_strict)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
+   if (smmu_domain->no_inner_cache)
+   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NO_IC;
+
smmu_domain->smmu = smmu;
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops) {
@@ -1579,6 +1583,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case DOMAIN_ATTR_NESTING:
*(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
return 0;
+   case DOMAIN_ATTR_NO_IC:
+   *((int *)data) = smmu_domain->no_inner_cache;
+   return 0;
default:
return -ENODEV;
}
@@ -1619,6 +1626,14 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
else
smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
break;
+   case DOMAIN_ATTR_NO_IC:
+   if (smmu_domain->smmu) {
+   ret = -EPERM;
+   goto out_

[PATCH v19 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-12-03 Thread Vivek Gautam
qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
clock and power requirements.
On msm8996, multiple cores, viz. mdss, video, etc. use this
smmu. On sdm845, this smmu is used with gpu.
Add bindings for the same.

Signed-off-by: Vivek Gautam 
Reviewed-by: Rob Herring 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 drivers/iommu/arm-smmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b6b11642b3a9..ba18d89d4732 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -120,6 +120,7 @@ enum arm_smmu_implementation {
GENERIC_SMMU,
ARM_MMU500,
CAVIUM_SMMUV2,
+   QCOM_SMMUV2,
 };
 
 struct arm_smmu_s2cr {
@@ -2030,6 +2031,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, 
GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
 ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
+ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2);
 
 static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v1", .data = &smmu_generic_v1 },
@@ -2038,6 +2040,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-401", .data = &arm_mmu401 },
{ .compatible = "arm,mmu-500", .data = &arm_mmu500 },
{ .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 },
+   { .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v19 3/5] iommu/arm-smmu: Add the device_link between masters and smmu

2018-12-03 Thread Vivek Gautam
From: Sricharan R 

Finally add the device link between the master device and
smmu, so that the smmu gets runtime enabled/disabled only when the
master needs it. This is done from add_device callback which gets
called once when the master is added to the smmu.

Signed-off-by: Sricharan R 
Signed-off-by: Vivek Gautam 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 drivers/iommu/arm-smmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 1917d214c4d9..b6b11642b3a9 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1500,6 +1500,9 @@ static int arm_smmu_add_device(struct device *dev)
 
iommu_device_link(&smmu->iommu, dev);
 
+   device_link_add(dev, smmu->dev,
+   DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
+
return 0;
 
 out_cfg_free:
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-22 Thread Vivek Gautam

Hi Tomasz, Jordan,


On 11/21/2018 9:18 AM, Tomasz Figa wrote:

Hi Jordan, Vivek,

On Wed, Nov 21, 2018 at 12:41 AM Jordan Crouse  wrote:

On Tue, Nov 20, 2018 at 03:24:37PM +0530, Vivek Gautam wrote:

dma_map_sg() expects a DMA domain. However, the drm devices
have been traditionally using unmanaged iommu domain which
is non-dma type. Using dma mapping APIs with that domain is bad.

Replace dma_map_sg() calls with dma_sync_sg_for_device{|cpu}()
to do the cache maintenance.

Signed-off-by: Vivek Gautam 
Suggested-by: Tomasz Figa 
---

Tested on an MTP sdm845:
https://github.com/vivekgautam1/linux/tree/v4.19/sdm845-mtp-display-working

  drivers/gpu/drm/msm/msm_gem.c | 27 ---
  1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 00c795ced02c..d7a7af610803 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -81,6 +81,8 @@ static struct page **get_pages(struct drm_gem_object *obj)
   struct drm_device *dev = obj->dev;
   struct page **p;
   int npages = obj->size >> PAGE_SHIFT;
+ struct scatterlist *s;
+ int i;

   if (use_pages(obj))
   p = drm_gem_get_pages(obj);
@@ -107,9 +109,19 @@ static struct page **get_pages(struct drm_gem_object *obj)
   /* For non-cached buffers, ensure the new pages are clean
* because display controller, GPU, etc. are not coherent:
*/
- if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
- dma_map_sg(dev->dev, msm_obj->sgt->sgl,
- msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+ if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED)) {
+ /*
+  * Fake up the SG table so that dma_sync_sg_*()
+  * can be used to flush the pages associated with it.
+  */

We aren't really faking.  The table is real, we are just slightly abusing the
sg_dma_address() which makes this comment a bit misleading. Instead I would
probably say something like:

/* dma_sync_sg_* flushes pages using sg_dma_address() so point it at the
  * physical page for the right behavior */

Or something like that.


It's actually quite complicated, but I agree that the comment isn't
very precise. The cases are as follows:
- arm64 iommu_dma_ops use sg_phys()
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm64/mm/dma-mapping.c#L599
- swiotlb_dma_ops used on arm64 if no IOMMU is available use
sg->dma_address directly:
https://elixir.bootlin.com/linux/v4.20-rc3/source/kernel/dma/swiotlb.c#L832
- arm_dma_ops use sg_dma_address():
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm/mm/dma-mapping.c#L1130
- arm iommu_ops use sg_page():
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm/mm/dma-mapping.c#L1869

Sounds like a mess...

Thanks for the review.

Technically with the below assignment we address all of the above. How 
about an even

simpler version of the suggested comment:

/* dma_sync_sg_* flushes physical pages, so point sg->dma_address to
 * the physical one for the right behavior.
 */





+ for_each_sg(msm_obj->sgt->sgl, s,
+ msm_obj->sgt->nents, i)
+ sg_dma_address(s) = sg_phys(s);
+

I'm wondering - wouldn't we want to do this association for cached buffers to so
we could sync them correctly in cpu_prep and cpu_fini?  Maybe it wouldn't hurt
to put this association in the main path (obviously the sync should stay inside
the conditional for uncached buffers).



Sure, I will move this out of the conditional check.


I guess it wouldn't hurt indeed. Note that cpu_prep/fini seem to be
missing the sync call currently.


I can't say I understand the usage of cpu_prep and cpu_fini(). But I can add
the necessary support if you can point me in the right direction.
Thanks

Best regards
Vivek


P.S. Jordan, not sure if it's my Gmail or your email client, but your
message had all the recipients in a Reply-to header, except you, so
pressing Reply to all in my case led to a message that didn't have you
in recipients anymore...

Best regards,
Tomasz




Re: [PATCH 1/2] arm64: dts: qcom: msm8996: Add VFE SMMU node

2018-11-19 Thread Vivek Gautam
Hi Todor,

On Mon, Nov 19, 2018 at 2:57 PM Todor Tomov  wrote:
>
> Add VFE SMMU node.
>
> Signed-off-by: Todor Tomov 
> ---
>
> This patch depends on patchset:
> https://lore.kernel.org/patchwork/cover/1013166/
>
>  arch/arm64/boot/dts/qcom/msm8996.dtsi | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index 13bb964..a4d087e5 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -950,6 +950,23 @@
> };
> };
>
> +   vfe_smmu: arm,smmu@da {
> +   compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2";
> +   reg = <0xda 0x1>;
> +
> +   #global-interrupts = <1>;
> +   interrupts = ,
> +,
> +;
> +   power-domains = <&mmcc MMAGIC_CAMSS_GDSC>;
> +   clocks = <&mmcc SMMU_VFE_AHB_CLK>,
> +<&mmcc SMMU_VFE_AXI_CLK>;
> +   clock-names = "iface",
> + "bus";
> +   #iommu-cells = <1>;
> +   status = "ok";

No point of adding this status here.
Rest looks good to me.

Reviewed-by: Vivek Gautam 

Best regards
Vivek

> +   };
> +
> agnoc@0 {
> power-domains = <&gcc AGGRE0_NOC_GDSC>;
> compatible = "simple-pm-bus";
> --
> 2.7.4
>


-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 1/2] dt-bindings: phy: Add Qualcomm Synopsys High-Speed USB PHY binding

2018-11-18 Thread Vivek Gautam
On Mon, Nov 19, 2018 at 12:29 PM Shawn Guo  wrote:
>
> On Sat, Nov 17, 2018 at 09:13:38AM -0600, Rob Herring wrote:
> 
> > > > > +- qcom,init-seq:
> > > > > +Value type: 
> > > > > +Definition: Should contain a sequence of  
> > > > > tuples to
> > > > > +program 'value' into phy register at 'offset' with 
> > > > > 'delay'
> > > > > +   in us afterwards.
> > > >
> > > > If we wanted this type of thing in DT, we'd have a generic binding (or
> > > > forth).
> > >
> > > Right now, this is a qualcomm usb phy specific bindings - first used in
> > > qcom,usb-hs-phy.txt and I extended it a bit for my phy.  As this is not
> > > a so good hardware description, I'm a little hesitated to make it
> > > generic for other platforms to use in general.  What about we put off it
> > > a little bit until we see more platforms need the same thing?
> >
> > I'm not saying I want it generic. Quite the opposite. I don't think we
> > should have it either generically or vendor specific. The main thing I
> > have a problem with is the timing information because then we're more
> > that just data. Without that we're talking about a bunch of properties
> > for register fields or just raw register values in DT. That becomes
> > more of a judgement call. There's not too much value in making a
> > driver translate a bunch of properties just to stuff them into
> > registers on init. But then just allowing any raw register value to be
> > in DT could be easily abused.
>
> Rob,
>
> I agree with your comments.  Honestly, I'm not comfortable with this
> 'qcom,init-seq' thing in the first impression.  The similar existence in
> mainline qcom,usb-hs-phy.txt makes me think it might be acceptable with
> the timing data added.  Okay, I know your position on this now.
>
> @Sriharsha,
>
> Seeing that 'qcom,init-seq' is being configured with the exactly same
> values for both HS phys in SoC level dts file (qcs404.dtsi), I think
> such settings can be moved into driver code as SoC specific data.
> Unless you have a different view on this, I will do it with v4.

phy-qcom-qmp and phy-qcom-qusb2 have been maintaining such SoC specific
init sequences in the drivers if you would like to have pointers from them.

Thanks
Vivek

>
> Shawn



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH v2 1/2] phy: qcom-qusb2: Use HSTX_TRIM fused value as is

2018-10-24 Thread Vivek Gautam

On 2018-10-25 11:46, Vivek Gautam wrote:

Hi Manu,

On 10/16/2018 12:52 PM, Manu Gautam wrote:


Fix HSTX_TRIM tuning logic which instead of using fused value
as HSTX_TRIM, incorrectly performs bitwise OR operation with
existing default value.

Fixes: ca04d9d3e1b1 ("phy: qcom-qusb2: New driver for QUSB2 PHY on
Qcom chips")
Signed-off-by: Manu Gautam 
Reviewed-by: Douglas Anderson 
---
drivers/phy/qualcomm/phy-qcom-qusb2.c | 19 ++-
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/phy/qualcomm/phy-qcom-qusb2.c
b/drivers/phy/qualcomm/phy-qcom-qusb2.c
index e70e425f26f5..9d6c88064158 100644
--- a/drivers/phy/qualcomm/phy-qcom-qusb2.c
+++ b/drivers/phy/qualcomm/phy-qcom-qusb2.c
@@ -402,10 +402,10 @@ static void qusb2_phy_set_tune2_param(struct
qusb2_phy *qphy)

/*
* Read efuse register having TUNE2/1 parameter's high nibble.
- * If efuse register shows value as 0x0, or if we fail to find
- * a valid efuse register settings, then use default value
- * as 0xB for high nibble that we have already set while
- * configuring phy.
+ * If efuse register shows value as 0x0 (indicating value is
not
+ * fused), or if we fail to find a valid efuse register
setting,
+ * then use default value for high nibble that we have already
+ * set while configuring the phy.
*/
val = nvmem_cell_read(qphy->cell, NULL);
if (IS_ERR(val) || !val[0]) {
@@ -415,12 +415,13 @@ static void qusb2_phy_set_tune2_param(struct
qusb2_phy *qphy)

/* Fused TUNE1/2 value is the higher nibble only */
if (cfg->update_tune1_with_efuse)
-qusb2_setbits(qphy->base, cfg->regs[QUSB2PHY_PORT_TUNE1],
-  val[0] << 0x4);
+qusb2_write_mask(qphy->base,
cfg->regs[QUSB2PHY_PORT_TUNE1],
+ val[0] << HSTX_TRIM_SHIFT,
+ HSTX_TRIM_MASK);
else
-qusb2_setbits(qphy->base, cfg->regs[QUSB2PHY_PORT_TUNE2],
-  val[0] << 0x4);
-
+qusb2_write_mask(qphy->base,
cfg->regs[QUSB2PHY_PORT_TUNE2],
+ val[0] << HSTX_TRIM_SHIFT,
+ HSTX_TRIM_MASK);
}

static int qusb2_phy_set_mode(struct phy *phy, enum phy_mode mode)


Thanks for the patch.
Acked-by: Vivek Gautam 



My bad. Didn't notice the HTML mode. Resending, so that it reaches to 
lists as well.


Thanks
Vivek


Regards
Vivek


  1   2   3   4   5   6   7   8   9   10   >