[Freedreno] [PATCH] iommu: arm-smmu: Set SCTLR.HUPCF bit

2018-09-27 Thread Rob Clark
We seem to need to set either this or CFCFG (stall), otherwise gpu
faults trigger problems with other in-flight transactions from the
GPU causing CP errors, etc.

In the ARM SMMU spec, the 'Hit under previous context fault' bit is
described as:

 '0' - Stall or terminate subsequent transactions in the presence
   of an outstanding context fault
 '1' - Process all subsequent transactions independently of any
   outstanding context fault.

Since we don't enable CFCFG (stall) the behavior of terminating
other transactions makes sense.  And is probably not what we want
(and definately not what we want for GPU).

Signed-off-by: Rob Clark 
---
So I hit this issue a long time back on 820 (msm8996) and at the
time I solved it with a patch that enabled CFCFG.  And it resurfaced
more recently on sdm845.  But at the time CFCFG was rejected, iirc
because of concern that it would cause problems on other non-qcom
arm smmu implementations.  And I think I forgot to send this version
of the solution.

If enabling HUPCF is anticipated to cause problems on other ARM
SMMU implementations, I think I can come up with a variant of this
patch which conditionally enables it for snapdragon.

Either way, I'd really like to get some variant of this fix merged
(and probably it would be a good idea for stable kernel branches
too), since current behaviour with the GPU means faults turn into
a fantastic cascade of fail.

 drivers/iommu/arm-smmu-regs.h | 1 +
 drivers/iommu/arm-smmu.c  | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..2291925eb800 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -178,6 +178,7 @@ enum arm_smmu_s2cr_privcfg {
 #define ARM_SMMU_CB_ATSR   0x8f0
 
 #define SCTLR_S1_ASIDPNE   (1 << 12)
+#define SCTLR_HUPCF(1 << 8)
 #define SCTLR_CFCFG(1 << 7)
 #define SCTLR_CFIE (1 << 6)
 #define SCTLR_CFRE (1 << 5)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f7a96bcf94a6..47ffc9aade72 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -713,9 +713,9 @@ static void arm_smmu_write_context_bank(struct 
arm_smmu_device *smmu, int idx)
reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M;
if (stage1)
reg |= SCTLR_S1_ASIDPNE;
+   reg |= SCTLR_HUPCF;
if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
reg |= SCTLR_E;
-
writel_relaxed(reg, cb_base + ARM_SMMU_CB_SCTLR);
 }
 
-- 
2.17.1

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH] drm/msm/A6xx: Send the right perf index value to GMU

2018-09-27 Thread Jordan Crouse
On Thu, Sep 27, 2018 at 10:16:22PM +0530, Sharat Masetty wrote:
> The index of the perf table was being set in the wrong bit position
> in the register. With this fix, the GPU clock can be seen running at
> desired frequency.

Thank you!  Without the patch measured GPU frequency appears to be about 414 Mhz
and 710 Mhz with the patch.  Thats a great way to double your
productivity.

Tested-by: Jordan Crouse 

> Signed-off-by: Sharat Masetty 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 3762c8d..a38ef1b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -78,7 +78,7 @@ static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int 
> index)
>   gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);
>  
>   gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,
> - ((index << 24) & 0xff) | (3 & 0xf));
> + ((3 & 0xf) << 28) | index);
>  
>   /*
>* Send an invalid index as a vote for the bus bandwidth and let the
> -- 
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH] drm/msm/A6xx: Send the right perf index value to GMU

2018-09-27 Thread Sharat Masetty
The index of the perf table was being set in the wrong bit position
in the register. With this fix, the GPU clock can be seen running at
desired frequency.

Signed-off-by: Sharat Masetty 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 3762c8d..a38ef1b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -78,7 +78,7 @@ static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int 
index)
gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);
 
gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,
-   ((index << 24) & 0xff) | (3 & 0xf));
+   ((3 & 0xf) << 28) | index);
 
/*
 * Send an invalid index as a vote for the bus bandwidth and let the
-- 
1.9.1

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 6/9] PM / OPP: dt-bindings: Add opp-interconnect-bw

2018-09-27 Thread Georgi Djakov
Hi Jordan,

Thanks for the patch!

On 08/27/2018 06:11 PM, Jordan Crouse wrote:
> Add the "opp-interconnect-bw" property to specify the
> average and peak bandwidth for an interconnect path for
> a specific operating power point. A separate bandwidth
> pair can be specified for each of the interconnects
> defined for the device by appending the interconnect
> name to the property.
> 
> Signed-off-by: Jordan Crouse 
> ---
>  Documentation/devicetree/bindings/opp/opp.txt | 36 +++
>  1 file changed, 36 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/opp/opp.txt 
> b/Documentation/devicetree/bindings/opp/opp.txt
> index c396c4c0af92..d714c084f36d 100644
> --- a/Documentation/devicetree/bindings/opp/opp.txt
> +++ b/Documentation/devicetree/bindings/opp/opp.txt
> @@ -170,6 +170,11 @@ Optional properties:
>functioning of the current device at the current OPP (where this property 
> is
>present).
>  
> +- opp-interconnect-bw-: This is an array of pairs specifying the 
> average
> +  and peak bandwidth in bytes per second for the interconnect path known by
> +  'name'.  This should match the name(s) specified by interconnect-names in 
> the
> +  device definition.
> +
>  Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states 
> together.
>  
>  / {
> @@ -543,3 +548,34 @@ Example 6: opp-microvolt-, opp-microamp-:
>   };
>   };
>  };
> +
> +Example 7: opp-interconnect-bw:
> +(example: leaf device with frequency and BW quotas)
> +
> +/ {
> + soc {
> + gpu@500 {
> + ...
> + interconnects = < 26  512>;
> + interconnect-names = "port0";
> + ...
> + operating-points-v2 = <_opp_table>;
> + };
> + };
> +
> + gpu_opp_table: opp_table0 {
> + compatible = "operating-points-v2";
> +
> + opp-71000 {
> + op-hz = /bits/ 64 <71000>;
> + /* Set peak bandwidth @ 7216000 KB/s */
> + opp-interconnect-bw-port0 = /bits/ 64 <0 721600>;

This seems a bit long. I would suggest the following instead.
If there is only one path:
/* average bandwidth = 0 KB/s,  peak bandwidth = 7216000 KB/s */
opp-bw-KBps = <0 7216000>;
or  
opp-bw-MBps = <0 7216>;

If there are multiple paths:
opp-bw-KBps-port0 = <0 7216000>;
opp-bw-KBps-port1 = <0 100>;

The above follows a convention similar to opp-microvolt, where at
runtime the platform can pick a  and a matching opp-bw-KBps-
property. If the platform doesn't pick a specific  or  does
not match with any of the opp-bw-KBps- properties, then
opp-bw-KBps shall be used if present.
For now supporting only KBps values seems enough to cover all present
platforms, so we can start with this and based on the requirements of
future platforms we can add MBps etc.

Thanks,
Georgi

> + };
> +
> + opp-21000 {
> + op-hz = /bits/ 64 <21000>;
> + /* Set peak bandwidth @ 120 KB/s */
> + opp-interconnect-bw-port0 = /bits/ 64 <0 12>;
> + };
> + };
> +};
> 
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH] drm/msm/A6xx: Send the right perf index value to GMU

2018-09-27 Thread Sharat Masetty
The index of the perf table was being set in the wrong bit position
in the register. With this fix, the GPU clock can be seen running at
desired frequency.

Signed-off-by: Sharat Masetty 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 3762c8d..421456e1 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -78,7 +78,7 @@ static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int 
index)
gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);
 
gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,
-   ((index << 24) & 0xff) | (3 & 0xf));
+   ((3 & 0xf) << 28) | (index & 0xff));
 
/*
 * Send an invalid index as a vote for the bus bandwidth and let the
-- 
1.9.1

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v16 1/5] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-09-27 Thread Vivek Gautam
On Wed, Sep 26, 2018 at 8:57 PM Robin Murphy  wrote:
>
> On 30/08/18 15:45, Vivek Gautam wrote:
> > From: Sricharan R 
> >
> > The smmu needs to be functional only when the respective
> > master's using it are active. The device_link feature
> > helps to track such functional dependencies, so that the
> > iommu gets powered when the master device enables itself
> > using pm_runtime. So by adapting the smmu driver for
> > runtime pm, above said dependency can be addressed.
> >
> > This patch adds the pm runtime/sleep callbacks to the
> > driver and also the functions to parse the smmu clocks
> > from DT and enable them in resume/suspend.
> >
> > Also, while we enable the runtime pm add a pm sleep suspend
> > callback that pushes devices to low power state by turning
> > the clocks off in a system sleep.
> > Also add corresponding clock enable path in resume callback.
> >
> > Signed-off-by: Sricharan R 
> > Signed-off-by: Archit Taneja 
> > [vivek: rework for clock and pm ops]
> > Signed-off-by: Vivek Gautam 
> > Reviewed-by: Tomasz Figa 
> > Tested-by: Srinivas Kandagatla 
> > ---
> >   drivers/iommu/arm-smmu.c | 77 
> > ++--
> >   1 file changed, 74 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index fd1b80ef9490..d900e007c3c9 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -48,6 +48,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >
> > @@ -205,6 +206,8 @@ struct arm_smmu_device {
> >   u32 num_global_irqs;
> >   u32 num_context_irqs;
> >   unsigned int*irqs;
> > + struct clk_bulk_data*clks;
> > + int num_clks;
> >
> >   u32 cavium_id_base; /* Specific to Cavium 
> > */
> >
> > @@ -1896,10 +1899,12 @@ static int arm_smmu_device_cfg_probe(struct 
> > arm_smmu_device *smmu)
> >   struct arm_smmu_match_data {
> >   enum arm_smmu_arch_version version;
> >   enum arm_smmu_implementation model;
> > + const char * const *clks;
> > + int num_clks;
> >   };
> >
> >   #define ARM_SMMU_MATCH_DATA(name, ver, imp) \
> > -static struct arm_smmu_match_data name = { .version = ver, .model = imp }
> > +static const struct arm_smmu_match_data name = { .version = ver, .model = 
> > imp }
> >
> >   ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU);
> >   ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
> > @@ -1918,6 +1923,23 @@ static const struct of_device_id arm_smmu_of_match[] 
> > = {
> >   };
> >   MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
> >
> > +static void arm_smmu_fill_clk_data(struct arm_smmu_device *smmu,
> > +const char * const *clks)
> > +{
> > + int i;
> > +
> > + if (smmu->num_clks < 1)
> > + return;
> > +
> > + smmu->clks = devm_kcalloc(smmu->dev, smmu->num_clks,
> > +   sizeof(*smmu->clks), GFP_KERNEL);
> > + if (!smmu->clks)
> > + return;
> > +
> > + for (i = 0; i < smmu->num_clks; i++)
> > + smmu->clks[i].id = clks[i];
> > +}
> > +
> >   #ifdef CONFIG_ACPI
> >   static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu)
> >   {
> > @@ -2000,6 +2022,9 @@ static int arm_smmu_device_dt_probe(struct 
> > platform_device *pdev,
> >   data = of_device_get_match_data(dev);
> >   smmu->version = data->version;
> >   smmu->model = data->model;
> > + smmu->num_clks = data->num_clks;
> > +
> > + arm_smmu_fill_clk_data(smmu, data->clks);
> >
> >   parse_driver_options(smmu);
> >
> > @@ -2098,6 +2123,14 @@ static int arm_smmu_device_probe(struct 
> > platform_device *pdev)
> >   smmu->irqs[i] = irq;
> >   }
> >
> > + err = devm_clk_bulk_get(smmu->dev, smmu->num_clks, smmu->clks);
> > + if (err)
> > + return err;
> > +
> > + err = clk_bulk_prepare_enable(smmu->num_clks, smmu->clks);
> > + if (err)
> > + return err;
> > +
>
> Hmm, if we error out beyond here it looks like we should strictly
> balance that prepare/enable before devres does the clk_bulk_put(),
> however the probe error path is starting to look like it needs a bit of
> love in general, so I might just spin a cleanup patch on top (and even
> then only for the sake of not being a bad example; SMMU probe failure is
> never a realistic situation for the system to actually recover from).

Sure Robin. Thanks for the review on the series.
Let me know, I can spin a change for probe failure path cleanup.

Best regards
Vivek

>
> Otherwise,
>
> Reviewed-by: Robin Murphy 
>
> >   err = arm_smmu_device_cfg_probe(smmu);
> >   if (err)
> >   return err;
> > @@ -2184,6 +2217,9 @@ static int arm_smmu_device_remove(struct 
> > platform_device 

Re: [Freedreno] [PATCH v16 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-09-27 Thread Vivek Gautam
Hi Robin,

On Wed, Sep 26, 2018 at 9:29 PM Robin Murphy  wrote:
>
> On 30/08/18 15:45, Vivek Gautam wrote:
> > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
> > clock and power requirements.
> > On msm8996, multiple cores, viz. mdss, video, etc. use this
> > smmu. On sdm845, this smmu is used with gpu.
> > Add bindings for the same.
> >
> > Signed-off-by: Vivek Gautam 
> > Reviewed-by: Rob Herring 
> > Reviewed-by: Tomasz Figa 
> > Tested-by: Srinivas Kandagatla 
> > ---
> >   drivers/iommu/arm-smmu.c | 13 +
> >   1 file changed, 13 insertions(+)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 166c8c6da24f..411e5ac57c64 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -119,6 +119,7 @@ enum arm_smmu_implementation {
> >   GENERIC_SMMU,
> >   ARM_MMU500,
> >   CAVIUM_SMMUV2,
> > + QCOM_SMMUV2,
>
> Hmm, it seems we don't actually need this right now, but maybe that just
> means there's more imp-def registers and/or errata to come ;)
>
> Either way I guess there's no real harm in having it.

Thanks for the review.

Best regards
Vivek

>
> Reviewed-by: Robin Murphy 
>
> >   };
> >
> >   struct arm_smmu_s2cr {
> > @@ -1970,6 +1971,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, 
> > GENERIC_SMMU);
> >   ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> >   ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> >
> > +static const char * const qcom_smmuv2_clks[] = {
> > + "bus", "iface",
> > +};
> > +
> > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > + .version = ARM_SMMU_V2,
> > + .model = QCOM_SMMUV2,
> > + .clks = qcom_smmuv2_clks,
> > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > +};
> > +
> >   static const struct of_device_id arm_smmu_of_match[] = {
> >   { .compatible = "arm,smmu-v1", .data = _generic_v1 },
> >   { .compatible = "arm,smmu-v2", .data = _generic_v2 },
> > @@ -1977,6 +1989,7 @@ static const struct of_device_id arm_smmu_of_match[] 
> > = {
> >   { .compatible = "arm,mmu-401", .data = _mmu401 },
> >   { .compatible = "arm,mmu-500", .data = _mmu500 },
> >   { .compatible = "cavium,smmu-v2", .data = _smmuv2 },
> > + { .compatible = "qcom,smmu-v2", .data = _smmuv2 },
> >   { },
> >   };
> >   MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
> >
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno