Re: [PATCH v5 1/5] drm/msm/adreno: Implement SMEM-based speed bin

2024-07-29 Thread Akhil P Oommen
On Mon, Jul 29, 2024 at 02:40:30PM +0200, Konrad Dybcio wrote:
> 
> 
> On 29.07.2024 2:13 PM, Konrad Dybcio wrote:
> > On 16.07.2024 1:56 PM, Konrad Dybcio wrote:
> >> On 15.07.2024 10:04 PM, Akhil P Oommen wrote:
> >>> On Tue, Jul 09, 2024 at 12:45:29PM +0200, Konrad Dybcio wrote:
> >>>> On recent (SM8550+) Snapdragon platforms, the GPU speed bin data is
> >>>> abstracted through SMEM, instead of being directly available in a fuse.
> >>>>
> >>>> Add support for SMEM-based speed binning, which includes getting
> >>>> "feature code" and "product code" from said source and parsing them
> >>>> to form something that lets us match OPPs against.
> >>>>
> >>>> Due to the product code being ignored in the context of Adreno on
> >>>> production parts (as of SM8650), hardcode it to SOCINFO_PC_UNKNOWN.
> >>>>
> >>>> Signed-off-by: Konrad Dybcio 
> >>>> ---
> >> [...]
> >>
> >>>>  
> >>>> -if (adreno_read_speedbin(dev, ) || !speedbin)
> >>>> +if (adreno_read_speedbin(adreno_gpu, dev, ) || 
> >>>> !speedbin)
> >>>>  speedbin = 0x;
> >>>> -adreno_gpu->speedbin = (uint16_t) (0x & speedbin);
> >>>> +adreno_gpu->speedbin = speedbin;
> >>> There are some chipsets which uses both Speedbin and Socinfo data for
> >>> SKU detection [1].
> >> 0_0
> >>
> >>
> >>> We don't need to worry about that logic for now. But
> >>> I am worried about mixing Speedbin and SKU_ID in the UABI with this patch.
> >>> It will be difficult when we have to expose both to userspace.
> >>>
> >>> I think we can use a separate bitfield to expose FCODE/PCODE. Currently,
> >>> the lower 32 bit is reserved for chipid and 33-48 is reserved for 
> >>> speedbin,
> >>> so I think we can use the rest of the 16 bits for SKU_ID. And within that
> >>> 16bits, 12 bits should be sufficient for FCODE and the rest 8 bits
> >>> reserved for future PCODE.
> >> Right, sounds reasonable. Hopefully nothing overflows..
> > +CC Elliot
> > 
> > Would you know whether these sizes ^ are going to be sufficient for
> > the foreseeable future?
> 
> Also Akhil, 12 + 8 > 16.. did you mean 8 bits for both P and FCODE? Or
> 12 for FCODE and 4 for PCODE?

Sorry, "8 bits" was a typo. You are right, 12 bits for Fcode and 4 bits for 
PCODE.

-Akhil

> 
> Konrad


Re: [PATCH] drm/msm/adreno: Fix error return if missing firmware-name

2024-07-16 Thread Akhil P Oommen
On Tue, Jul 16, 2024 at 09:06:30AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> -ENODEV is used to signify that there is no zap shader for the platform,
> and the CPU can directly take the GPU out of secure mode.  We want to
> use this return code when there is no zap-shader node.  But not when
> there is, but without a firmware-name property.  This case we want to
> treat as-if the needed fw is not found.
> 
> Signed-off-by: Rob Clark 
> ---

Reviewed-by: Akhil P Oommen 

-Akhil

>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index b46e7e93b3ed..0d84be3be0b7 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -99,7 +99,7 @@ static int zap_shader_load_mdt(struct msm_gpu *gpu, const 
> char *fwname,
>* was a bad idea, and is only provided for backwards
>* compatibility for older targets.
>*/
> - return -ENODEV;
> + return -ENOENT;
>   }
>  
>   if (IS_ERR(fw)) {
> -- 
> 2.45.2
> 


Re: [PATCH v5 1/5] drm/msm/adreno: Implement SMEM-based speed bin

2024-07-15 Thread Akhil P Oommen
On Tue, Jul 09, 2024 at 12:45:29PM +0200, Konrad Dybcio wrote:
> On recent (SM8550+) Snapdragon platforms, the GPU speed bin data is
> abstracted through SMEM, instead of being directly available in a fuse.
> 
> Add support for SMEM-based speed binning, which includes getting
> "feature code" and "product code" from said source and parsing them
> to form something that lets us match OPPs against.
> 
> Due to the product code being ignored in the context of Adreno on
> production parts (as of SM8650), hardcode it to SOCINFO_PC_UNKNOWN.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 14 +-
>  drivers/gpu/drm/msm/adreno/adreno_device.c |  2 ++
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c| 42 
> +++---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  7 -
>  4 files changed, 54 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index bcaec86ac67a..0d8682c28ba4 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2117,18 +2117,20 @@ static u32 fuse_to_supp_hw(const struct adreno_info 
> *info, u32 fuse)
>   return UINT_MAX;
>  }
>  
> -static int a6xx_set_supported_hw(struct device *dev, const struct 
> adreno_info *info)
> +static int a6xx_set_supported_hw(struct adreno_gpu *adreno_gpu,
> +  struct device *dev,
> +  const struct adreno_info *info)
>  {
>   u32 supp_hw;
>   u32 speedbin;
>   int ret;
>  
> - ret = adreno_read_speedbin(dev, );
> + ret = adreno_read_speedbin(adreno_gpu, dev, );
>   /*
> -  * -ENOENT means that the platform doesn't support speedbin which is
> -  * fine
> +  * -ENOENT/EOPNOTSUPP means that the platform doesn't support speedbin
> +  * which is fine
>*/
> - if (ret == -ENOENT) {
> + if (ret == -ENOENT || ret == -EOPNOTSUPP) {
>   return 0;
>   } else if (ret) {
>   dev_err_probe(dev, ret,
> @@ -2283,7 +2285,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>  
>   a6xx_llc_slices_init(pdev, a6xx_gpu, is_a7xx);
>  
> - ret = a6xx_set_supported_hw(>dev, config->info);
> + ret = a6xx_set_supported_hw(adreno_gpu, >dev, config->info);
>   if (ret) {
>   a6xx_llc_slices_destroy(a6xx_gpu);
>   kfree(a6xx_gpu);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index cfc74a9e2646..0842ea76e616 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -6,6 +6,8 @@
>   * Copyright (c) 2014,2017 The Linux Foundation. All rights reserved.
>   */
>  
> +#include 
> +
>  #include "adreno_gpu.h"
>  
>  bool hang_debug = false;
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 1c6626747b98..cf6652c4439d 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -21,6 +21,9 @@
>  #include "msm_gem.h"
>  #include "msm_mmu.h"
>  
> +#include 
> +#include 
> +
>  static u64 address_space_size = 0;
>  MODULE_PARM_DESC(address_space_size, "Override for size of processes private 
> GPU address space");
>  module_param(address_space_size, ullong, 0600);
> @@ -1061,9 +1064,40 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem 
> *adreno_ocmem)
>  adreno_ocmem->hdl);
>  }
>  
> -int adreno_read_speedbin(struct device *dev, u32 *speedbin)
> +int adreno_read_speedbin(struct adreno_gpu *adreno_gpu,
> +  struct device *dev, u32 *fuse)
>  {
> - return nvmem_cell_read_variable_le_u32(dev, "speed_bin", speedbin);
> + int ret;
> +
> + /*
> +  * Try reading the speedbin via a nvmem cell first
> +  * -ENOENT means "no nvmem-cells" and essentially means "old DT" or
> +  * "nvmem fuse is irrelevant", simply assume it's fine.
> +  */
> + ret = nvmem_cell_read_variable_le_u32(dev, "speed_bin", fuse);
> + if (!ret)
> + return 0;
> + else if (ret != -ENOENT)
> + return dev_err_probe(dev, ret, "Couldn't read the speed bin 
> fuse value\n");
> +
> +#ifdef CONFIG_QCOM_SMEM
> + u32 fcode;
> +
> + /*
> +  * Only check the feature code - the product code only matters for
> +  * proto SoCs unavailable outside Qualcomm labs, as far as GPU bin
> +  * matching is concerned.
> +  *
> +  * Ignore EOPNOTSUPP, as not all SoCs expose this info through SMEM.
> +  */
> + ret = qcom_smem_get_feature_code();
> + if (!ret)
> + *fuse = ADRENO_SKU_ID(fcode);
> + else if (ret != -EOPNOTSUPP)
> + return dev_err_probe(dev, ret, "Couldn't get feature code from 
> SMEM\n");
> +#endif
> +
> + return ret;
>  }
>  
>  int adreno_gpu_init(struct drm_device *drm, 

Re: [PATCH v2 3/5] drm/msm/adreno: Introduce gmu_chipid for a740 & a750

2024-06-30 Thread Akhil P Oommen
On Sat, Jun 29, 2024 at 03:06:22PM +0200, Konrad Dybcio wrote:
> On 29.06.2024 3:49 AM, Akhil P Oommen wrote:
> > To simplify, introduce the new gmu_chipid for a740 & a750 GPUs.
> > 
> > Signed-off-by: Akhil P Oommen 
> > ---
> 
> This gets rid of getting patchid from dts, but I suppose that's fine,
> as we can just add a new entry to the id table
> 
> [...]
> 
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > @@ -771,7 +771,7 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
> > unsigned int state)
> > struct adreno_gpu *adreno_gpu = _gpu->base;
> > const struct a6xx_info *a6xx_info = adreno_gpu->info->a6xx;
> > u32 fence_range_lower, fence_range_upper;
> > -   u32 chipid, chipid_min = 0;
> > +   u32 chipid = 0;
> 
> The initialization doesn't seem necessary

Rob, would it be possible to fix this up when you pick this patch?

-Akhil.

> 
> otherwise:
> 
> Reviewed-by: Konrad Dybcio 
> 
> Konrad


Re: [PATCH v4 4/5] drm/msm/adreno: Redo the speedbin assignment

2024-06-30 Thread Akhil P Oommen
On Tue, Jun 25, 2024 at 08:28:09PM +0200, Konrad Dybcio wrote:
> There is no need to reinvent the wheel for simple read-match-set logic.
> 
> Make speedbin discovery and assignment generation independent.
> 
> This implicitly removes the bogus 0x80 / BIT(7) speed bin on A5xx,
> which has no representation in hardware whatshowever.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c   | 34 
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 56 
> -
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 51 ++
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  3 --
>  4 files changed, 45 insertions(+), 99 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index c003f970189b..eed6a2eb1731 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -1704,38 +1704,6 @@ static const struct adreno_gpu_funcs funcs = {
>   .get_timestamp = a5xx_get_timestamp,
>  };
>  
> -static void check_speed_bin(struct device *dev)
> -{
> - struct nvmem_cell *cell;
> - u32 val;
> -
> - /*
> -  * If the OPP table specifies a opp-supported-hw property then we have
> -  * to set something with dev_pm_opp_set_supported_hw() or the table
> -  * doesn't get populated so pick an arbitrary value that should
> -  * ensure the default frequencies are selected but not conflict with any
> -  * actual bins
> -  */
> - val = 0x80;
> -
> - cell = nvmem_cell_get(dev, "speed_bin");
> -
> - if (!IS_ERR(cell)) {
> - void *buf = nvmem_cell_read(cell, NULL);
> -
> - if (!IS_ERR(buf)) {
> - u8 bin = *((u8 *) buf);
> -
> - val = (1 << bin);
> - kfree(buf);
> - }
> -
> - nvmem_cell_put(cell);
> - }
> -
> - devm_pm_opp_set_supported_hw(dev, , 1);
> -}
> -
>  struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
>  {
>   struct msm_drm_private *priv = dev->dev_private;
> @@ -1763,8 +1731,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
>  
>   a5xx_gpu->lm_leakage = 0x4E001A;
>  
> - check_speed_bin(>dev);
> -
>   nr_rings = 4;
>  
>   if (config->info->revn == 510)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 8ace096bb68c..f038e5f1fe59 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2112,55 +2112,6 @@ static bool a6xx_progress(struct msm_gpu *gpu, struct 
> msm_ringbuffer *ring)
>   return progress;
>  }
>  
> -static u32 fuse_to_supp_hw(const struct adreno_info *info, u32 fuse)
> -{
> - if (!info->speedbins)
> - return UINT_MAX;
> -
> - for (int i = 0; info->speedbins[i].fuse != SHRT_MAX; i++)
> - if (info->speedbins[i].fuse == fuse)
> - return BIT(info->speedbins[i].speedbin);
> -
> - return UINT_MAX;
> -}
> -
> -static int a6xx_set_supported_hw(struct adreno_gpu *adreno_gpu,
> -  struct device *dev,
> -  const struct adreno_info *info)
> -{
> - u32 supp_hw;
> - u32 speedbin;
> - int ret;
> -
> - ret = adreno_read_speedbin(adreno_gpu, dev, );
> - /*
> -  * -ENOENT means that the platform doesn't support speedbin which is
> -  * fine
> -  */
> - if (ret == -ENOENT) {
> - return 0;
> - } else if (ret) {
> - dev_err_probe(dev, ret,
> -   "failed to read speed-bin. Some OPPs may not be 
> supported by hardware\n");
> - return ret;
> - }
> -
> - supp_hw = fuse_to_supp_hw(info, speedbin);
> -
> - if (supp_hw == UINT_MAX) {
> - DRM_DEV_ERROR(dev,
> - "missing support for speed-bin: %u. Some OPPs may not 
> be supported by hardware\n",
> - speedbin);
> - supp_hw = BIT(0); /* Default */
> - }
> -
> - ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
> - if (ret)
> - return ret;
> -
> - return 0;
> -}
> -
>  static const struct adreno_gpu_funcs funcs = {
>   .base = {
>   .get_param = adreno_get_param,
> @@ -2292,13 +2243,6 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>  
>   a6xx_llc_slices_init(pdev, a6xx_gpu, is_a7xx);
>  
> - ret = a6xx_set_supported_hw(adreno_gpu, >dev, config->info);
> - if (ret) {
> - a6xx_llc_slices_destroy(a6xx_gpu);
> - kfree(a6xx_gpu);
> - return ERR_PTR(ret);
> - }
> -
>   if (is_a7xx)
>   ret = adreno_gpu_init(dev, pdev, adreno_gpu, _a7xx, 1);
>   else if (adreno_has_gmu_wrapper(adreno_gpu))
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 

Re: [PATCH v4 1/5] drm/msm/adreno: Implement SMEM-based speed bin

2024-06-30 Thread Akhil P Oommen
On Tue, Jun 25, 2024 at 08:28:06PM +0200, Konrad Dybcio wrote:
> On recent (SM8550+) Snapdragon platforms, the GPU speed bin data is
> abstracted through SMEM, instead of being directly available in a fuse.
> 
> Add support for SMEM-based speed binning, which includes getting
> "feature code" and "product code" from said source and parsing them
> to form something that lets us match OPPs against.
> 
> Due to the product code being ignored in the context of Adreno on
> production parts (as of SM8650), hardcode it to SOCINFO_PC_UNKNOWN.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  8 +++---
>  drivers/gpu/drm/msm/adreno/adreno_device.c |  2 ++
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c| 41 
> +++---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  7 -
>  4 files changed, 50 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c98cdb1e9326..8ace096bb68c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2124,13 +2124,15 @@ static u32 fuse_to_supp_hw(const struct adreno_info 
> *info, u32 fuse)
>   return UINT_MAX;
>  }
>  
> -static int a6xx_set_supported_hw(struct device *dev, const struct 
> adreno_info *info)
> +static int a6xx_set_supported_hw(struct adreno_gpu *adreno_gpu,
> +  struct device *dev,
> +  const struct adreno_info *info)
>  {
>   u32 supp_hw;
>   u32 speedbin;
>   int ret;
>  
> - ret = adreno_read_speedbin(dev, );
> + ret = adreno_read_speedbin(adreno_gpu, dev, );
>   /*
>* -ENOENT means that the platform doesn't support speedbin which is
>* fine
> @@ -2290,7 +2292,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>  
>   a6xx_llc_slices_init(pdev, a6xx_gpu, is_a7xx);
>  
> - ret = a6xx_set_supported_hw(>dev, config->info);
> + ret = a6xx_set_supported_hw(adreno_gpu, >dev, config->info);
>   if (ret) {
>   a6xx_llc_slices_destroy(a6xx_gpu);
>   kfree(a6xx_gpu);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 1e789ff6945e..e514346088f9 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -6,6 +6,8 @@
>   * Copyright (c) 2014,2017 The Linux Foundation. All rights reserved.
>   */
>  
> +#include 
> +
>  #include "adreno_gpu.h"
>  
>  bool hang_debug = false;
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 1c6626747b98..6ffd02f38499 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -21,6 +21,9 @@
>  #include "msm_gem.h"
>  #include "msm_mmu.h"
>  
> +#include 
> +#include 
> +
>  static u64 address_space_size = 0;
>  MODULE_PARM_DESC(address_space_size, "Override for size of processes private 
> GPU address space");
>  module_param(address_space_size, ullong, 0600);
> @@ -1061,9 +1064,39 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem 
> *adreno_ocmem)
>  adreno_ocmem->hdl);
>  }
>  
> -int adreno_read_speedbin(struct device *dev, u32 *speedbin)
> +int adreno_read_speedbin(struct adreno_gpu *adreno_gpu,
> +  struct device *dev, u32 *fuse)
>  {
> - return nvmem_cell_read_variable_le_u32(dev, "speed_bin", speedbin);
> + u32 fcode;
> + int ret;
> +
> + /*
> +  * Try reading the speedbin via a nvmem cell first
> +  * -ENOENT means "no nvmem-cells" and essentially means "old DT" or
> +  * "nvmem fuse is irrelevant", simply assume it's fine.
> +  */
> + ret = nvmem_cell_read_variable_le_u32(dev, "speed_bin", fuse);
> + if (!ret)
> + return 0;
> + else if (ret != -ENOENT)
> + return dev_err_probe(dev, ret, "Couldn't read the speed bin 
> fuse value\n");
> +
> +#ifdef CONFIG_QCOM_SMEM
> + /*
> +  * Only check the feature code - the product code only matters for
> +  * proto SoCs unavailable outside Qualcomm labs, as far as GPU bin
> +  * matching is concerned.
> +  *
> +  * Ignore EOPNOTSUPP, as not all SoCs expose this info through SMEM.
> +  */
> + ret = qcom_smem_get_feature_code();
> + if (!ret)
> + *fuse = ADRENO_SKU_ID(fcode);
> + else if (ret != -EOPNOTSUPP)
> + return dev_err_probe(dev, ret, "Couldn't get feature code from 
> SMEM\n");
> +#endif
> +
> + return 0;
>  }
>  
>  int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
> @@ -1102,9 +1135,9 @@ int adreno_gpu_init(struct drm_device *drm, struct 
> platform_device *pdev,
>   devm_pm_opp_set_clkname(dev, "core");
>   }
>  
> - if (adreno_read_speedbin(dev, ) || !speedbin)
> + if 

Re: [PATCH v4 1/5] drm/msm/adreno: Split up giant device table

2024-06-30 Thread Akhil P Oommen
On Sat, Jun 29, 2024 at 06:32:05AM -0700, Rob Clark wrote:
> On Fri, Jun 28, 2024 at 6:58 PM Akhil P Oommen  
> wrote:
> >
> > On Tue, Jun 18, 2024 at 09:42:47AM -0700, Rob Clark wrote:
> > > From: Rob Clark 
> > >
> > > Split into a separate table per generation, in preparation to move each
> > > gen's device table to it's own file.
> > >
> > > Signed-off-by: Rob Clark 
> > > Reviewed-by: Dmitry Baryshkov 
> > > Reviewed-by: Konrad Dybcio 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/adreno_device.c | 67 +-
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h| 10 
> > >  2 files changed, 63 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > index c3703a51287b..a57659eaddc2 100644
> > > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > @@ -20,7 +20,7 @@ bool allow_vram_carveout = false;
> > >  MODULE_PARM_DESC(allow_vram_carveout, "Allow using VRAM Carveout, in 
> > > place of IOMMU");
> > >  module_param_named(allow_vram_carveout, allow_vram_carveout, bool, 0600);
> > >
> > > -static const struct adreno_info gpulist[] = {
> > > +static const struct adreno_info a2xx_gpus[] = {
> > >   {
> > >   .chip_ids = ADRENO_CHIP_IDS(0x0200),
> > >   .family = ADRENO_2XX_GEN1,
> > > @@ -54,7 +54,12 @@ static const struct adreno_info gpulist[] = {
> > >   .gmem  = SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > >   .init  = a2xx_gpu_init,
> > > - }, {
> > > + }
> > > +};
> > > +DECLARE_ADRENO_GPULIST(a2xx);
> > > +
> > > +static const struct adreno_info a3xx_gpus[] = {
> > > + {
> > >   .chip_ids = ADRENO_CHIP_IDS(0x03000512),
> > >   .family = ADRENO_3XX,
> > >   .fw = {
> > > @@ -116,7 +121,12 @@ static const struct adreno_info gpulist[] = {
> > >   .gmem  = SZ_1M,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > >   .init  = a3xx_gpu_init,
> > > - }, {
> > > + }
> > > +};
> > > +DECLARE_ADRENO_GPULIST(a3xx);
> > > +
> > > +static const struct adreno_info a4xx_gpus[] = {
> > > + {
> > >   .chip_ids = ADRENO_CHIP_IDS(0x04000500),
> > >   .family = ADRENO_4XX,
> > >   .revn  = 405,
> > > @@ -149,7 +159,12 @@ static const struct adreno_info gpulist[] = {
> > >   .gmem  = (SZ_1M + SZ_512K),
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > >   .init  = a4xx_gpu_init,
> > > - }, {
> > > + }
> > > +};
> > > +DECLARE_ADRENO_GPULIST(a4xx);
> > > +
> > > +static const struct adreno_info a5xx_gpus[] = {
> > > + {
> > >   .chip_ids = ADRENO_CHIP_IDS(0x05000600),
> > >   .family = ADRENO_5XX,
> > >   .revn = 506,
> > > @@ -274,7 +289,12 @@ static const struct adreno_info gpulist[] = {
> > >   .quirks = ADRENO_QUIRK_LMLOADKILL_DISABLE,
> > >   .init = a5xx_gpu_init,
> > >   .zapfw = "a540_zap.mdt",
> > > - }, {
> > > + }
> > > +};
> > > +DECLARE_ADRENO_GPULIST(a5xx);
> > > +
> > > +static const struct adreno_info a6xx_gpus[] = {
> > > + {
> > >   .chip_ids = ADRENO_CHIP_IDS(0x0601),
> > >   .family = ADRENO_6XX_GEN1,
> > >   .revn = 610,
> > > @@ -520,7 +540,12 @@ static const struct adreno_info gpulist[] = {
> > >   .zapfw = "a690_zap.mdt",
> > >   .hwcg = a690_hwcg,
> > >   .address_space_size = SZ_16G,
> > > - }, {
> > > + }
> > > +};
> > > +DECLARE_ADRENO_GPULIST(a6xx);
> > > +
> > > +static const struct adreno_info a7xx_gpus[] = {
> > > + {
> > >   .chip_ids = ADRENO_CHIP_IDS(0x07000200),
> > >   .family = ADRENO_6XX_GEN1, /* NOT a mistake! */
> > >   .fw = {
> > > @@ -582,7 +607,17 @@ static const struct adreno_info gpulist[] = {
> > &g

Re: [PATCH v2 0/5] Support for Adreno X1-85 GPU

2024-06-28 Thread Akhil P Oommen
On Sat, Jun 29, 2024 at 07:19:33AM +0530, Akhil P Oommen wrote:
> This series adds support for the Adreno X1-85 GPU found in Qualcomm's
> compute series chipset, Snapdragon X1 Elite (x1e80100). In this new
> naming scheme for Adreno GPU, 'X' stands for compute series, '1' denotes
> 1st generation and '8' & '5' denotes the tier and the SKU which it
> belongs.
> 
> X1-85 has major focus on doubling core clock frequency and bandwidth
> throughput. It has a dedicated collapsible Graphics MX rail (gmxc) to
> power the memories and double the number of data channels to improve
> bandwidth to DDR.
> 
> Mesa has the necessary bits present already to support this GPU. We are
> able to bring up Gnome desktop by hardcoding "0x43050a01" as
> chipid. Also, verified glxgears and glmark2. We have plans to add the
> new chipid support to Mesa in next few weeks, but these patches can go in
> right away to get included in v6.11.
> 
> This series is rebased on top of msm-next branch. P3 cherry-picks cleanly on
> qcom/for-next.

A typo here: P5 cherry-picks cleanly on qcom/for-next.

-Akhil
> 
> P1, P2 & P3 for Rob Clark
> P4 for Will Deacon
> P5 for Bjorn to pick up.
> 
> Changes in v2:
> - Minor update to compatible pattern, '[x]' -> 'x'
> - Increased address space size (Rob)
> - Introduced gmu_chipid in a6xx_info (Rob)
> - Improved fallback logic for gmxc (Dmitry)
> - Rebased on top of msm-next
> - Picked a new patch for arm-mmu bindings update
> - Reordered gpu & gmu reg enties to match schema
> 
> Akhil P Oommen (5):
>   dt-bindings: display/msm/gmu: Add Adreno X185 GMU
>   drm/msm/adreno: Add support for X185 GPU
>   drm/msm/adreno: Introduce gmu_chipid for a740 & a750
>   dt-bindings: arm-smmu: Add X1E80100 GPU SMMU
>   arm64: dts: qcom: x1e80100: Add gpu support
> 
>  .../devicetree/bindings/display/msm/gmu.yaml  |   4 +
>  .../devicetree/bindings/iommu/arm,smmu.yaml   |   3 +-
>  arch/arm64/boot/dts/qcom/x1e80100.dtsi| 195 ++
>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c |  20 ++
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  34 +--
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c |   2 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   5 +
>  8 files changed, 239 insertions(+), 25 deletions(-)
> 
> -- 
> 2.45.1
> 


Re: [PATCH v4 1/5] drm/msm/adreno: Split up giant device table

2024-06-28 Thread Akhil P Oommen
On Tue, Jun 18, 2024 at 09:42:47AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Split into a separate table per generation, in preparation to move each
> gen's device table to it's own file.
> 
> Signed-off-by: Rob Clark 
> Reviewed-by: Dmitry Baryshkov 
> Reviewed-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 67 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h| 10 
>  2 files changed, 63 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index c3703a51287b..a57659eaddc2 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -20,7 +20,7 @@ bool allow_vram_carveout = false;
>  MODULE_PARM_DESC(allow_vram_carveout, "Allow using VRAM Carveout, in place 
> of IOMMU");
>  module_param_named(allow_vram_carveout, allow_vram_carveout, bool, 0600);
>  
> -static const struct adreno_info gpulist[] = {
> +static const struct adreno_info a2xx_gpus[] = {
>   {
>   .chip_ids = ADRENO_CHIP_IDS(0x0200),
>   .family = ADRENO_2XX_GEN1,
> @@ -54,7 +54,12 @@ static const struct adreno_info gpulist[] = {
>   .gmem  = SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
>   .init  = a2xx_gpu_init,
> - }, {
> + }
> +};
> +DECLARE_ADRENO_GPULIST(a2xx);
> +
> +static const struct adreno_info a3xx_gpus[] = {
> + {
>   .chip_ids = ADRENO_CHIP_IDS(0x03000512),
>   .family = ADRENO_3XX,
>   .fw = {
> @@ -116,7 +121,12 @@ static const struct adreno_info gpulist[] = {
>   .gmem  = SZ_1M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
>   .init  = a3xx_gpu_init,
> - }, {
> + }
> +};
> +DECLARE_ADRENO_GPULIST(a3xx);
> +
> +static const struct adreno_info a4xx_gpus[] = {
> + {
>   .chip_ids = ADRENO_CHIP_IDS(0x04000500),
>   .family = ADRENO_4XX,
>   .revn  = 405,
> @@ -149,7 +159,12 @@ static const struct adreno_info gpulist[] = {
>   .gmem  = (SZ_1M + SZ_512K),
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
>   .init  = a4xx_gpu_init,
> - }, {
> + }
> +};
> +DECLARE_ADRENO_GPULIST(a4xx);
> +
> +static const struct adreno_info a5xx_gpus[] = {
> + {
>   .chip_ids = ADRENO_CHIP_IDS(0x05000600),
>   .family = ADRENO_5XX,
>   .revn = 506,
> @@ -274,7 +289,12 @@ static const struct adreno_info gpulist[] = {
>   .quirks = ADRENO_QUIRK_LMLOADKILL_DISABLE,
>   .init = a5xx_gpu_init,
>   .zapfw = "a540_zap.mdt",
> - }, {
> + }
> +};
> +DECLARE_ADRENO_GPULIST(a5xx);
> +
> +static const struct adreno_info a6xx_gpus[] = {
> + {
>   .chip_ids = ADRENO_CHIP_IDS(0x0601),
>   .family = ADRENO_6XX_GEN1,
>   .revn = 610,
> @@ -520,7 +540,12 @@ static const struct adreno_info gpulist[] = {
>   .zapfw = "a690_zap.mdt",
>   .hwcg = a690_hwcg,
>   .address_space_size = SZ_16G,
> - }, {
> + }
> +};
> +DECLARE_ADRENO_GPULIST(a6xx);
> +
> +static const struct adreno_info a7xx_gpus[] = {
> + {
>   .chip_ids = ADRENO_CHIP_IDS(0x07000200),
>   .family = ADRENO_6XX_GEN1, /* NOT a mistake! */
>   .fw = {
> @@ -582,7 +607,17 @@ static const struct adreno_info gpulist[] = {
>   .init = a6xx_gpu_init,
>   .zapfw = "gen70900_zap.mbn",
>   .address_space_size = SZ_16G,
> - },
> + }
> +};
> +DECLARE_ADRENO_GPULIST(a7xx);
> +
> +static const struct adreno_gpulist *gpulists[] = {
> + _gpulist,
> + _gpulist,
> + _gpulist,
> + _gpulist,
> + _gpulist,
> + _gpulist,

Typo. a6xx_gpulist -> a7xx_gpulist.

-Akhil

>  };
>  
>  MODULE_FIRMWARE("qcom/a300_pm4.fw");
> @@ -617,13 +652,17 @@ MODULE_FIRMWARE("qcom/yamato_pm4.fw");
>  static const struct adreno_info *adreno_info(uint32_t chip_id)
>  {
>   /* identify gpu: */
> - for (int i = 0; i < ARRAY_SIZE(gpulist); i++) {
> - const struct adreno_info *info = [i];
> - if (info->machine && !of_machine_is_compatible(info->machine))
> - continue;
> - for (int j = 0; info->chip_ids[j]; j++)
> - if (info->chip_ids[j] == chip_id)
> - return info;
> + for (int i = 0; i < ARRAY_SIZE(gpulists); i++) {
> + for (int j = 0; j < gpulists[i]->gpus_count; j++) {
> + const struct adreno_info *info = [i]->gpus[j];
> +
> + if (info->machine && 
> !of_machine_is_compatible(info->machine))
> + continue;
> +
> + for (int k = 0; info->chip_ids[k]; k++)
> + if (info->chip_ids[k] == chip_id)
> + 

[PATCH v2 5/5] arm64: dts: qcom: x1e80100: Add gpu support

2024-06-28 Thread Akhil P Oommen
Add the necessary dt nodes for gpu support in X1E80100.

Signed-off-by: Akhil P Oommen 
---

Changes in v2:
- Reordered gpu & gmu reg enties to match schema

 arch/arm64/boot/dts/qcom/x1e80100.dtsi | 195 +
 1 file changed, 195 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi 
b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
index 5f90a0b3c016..f043204aa12f 100644
--- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
+++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2985,6 +2986,200 @@ tcsr: clock-controller@1fc {
#reset-cells = <1>;
};
 
+   gpu: gpu@3d0 {
+   compatible = "qcom,adreno-43050c01", "qcom,adreno";
+   reg = <0x0 0x03d0 0x0 0x4>,
+ <0x0 0x03d9e000 0x0 0x1000>,
+ <0x0 0x03d61000 0x0 0x800>;
+
+   reg-names = "kgsl_3d0_reg_memory",
+   "cx_mem",
+   "cx_dbgc";
+
+   interrupts = ;
+
+   iommus = <_smmu 0 0x0>,
+<_smmu 1 0x0>;
+
+   operating-points-v2 = <_opp_table>;
+
+   qcom,gmu = <>;
+   #cooling-cells = <2>;
+
+   interconnects = <_noc MASTER_GFX3D 0 _virt 
SLAVE_EBI1 0>;
+   interconnect-names = "gfx-mem";
+
+   zap-shader {
+   memory-region = <_microcode_mem>;
+   firmware-name = "qcom/gen70500_zap.mbn";
+   };
+
+   gpu_opp_table: opp-table {
+   compatible = "operating-points-v2";
+
+   opp-11 {
+   opp-hz = /bits/ 64 <11>;
+   opp-level = 
;
+   opp-peak-kBps = <1650>;
+   };
+
+   opp-10 {
+   opp-hz = /bits/ 64 <10>;
+   opp-level = 
;
+   opp-peak-kBps = <14398438>;
+   };
+
+   opp-92500 {
+   opp-hz = /bits/ 64 <92500>;
+   opp-level = 
;
+   opp-peak-kBps = <14398438>;
+   };
+
+   opp-8 {
+   opp-hz = /bits/ 64 <8>;
+   opp-level = ;
+   opp-peak-kBps = <12449219>;
+   };
+
+   opp-74400 {
+   opp-hz = /bits/ 64 <74400>;
+   opp-level = 
;
+   opp-peak-kBps = <10687500>;
+   };
+
+   opp-68700 {
+   opp-hz = /bits/ 64 <68700>;
+   opp-level = 
;
+   opp-peak-kBps = <8171875>;
+   };
+
+   opp-55000 {
+   opp-hz = /bits/ 64 <55000>;
+   opp-level = ;
+   opp-peak-kBps = <6074219>;
+   };
+
+   opp-39000 {
+   opp-hz = /bits/ 64 <39000>;
+   opp-level = 
;
+   opp-peak-kBps = <300>;
+   };
+
+   opp-3 {
+   opp-hz = /bits/ 64 <3>;
+   opp-level = 
;
+   opp-peak-kBps = <2136719>;
+   };
+   };
+   };
+
+   gmu: gmu@3d6a000 {
+   compatible = "qcom,adreno-gmu-x185.1", 
"qcom,adreno-gmu";
+   reg = <0x0 0x03d6a000 0x0 0x35000>,
+ <0x0 0x03d5 0x0 0x1>,
+

[PATCH v2 4/5] dt-bindings: arm-smmu: Add X1E80100 GPU SMMU

2024-06-28 Thread Akhil P Oommen
Update the devicetree bindings to support the gpu present in
X1E80100 platform.

Signed-off-by: Akhil P Oommen 
---

Changes in v2:
- New patch in v2

 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 5c130cf06a21..7ef225d4d783 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -95,6 +95,7 @@ properties:
   - qcom,sm8450-smmu-500
   - qcom,sm8550-smmu-500
   - qcom,sm8650-smmu-500
+  - qcom,x1e80100-smmu-500
   - const: qcom,adreno-smmu
   - const: qcom,smmu-500
   - const: arm,mmu-500
@@ -520,6 +521,7 @@ allOf:
 - enum:
 - qcom,sm8550-smmu-500
 - qcom,sm8650-smmu-500
+- qcom,x1e80100-smmu-500
 - const: qcom,adreno-smmu
 - const: qcom,smmu-500
 - const: arm,mmu-500
@@ -557,7 +559,6 @@ allOf:
   - qcom,sdx65-smmu-500
   - qcom,sm6350-smmu-500
   - qcom,sm6375-smmu-500
-  - qcom,x1e80100-smmu-500
 then:
   properties:
 clock-names: false
-- 
2.45.1



[PATCH v2 3/5] drm/msm/adreno: Introduce gmu_chipid for a740 & a750

2024-06-28 Thread Akhil P Oommen
To simplify, introduce the new gmu_chipid for a740 & a750 GPUs.

Signed-off-by: Akhil P Oommen 
---

Changes in v2:
- New patch in v2

 drivers/gpu/drm/msm/adreno/a6xx_catalog.c |  2 ++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 23 +--
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index c507681648ac..bdafca7267a8 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -1206,6 +1206,7 @@ static const struct adreno_info a7xx_gpus[] = {
.a6xx = &(const struct a6xx_info) {
.hwcg = a740_hwcg,
.protect = _protect,
+   .gmu_chipid = 0x7020100,
},
.address_space_size = SZ_16G,
}, {
@@ -1241,6 +1242,7 @@ static const struct adreno_info a7xx_gpus[] = {
.zapfw = "gen70900_zap.mbn",
.a6xx = &(const struct a6xx_info) {
.protect = _protect,
+   .gmu_chipid = 0x7090100,
},
.address_space_size = SZ_16G,
}
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 20034aa2fad8..e4c430504daa 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -771,7 +771,7 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, unsigned 
int state)
struct adreno_gpu *adreno_gpu = _gpu->base;
const struct a6xx_info *a6xx_info = adreno_gpu->info->a6xx;
u32 fence_range_lower, fence_range_upper;
-   u32 chipid, chipid_min = 0;
+   u32 chipid = 0;
int ret;
 
/* Vote veto for FAL10 */
@@ -833,27 +833,6 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
unsigned int state)
 
if (a6xx_info->gmu_chipid) {
chipid = a6xx_info->gmu_chipid;
-   /* NOTE: A730 may also fall in this if-condition with a future GMU fw 
update. */
-   } else if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
-   /* A7xx GPUs have obfuscated chip IDs. Use constant maj = 7 */
-   chipid = FIELD_PREP(GENMASK(31, 24), 0x7);
-
-   /*
-* The min part has a 1-1 mapping for each GPU SKU.
-* This chipid that the GMU expects corresponds to the 
"GENX_Y_Z" naming,
-* where X = major, Y = minor, Z = patchlevel, e.g. GEN7_2_1 
for prod A740.
-*/
-   if (adreno_is_a740(adreno_gpu))
-   chipid_min = 2;
-   else if (adreno_is_a750(adreno_gpu))
-   chipid_min = 9;
-   else
-   return -EINVAL;
-
-   chipid |= FIELD_PREP(GENMASK(23, 16), chipid_min);
-
-   /* Get the patchid (which may vary) from the device tree */
-   chipid |= FIELD_PREP(GENMASK(15, 8), 
adreno_patchid(adreno_gpu));
} else {
/*
 * Note that the GMU has a slightly different layout for
-- 
2.45.1



[PATCH v2 1/5] dt-bindings: display/msm/gmu: Add Adreno X185 GMU

2024-06-28 Thread Akhil P Oommen
Document Adreno X185 GMU in the dt-binding specification.

Signed-off-by: Akhil P Oommen 
Reviewed-by: Krzysztof Kozlowski 
---

Changes in v2:
- Minor update to compatible pattern, '[x]' -> 'x'

 Documentation/devicetree/bindings/display/msm/gmu.yaml | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml 
b/Documentation/devicetree/bindings/display/msm/gmu.yaml
index b3837368a260..b1bd372996d5 100644
--- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
@@ -23,6 +23,9 @@ properties:
   - items:
   - pattern: '^qcom,adreno-gmu-[67][0-9][0-9]\.[0-9]$'
   - const: qcom,adreno-gmu
+  - items:
+  - pattern: '^qcom,adreno-gmu-x[1-9][0-9][0-9]\.[0-9]$'
+  - const: qcom,adreno-gmu
   - const: qcom,adreno-gmu-wrapper
 
   reg:
@@ -225,6 +228,7 @@ allOf:
   - qcom,adreno-gmu-730.1
   - qcom,adreno-gmu-740.1
   - qcom,adreno-gmu-750.1
+  - qcom,adreno-gmu-x185.1
 then:
   properties:
 reg:
-- 
2.45.1



[PATCH v2 2/5] drm/msm/adreno: Add support for X185 GPU

2024-06-28 Thread Akhil P Oommen
Add support in drm/msm driver for the Adreno X185 gpu found in
Snapdragon X1 Elite chipset.

Signed-off-by: Akhil P Oommen 
---

Changes in v2:
- Increased address space size (Rob)
- Introduced gmu_chipid in a6xx_info (Rob)
- Improved fallback logic for gmxc (Dmitry)

 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 18 ++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 13 +++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  5 +
 5 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 53e33ff78411..c507681648ac 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -1208,6 +1208,24 @@ static const struct adreno_info a7xx_gpus[] = {
.protect = _protect,
},
.address_space_size = SZ_16G,
+   }, {
+   .chip_ids = ADRENO_CHIP_IDS(0x43050c01), /* "C512v2" */
+   .family = ADRENO_7XX_GEN2,
+   .fw = {
+   [ADRENO_FW_SQE] = "gen70500_sqe.fw",
+   [ADRENO_FW_GMU] = "gen70500_gmu.bin",
+   },
+   .gmem = 3 * SZ_1M,
+   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
+ ADRENO_QUIRK_HAS_HW_APRIV,
+   .init = a6xx_gpu_init,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a740_hwcg,
+   .protect = _protect,
+   .gmu_chipid = 0x7050001,
+   },
+   .address_space_size = SZ_256G,
}, {
.chip_ids = ADRENO_CHIP_IDS(0x43051401), /* "C520v2" */
.family = ADRENO_7XX_GEN3,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 0e3dfd4c2bc8..20034aa2fad8 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -769,6 +769,7 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, unsigned 
int state)
 {
struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
struct adreno_gpu *adreno_gpu = _gpu->base;
+   const struct a6xx_info *a6xx_info = adreno_gpu->info->a6xx;
u32 fence_range_lower, fence_range_upper;
u32 chipid, chipid_min = 0;
int ret;
@@ -830,8 +831,10 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
unsigned int state)
 */
gmu_write(gmu, REG_A6XX_GMU_CM3_CFG, 0x4052);
 
+   if (a6xx_info->gmu_chipid) {
+   chipid = a6xx_info->gmu_chipid;
/* NOTE: A730 may also fall in this if-condition with a future GMU fw 
update. */
-   if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
+   } else if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
/* A7xx GPUs have obfuscated chip IDs. Use constant maj = 7 */
chipid = FIELD_PREP(GENMASK(31, 24), 0x7);
 
@@ -1329,7 +1332,13 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct device 
*dev, u32 *votes,
if (!pri_count)
return -EINVAL;
 
-   sec = cmd_db_read_aux_data("mx.lvl", _count);
+   /*
+* Some targets have a separate gfx mxc rail. So try to read that first 
and then fall back
+* to regular mx rail if it is missing
+*/
+   sec = cmd_db_read_aux_data("gmxc.lvl", _count);
+   if (IS_ERR(sec) && sec != ERR_PTR(-EPROBE_DEFER))
+   sec = cmd_db_read_aux_data("mx.lvl", _count);
if (IS_ERR(sec))
return PTR_ERR(sec);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index c98cdb1e9326..092e0a1dd612 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1031,7 +1031,7 @@ static int hw_init(struct msm_gpu *gpu)
gpu_write(gpu, REG_A6XX_UCHE_CLIENT_PF, BIT(7) | 0x1);
 
/* Set weights for bicubic filtering */
-   if (adreno_is_a650_family(adreno_gpu)) {
+   if (adreno_is_a650_family(adreno_gpu) || adreno_is_x185(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_0, 0);
gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_1,
0x3fe05ff4);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 1c3cc6df70fe..e3e5c53ae8af 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -21,6 +21,7 @@ extern bool hang_debug;
 struct a6xx_info {
const struct adreno_reglist *hwcg;
const struct adreno_protect *protect;

[PATCH v2 0/5] Support for Adreno X1-85 GPU

2024-06-28 Thread Akhil P Oommen
This series adds support for the Adreno X1-85 GPU found in Qualcomm's
compute series chipset, Snapdragon X1 Elite (x1e80100). In this new
naming scheme for Adreno GPU, 'X' stands for compute series, '1' denotes
1st generation and '8' & '5' denotes the tier and the SKU which it
belongs.

X1-85 has major focus on doubling core clock frequency and bandwidth
throughput. It has a dedicated collapsible Graphics MX rail (gmxc) to
power the memories and double the number of data channels to improve
bandwidth to DDR.

Mesa has the necessary bits present already to support this GPU. We are
able to bring up Gnome desktop by hardcoding "0x43050a01" as
chipid. Also, verified glxgears and glmark2. We have plans to add the
new chipid support to Mesa in next few weeks, but these patches can go in
right away to get included in v6.11.

This series is rebased on top of msm-next branch. P3 cherry-picks cleanly on
qcom/for-next.

P1, P2 & P3 for Rob Clark
P4 for Will Deacon
P5 for Bjorn to pick up.

Changes in v2:
- Minor update to compatible pattern, '[x]' -> 'x'
- Increased address space size (Rob)
- Introduced gmu_chipid in a6xx_info (Rob)
- Improved fallback logic for gmxc (Dmitry)
- Rebased on top of msm-next
- Picked a new patch for arm-mmu bindings update
- Reordered gpu & gmu reg enties to match schema

Akhil P Oommen (5):
  dt-bindings: display/msm/gmu: Add Adreno X185 GMU
  drm/msm/adreno: Add support for X185 GPU
  drm/msm/adreno: Introduce gmu_chipid for a740 & a750
  dt-bindings: arm-smmu: Add X1E80100 GPU SMMU
  arm64: dts: qcom: x1e80100: Add gpu support

 .../devicetree/bindings/display/msm/gmu.yaml  |   4 +
 .../devicetree/bindings/iommu/arm,smmu.yaml   |   3 +-
 arch/arm64/boot/dts/qcom/x1e80100.dtsi| 195 ++
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c |  20 ++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  34 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |   2 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   5 +
 8 files changed, 239 insertions(+), 25 deletions(-)

-- 
2.45.1



Re: [PATCH] drm/msm/a6xx: request memory region

2024-06-26 Thread Akhil P Oommen
<< snip >>

> > > > > > @@ -1503,7 +1497,7 @@ static void __iomem *a6xx_gmu_get_mmio(struct 
> > > > > > platform_device *pdev,
> > > > > > return ERR_PTR(-EINVAL);
> > > > > > }
> > > > > >
> > > > > > -   ret = ioremap(res->start, resource_size(res));
> > > > > > +   ret = devm_ioremap_resource(>dev, res);
> > > > >
> > > > > So, this doesn't actually work, failing in __request_region_locked(),
> > > > > because the gmu region partially overlaps with the gpucc region (which
> > > > > is busy).  I think this is intentional, since gmu is controlling the
> > > > > gpu clocks, etc.  In particular REG_A6XX_GPU_CC_GX_GDSCR is in this
> > > > > overlapping region.  Maybe Akhil knows more about GMU.
> > > >
> > > > We don't really need to map gpucc region from driver on behalf of gmu.
> > > > Since we don't access any gpucc register from drm-msm driver, we can
> > > > update the range size to correct this. But due to backward compatibility
> > > > requirement with older dt, can we still enable region locking? I prefer
> > > > it if that is possible.
> > >
> > > Actually, when I reduced the region size to not overlap with gpucc,
> > > the region is smaller than REG_A6XX_GPU_CC_GX_GDSCR * 4.
> > >
> > > So I guess that register is actually part of gpucc?
> >
> > Yes. It has *GPU_CC* in its name. :P
> >
> > I just saw that we program this register on legacy a6xx targets to
> > ensure retention is really ON before collapsing gdsc. So we can't
> > avoid mapping gpucc region in legacy a6xx GPUs. That is unfortunate!
> 
> I guess we could still use devm_ioremap().. idk if there is a better
> way to solve this

Can we do it without breaking backward compatibility with dt?

-Akhil

> 
> BR,
> -R
> 
> > -Akhil.
> >
> > >
> > > BR,
> > > -R


Re: [PATCH v2 1/2] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-26 Thread Akhil P Oommen
On Wed, Jun 26, 2024 at 09:59:39AM +0200, Daniel Vetter wrote:
> On Tue, Jun 25, 2024 at 08:54:41PM +0200, Konrad Dybcio wrote:
> > Memory barriers help ensure instruction ordering, NOT time and order
> > of actual write arrival at other observers (e.g. memory-mapped IP).
> > On architectures employing weak memory ordering, the latter can be a
> > giant pain point, and it has been as part of this driver.
> > 
> > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> > readl/writel, which include r/w (respectively) barriers.
> > 
> > Replace the barriers with a readback (or drop altogether where possible)
> > that ensures the previous writes have exited the write buffer (as the CPU
> > must flush the write to the register it's trying to read back).
> > 
> > Signed-off-by: Konrad Dybcio 
> 
> Some in pci these readbacks are actually part of the spec and called
> posting reads. I'd very much recommend drivers create a small wrapper
> function for these cases with a void return value, because it makes the
> code so much more legible and easier to understand.

For Adreno which is configured via mmio, we don't need to do this often. 
GBIF_HALT
is a scenario where we need to be extra careful as it can potentially cause some
internal lockup. Another scenario I can think of is GPU soft reset where need to
keep a delay on cpu side after triggering. We should closely scrutinize any
other instance that comes up. So I feel a good justification as a comment here
would be enough, to remind the reader. Think of it as a way to discourage the
use by making it hard.

This is a bit subjective, I am fine if you have a strong opinion on this.

-Akhil.

> -Sima
> 
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  4 +---
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 10 ++
> >  2 files changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > index 0e3dfd4c2bc8..09d640165b18 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > @@ -466,9 +466,7 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
> > int ret;
> > u32 val;
> >  
> > -   gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> > -   /* Wait for the register to finish posting */
> > -   wmb();
> > +   gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> >  
> > ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val,
> > val & (1 << 1), 100, 1);
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index c98cdb1e9326..4083d0cad782 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -855,14 +855,16 @@ static int hw_init(struct msm_gpu *gpu)
> > /* Clear GBIF halt in case GX domain was not collapsed */
> > if (adreno_is_a619_holi(adreno_gpu)) {
> > gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > +   gpu_read(gpu, REG_A6XX_GBIF_HALT);
> > +
> > gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0);
> > -   /* Let's make extra sure that the GPU can access the memory.. */
> > -   mb();
> > +   gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL);
> > } else if (a6xx_has_gbif(adreno_gpu)) {
> > gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > +   gpu_read(gpu, REG_A6XX_GBIF_HALT);
> > +
> > gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
> > -   /* Let's make extra sure that the GPU can access the memory.. */
> > -   mb();
> > +   gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT);
> > }
> >  
> > /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */
> > 
> > -- 
> > 2.45.2
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: [PATCH v1 3/3] arm64: dts: qcom: x1e80100: Add gpu support

2024-06-26 Thread Akhil P Oommen
On Mon, Jun 24, 2024 at 03:57:35PM +0200, Konrad Dybcio wrote:
> 
> 
> On 6/23/24 13:06, Akhil P Oommen wrote:
> > Add the necessary dt nodes for gpu support in X1E80100.
> > 
> > Signed-off-by: Akhil P Oommen 
> > ---
> 
> [...]
> 
> > +
> > +   opp-11 {
> > +   opp-hz = /bits/ 64 <11>;
> > +   opp-level = 
> > ;
> > +   opp-peak-kBps = <1650>;
> 
> No speedbins?

Coming soon! I am waiting for some confirmations on some SKU related
data. This is the lowest Fmax among all SKUs which we can safely enable
for now.

-Akhil.
> 
> Konrad


Re: [PATCH v1 3/3] arm64: dts: qcom: x1e80100: Add gpu support

2024-06-26 Thread Akhil P Oommen
On Mon, Jun 24, 2024 at 12:23:42AM +0300, Dmitry Baryshkov wrote:
> On Sun, Jun 23, 2024 at 04:36:30PM GMT, Akhil P Oommen wrote:
> > Add the necessary dt nodes for gpu support in X1E80100.
> > 
> > Signed-off-by: Akhil P Oommen 
> > ---
> > 
> >  arch/arm64/boot/dts/qcom/x1e80100.dtsi | 195 +
> >  1 file changed, 195 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi 
> > b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
> > index 5f90a0b3c016..3e887286bab4 100644
> > --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
> > @@ -6,6 +6,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> 
> 
> > +   gmu: gmu@3d6a000 {
> > +   compatible = "qcom,adreno-gmu-x185.1", 
> > "qcom,adreno-gmu";
> > +   reg = <0x0 0x03d5 0x0 0x1>,
> > + <0x0 0x03d6a000 0x0 0x35000>,
> > + <0x0 0x0b28 0x0 0x1>;
> > +   reg-names =  "rscc", "gmu", "gmu_pdc";
> 
> Ther @address should match the first resource defined for a device.

I will reorder this and move gmu to first.

-Akhil.

> 
> > +
> -- 
> With best wishes
> Dmitry


Re: [PATCH v1 2/3] drm/msm/adreno: Add support for X185 GPU

2024-06-26 Thread Akhil P Oommen
On Wed, Jun 26, 2024 at 11:43:08AM -0700, Rob Clark wrote:
> On Wed, Jun 26, 2024 at 1:24 AM Akhil P Oommen  
> wrote:
> >
> > On Mon, Jun 24, 2024 at 03:53:48PM +0200, Konrad Dybcio wrote:
> > >
> > >
> > > On 6/23/24 13:06, Akhil P Oommen wrote:
> > > > Add support in drm/msm driver for the Adreno X185 gpu found in
> > > > Snapdragon X1 Elite chipset.
> > > >
> > > > Signed-off-by: Akhil P Oommen 
> > > > ---
> > > >
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 19 +++
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  6 ++
> > > >   drivers/gpu/drm/msm/adreno/adreno_device.c | 14 ++
> > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  5 +
> > > >   4 files changed, 36 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > index 0e3dfd4c2bc8..168a4bddfaf2 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > @@ -830,8 +830,10 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
> > > > unsigned int state)
> > > >  */
> > > > gmu_write(gmu, REG_A6XX_GMU_CM3_CFG, 0x4052);
> > > > +   if (adreno_is_x185(adreno_gpu)) {
> > > > +   chipid = 0x7050001;
> > >
> > > What's wrong with using the logic below?
> >
> > patchid is BITS(7, 0), not (15, 8) in the case of x185. Due to the
> > changes in the chipid scheme within the a7x family, this is a bit
> > confusing. I will try to improve here in another series.
> 
> I'm thinking we should just add gmu_chipid to struct a6xx_info, tbh
> 
> Maybe to start with, we can fall back to the existing logic if
> a6xx_info::gmu_chipid is zero so we don't have to add it for _every_
> a6xx/a7xx

Agree, I was thinking the same.

-Akhil.
> 
> BR,
> -R
> 
> > >
> > > > /* NOTE: A730 may also fall in this if-condition with a future GMU 
> > > > fw update. */
> > > > -   if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
> > > > +   } else if (adreno_is_a7xx(adreno_gpu) && 
> > > > !adreno_is_a730(adreno_gpu)) {
> > > > /* A7xx GPUs have obfuscated chip IDs. Use constant maj = 7 
> > > > */
> > > > chipid = FIELD_PREP(GENMASK(31, 24), 0x7);
> > > > @@ -1329,9 +1331,18 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct 
> > > > device *dev, u32 *votes,
> > > > if (!pri_count)
> > > > return -EINVAL;
> > > > -   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > > > -   if (IS_ERR(sec))
> > > > -   return PTR_ERR(sec);
> > > > +   /*
> > > > +* Some targets have a separate gfx mxc rail. So try to read that 
> > > > first and then fall back
> > > > +* to regular mx rail if it is missing
> > > > +*/
> > > > +   sec = cmd_db_read_aux_data("gmxc.lvl", _count);
> > > > +   if (PTR_ERR_OR_ZERO(sec) == -EPROBE_DEFER) {
> > > > +   return -EPROBE_DEFER;
> > > > +   } else if (IS_ERR(sec)) {
> > > > +   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > > > +   if (IS_ERR(sec))
> > > > +   return PTR_ERR(sec);
> > > > +   }
> > >
> > > I assume GMXC would always be used if present, although please use the
> > > approach Dmitry suggested
> >
> > Correct.
> >
> > -Akhil
> > >
> > >
> > > The rest looks good!
> > >
> > > Konrad


Re: [PATCH v1 2/3] drm/msm/adreno: Add support for X185 GPU

2024-06-26 Thread Akhil P Oommen
On Mon, Jun 24, 2024 at 07:28:06AM -0700, Rob Clark wrote:
> On Mon, Jun 24, 2024 at 7:25 AM Rob Clark  wrote:
> >
> > On Sun, Jun 23, 2024 at 4:08 AM Akhil P Oommen  
> > wrote:
> > >
> > > Add support in drm/msm driver for the Adreno X185 gpu found in
> > > Snapdragon X1 Elite chipset.
> > >
> > > Signed-off-by: Akhil P Oommen 
> > > ---
> > >
> > >  drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 19 +++
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  6 ++
> > >  drivers/gpu/drm/msm/adreno/adreno_device.c | 14 ++
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  5 +
> > >  4 files changed, 36 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > index 0e3dfd4c2bc8..168a4bddfaf2 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > @@ -830,8 +830,10 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
> > > unsigned int state)
> > >  */
> > > gmu_write(gmu, REG_A6XX_GMU_CM3_CFG, 0x4052);
> > >
> > > +   if (adreno_is_x185(adreno_gpu)) {
> > > +   chipid = 0x7050001;
> > > /* NOTE: A730 may also fall in this if-condition with a future 
> > > GMU fw update. */
> > > -   if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
> > > +   } else if (adreno_is_a7xx(adreno_gpu) && 
> > > !adreno_is_a730(adreno_gpu)) {
> > > /* A7xx GPUs have obfuscated chip IDs. Use constant maj = 
> > > 7 */
> > > chipid = FIELD_PREP(GENMASK(31, 24), 0x7);
> > >
> > > @@ -1329,9 +1331,18 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct 
> > > device *dev, u32 *votes,
> > > if (!pri_count)
> > > return -EINVAL;
> > >
> > > -   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > > -   if (IS_ERR(sec))
> > > -   return PTR_ERR(sec);
> > > +   /*
> > > +* Some targets have a separate gfx mxc rail. So try to read that 
> > > first and then fall back
> > > +* to regular mx rail if it is missing
> > > +*/
> > > +   sec = cmd_db_read_aux_data("gmxc.lvl", _count);
> > > +   if (PTR_ERR_OR_ZERO(sec) == -EPROBE_DEFER) {
> > > +   return -EPROBE_DEFER;
> > > +   } else if (IS_ERR(sec)) {
> > > +   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > > +   if (IS_ERR(sec))
> > > +   return PTR_ERR(sec);
> > > +   }
> > >
> > > sec_count >>= 1;
> > > if (!sec_count)
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > index 973872ad0474..97837f7f2a40 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > @@ -1319,9 +1319,7 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
> > > count = ARRAY_SIZE(a660_protect);
> > > count_max = 48;
> > > BUILD_BUG_ON(ARRAY_SIZE(a660_protect) > 48);
> > > -   } else if (adreno_is_a730(adreno_gpu) ||
> > > -  adreno_is_a740(adreno_gpu) ||
> > > -  adreno_is_a750(adreno_gpu)) {
> > > +   } else if (adreno_is_a7xx(adreno_gpu)) {
> > > regs = a730_protect;
> > > count = ARRAY_SIZE(a730_protect);
> > > count_max = 48;
> > > @@ -1891,7 +1889,7 @@ static int hw_init(struct msm_gpu *gpu)
> > > gpu_write(gpu, REG_A6XX_UCHE_CLIENT_PF, BIT(7) | 0x1);
> > >
> > > /* Set weights for bicubic filtering */
> > > -   if (adreno_is_a650_family(adreno_gpu)) {
> > > +   if (adreno_is_a650_family(adreno_gpu) || 
> > > adreno_is_x185(adreno_gpu)) {
> > > gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_0, 0);
> > > gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_1,
> > > 0x3fe05ff4);
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > index c3703

Re: [PATCH v1 2/3] drm/msm/adreno: Add support for X185 GPU

2024-06-26 Thread Akhil P Oommen
On Mon, Jun 24, 2024 at 03:53:48PM +0200, Konrad Dybcio wrote:
> 
> 
> On 6/23/24 13:06, Akhil P Oommen wrote:
> > Add support in drm/msm driver for the Adreno X185 gpu found in
> > Snapdragon X1 Elite chipset.
> > 
> > Signed-off-by: Akhil P Oommen 
> > ---
> > 
> >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 19 +++
> >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  6 ++
> >   drivers/gpu/drm/msm/adreno/adreno_device.c | 14 ++
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  5 +
> >   4 files changed, 36 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > index 0e3dfd4c2bc8..168a4bddfaf2 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > @@ -830,8 +830,10 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
> > unsigned int state)
> >  */
> > gmu_write(gmu, REG_A6XX_GMU_CM3_CFG, 0x4052);
> > +   if (adreno_is_x185(adreno_gpu)) {
> > +   chipid = 0x7050001;
> 
> What's wrong with using the logic below?

patchid is BITS(7, 0), not (15, 8) in the case of x185. Due to the
changes in the chipid scheme within the a7x family, this is a bit
confusing. I will try to improve here in another series.

> 
> > /* NOTE: A730 may also fall in this if-condition with a future GMU fw 
> > update. */
> > -   if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
> > +   } else if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
> > /* A7xx GPUs have obfuscated chip IDs. Use constant maj = 7 */
> > chipid = FIELD_PREP(GENMASK(31, 24), 0x7);
> > @@ -1329,9 +1331,18 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct 
> > device *dev, u32 *votes,
> > if (!pri_count)
> > return -EINVAL;
> > -   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > -   if (IS_ERR(sec))
> > -   return PTR_ERR(sec);
> > +   /*
> > +* Some targets have a separate gfx mxc rail. So try to read that first 
> > and then fall back
> > +* to regular mx rail if it is missing
> > +*/
> > +   sec = cmd_db_read_aux_data("gmxc.lvl", _count);
> > +   if (PTR_ERR_OR_ZERO(sec) == -EPROBE_DEFER) {
> > +   return -EPROBE_DEFER;
> > +   } else if (IS_ERR(sec)) {
> > +   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > +   if (IS_ERR(sec))
> > +   return PTR_ERR(sec);
> > +   }
> 
> I assume GMXC would always be used if present, although please use the
> approach Dmitry suggested

Correct.

-Akhil
> 
> 
> The rest looks good!
> 
> Konrad


Re: [PATCH v1 2/3] drm/msm/adreno: Add support for X185 GPU

2024-06-26 Thread Akhil P Oommen
On Mon, Jun 24, 2024 at 12:21:30AM +0300, Dmitry Baryshkov wrote:
> On Sun, Jun 23, 2024 at 04:36:29PM GMT, Akhil P Oommen wrote:
> > Add support in drm/msm driver for the Adreno X185 gpu found in
> > Snapdragon X1 Elite chipset.
> > 
> > Signed-off-by: Akhil P Oommen 
> > ---
> > 
> >  drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 19 +++
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  6 ++
> >  drivers/gpu/drm/msm/adreno/adreno_device.c | 14 ++
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  5 +
> >  4 files changed, 36 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > index 0e3dfd4c2bc8..168a4bddfaf2 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > @@ -830,8 +830,10 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
> > unsigned int state)
> >  */
> > gmu_write(gmu, REG_A6XX_GMU_CM3_CFG, 0x4052);
> >  
> > +   if (adreno_is_x185(adreno_gpu)) {
> > +   chipid = 0x7050001;
> > /* NOTE: A730 may also fall in this if-condition with a future GMU fw 
> > update. */
> > -   if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
> > +   } else if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
> > /* A7xx GPUs have obfuscated chip IDs. Use constant maj = 7 */
> > chipid = FIELD_PREP(GENMASK(31, 24), 0x7);
> >  
> > @@ -1329,9 +1331,18 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct 
> > device *dev, u32 *votes,
> > if (!pri_count)
> > return -EINVAL;
> >  
> > -   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > -   if (IS_ERR(sec))
> > -   return PTR_ERR(sec);
> > +   /*
> > +* Some targets have a separate gfx mxc rail. So try to read that first 
> > and then fall back
> > +* to regular mx rail if it is missing
> 
> Can we use compatibles / flags to detect this?

I prefer the current approach so that we don't need to keep adding
checks here for future targets. If gmxc is prefer, we have to use that
in all targets.

> 
> > +*/
> > +   sec = cmd_db_read_aux_data("gmxc.lvl", _count);
> > +   if (PTR_ERR_OR_ZERO(sec) == -EPROBE_DEFER) {
> > +   return -EPROBE_DEFER;
> > +   } else if (IS_ERR(sec)) {
> > +   sec = cmd_db_read_aux_data("mx.lvl", _count);
> > +   if (IS_ERR(sec))
> > +   return PTR_ERR(sec);
> > +   }
> 
> The following code might be slightly more idiomatic:
> 
>   sec = cmd_db_read_aux_data("gmxc.lvl", _count);
>   if (IS_ERR(sec) && sec != ERR_PTR(-EPROBE_DEFER))
>   sec = cmd_db_read_aux_data("mx.lvl", _count);
>   if (IS_ERR(sec))
>   return PTR_ERR(sec);
>
Ack. This is neater!

> 
> >  
> > sec_count >>= 1;
> > if (!sec_count)
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index 973872ad0474..97837f7f2a40 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -1319,9 +1319,7 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
> > count = ARRAY_SIZE(a660_protect);
> > count_max = 48;
> > BUILD_BUG_ON(ARRAY_SIZE(a660_protect) > 48);
> > -   } else if (adreno_is_a730(adreno_gpu) ||
> > -  adreno_is_a740(adreno_gpu) ||
> > -  adreno_is_a750(adreno_gpu)) {
> > +   } else if (adreno_is_a7xx(adreno_gpu)) {
> > regs = a730_protect;
> > count = ARRAY_SIZE(a730_protect);
> > count_max = 48;
> > @@ -1891,7 +1889,7 @@ static int hw_init(struct msm_gpu *gpu)
> > gpu_write(gpu, REG_A6XX_UCHE_CLIENT_PF, BIT(7) | 0x1);
> >  
> > /* Set weights for bicubic filtering */
> > -   if (adreno_is_a650_family(adreno_gpu)) {
> > +   if (adreno_is_a650_family(adreno_gpu) || adreno_is_x185(adreno_gpu)) {
> > gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_0, 0);
> > gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_1,
> > 0x3fe05ff4);
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index c3703a51287b..139c7d828749 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/d

Re: [PATCH v1 1/3] dt-bindings: display/msm/gmu: Add Adreno X185 GMU

2024-06-26 Thread Akhil P Oommen
On Sun, Jun 23, 2024 at 02:40:14PM +0200, Krzysztof Kozlowski wrote:
> On 23/06/2024 13:06, Akhil P Oommen wrote:
> > Document Adreno X185 GMU in the dt-binding specification.
> > 
> > Signed-off-by: Akhil P Oommen 
> > ---
> > 
> >  Documentation/devicetree/bindings/display/msm/gmu.yaml | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml 
> > b/Documentation/devicetree/bindings/display/msm/gmu.yaml
> > index b3837368a260..9aa7151fd66f 100644
> > --- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
> > +++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
> > @@ -23,6 +23,9 @@ properties:
> >- items:
> >- pattern: '^qcom,adreno-gmu-[67][0-9][0-9]\.[0-9]$'
> >- const: qcom,adreno-gmu
> > +  - items:
> > +  - pattern: '^qcom,adreno-gmu-[x][1-9][0-9][0-9]\.[0-9]$'
> 
> '[x]' is odd. Should be just 'x'.

Ack

-Akhil
> 
> 
> Best regards,
> Krzysztof
> 


Re: [PATCH v2 0/2] Clean up barriers

2024-06-25 Thread Akhil P Oommen
On Tue, Jun 25, 2024 at 08:54:40PM +0200, Konrad Dybcio wrote:
> Changes in v3:
> - Drop the wrapper functions
> - Drop the readback in GMU code
> - Split the commit in two
> 
> Link to v2: 
> https://lore.kernel.org/linux-arm-msm/20240509-topic-adreno-v2-1-b82a9f99b...@linaro.org/
> 
> Changes in v2:
> - Introduce gpu_write_flush() and use it
> - Don't accidentally break a630 by trying to write to non-existent GBIF
> 
> Link to v1: 
> https://lore.kernel.org/linux-arm-msm/20240508-topic-adreno-v1-1-1babd05c1...@linaro.org/
> 
> Signed-off-by: Konrad Dybcio 
> ---
> Konrad Dybcio (2):
>   drm/msm/adreno: De-spaghettify the use of memory barriers
>   Revert "drm/msm/a6xx: Poll for GBIF unhalt status in hw_init"
> 
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  4 +---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 ++
>  2 files changed, 7 insertions(+), 11 deletions(-)
> ---
> base-commit: 0fc4bfab2cd45f9acb86c4f04b5191e114e901ed
> change-id: 20240625-adreno_barriers-29f356742418

for the whole series:
Reviewed-by: Akhil P Oommen 

-Akhil

> 
> Best regards,
> -- 
> Konrad Dybcio 
> 


Re: [PATCH] drm/msm/a6xx: request memory region

2024-06-25 Thread Akhil P Oommen
On Tue, Jun 25, 2024 at 11:03:42AM -0700, Rob Clark wrote: > On Tue, Jun 25, 
2024 at 10:59 AM Akhil P Oommen  wrote:
> >
> > On Fri, Jun 21, 2024 at 02:09:58PM -0700, Rob Clark wrote:
> > > On Sat, Jun 8, 2024 at 8:44 AM Kiarash Hajian
> > >  wrote:
> > > >
> > > > The driver's memory regions are currently just ioremap()ed, but not
> > > > reserved through a request. That's not a bug, but having the request is
> > > > a little more robust.
> > > >
> > > > Implement the region-request through the corresponding managed
> > > > devres-function.
> > > >
> > > > Signed-off-by: Kiarash Hajian 
> > > > ---
> > > > Changes in v6:
> > > > -Fix compile error
> > > > -Link to v5: 
> > > > https://lore.kernel.org/all/20240607-memory-v1-1-8664f52fc...@gmail.com
> > > >
> > > > Changes in v5:
> > > > - Fix error hanlding problems.
> > > > - Link to v4: 
> > > > https://lore.kernel.org/r/20240512-msm-adreno-memory-region-v4-1-3881a6408...@gmail.com
> > > >
> > > > Changes in v4:
> > > > - Combine v3 commits into a singel commit
> > > > - Link to v3: 
> > > > https://lore.kernel.org/r/20240512-msm-adreno-memory-region-v3-0-0a728ad45...@gmail.com
> > > >
> > > > Changes in v3:
> > > > - Remove redundant devm_iounmap calls, relying on devres for 
> > > > automatic resource cleanup.
> > > >
> > > > Changes in v2:
> > > > - update the subject prefix to "drm/msm/a6xx:", to match the 
> > > > majority of other changes to this file.
> > > > ---
> > > >  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 33 
> > > > +++--
> > > >  1 file changed, 11 insertions(+), 22 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > index 8bea8ef26f77..d26cc6254ef9 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > @@ -525,7 +525,7 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
> > > > bool pdc_in_aop = false;
> > > >
> > > > if (IS_ERR(pdcptr))
> > > > -   goto err;
> > > > +   return;
> > > >
> > > > if (adreno_is_a650(adreno_gpu) ||
> > > > adreno_is_a660_family(adreno_gpu) ||
> > > > @@ -541,7 +541,7 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
> > > > if (!pdc_in_aop) {
> > > > seqptr = a6xx_gmu_get_mmio(pdev, "gmu_pdc_seq");
> > > > if (IS_ERR(seqptr))
> > > > -   goto err;
> > > > +   return;
> > > > }
> > > >
> > > > /* Disable SDE clock gating */
> > > > @@ -633,12 +633,6 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu 
> > > > *gmu)
> > > > wmb();
> > > >
> > > > a6xx_rpmh_stop(gmu);
> > > > -
> > > > -err:
> > > > -   if (!IS_ERR_OR_NULL(pdcptr))
> > > > -   iounmap(pdcptr);
> > > > -   if (!IS_ERR_OR_NULL(seqptr))
> > > > -   iounmap(seqptr);
> > > >  }
> > > >
> > > >  /*
> > > > @@ -1503,7 +1497,7 @@ static void __iomem *a6xx_gmu_get_mmio(struct 
> > > > platform_device *pdev,
> > > > return ERR_PTR(-EINVAL);
> > > > }
> > > >
> > > > -   ret = ioremap(res->start, resource_size(res));
> > > > +   ret = devm_ioremap_resource(>dev, res);
> > >
> > > So, this doesn't actually work, failing in __request_region_locked(),
> > > because the gmu region partially overlaps with the gpucc region (which
> > > is busy).  I think this is intentional, since gmu is controlling the
> > > gpu clocks, etc.  In particular REG_A6XX_GPU_CC_GX_GDSCR is in this
> > > overlapping region.  Maybe Akhil knows more about GMU.
> >
> > We don't really need to map gpucc region from driver on behalf of gmu.
> > Since we don't access any gpucc register from drm-msm driver, we can
> > update the range size to correct

Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-25 Thread Akhil P Oommen
On Tue, Jun 18, 2024 at 10:08:23PM +0530, Akhil P Oommen wrote:
> On Tue, Jun 04, 2024 at 07:35:04PM +0200, Konrad Dybcio wrote:
> > 
> > 
> > On 5/14/24 20:38, Akhil P Oommen wrote:
> > > On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote:
> > > > Memory barriers help ensure instruction ordering, NOT time and order
> > > > of actual write arrival at other observers (e.g. memory-mapped IP).
> > > > On architectures employing weak memory ordering, the latter can be a
> > > > giant pain point, and it has been as part of this driver.
> > > > 
> > > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> > > > readl/writel, which include r/w (respectively) barriers.
> > > > 
> > > > Replace the barriers with a readback that ensures the previous writes
> > > > have exited the write buffer (as the CPU must flush the write to the
> > > > register it's trying to read back) and subsequently remove the hack
> > > > introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt
> > > > status in hw_init").
> > > > 
> > > > Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in 
> > > > hw_init")
> > > > Signed-off-by: Konrad Dybcio 
> > > > ---
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  5 ++---
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 --
> > > >   2 files changed, 6 insertions(+), 13 deletions(-)
> > > 
> > > I prefer this version compared to the v2. A helper routine is
> > > unnecessary here because:
> > > 1. there are very few scenarios where we have to read back the same
> > > register.
> > > 2. we may accidently readback a write only register.
> > 
> > Which would still trigger an address dependency on the CPU, no?
> 
> Yes, but it is not a good idea to read a write-only register. We can't be
> sure about its effect on the endpoint.
> 
> > 
> > > 
> > > > 
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > index 0e3dfd4c2bc8..4135a53b55a7 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > > @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
> > > > int ret;
> > > > u32 val;
> > > > -   gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> > > > -   /* Wait for the register to finish posting */
> > > > -   wmb();
> > > > +   gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> > > > +   gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ);
> > > 
> > > This is unnecessary because we are polling on a register on the same port 
> > > below. But I think we
> > > can replace "wmb()" above with "mb()" to avoid reordering between read
> > > and write IO instructions.
> > 
> > Ok on the dropping readback part
> > 
> > + AFAIU from Will's response, we can drop the barrier as well

Yes, let drop the the barrier.

> 
> Lets wait a bit on Will's response on compiler reordering.
> 
> > 
> > > 
> > > > ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val,
> > > > val & (1 << 1), 100, 1);
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > index 973872ad0474..0acbc38b8e70 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > @@ -1713,22 +1713,16 @@ static int hw_init(struct msm_gpu *gpu)
> > > > }
> > > > /* Clear GBIF halt in case GX domain was not collapsed */
> > > > +   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > > 
> > > We need a full barrier here to avoid reordering. Also, lets add a
> > > comment about why we are doing this odd looking sequence.

Please ignore this.

> > > 
> > > > +   gpu_read(gpu, REG_A6XX_GBIF_HALT);
> > > > if (adreno_is_a619_holi(adreno_gpu)) {
> > > > -   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > > > gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0);
> > > > -   /* Let's make extra sure that the GPU can a

Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-25 Thread Akhil P Oommen
On Thu, Jun 20, 2024 at 02:04:01PM +0100, Will Deacon wrote:
> On Tue, Jun 18, 2024 at 09:41:58PM +0530, Akhil P Oommen wrote:
> > On Tue, Jun 04, 2024 at 03:40:56PM +0100, Will Deacon wrote:
> > > On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote:
> > > > On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote:
> > > > > On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> > > > > > If I understand correctly, you don't need any memory barrier.
> > > > > > writel()/readl()'s are ordered to the same endpoint. That goes for 
> > > > > > all
> > > > > > the reordering/barrier comments mentioned below too.
> > > > > > 
> > > > > > device-io.rst:
> > > > > > 
> > > > > > The read and write functions are defined to be ordered. That is 
> > > > > > the
> > > > > > compiler is not permitted to reorder the I/O sequence. When the 
> > > > > > ordering
> > > > > > can be compiler optimised, you can use __readb() and friends to
> > > > > > indicate the relaxed ordering. Use this with care.
> > > > > > 
> > > > > > memory-barriers.txt:
> > > > > > 
> > > > > >  (*) readX(), writeX():
> > > > > > 
> > > > > > The readX() and writeX() MMIO accessors take a pointer to 
> > > > > > the
> > > > > > peripheral being accessed as an __iomem * parameter. For 
> > > > > > pointers
> > > > > > mapped with the default I/O attributes (e.g. those returned 
> > > > > > by
> > > > > > ioremap()), the ordering guarantees are as follows:
> > > > > > 
> > > > > > 1. All readX() and writeX() accesses to the same peripheral 
> > > > > > are ordered
> > > > > >with respect to each other. This ensures that MMIO 
> > > > > > register accesses
> > > > > >by the same CPU thread to a particular device will 
> > > > > > arrive in program
> > > > > >order.
> > > > > > 
> > > > > 
> > > > > In arm64, a writel followed by readl translates to roughly the 
> > > > > following
> > > > > sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am 
> > > > > not
> > > > > sure what is stopping compiler from reordering  __raw_writel() and 
> > > > > __raw_readl()
> > > > > above? I am assuming iomem cookie is ignored during compilation.
> > > > 
> > > > It seems to me that is due to some usage of volatile there in
> > > > __raw_writel() etc, but to be honest after reading about volatile and
> > > > some threads from gcc mailing lists, I don't have a confident answer :)
> > > > 
> > > > > 
> > > > > Added Will to this thread if he can throw some light on this.
> > > > 
> > > > Hopefully Will can school us.
> > > 
> > > The ordering in this case is ensured by the memory attributes used for
> > > ioremap(). When an MMIO region is mapped using Device-nGnRE attributes
> > > (as it the case for ioremap()), the "nR" part means "no reordering", so
> > > readX() and writeX() to that region are ordered wrt each other.
> > 
> > But that avoids only HW reordering, doesn't it? What about *compiler 
> > reordering* in the
> > case of a writel following by a readl which translates to:
> > 1: dmb_wmb()
> > 2: __raw_writel() -> roughly "asm volatile('str')
> > 3: __raw_readl() -> roughly "asm volatile('ldr')
> > 4: dmb_rmb()
> > 
> > Is the 'volatile' keyword sufficient to avoid reordering between (2) and 
> > (3)? Or
> > do we need a "memory" clobber to inhibit reordering?
> > 
> > This is still not clear to me even after going through some compiler 
> > documentions.
> 
> I don't think the compiler should reorder volatile asm blocks wrt each
> other.
> 

Thanks Will for confirmation.

-Akhil.

> Will


Re: [PATCH] drm/msm/a6xx: request memory region

2024-06-25 Thread Akhil P Oommen
On Fri, Jun 21, 2024 at 02:09:58PM -0700, Rob Clark wrote:
> On Sat, Jun 8, 2024 at 8:44 AM Kiarash Hajian
>  wrote:
> >
> > The driver's memory regions are currently just ioremap()ed, but not
> > reserved through a request. That's not a bug, but having the request is
> > a little more robust.
> >
> > Implement the region-request through the corresponding managed
> > devres-function.
> >
> > Signed-off-by: Kiarash Hajian 
> > ---
> > Changes in v6:
> > -Fix compile error
> > -Link to v5: 
> > https://lore.kernel.org/all/20240607-memory-v1-1-8664f52fc...@gmail.com
> >
> > Changes in v5:
> > - Fix error hanlding problems.
> > - Link to v4: 
> > https://lore.kernel.org/r/20240512-msm-adreno-memory-region-v4-1-3881a6408...@gmail.com
> >
> > Changes in v4:
> > - Combine v3 commits into a singel commit
> > - Link to v3: 
> > https://lore.kernel.org/r/20240512-msm-adreno-memory-region-v3-0-0a728ad45...@gmail.com
> >
> > Changes in v3:
> > - Remove redundant devm_iounmap calls, relying on devres for automatic 
> > resource cleanup.
> >
> > Changes in v2:
> > - update the subject prefix to "drm/msm/a6xx:", to match the majority 
> > of other changes to this file.
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 33 
> > +++--
> >  1 file changed, 11 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > index 8bea8ef26f77..d26cc6254ef9 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > @@ -525,7 +525,7 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
> > bool pdc_in_aop = false;
> >
> > if (IS_ERR(pdcptr))
> > -   goto err;
> > +   return;
> >
> > if (adreno_is_a650(adreno_gpu) ||
> > adreno_is_a660_family(adreno_gpu) ||
> > @@ -541,7 +541,7 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
> > if (!pdc_in_aop) {
> > seqptr = a6xx_gmu_get_mmio(pdev, "gmu_pdc_seq");
> > if (IS_ERR(seqptr))
> > -   goto err;
> > +   return;
> > }
> >
> > /* Disable SDE clock gating */
> > @@ -633,12 +633,6 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
> > wmb();
> >
> > a6xx_rpmh_stop(gmu);
> > -
> > -err:
> > -   if (!IS_ERR_OR_NULL(pdcptr))
> > -   iounmap(pdcptr);
> > -   if (!IS_ERR_OR_NULL(seqptr))
> > -   iounmap(seqptr);
> >  }
> >
> >  /*
> > @@ -1503,7 +1497,7 @@ static void __iomem *a6xx_gmu_get_mmio(struct 
> > platform_device *pdev,
> > return ERR_PTR(-EINVAL);
> > }
> >
> > -   ret = ioremap(res->start, resource_size(res));
> > +   ret = devm_ioremap_resource(>dev, res);
> 
> So, this doesn't actually work, failing in __request_region_locked(),
> because the gmu region partially overlaps with the gpucc region (which
> is busy).  I think this is intentional, since gmu is controlling the
> gpu clocks, etc.  In particular REG_A6XX_GPU_CC_GX_GDSCR is in this
> overlapping region.  Maybe Akhil knows more about GMU.

We don't really need to map gpucc region from driver on behalf of gmu.
Since we don't access any gpucc register from drm-msm driver, we can
update the range size to correct this. But due to backward compatibility
requirement with older dt, can we still enable region locking? I prefer
it if that is possible.

FYI, kgsl accesses gpucc registers to ensure gdsc has collapsed. So
gpucc region has to be mapped by kgsl and that is reflected in the kgsl
device tree.

-Akhil

> 
> BR,
> -R
> 
> > if (!ret) {
> > DRM_DEV_ERROR(>dev, "Unable to map the %s 
> > registers\n", name);
> > return ERR_PTR(-EINVAL);
> > @@ -1613,13 +1607,13 @@ int a6xx_gmu_wrapper_init(struct a6xx_gpu 
> > *a6xx_gpu, struct device_node *node)
> > gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> > if (IS_ERR(gmu->mmio)) {
> > ret = PTR_ERR(gmu->mmio);
> > -   goto err_mmio;
> > +   goto err_cleanup;
> > }
> >
> > gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> > if (IS_ERR(gmu->cxpd)) {
> > ret = PTR_ERR(gmu->cxpd);
> > -   goto err_mmio;
> > +   goto err_cleanup;
> > }
> >
> > if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> > @@ -1635,7 +1629,7 @@ int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, 
> > struct device_node *node)
> > gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
> > if (IS_ERR(gmu->gxpd)) {
> > ret = PTR_ERR(gmu->gxpd);
> > -   goto err_mmio;
> > +   goto err_cleanup;
> > }
> >
> > gmu->initialized = true;
> > @@ -1645,9 +1639,7 @@ int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, 
> > struct 

Re: [PATCH v1 0/3] Support for Adreno X1-85 GPU

2024-06-24 Thread Akhil P Oommen
On Sun, Jun 23, 2024 at 01:11:48PM +0200, Krzysztof Kozlowski wrote:
> On 23/06/2024 13:06, Akhil P Oommen wrote:
> > This series adds support for the Adreno X1-85 GPU found in Qualcomm's
> > compute series chipset, Snapdragon X1 Elite (x1e80100). In this new
> > naming scheme for Adreno GPU, 'X' stands for compute series, '1' denotes
> > 1st generation and '8' & '5' denotes the tier and the SKU which it
> > belongs.
> > 
> > X1-85 has major focus on doubling core clock frequency and bandwidth
> > throughput. It has a dedicated collapsible Graphics MX rail (gmxc) to
> > power the memories and double the number of data channels to improve
> > bandwidth to DDR.
> > 
> > Mesa has the necessary bits present already to support this GPU. We are
> > able to bring up Gnome desktop by hardcoding "0x43050a01" as
> > chipid. Also, verified glxgears and glmark2. We have plans to add the
> > new chipid support to Mesa in next few weeks, but these patches can go in
> > right away to get included in v6.11.
> > 
> > This series is rebased on top of v6.10-rc4. P3 cherry-picks cleanly on
> > qcom/for-next.
> > 
> > P1 & P2 for Rob, P3 for Bjorn to pick up.
> 
> Which Rob?

Sorry for the confusion! I meant Rob Clark whom I had added in the "To:"
list.

-Akhil

> 
> Why bindings cannot go as usual way - via the subsystem?
> 
> Best regards,
> Krzysztof
> 
> 


Re: [PATCH v1 3/3] arm64: dts: qcom: x1e80100: Add gpu support

2024-06-23 Thread Akhil P Oommen
On Sun, Jun 23, 2024 at 03:40:06PM -0500, Bjorn Andersson wrote:
> On Sun, Jun 23, 2024 at 08:46:30PM GMT, Akhil P Oommen wrote:
> > On Sun, Jun 23, 2024 at 02:53:17PM +0200, Krzysztof Kozlowski wrote:
> > > On 23/06/2024 14:28, Akhil P Oommen wrote:
> > > > On Sun, Jun 23, 2024 at 01:17:16PM +0200, Krzysztof Kozlowski wrote:
> > > >> On 23/06/2024 13:06, Akhil P Oommen wrote:
> > > >>> Add the necessary dt nodes for gpu support in X1E80100.
> > > >>>
> > > >>> Signed-off-by: Akhil P Oommen 
> > > >>> ---
> > > >>> + gmu: gmu@3d6a000 {
> > > >>> + compatible = "qcom,adreno-gmu-x185.1", 
> > > >>> "qcom,adreno-gmu";
> > > >>> + reg = <0x0 0x03d5 0x0 0x1>,
> > > >>> +   <0x0 0x03d6a000 0x0 0x35000>,
> > > >>> +   <0x0 0x0b28 0x0 0x1>;
> > > >>> + reg-names =  "rscc", "gmu", "gmu_pdc";
> > > >>
> > > >> Really, please start testing your patches. Your internal instructions
> > > >> tells you to do that, so please follow it carefully. Don't use the
> > > >> community as the tool, because you do not want to run checks and
> > > >> investigate results.
> > > > 
> > > > This was obviously tested before (and retested now) and everything 
> > > > works. I am
> > > > confused about what you meant. Could you please elaborate a bit? The 
> > > > device
> > > > and the compilation/test setup is new for me, so I am wondering if I
> > > > made any silly mistake!
> > > 
> > > Eh, your DTS is not correct, but this could not be pointed out by tests,
> > > because the binding does not work. :(
> > 
> > I reordered both "reg" and "reg-names" arrays based on the address.
> 
> The @3d6a000 should match the first reg entry.
> 
> > Not sure if
> > that is what we are talking about here. Gpu driver uses 
> > platform_get_resource_byname()
> > to query mmio resources.
> > 
> > I will retest dt-bindings and dts checks after picking the patches you
> > just posted and report back. Is the schema supposed to enforce strict
> > order?
> 
> In your second hunk in patch 1, you are defining the order of reg,
> reg-names, clocks, and clock-names. This creates an ABI between DTB and
> implementation where ordering is significant - regardless of Linux using
> platform_get_resource_byname().

I will revert this to the original order. Thanks for the clarification, 
Bjorn/Krzysztof.

-Akhil.
> 
> Regards,
> Bjorn
> 
> > 
> > -Akhil.
> > > 
> > > I'll fix up the binding and then please test on top of my patch (see
> > > your internal guideline about necessary tests before sending any binding
> > > or DTS patch).
> > > 
> > > Best regards,
> > > Krzysztof
> > > 


Re: [PATCH v1 3/3] arm64: dts: qcom: x1e80100: Add gpu support

2024-06-23 Thread Akhil P Oommen
On Sun, Jun 23, 2024 at 02:53:17PM +0200, Krzysztof Kozlowski wrote:
> On 23/06/2024 14:28, Akhil P Oommen wrote:
> > On Sun, Jun 23, 2024 at 01:17:16PM +0200, Krzysztof Kozlowski wrote:
> >> On 23/06/2024 13:06, Akhil P Oommen wrote:
> >>> Add the necessary dt nodes for gpu support in X1E80100.
> >>>
> >>> Signed-off-by: Akhil P Oommen 
> >>> ---
> >>> + gmu: gmu@3d6a000 {
> >>> + compatible = "qcom,adreno-gmu-x185.1", 
> >>> "qcom,adreno-gmu";
> >>> + reg = <0x0 0x03d5 0x0 0x1>,
> >>> +   <0x0 0x03d6a000 0x0 0x35000>,
> >>> +   <0x0 0x0b28 0x0 0x1>;
> >>> + reg-names =  "rscc", "gmu", "gmu_pdc";
> >>
> >> Really, please start testing your patches. Your internal instructions
> >> tells you to do that, so please follow it carefully. Don't use the
> >> community as the tool, because you do not want to run checks and
> >> investigate results.
> > 
> > This was obviously tested before (and retested now) and everything works. I 
> > am
> > confused about what you meant. Could you please elaborate a bit? The device
> > and the compilation/test setup is new for me, so I am wondering if I
> > made any silly mistake!
> 
> Eh, your DTS is not correct, but this could not be pointed out by tests,
> because the binding does not work. :(

I reordered both "reg" and "reg-names" arrays based on the address. Not sure if
that is what we are talking about here. Gpu driver uses 
platform_get_resource_byname()
to query mmio resources.

I will retest dt-bindings and dts checks after picking the patches you
just posted and report back. Is the schema supposed to enforce strict
order?

-Akhil.
> 
> I'll fix up the binding and then please test on top of my patch (see
> your internal guideline about necessary tests before sending any binding
> or DTS patch).
> 
> Best regards,
> Krzysztof
> 


Re: [PATCH v1 3/3] arm64: dts: qcom: x1e80100: Add gpu support

2024-06-23 Thread Akhil P Oommen
On Sun, Jun 23, 2024 at 01:17:16PM +0200, Krzysztof Kozlowski wrote:
> On 23/06/2024 13:06, Akhil P Oommen wrote:
> > Add the necessary dt nodes for gpu support in X1E80100.
> > 
> > Signed-off-by: Akhil P Oommen 
> > ---
> > +   gmu: gmu@3d6a000 {
> > +   compatible = "qcom,adreno-gmu-x185.1", 
> > "qcom,adreno-gmu";
> > +   reg = <0x0 0x03d5 0x0 0x1>,
> > + <0x0 0x03d6a000 0x0 0x35000>,
> > + <0x0 0x0b28 0x0 0x1>;
> > +   reg-names =  "rscc", "gmu", "gmu_pdc";
> 
> Really, please start testing your patches. Your internal instructions
> tells you to do that, so please follow it carefully. Don't use the
> community as the tool, because you do not want to run checks and
> investigate results.

This was obviously tested before (and retested now) and everything works. I am
confused about what you meant. Could you please elaborate a bit? The device
and the compilation/test setup is new for me, so I am wondering if I
made any silly mistake!

-Akhil.

> 
> NAK.
> 
> Best regards,
> Krzysztof
> 


[PATCH v1 3/3] arm64: dts: qcom: x1e80100: Add gpu support

2024-06-23 Thread Akhil P Oommen
Add the necessary dt nodes for gpu support in X1E80100.

Signed-off-by: Akhil P Oommen 
---

 arch/arm64/boot/dts/qcom/x1e80100.dtsi | 195 +
 1 file changed, 195 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi 
b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
index 5f90a0b3c016..3e887286bab4 100644
--- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
+++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2985,6 +2986,200 @@ tcsr: clock-controller@1fc {
#reset-cells = <1>;
};
 
+   gpu: gpu@3d0 {
+   compatible = "qcom,adreno-43050c01", "qcom,adreno";
+   reg = <0x0 0x03d0 0x0 0x4>,
+ <0x0 0x03d61000 0x0 0x800>,
+ <0x0 0x03d9e000 0x0 0x1000>;
+
+   reg-names = "kgsl_3d0_reg_memory",
+   "cx_dbgc",
+   "cx_mem";
+
+   interrupts = ;
+
+   iommus = <_smmu 0 0x0>,
+<_smmu 1 0x0>;
+
+   operating-points-v2 = <_opp_table>;
+
+   qcom,gmu = <>;
+   #cooling-cells = <2>;
+
+   interconnects = <_noc MASTER_GFX3D 0 _virt 
SLAVE_EBI1 0>;
+   interconnect-names = "gfx-mem";
+
+   zap-shader {
+   memory-region = <_microcode_mem>;
+   firmware-name = "qcom/gen70500_zap.mbn";
+   };
+
+   gpu_opp_table: opp-table {
+   compatible = "operating-points-v2";
+
+   opp-11 {
+   opp-hz = /bits/ 64 <11>;
+   opp-level = 
;
+   opp-peak-kBps = <1650>;
+   };
+
+   opp-10 {
+   opp-hz = /bits/ 64 <10>;
+   opp-level = 
;
+   opp-peak-kBps = <14398438>;
+   };
+
+   opp-92500 {
+   opp-hz = /bits/ 64 <92500>;
+   opp-level = 
;
+   opp-peak-kBps = <14398438>;
+   };
+
+   opp-8 {
+   opp-hz = /bits/ 64 <8>;
+   opp-level = ;
+   opp-peak-kBps = <12449219>;
+   };
+
+   opp-74400 {
+   opp-hz = /bits/ 64 <74400>;
+   opp-level = 
;
+   opp-peak-kBps = <10687500>;
+   };
+
+   opp-68700 {
+   opp-hz = /bits/ 64 <68700>;
+   opp-level = 
;
+   opp-peak-kBps = <8171875>;
+   };
+
+   opp-55000 {
+   opp-hz = /bits/ 64 <55000>;
+   opp-level = ;
+   opp-peak-kBps = <6074219>;
+   };
+
+   opp-39000 {
+   opp-hz = /bits/ 64 <39000>;
+   opp-level = 
;
+   opp-peak-kBps = <300>;
+   };
+
+   opp-3 {
+   opp-hz = /bits/ 64 <3>;
+   opp-level = 
;
+   opp-peak-kBps = <2136719>;
+   };
+   };
+   };
+
+   gmu: gmu@3d6a000 {
+   compatible = "qcom,adreno-gmu-x185.1", 
"qcom,adreno-gmu";
+   reg = <0x0 0x03d5 0x0 0x1>,
+ <0x0 0x03d6a000 0x0 0x35000>,
+   

[PATCH v1 1/3] dt-bindings: display/msm/gmu: Add Adreno X185 GMU

2024-06-23 Thread Akhil P Oommen
Document Adreno X185 GMU in the dt-binding specification.

Signed-off-by: Akhil P Oommen 
---

 Documentation/devicetree/bindings/display/msm/gmu.yaml | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml 
b/Documentation/devicetree/bindings/display/msm/gmu.yaml
index b3837368a260..9aa7151fd66f 100644
--- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
@@ -23,6 +23,9 @@ properties:
   - items:
   - pattern: '^qcom,adreno-gmu-[67][0-9][0-9]\.[0-9]$'
   - const: qcom,adreno-gmu
+  - items:
+  - pattern: '^qcom,adreno-gmu-[x][1-9][0-9][0-9]\.[0-9]$'
+  - const: qcom,adreno-gmu
   - const: qcom,adreno-gmu-wrapper
 
   reg:
@@ -225,6 +228,7 @@ allOf:
   - qcom,adreno-gmu-730.1
   - qcom,adreno-gmu-740.1
   - qcom,adreno-gmu-750.1
+  - qcom,adreno-gmu-x185.1
 then:
   properties:
 reg:
-- 
2.45.1



[PATCH v1 2/3] drm/msm/adreno: Add support for X185 GPU

2024-06-23 Thread Akhil P Oommen
Add support in drm/msm driver for the Adreno X185 gpu found in
Snapdragon X1 Elite chipset.

Signed-off-by: Akhil P Oommen 
---

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 19 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  6 ++
 drivers/gpu/drm/msm/adreno/adreno_device.c | 14 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.h|  5 +
 4 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 0e3dfd4c2bc8..168a4bddfaf2 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -830,8 +830,10 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
unsigned int state)
 */
gmu_write(gmu, REG_A6XX_GMU_CM3_CFG, 0x4052);
 
+   if (adreno_is_x185(adreno_gpu)) {
+   chipid = 0x7050001;
/* NOTE: A730 may also fall in this if-condition with a future GMU fw 
update. */
-   if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
+   } else if (adreno_is_a7xx(adreno_gpu) && !adreno_is_a730(adreno_gpu)) {
/* A7xx GPUs have obfuscated chip IDs. Use constant maj = 7 */
chipid = FIELD_PREP(GENMASK(31, 24), 0x7);
 
@@ -1329,9 +1331,18 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct device 
*dev, u32 *votes,
if (!pri_count)
return -EINVAL;
 
-   sec = cmd_db_read_aux_data("mx.lvl", _count);
-   if (IS_ERR(sec))
-   return PTR_ERR(sec);
+   /*
+* Some targets have a separate gfx mxc rail. So try to read that first 
and then fall back
+* to regular mx rail if it is missing
+*/
+   sec = cmd_db_read_aux_data("gmxc.lvl", _count);
+   if (PTR_ERR_OR_ZERO(sec) == -EPROBE_DEFER) {
+   return -EPROBE_DEFER;
+   } else if (IS_ERR(sec)) {
+   sec = cmd_db_read_aux_data("mx.lvl", _count);
+   if (IS_ERR(sec))
+   return PTR_ERR(sec);
+   }
 
sec_count >>= 1;
if (!sec_count)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 973872ad0474..97837f7f2a40 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1319,9 +1319,7 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
count = ARRAY_SIZE(a660_protect);
count_max = 48;
BUILD_BUG_ON(ARRAY_SIZE(a660_protect) > 48);
-   } else if (adreno_is_a730(adreno_gpu) ||
-  adreno_is_a740(adreno_gpu) ||
-  adreno_is_a750(adreno_gpu)) {
+   } else if (adreno_is_a7xx(adreno_gpu)) {
regs = a730_protect;
count = ARRAY_SIZE(a730_protect);
count_max = 48;
@@ -1891,7 +1889,7 @@ static int hw_init(struct msm_gpu *gpu)
gpu_write(gpu, REG_A6XX_UCHE_CLIENT_PF, BIT(7) | 0x1);
 
/* Set weights for bicubic filtering */
-   if (adreno_is_a650_family(adreno_gpu)) {
+   if (adreno_is_a650_family(adreno_gpu) || adreno_is_x185(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_0, 0);
gpu_write(gpu, REG_A6XX_TPL1_BICUBIC_WEIGHTS_TABLE_1,
0x3fe05ff4);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
b/drivers/gpu/drm/msm/adreno/adreno_device.c
index c3703a51287b..139c7d828749 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -568,6 +568,20 @@ static const struct adreno_info gpulist[] = {
.zapfw = "a740_zap.mdt",
.hwcg = a740_hwcg,
.address_space_size = SZ_16G,
+   }, {
+   .chip_ids = ADRENO_CHIP_IDS(0x43050c01), /* "C512v2" */
+   .family = ADRENO_7XX_GEN2,
+   .fw = {
+   [ADRENO_FW_SQE] = "gen70500_sqe.fw",
+   [ADRENO_FW_GMU] = "gen70500_gmu.bin",
+   },
+   .gmem = 3 * SZ_1M,
+   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
+ ADRENO_QUIRK_HAS_HW_APRIV,
+   .init = a6xx_gpu_init,
+   .hwcg = a740_hwcg,
+   .address_space_size = SZ_16G,
}, {
.chip_ids = ADRENO_CHIP_IDS(0x43051401), /* "C520v2" */
.family = ADRENO_7XX_GEN3,
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 77526892eb8c..d9ea8e0f6ad5 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -448,6 +448,11 @@ static inline int adreno_is_a750(struct adreno_gpu *gpu)
return gpu->info->chip_ids[0] == 0x43051401;
 }
 
+static inline 

[PATCH v1 0/3] Support for Adreno X1-85 GPU

2024-06-23 Thread Akhil P Oommen
This series adds support for the Adreno X1-85 GPU found in Qualcomm's
compute series chipset, Snapdragon X1 Elite (x1e80100). In this new
naming scheme for Adreno GPU, 'X' stands for compute series, '1' denotes
1st generation and '8' & '5' denotes the tier and the SKU which it
belongs.

X1-85 has major focus on doubling core clock frequency and bandwidth
throughput. It has a dedicated collapsible Graphics MX rail (gmxc) to
power the memories and double the number of data channels to improve
bandwidth to DDR.

Mesa has the necessary bits present already to support this GPU. We are
able to bring up Gnome desktop by hardcoding "0x43050a01" as
chipid. Also, verified glxgears and glmark2. We have plans to add the
new chipid support to Mesa in next few weeks, but these patches can go in
right away to get included in v6.11.

This series is rebased on top of v6.10-rc4. P3 cherry-picks cleanly on
qcom/for-next.

P1 & P2 for Rob, P3 for Bjorn to pick up.


Akhil P Oommen (3):
  dt-bindings: display/msm/gmu: Add Adreno X185 GMU
  drm/msm/adreno: Add support for X185 GPU
  arm64: dts: qcom: x1e80100: Add gpu support

 .../devicetree/bindings/display/msm/gmu.yaml  |   4 +
 arch/arm64/boot/dts/qcom/x1e80100.dtsi| 195 ++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  19 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |   6 +-
 drivers/gpu/drm/msm/adreno/adreno_device.c|  14 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   5 +
 6 files changed, 235 insertions(+), 8 deletions(-)

-- 
2.45.1



Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-18 Thread Akhil P Oommen
On Tue, Jun 04, 2024 at 07:35:04PM +0200, Konrad Dybcio wrote:
> 
> 
> On 5/14/24 20:38, Akhil P Oommen wrote:
> > On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote:
> > > Memory barriers help ensure instruction ordering, NOT time and order
> > > of actual write arrival at other observers (e.g. memory-mapped IP).
> > > On architectures employing weak memory ordering, the latter can be a
> > > giant pain point, and it has been as part of this driver.
> > > 
> > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> > > readl/writel, which include r/w (respectively) barriers.
> > > 
> > > Replace the barriers with a readback that ensures the previous writes
> > > have exited the write buffer (as the CPU must flush the write to the
> > > register it's trying to read back) and subsequently remove the hack
> > > introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt
> > > status in hw_init").
> > > 
> > > Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in 
> > > hw_init")
> > > Signed-off-by: Konrad Dybcio 
> > > ---
> > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  5 ++---
> > >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 --
> > >   2 files changed, 6 insertions(+), 13 deletions(-)
> > 
> > I prefer this version compared to the v2. A helper routine is
> > unnecessary here because:
> > 1. there are very few scenarios where we have to read back the same
> > register.
> > 2. we may accidently readback a write only register.
> 
> Which would still trigger an address dependency on the CPU, no?

Yes, but it is not a good idea to read a write-only register. We can't be
sure about its effect on the endpoint.

> 
> > 
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > index 0e3dfd4c2bc8..4135a53b55a7 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
> > >   int ret;
> > >   u32 val;
> > > - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> > > - /* Wait for the register to finish posting */
> > > - wmb();
> > > + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> > > + gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ);
> > 
> > This is unnecessary because we are polling on a register on the same port 
> > below. But I think we
> > can replace "wmb()" above with "mb()" to avoid reordering between read
> > and write IO instructions.
> 
> Ok on the dropping readback part
> 
> + AFAIU from Will's response, we can drop the barrier as well

Lets wait a bit on Will's response on compiler reordering.

> 
> > 
> > >   ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val,
> > >   val & (1 << 1), 100, 1);
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > index 973872ad0474..0acbc38b8e70 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > @@ -1713,22 +1713,16 @@ static int hw_init(struct msm_gpu *gpu)
> > >   }
> > >   /* Clear GBIF halt in case GX domain was not collapsed */
> > > + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > 
> > We need a full barrier here to avoid reordering. Also, lets add a
> > comment about why we are doing this odd looking sequence.
> > 
> > > + gpu_read(gpu, REG_A6XX_GBIF_HALT);
> > >   if (adreno_is_a619_holi(adreno_gpu)) {
> > > - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > >   gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0);
> > > - /* Let's make extra sure that the GPU can access the memory.. */
> > > - mb();
> > 
> > We need a full barrier here.
> > 
> > > + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL);
> > >   } else if (a6xx_has_gbif(adreno_gpu)) {
> > > - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > >   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
> > > - /* Let's make extra sure that the GPU can access the memory.. */
> > > - mb();
> > 
> > We need a full barrier here.
> 
> Not sure we do between REG_A6XX

Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-18 Thread Akhil P Oommen
On Tue, Jun 04, 2024 at 03:40:56PM +0100, Will Deacon wrote:
> On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote:
> > On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote:
> > > On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> > > > If I understand correctly, you don't need any memory barrier.
> > > > writel()/readl()'s are ordered to the same endpoint. That goes for all
> > > > the reordering/barrier comments mentioned below too.
> > > > 
> > > > device-io.rst:
> > > > 
> > > > The read and write functions are defined to be ordered. That is the
> > > > compiler is not permitted to reorder the I/O sequence. When the 
> > > > ordering
> > > > can be compiler optimised, you can use __readb() and friends to
> > > > indicate the relaxed ordering. Use this with care.
> > > > 
> > > > memory-barriers.txt:
> > > > 
> > > >  (*) readX(), writeX():
> > > > 
> > > > The readX() and writeX() MMIO accessors take a pointer to 
> > > > the
> > > > peripheral being accessed as an __iomem * parameter. For 
> > > > pointers
> > > > mapped with the default I/O attributes (e.g. those returned 
> > > > by
> > > > ioremap()), the ordering guarantees are as follows:
> > > > 
> > > > 1. All readX() and writeX() accesses to the same peripheral 
> > > > are ordered
> > > >with respect to each other. This ensures that MMIO 
> > > > register accesses
> > > >by the same CPU thread to a particular device will 
> > > > arrive in program
> > > >order.
> > > > 
> > > 
> > > In arm64, a writel followed by readl translates to roughly the following
> > > sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not
> > > sure what is stopping compiler from reordering  __raw_writel() and 
> > > __raw_readl()
> > > above? I am assuming iomem cookie is ignored during compilation.
> > 
> > It seems to me that is due to some usage of volatile there in
> > __raw_writel() etc, but to be honest after reading about volatile and
> > some threads from gcc mailing lists, I don't have a confident answer :)
> > 
> > > 
> > > Added Will to this thread if he can throw some light on this.
> > 
> > Hopefully Will can school us.
> 
> The ordering in this case is ensured by the memory attributes used for
> ioremap(). When an MMIO region is mapped using Device-nGnRE attributes
> (as it the case for ioremap()), the "nR" part means "no reordering", so
> readX() and writeX() to that region are ordered wrt each other.

But that avoids only HW reordering, doesn't it? What about *compiler 
reordering* in the
case of a writel following by a readl which translates to:
1: dmb_wmb()
2: __raw_writel() -> roughly "asm volatile('str')
3: __raw_readl() -> roughly "asm volatile('ldr')
4: dmb_rmb()

Is the 'volatile' keyword sufficient to avoid reordering between (2) and (3)? Or
do we need a "memory" clobber to inhibit reordering?

This is still not clear to me even after going through some compiler 
documentions.

-Akhil.

> 
> Note that guarantee _doesn't_ apply to other flavours of ioremap(), so
> e.g. ioremap_wc() won't give you the ordering.
> 
> Hope that helps,
> 
> Will


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-05-16 Thread Akhil P Oommen
On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> On Wed, May 15, 2024 at 12:08:49AM GMT, Akhil P Oommen wrote:
> > On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote:
> > > Memory barriers help ensure instruction ordering, NOT time and order
> > > of actual write arrival at other observers (e.g. memory-mapped IP).
> > > On architectures employing weak memory ordering, the latter can be a
> > > giant pain point, and it has been as part of this driver.
> > > 
> > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> > > readl/writel, which include r/w (respectively) barriers.
> > > 
> > > Replace the barriers with a readback that ensures the previous writes
> > > have exited the write buffer (as the CPU must flush the write to the
> > > register it's trying to read back) and subsequently remove the hack
> > > introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt
> > > status in hw_init").
> 
> For what its worth, I've been eyeing (but haven't tested) sending some
> patches to clean up dsi_phy_write_udelay/ndelay(). There's no ordering
> guarantee between a writel() and a delay(), so the expected "write then
> delay" sequence might not be happening.. you need to write, read, delay.
> 
> memory-barriers.txt:
> 
>   5. A readX() by a CPU thread from the peripheral will complete before
>  any subsequent delay() loop can begin execution on the same thread.
>  This ensures that two MMIO register writes by the CPU to a peripheral
>  will arrive at least 1us apart if the first write is immediately read
>  back with readX() and udelay(1) is called prior to the second
>  writeX():
> 
>   writel(42, DEVICE_REGISTER_0); // Arrives at the device...
>   readl(DEVICE_REGISTER_0);
>   udelay(1);
>   writel(42, DEVICE_REGISTER_1); // ...at least 1us before this.

Yes, udelay orders only with readl(). I saw a patch from Will Deacon
which fixes this for arm64 few years back:
https://lore.kernel.org/all/1543251228-30001-1-git-send-email-will.dea...@arm.com/T/

But this is needed only when you write io and do cpuside wait , not when
you poll io to check status.

> 
> > > 
> > > Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in 
> > > hw_init")
> > > Signed-off-by: Konrad Dybcio 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  5 ++---
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 --
> > >  2 files changed, 6 insertions(+), 13 deletions(-)
> > 
> > I prefer this version compared to the v2. A helper routine is
> > unnecessary here because:
> > 1. there are very few scenarios where we have to read back the same
> > register.
> > 2. we may accidently readback a write only register.
> > 
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > index 0e3dfd4c2bc8..4135a53b55a7 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
> > >   int ret;
> > >   u32 val;
> > >  
> > > - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> > > - /* Wait for the register to finish posting */
> > > - wmb();
> > > + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> > > + gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ);
> > 
> > This is unnecessary because we are polling on a register on the same port 
> > below. But I think we
> > can replace "wmb()" above with "mb()" to avoid reordering between read
> > and write IO instructions.
> 
> If I understand correctly, you don't need any memory barrier.
> writel()/readl()'s are ordered to the same endpoint. That goes for all
> the reordering/barrier comments mentioned below too.
> 
> device-io.rst:
> 
> The read and write functions are defined to be ordered. That is the
> compiler is not permitted to reorder the I/O sequence. When the ordering
> can be compiler optimised, you can use __readb() and friends to
> indicate the relaxed ordering. Use this with care.
> 
> memory-barriers.txt:
> 
>  (*) readX(), writeX():
> 
>   The readX() and writeX() MMIO accessors take a pointer to the
>   peripheral being accessed as an __iomem * parameter. For pointers
>   mapped with the default I/O attributes (e.g.

Re: [PATCH] drm/msm: Add obj flags to gpu devcoredump

2024-05-14 Thread Akhil P Oommen
On Mon, May 13, 2024 at 08:51:47AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> When debugging faults, it is useful to know how the BO is mapped (cached
> vs WC, gpu readonly, etc).
> 
> Signed-off-by: Rob Clark 

Reviewed-by: Akhil P Oommen 

-Akhil

> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 1 +
>  drivers/gpu/drm/msm/msm_gpu.c   | 6 --
>  drivers/gpu/drm/msm/msm_gpu.h   | 1 +
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index b7bbef2eeff4..d9ea15994ae9 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -887,6 +887,7 @@ void adreno_show(struct msm_gpu *gpu, struct 
> msm_gpu_state *state,
>   drm_printf(p, "  - iova: 0x%016llx\n",
>   state->bos[i].iova);
>   drm_printf(p, "size: %zd\n", state->bos[i].size);
> + drm_printf(p, "flags: 0x%x\n", state->bos[i].flags);
>   drm_printf(p, "name: %-32s\n", state->bos[i].name);
>  
>   adreno_show_object(p, >bos[i].data,
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index d14ec058906f..ceaee23a4d22 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -222,14 +222,16 @@ static void msm_gpu_crashstate_get_bo(struct 
> msm_gpu_state *state,
>   struct drm_gem_object *obj, u64 iova, bool full)
>  {
>   struct msm_gpu_state_bo *state_bo = >bos[state->nr_bos];
> + struct msm_gem_object *msm_obj = to_msm_bo(obj);
>  
>   /* Don't record write only objects */
>   state_bo->size = obj->size;
> + state_bo->flags = msm_obj->flags;
>   state_bo->iova = iova;
>  
> - BUILD_BUG_ON(sizeof(state_bo->name) != sizeof(to_msm_bo(obj)->name));
> + BUILD_BUG_ON(sizeof(state_bo->name) != sizeof(msm_obj->name));
>  
> - memcpy(state_bo->name, to_msm_bo(obj)->name, sizeof(state_bo->name));
> + memcpy(state_bo->name, msm_obj->name, sizeof(state_bo->name));
>  
>   if (full) {
>   void *ptr;
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 685470b84708..05bb247e7210 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -527,6 +527,7 @@ struct msm_gpu_submitqueue {
>  struct msm_gpu_state_bo {
>   u64 iova;
>   size_t size;
> + u32 flags;
>   void *data;
>   bool encoded;
>   char name[32];
> -- 
> 2.45.0
> 


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-05-14 Thread Akhil P Oommen
On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote:
> Memory barriers help ensure instruction ordering, NOT time and order
> of actual write arrival at other observers (e.g. memory-mapped IP).
> On architectures employing weak memory ordering, the latter can be a
> giant pain point, and it has been as part of this driver.
> 
> Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> readl/writel, which include r/w (respectively) barriers.
> 
> Replace the barriers with a readback that ensures the previous writes
> have exited the write buffer (as the CPU must flush the write to the
> register it's trying to read back) and subsequently remove the hack
> introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt
> status in hw_init").
> 
> Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in hw_init")
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  5 ++---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 --
>  2 files changed, 6 insertions(+), 13 deletions(-)

I prefer this version compared to the v2. A helper routine is
unnecessary here because:
1. there are very few scenarios where we have to read back the same
register.
2. we may accidently readback a write only register.

> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 0e3dfd4c2bc8..4135a53b55a7 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
>   int ret;
>   u32 val;
>  
> - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> - /* Wait for the register to finish posting */
> - wmb();
> + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> + gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ);

This is unnecessary because we are polling on a register on the same port 
below. But I think we
can replace "wmb()" above with "mb()" to avoid reordering between read
and write IO instructions.

>  
>   ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val,
>   val & (1 << 1), 100, 1);
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 973872ad0474..0acbc38b8e70 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1713,22 +1713,16 @@ static int hw_init(struct msm_gpu *gpu)
>   }
>  
>   /* Clear GBIF halt in case GX domain was not collapsed */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);

We need a full barrier here to avoid reordering. Also, lets add a
comment about why we are doing this odd looking sequence.

> + gpu_read(gpu, REG_A6XX_GBIF_HALT);
>   if (adreno_is_a619_holi(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
>   gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0);
> - /* Let's make extra sure that the GPU can access the memory.. */
> - mb();

We need a full barrier here.

> + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL);
>   } else if (a6xx_has_gbif(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
>   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
> - /* Let's make extra sure that the GPU can access the memory.. */
> - mb();

We need a full barrier here.

> + gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT);
>   }
>  
> - /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */
> - if (adreno_is_a7xx(adreno_gpu) && a6xx_has_gbif(adreno_gpu))
> - spin_until(!gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK));
> -

Why is this removed?

-Akhil

>   gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);
>  
>   if (adreno_is_a619_holi(adreno_gpu))
> 
> ---
> base-commit: 93a39e4766083050ca0ecd6a3548093a3b9eb60c
> change-id: 20240508-topic-adreno-a2d199cd4152
> 
> Best regards,
> -- 
> Konrad Dybcio 
> 


Re: [PATCH v4 04/16] drm/msm: move msm_gpummu.c to adreno/a2xx_gpummu.c

2024-03-25 Thread Akhil P Oommen
On Sun, Mar 24, 2024 at 01:13:55PM +0200, Dmitry Baryshkov wrote:
> On Sun, 24 Mar 2024 at 11:55, Akhil P Oommen  wrote:
> >
> > On Sat, Mar 23, 2024 at 12:56:56AM +0200, Dmitry Baryshkov wrote:
> > > The msm_gpummu.c implementation is used only on A2xx and it is tied to
> > > the A2xx registers. Rename the source file accordingly.
> > >
> >
> > There are very few functions in this file and a2xx_gpu.c is a relatively
> > small source file too. Shall we just move them to a2xx_gpu.c instead of
> > renaming?
> 
> I'd prefer to keep them separate, at least within this series. Let's
> leave that to Rob's discretion.

Sounds good.

Reviewed-by: Akhil P Oommen 

-Akhil

> 
> > -Akhil
> >
> > > Signed-off-by: Dmitry Baryshkov 
> > > ---
> > >  drivers/gpu/drm/msm/Makefile   |  2 +-
> > >  drivers/gpu/drm/msm/adreno/a2xx_gpu.c  |  4 +-
> > >  drivers/gpu/drm/msm/adreno/a2xx_gpu.h  |  4 ++
> > >  .../drm/msm/{msm_gpummu.c => adreno/a2xx_gpummu.c} | 45 
> > > --
> > >  drivers/gpu/drm/msm/msm_mmu.h  |  5 ---
> > >  5 files changed, 31 insertions(+), 29 deletions(-)
> 
> 
> -- 
> With best wishes
> Dmitry


Re: [PATCH v4 10/16] drm/msm: generate headers on the fly

2024-03-25 Thread Akhil P Oommen
On Sun, Mar 24, 2024 at 12:57:43PM +0200, Dmitry Baryshkov wrote:
> On Sun, 24 Mar 2024 at 12:30, Akhil P Oommen  wrote:
> >
> > On Sat, Mar 23, 2024 at 12:57:02AM +0200, Dmitry Baryshkov wrote:
> > > Generate DRM/MSM headers on the fly during kernel build. This removes a
> > > need to push register changes to Mesa with the following manual
> > > synchronization step. Existing headers will be removed in the following
> > > commits (split away to ease reviews).
> >
> > Is this approach common in upstream kernel? Isn't it a bit awkward from
> > legal perspective to rely on a source file outside of kernel during
> > compilation?
> 
> As long as the source file for that file is available. For examples of
> non-trivial generated files see
> arch/arm64/include/generated/sysreg-defs.h and
> arch/arm64/include/generated/cpucap-defs.h

I see that the xml files import a GPL compatible license, so I guess 
those are fine. The gen_header.py script doesn't include any license.
Shouldn't it have one?

-Akhil.

> 
> -- 
> With best wishes
> Dmitry


Re: [PATCH v4 10/16] drm/msm: generate headers on the fly

2024-03-24 Thread Akhil P Oommen
On Sat, Mar 23, 2024 at 12:57:02AM +0200, Dmitry Baryshkov wrote:
> Generate DRM/MSM headers on the fly during kernel build. This removes a
> need to push register changes to Mesa with the following manual
> synchronization step. Existing headers will be removed in the following
> commits (split away to ease reviews).

Is this approach common in upstream kernel? Isn't it a bit awkward from
legal perspective to rely on a source file outside of kernel during
compilation?

-Akhil

> 
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/.gitignore |  1 +
>  drivers/gpu/drm/msm/Makefile   | 97 
> +-
>  drivers/gpu/drm/msm/msm_drv.c  |  3 +-
>  drivers/gpu/drm/msm/msm_gpu.c  |  2 +-
>  4 files changed, 80 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/.gitignore b/drivers/gpu/drm/msm/.gitignore
> new file mode 100644
> index ..9ab870da897d
> --- /dev/null
> +++ b/drivers/gpu/drm/msm/.gitignore
> @@ -0,0 +1 @@
> +generated/
> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
> index 26ed4f443149..c861de58286c 100644
> --- a/drivers/gpu/drm/msm/Makefile
> +++ b/drivers/gpu/drm/msm/Makefile
> @@ -1,10 +1,11 @@
>  # SPDX-License-Identifier: GPL-2.0
>  ccflags-y := -I $(srctree)/$(src)
> +ccflags-y += -I $(obj)/generated
>  ccflags-y += -I $(srctree)/$(src)/disp/dpu1
>  ccflags-$(CONFIG_DRM_MSM_DSI) += -I $(srctree)/$(src)/dsi
>  ccflags-$(CONFIG_DRM_MSM_DP) += -I $(srctree)/$(src)/dp
>  
> -msm-y := \
> +adreno-y := \
>   adreno/adreno_device.o \
>   adreno/adreno_gpu.o \
>   adreno/a2xx_gpu.o \
> @@ -18,7 +19,11 @@ msm-y := \
>   adreno/a6xx_gmu.o \
>   adreno/a6xx_hfi.o \
>  
> -msm-$(CONFIG_DRM_MSM_HDMI) += \
> +adreno-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
> +
> +adreno-$(CONFIG_DRM_MSM_GPU_STATE)   += adreno/a6xx_gpu_state.o
> +
> +msm-display-$(CONFIG_DRM_MSM_HDMI) += \
>   hdmi/hdmi.o \
>   hdmi/hdmi_audio.o \
>   hdmi/hdmi_bridge.o \
> @@ -31,7 +36,7 @@ msm-$(CONFIG_DRM_MSM_HDMI) += \
>   hdmi/hdmi_phy_8x74.o \
>   hdmi/hdmi_pll_8960.o \
>  
> -msm-$(CONFIG_DRM_MSM_MDP4) += \
> +msm-display-$(CONFIG_DRM_MSM_MDP4) += \
>   disp/mdp4/mdp4_crtc.o \
>   disp/mdp4/mdp4_dsi_encoder.o \
>   disp/mdp4/mdp4_dtv_encoder.o \
> @@ -42,7 +47,7 @@ msm-$(CONFIG_DRM_MSM_MDP4) += \
>   disp/mdp4/mdp4_kms.o \
>   disp/mdp4/mdp4_plane.o \
>  
> -msm-$(CONFIG_DRM_MSM_MDP5) += \
> +msm-display-$(CONFIG_DRM_MSM_MDP5) += \
>   disp/mdp5/mdp5_cfg.o \
>   disp/mdp5/mdp5_cmd_encoder.o \
>   disp/mdp5/mdp5_ctl.o \
> @@ -55,7 +60,7 @@ msm-$(CONFIG_DRM_MSM_MDP5) += \
>   disp/mdp5/mdp5_plane.o \
>   disp/mdp5/mdp5_smp.o \
>  
> -msm-$(CONFIG_DRM_MSM_DPU) += \
> +msm-display-$(CONFIG_DRM_MSM_DPU) += \
>   disp/dpu1/dpu_core_perf.o \
>   disp/dpu1/dpu_crtc.o \
>   disp/dpu1/dpu_encoder.o \
> @@ -85,14 +90,16 @@ msm-$(CONFIG_DRM_MSM_DPU) += \
>   disp/dpu1/dpu_vbif.o \
>   disp/dpu1/dpu_writeback.o
>  
> -msm-$(CONFIG_DRM_MSM_MDSS) += \
> +msm-display-$(CONFIG_DRM_MSM_MDSS) += \
>   msm_mdss.o \
>  
> -msm-y += \
> +msm-display-y += \
>   disp/mdp_format.o \
>   disp/mdp_kms.o \
>   disp/msm_disp_snapshot.o \
>   disp/msm_disp_snapshot_util.o \
> +
> +msm-y += \
>   msm_atomic.o \
>   msm_atomic_tracepoints.o \
>   msm_debugfs.o \
> @@ -115,12 +122,12 @@ msm-y += \
>   msm_submitqueue.o \
>   msm_gpu_tracepoints.o \
>  
> -msm-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
> - dp/dp_debug.o
> +msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
>  
> -msm-$(CONFIG_DRM_MSM_GPU_STATE)  += adreno/a6xx_gpu_state.o
> +msm-display-$(CONFIG_DEBUG_FS) += \
> + dp/dp_debug.o
>  
> -msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
> +msm-display-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
>   dp/dp_catalog.o \
>   dp/dp_ctrl.o \
>   dp/dp_display.o \
> @@ -130,21 +137,69 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
>   dp/dp_audio.o \
>   dp/dp_utils.o
>  
> -msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
> -
> -msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
> +msm-display-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
>  
> -msm-$(CONFIG_DRM_MSM_DSI) += dsi/dsi.o \
> +msm-display-$(CONFIG_DRM_MSM_DSI) += dsi/dsi.o \
>   dsi/dsi_cfg.o \
>   dsi/dsi_host.o \
>   dsi/dsi_manager.o \
>   dsi/phy/dsi_phy.o
>  
> -msm-$(CONFIG_DRM_MSM_DSI_28NM_PHY) += dsi/phy/dsi_phy_28nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_20NM_PHY) += dsi/phy/dsi_phy_20nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_28NM_8960_PHY) += dsi/phy/dsi_phy_28nm_8960.o
> -msm-$(CONFIG_DRM_MSM_DSI_14NM_PHY) += dsi/phy/dsi_phy_14nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_10NM_PHY) += dsi/phy/dsi_phy_10nm.o
> -msm-$(CONFIG_DRM_MSM_DSI_7NM_PHY) += dsi/phy/dsi_phy_7nm.o
> +msm-display-$(CONFIG_DRM_MSM_DSI_28NM_PHY) += dsi/phy/dsi_phy_28nm.o
> 

Re: [PATCH v4 04/16] drm/msm: move msm_gpummu.c to adreno/a2xx_gpummu.c

2024-03-24 Thread Akhil P Oommen
On Sat, Mar 23, 2024 at 12:56:56AM +0200, Dmitry Baryshkov wrote:
> The msm_gpummu.c implementation is used only on A2xx and it is tied to
> the A2xx registers. Rename the source file accordingly.
> 

There are very few functions in this file and a2xx_gpu.c is a relatively
small source file too. Shall we just move them to a2xx_gpu.c instead of
renaming?

-Akhil

> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/Makefile   |  2 +-
>  drivers/gpu/drm/msm/adreno/a2xx_gpu.c  |  4 +-
>  drivers/gpu/drm/msm/adreno/a2xx_gpu.h  |  4 ++
>  .../drm/msm/{msm_gpummu.c => adreno/a2xx_gpummu.c} | 45 
> --
>  drivers/gpu/drm/msm/msm_mmu.h  |  5 ---
>  5 files changed, 31 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
> index b21ae2880c71..26ed4f443149 100644
> --- a/drivers/gpu/drm/msm/Makefile
> +++ b/drivers/gpu/drm/msm/Makefile
> @@ -8,6 +8,7 @@ msm-y := \
>   adreno/adreno_device.o \
>   adreno/adreno_gpu.o \
>   adreno/a2xx_gpu.o \
> + adreno/a2xx_gpummu.o \
>   adreno/a3xx_gpu.o \
>   adreno/a4xx_gpu.o \
>   adreno/a5xx_gpu.o \
> @@ -113,7 +114,6 @@ msm-y += \
>   msm_ringbuffer.o \
>   msm_submitqueue.o \
>   msm_gpu_tracepoints.o \
> - msm_gpummu.o
>  
>  msm-$(CONFIG_DEBUG_FS) += adreno/a5xx_debugfs.o \
>   dp/dp_debug.o
> diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> index 0d8133f3174b..0dc255ddf5ce 100644
> --- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
> @@ -113,7 +113,7 @@ static int a2xx_hw_init(struct msm_gpu *gpu)
>   uint32_t *ptr, len;
>   int i, ret;
>  
> - msm_gpummu_params(gpu->aspace->mmu, _base, _error);
> + a2xx_gpummu_params(gpu->aspace->mmu, _base, _error);
>  
>   DBG("%s", gpu->name);
>  
> @@ -469,7 +469,7 @@ static struct msm_gpu_state *a2xx_gpu_state_get(struct 
> msm_gpu *gpu)
>  static struct msm_gem_address_space *
>  a2xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
>  {
> - struct msm_mmu *mmu = msm_gpummu_new(>dev, gpu);
> + struct msm_mmu *mmu = a2xx_gpummu_new(>dev, gpu);
>   struct msm_gem_address_space *aspace;
>  
>   aspace = msm_gem_address_space_create(mmu, "gpu", SZ_16M,
> diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.h 
> b/drivers/gpu/drm/msm/adreno/a2xx_gpu.h
> index 161a075f94af..53702f19990f 100644
> --- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.h
> @@ -19,4 +19,8 @@ struct a2xx_gpu {
>  };
>  #define to_a2xx_gpu(x) container_of(x, struct a2xx_gpu, base)
>  
> +struct msm_mmu *a2xx_gpummu_new(struct device *dev, struct msm_gpu *gpu);
> +void a2xx_gpummu_params(struct msm_mmu *mmu, dma_addr_t *pt_base,
> + dma_addr_t *tran_error);
> +
>  #endif /* __A2XX_GPU_H__ */
> diff --git a/drivers/gpu/drm/msm/msm_gpummu.c 
> b/drivers/gpu/drm/msm/adreno/a2xx_gpummu.c
> similarity index 67%
> rename from drivers/gpu/drm/msm/msm_gpummu.c
> rename to drivers/gpu/drm/msm/adreno/a2xx_gpummu.c
> index f7d1945e0c9f..39641551eeb6 100644
> --- a/drivers/gpu/drm/msm/msm_gpummu.c
> +++ b/drivers/gpu/drm/msm/adreno/a2xx_gpummu.c
> @@ -5,30 +5,33 @@
>  
>  #include "msm_drv.h"
>  #include "msm_mmu.h"
> -#include "adreno/adreno_gpu.h"
> -#include "adreno/a2xx.xml.h"
>  
> -struct msm_gpummu {
> +#include "adreno_gpu.h"
> +#include "a2xx_gpu.h"
> +
> +#include "a2xx.xml.h"
> +
> +struct a2xx_gpummu {
>   struct msm_mmu base;
>   struct msm_gpu *gpu;
>   dma_addr_t pt_base;
>   uint32_t *table;
>  };
> -#define to_msm_gpummu(x) container_of(x, struct msm_gpummu, base)
> +#define to_a2xx_gpummu(x) container_of(x, struct a2xx_gpummu, base)
>  
>  #define GPUMMU_VA_START SZ_16M
>  #define GPUMMU_VA_RANGE (0xfff * SZ_64K)
>  #define GPUMMU_PAGE_SIZE SZ_4K
>  #define TABLE_SIZE (sizeof(uint32_t) * GPUMMU_VA_RANGE / GPUMMU_PAGE_SIZE)
>  
> -static void msm_gpummu_detach(struct msm_mmu *mmu)
> +static void a2xx_gpummu_detach(struct msm_mmu *mmu)
>  {
>  }
>  
> -static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t iova,
> +static int a2xx_gpummu_map(struct msm_mmu *mmu, uint64_t iova,
>   struct sg_table *sgt, size_t len, int prot)
>  {
> - struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
> + struct a2xx_gpummu *gpummu = to_a2xx_gpummu(mmu);
>   unsigned idx = (iova - GPUMMU_VA_START) / GPUMMU_PAGE_SIZE;
>   struct sg_dma_page_iter dma_iter;
>   unsigned prot_bits = 0;
> @@ -53,9 +56,9 @@ static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t 
> iova,
>   return 0;
>  }
>  
> -static int msm_gpummu_unmap(struct msm_mmu *mmu, uint64_t iova, size_t len)
> +static int a2xx_gpummu_unmap(struct msm_mmu *mmu, uint64_t iova, size_t len)
>  {
> - struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
> + struct a2xx_gpummu *gpummu = 

Re: [PATCH] drm/msm/a6xx: Fix recovery vs runpm race

2023-12-22 Thread Akhil P Oommen
On Mon, Dec 18, 2023 at 07:59:24AM -0800, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> a6xx_recover() is relying on the gpu lock to serialize against incoming
> submits doing a runpm get, as it tries to temporarily balance out the
> runpm gets with puts in order to power off the GPU.  Unfortunately this
> gets worse when we (in a later patch) will move the runpm get out of the
> scheduler thread/work to move it out of the fence signaling path.
> 
> Instead we can just simplify the whole thing by using force_suspend() /
> force_resume() instead of trying to be clever.

At some places, we take a pm_runtime vote and access the gpu
registers assuming it will be powered until we drop the vote.  
a6xx_get_timestamp()
is an example. If we do a force suspend, it may cause bus errors from
those threads. Now you have to serialize every place we do runtime_get/put with 
a
mutex. Or is there a better way to handle the 'later patch' you
mentioned?

-Akhil.

> 
> Reported-by: David Heidelberg 
> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10272
> Fixes: abe2023b4cea ("drm/msm/gpu: Push gpu lock down past runpm")
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 12 ++--
>  1 file changed, 2 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 268737e59131..a5660d63535b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1244,12 +1244,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
>   dev_pm_genpd_add_notifier(gmu->cxpd, >pd_nb);
>   dev_pm_genpd_synced_poweroff(gmu->cxpd);
>  
> - /* Drop the rpm refcount from active submits */
> - if (active_submits)
> - pm_runtime_put(>pdev->dev);
> -
> - /* And the final one from recover worker */
> - pm_runtime_put_sync(>pdev->dev);
> + pm_runtime_force_suspend(>pdev->dev);
>  
>   if (!wait_for_completion_timeout(>pd_gate, msecs_to_jiffies(1000)))
>   DRM_DEV_ERROR(>pdev->dev, "cx gdsc didn't collapse\n");
> @@ -1258,10 +1253,7 @@ static void a6xx_recover(struct msm_gpu *gpu)
>  
>   pm_runtime_use_autosuspend(>pdev->dev);
>  
> - if (active_submits)
> - pm_runtime_get(>pdev->dev);
> -
> - pm_runtime_get_sync(>pdev->dev);
> + pm_runtime_force_resume(>pdev->dev);
>  
>   gpu->active_submits = active_submits;
>   mutex_unlock(>active_lock);
> -- 
> 2.43.0
> 


Re: [PATCH v2 1/1] drm/msm/adreno: Add support for SM7150 SoC machine

2023-12-07 Thread Akhil P Oommen
On Thu, Nov 23, 2023 at 12:03:56AM +0300, Danila Tikhonov wrote:
> 
> sc7180/sm7125 (atoll) expects speedbins from atoll.dtsi:
> And has a parameter: /delete-property/ qcom,gpu-speed-bin;
> 107 for 504Mhz max freq, pwrlevel 4
> 130 for 610Mhz max freq, pwrlevel 3
> 159 for 750Mhz max freq, pwrlevel 5
> 169 for 800Mhz max freq, pwrlevel 2
> 174 for 825Mhz max freq, pwrlevel 1 (Downstream says 172, but thats probably
> typo)
A bit confused. where do you see 172 in downstream code? It is 174 in the 
downstream
code when I checked.
> For rest of the speed bins, speed-bin value is calulated as
> FMAX/4.8MHz + 2 round up to zero decimal places.
> 
> sm7150 (sdmmagpie) expects speedbins from sdmmagpie-gpu.dtsi:
> 128 for 610Mhz max freq, pwrlevel 3
> 146 for 700Mhz max freq, pwrlevel 2
> 167 for 800Mhz max freq, pwrlevel 4
> 172 for 504Mhz max freq, pwrlevel 1
> For rest of the speed bins, speed-bin value is calulated as
> FMAX/4.8 MHz round up to zero decimal places.
> 
> Creating a new entry does not make much sense.
> I can suggest expanding the standard entry:
> 
> .speedbins = ADRENO_SPEEDBINS(
>     { 0, 0 },
>     /* sc7180/sm7125 */
>     { 107, 3 },
>     { 130, 4 },
>     { 159, 5 },
>     { 168, 1 }, has already
>     { 174, 2 }, has already
>     /* sm7150 */
>     { 128, 1 },
>     { 146, 2 },
>     { 167, 3 },
>     { 172, 4 }, ),
> 

A difference I see between atoll and sdmmagpie is that the former
doesn't support 180Mhz. If you want to do the same, then you need to use
a new bit in the supported-hw bitfield instead of reusing an existing one.
Generally it is better to stick to exactly what downstream does.

-Akhil.

> All the best,
> Danila
> 
> On 11/22/23 23:28, Konrad Dybcio wrote:
> > 
> > 
> > On 10/16/23 16:32, Dmitry Baryshkov wrote:
> > > On 26/09/2023 23:03, Konrad Dybcio wrote:
> > > > On 26.09.2023 21:10, Danila Tikhonov wrote:
> > > > > 
> > > > > I think you mean by name downstream dt - sdmmagpie-gpu.dtsi
> > > > > 
> > > > > You can see the forked version of the mainline here:
> > > > > https://github.com/sm7150-mainline/linux/blob/next/arch/arm64/boot/dts/qcom/sm7150.dtsi
> > > > > 
> > > > > 
> > > > > All fdt that we got here, if it is useful for you:
> > > > > https://github.com/sm7150-mainline/downstream-fdt
> > > > > 
> > > > > Best wishes, Danila
> > > > Taking a look at downstream, atoll.dtsi (SC7180) includes
> > > > sdmmagpie-gpu.dtsi.
> > > > 
> > > > Bottom line is, they share the speed bins, so it should be
> > > > fine to just extend the existing entry.
> > > 
> > > But then atoll.dtsi rewrites speed bins and pwrlevel bins. So they
> > > are not shared.
> > +Akhil
> > 
> > could you please check internally?
> > 
> > Konrad
> 


Re: [Freedreno] [PATCH 1/7] drm/msm/a6xx: Fix unknown speedbin case

2023-10-17 Thread Akhil P Oommen
On Tue, Oct 17, 2023 at 01:22:27AM +0530, Akhil P Oommen wrote:
> 
> On Tue, Sep 26, 2023 at 08:24:36PM +0200, Konrad Dybcio wrote:
> > 
> > When opp-supported-hw is present under an OPP node, but no form of
> > opp_set_supported_hw() has been called, that OPP is ignored by the API
> > and marked as unsupported.
> > 
> > Before Commit c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to
> > device table"), an unknown speedbin would result in marking all OPPs
> > as available, but it's better to avoid potentially overclocking the
> > silicon - the GMU will simply refuse to power up the chip.
> > 
> > Currently, the Adreno speedbin code does just that (AND returns an
> > invalid error, (int)UINT_MAX). Fix that by defaulting to speedbin 0
> > (which is conveniently always bound to fuseval == 0).
> 
> Wish we documented somewhere that we should reserve BIT(0) for fuse
> val=0 always and assume that would be the super SKU.
Aah! I got this backward. Fuseval=0 is the supersku and it is not safe
to fallback to that blindly. Ideally, we should fallback to the lowest
denominator SKU, but it is difficult to predict that upfront and assign
BIT(0).

Anyway, I can't see a better way to handle this.

-Akhil

> 
> Reviewed-by: Akhil P Oommen 
> 
> -Akhil
> 
> > 
> > Fixes: c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to device 
> > table")
> > Signed-off-by: Konrad Dybcio 
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index d4e85e24002f..522ca7fe6762 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -2237,7 +2237,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
> > const struct adreno_info *i
> > DRM_DEV_ERROR(dev,
> > "missing support for speed-bin: %u. Some OPPs may not 
> > be supported by hardware\n",
> > speedbin);
> > -   return UINT_MAX;
> > +   supp_hw = BIT(0); /* Default */
> > }
> >  
> > ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
> > 
> > -- 
> > 2.42.0
> > 


Re: [PATCH 2/7] drm/msm/adreno: Add ZAP firmware name to A635

2023-10-17 Thread Akhil P Oommen


On Tue, Oct 17, 2023 at 12:33:45AM -0700, Rob Clark wrote:
> 
> On Mon, Oct 16, 2023 at 1:12 PM Akhil P Oommen  
> wrote:
> >
> > On Tue, Sep 26, 2023 at 08:24:37PM +0200, Konrad Dybcio wrote:
> > >
> > > Some (many?) devices with A635 expect a ZAP shader to be loaded.
> > >
> > > Set the file name to allow for that.
> > >
> > > Signed-off-by: Konrad Dybcio 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/adreno_device.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > index fa527935ffd4..16527fe8584d 100644
> > > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > @@ -454,6 +454,7 @@ static const struct adreno_info gpulist[] = {
> > >   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > >   ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > > + .zapfw = "a660_zap.mbn",
> >
> > sc7280 doesn't have a TZ and so no zap shader support. Can we handle
> > this using "firmware-name" property in your top level platform dt? Zap
> > firmwares are signed with different keys for each OEMs. So there is
> > cross-compatibility anyway.
I had a typo here. I meant "no cross compatibility".

> 
> I think this ends up working out because the version of sc7280 that
> doesn't have TZ also doesn't have the associated mem-region/etc..  but
> maybe we should deprecate the zapfw field as in practice it isn't
> useful (ie. always overriden by firmware-name).
Sounds good.

> 
> Fwiw there are windows laptops with sc7180/sc7280 which do use zap fw.
Aah! right.
> 
> BR,
> -R
> 
> >
> > -Ahil.
> >
> > >   .hwcg = a660_hwcg,
> > >   .address_space_size = SZ_16G,
> > >   .speedbins = ADRENO_SPEEDBINS(
> > >
> > > --
> > > 2.42.0
> > >


Re: [Freedreno] [PATCH 6/7] arm64: dts: qcom: sc7280: Mark Adreno SMMU as DMA coherent

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:41PM +0200, Konrad Dybcio wrote:
> 
> The SMMUs on sc7280 are cache-coherent. APPS_SMMU is marked as such,
> mark the GPU one as well.
> 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil

> ---
>  arch/arm64/boot/dts/qcom/sc7280.dtsi | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
> b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> index 0d96d1454c49..edaca6c2cf8c 100644
> --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> @@ -2783,6 +2783,7 @@ adreno_smmu: iommu@3da {
>   "gpu_cc_hub_aon_clk";
>  
>   power-domains = < GPU_CC_CX_GDSC>;
> + dma-coherent;
>   };
>  
>   remoteproc_mpss: remoteproc@408 {
> 
> -- 
> 2.42.0
> 


Re: [PATCH 5/7] arm64: dts: qcom: sc7280: Fix up GPU SIDs

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:40PM +0200, Konrad Dybcio wrote:
> 
> GPU_SMMU SID 1 is meant for Adreno LPAC (Low Priority Async Compute).
> On platforms that support it (in firmware), it is necessary to
> describe that link, or Adreno register access will hang the board.
> 
> Add that and fix up the SMR mask of SID 0, which seems to have been
> copypasted from another SoC.
> 
> Fixes: 96c471970b7b ("arm64: dts: qcom: sc7280: Add gpu support")
> Signed-off-by: Konrad Dybcio 
> ---
>  arch/arm64/boot/dts/qcom/sc7280.dtsi | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
> b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> index c38ddf267ef5..0d96d1454c49 100644
> --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
> @@ -2603,7 +2603,8 @@ gpu: gpu@3d0 {
>   "cx_mem",
>   "cx_dbgc";
>   interrupts = ;
> - iommus = <_smmu 0 0x401>;
> + iommus = <_smmu 0 0x400>,
> +  <_smmu 1 0x400>;
Aren't both functionally same? 401 works fine on sc7280. You might be
having issue due to Qcom TZ policies on your platform. I am okay with the 
change, but can
you please reword the commit text?

-Akhil.

>   operating-points-v2 = <_opp_table>;
>   qcom,gmu = <>;
>   interconnects = <_noc MASTER_GFX3D 0 _virt 
> SLAVE_EBI1 0>;
> 
> -- 
> 2.42.0
> 


Re: [PATCH 2/7] drm/msm/adreno: Add ZAP firmware name to A635

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:37PM +0200, Konrad Dybcio wrote:
> 
> Some (many?) devices with A635 expect a ZAP shader to be loaded.
> 
> Set the file name to allow for that.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index fa527935ffd4..16527fe8584d 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -454,6 +454,7 @@ static const struct adreno_info gpulist[] = {
>   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>   ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
> + .zapfw = "a660_zap.mbn",

sc7280 doesn't have a TZ and so no zap shader support. Can we handle
this using "firmware-name" property in your top level platform dt? Zap
firmwares are signed with different keys for each OEMs. So there is
cross-compatibility anyway.

-Ahil.

>   .hwcg = a660_hwcg,
>   .address_space_size = SZ_16G,
>   .speedbins = ADRENO_SPEEDBINS(
> 
> -- 
> 2.42.0
> 


Re: [PATCH 1/7] drm/msm/a6xx: Fix unknown speedbin case

2023-10-16 Thread Akhil P Oommen
On Tue, Sep 26, 2023 at 08:24:36PM +0200, Konrad Dybcio wrote:
> 
> When opp-supported-hw is present under an OPP node, but no form of
> opp_set_supported_hw() has been called, that OPP is ignored by the API
> and marked as unsupported.
> 
> Before Commit c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to
> device table"), an unknown speedbin would result in marking all OPPs
> as available, but it's better to avoid potentially overclocking the
> silicon - the GMU will simply refuse to power up the chip.
> 
> Currently, the Adreno speedbin code does just that (AND returns an
> invalid error, (int)UINT_MAX). Fix that by defaulting to speedbin 0
> (which is conveniently always bound to fuseval == 0).

Wish we documented somewhere that we should reserve BIT(0) for fuse
val=0 always and assume that would be the super SKU.

Reviewed-by: Akhil P Oommen 

-Akhil

> 
> Fixes: c928a05e4415 ("drm/msm/adreno: Move speedbin mapping to device table")
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index d4e85e24002f..522ca7fe6762 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2237,7 +2237,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
> const struct adreno_info *i
>   DRM_DEV_ERROR(dev,
>   "missing support for speed-bin: %u. Some OPPs may not 
> be supported by hardware\n",
>   speedbin);
> - return UINT_MAX;
> + supp_hw = BIT(0); /* Default */
>   }
>  
>   ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
> 
> -- 
> 2.42.0
> 


Re: [Freedreno] [PATCH 12/12] drm/msm/adreno: Switch to chip-id for identifying GPU

2023-07-17 Thread Akhil P Oommen
On Thu, Jul 13, 2023 at 03:06:36PM -0700, Rob Clark wrote:
> 
> On Thu, Jul 13, 2023 at 2:39 PM Akhil P Oommen  
> wrote:
> >
> > On Fri, Jul 07, 2023 at 06:45:42AM +0300, Dmitry Baryshkov wrote:
> > >
> > > On 07/07/2023 00:10, Rob Clark wrote:
> > > > From: Rob Clark 
> > > >
> > > > Since the revision becomes an opaque identifier with future GPUs, move
> > > > away from treating different ranges of bits as having a given meaning.
> > > > This means that we need to explicitly list different patch revisions in
> > > > the device table.
> > > >
> > > > Signed-off-by: Rob Clark 
> > > > ---
> > > >   drivers/gpu/drm/msm/adreno/a4xx_gpu.c  |   2 +-
> > > >   drivers/gpu/drm/msm/adreno/a5xx_gpu.c  |  11 +-
> > > >   drivers/gpu/drm/msm/adreno/a5xx_power.c|   2 +-
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c  |  13 ++-
> > > >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |   9 +-
> > > >   drivers/gpu/drm/msm/adreno/adreno_device.c | 128 ++---
> > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c|  16 +--
> > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  51 
> > > >   8 files changed, 122 insertions(+), 110 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > > > index 715436cb3996..8b4cdf95f445 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > > > @@ -145,7 +145,7 @@ static void a4xx_enable_hwcg(struct msm_gpu *gpu)
> > > > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_DELAY_HLSQ, 0x0022);
> > > > /* Early A430's have a timing issue with SP/TP power collapse;
> > > >disabling HW clock gating prevents it. */
> > > > -   if (adreno_is_a430(adreno_gpu) && adreno_gpu->rev.patchid < 2)
> > > > +   if (adreno_is_a430(adreno_gpu) && adreno_patchid(adreno_gpu) < 2)
> > > > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0);
> > > > else
> > > > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0x);
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> > > > b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > index f0803e94ebe5..70d2b5342cd9 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > > @@ -1744,6 +1744,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device 
> > > > *dev)
> > > > struct msm_drm_private *priv = dev->dev_private;
> > > > struct platform_device *pdev = priv->gpu_pdev;
> > > > struct adreno_platform_config *config = pdev->dev.platform_data;
> > > > +   const struct adreno_info *info;
> > > > struct a5xx_gpu *a5xx_gpu = NULL;
> > > > struct adreno_gpu *adreno_gpu;
> > > > struct msm_gpu *gpu;
> > > > @@ -1770,7 +1771,15 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device 
> > > > *dev)
> > > > nr_rings = 4;
> > > > -   if (adreno_cmp_rev(ADRENO_REV(5, 1, 0, ANY_ID), config->rev))
> > > > +   /*
> > > > +* Note that we wouldn't have been able to get this far if there is 
> > > > not
> > > > +* a device table entry for this chip_id
> > > > +*/
> > > > +   info = adreno_find_info(config->chip_id);
> > > > +   if (WARN_ON(!info))
> > > > +   return ERR_PTR(-EINVAL);
> > > > +
> > > > +   if (info->revn == 510)
> > > > nr_rings = 1;
> > > > ret = adreno_gpu_init(dev, pdev, adreno_gpu, , nr_rings);
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_power.c 
> > > > b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > > > index 0e63a1429189..7705f8010484 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > > > @@ -179,7 +179,7 @@ static void a540_lm_setup(struct msm_gpu *gpu)
> > > > /* The battery current limiter isn't enabled for A540 */
> > > > config = AGC_LM_CONFIG_BCL_DISABLED;
> > > > -   config |= adreno_gpu->rev.patchid << 
> > > > AGC_LM_CONFIG_GPU_VERSION_SHIFT;
> > > > +   config |= adreno_patchid(adreno_gp

Re: [Freedreno] [PATCH 05/12] drm/msm/adreno: Use quirk to identify cached-coherent support

2023-07-17 Thread Akhil P Oommen
On Thu, Jul 13, 2023 at 03:25:33PM -0700, Rob Clark wrote:
> 
> On Thu, Jul 13, 2023 at 1:06 PM Akhil P Oommen  
> wrote:
> >
> > On Thu, Jul 06, 2023 at 02:10:38PM -0700, Rob Clark wrote:
> > >
> > > From: Rob Clark 
> > >
> > > It is better to explicitly list it.  With the move to opaque chip-id's
> > > for future devices, we should avoid trying to infer things like
> > > generation from the numerical value.
> > >
> > > Signed-off-by: Rob Clark 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/adreno_device.c | 23 +++---
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
> > >  2 files changed, 17 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > index f469f951a907..3c531da417b9 100644
> > > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > @@ -256,6 +256,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   }, {
> > >   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > > @@ -266,6 +267,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a615_zap.mdt",
> > >   .hwcg = a615_hwcg,
> > > @@ -278,6 +280,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a630_zap.mdt",
> > >   .hwcg = a630_hwcg,
> > > @@ -290,6 +293,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a640_zap.mdt",
> > >   .hwcg = a640_hwcg,
> > > @@ -302,7 +306,8 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M + SZ_128K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > > + ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a650_zap.mdt",
> > >   .hwcg = a650_hwcg,
> > > @@ -316,7 +321,8 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_1M + SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > > + ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > >   .zapfw = "a660_zap.mdt",
> > >   .hwcg = a660_hwcg,
> > > @@ -329,7 +335,8 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_512K,
> > >   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > > - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> > > + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > > + ADRENO_QUIRK_HAS_HW_APRIV,
> > >   .init = a6xx_gpu_init,
> > >   .hwcg = a660_hwcg,
> > >   .address_space_size = SZ_16G,
> > > @@ -342,6 +349,7 @@ static const struct adreno_info gpulist[] = {
> > >   },
> > >   .gmem = SZ_2M,
> > >   .inactive_period = DRM_MSM_INACTIVE_

Re: [Freedreno] [PATCH 06/12] drm/msm/adreno: Allow SoC specific gpu device table entries

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 02:40:47AM +0200, Konrad Dybcio wrote:
> 
> On 6.07.2023 23:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > There are cases where there are differences due to SoC integration.
> > Such as cache-coherency support, and (in the next patch) e-fuse to
> > speedbin mappings.
> > 
> > Signed-off-by: Rob Clark 
> > ---
> of_machine_is_compatible is rather used in extremely desperate
> situations :/ I'm not sure this is the correct way to do this..
> 
> Especially since there's a direct correlation between GMU presence
> and ability to do cached coherent.
> 
> The GMU mandates presence of RPMh (as most of what the GMU does is
> talk to AOSS through its RSC).
> 
> To achieve I/O coherency, there must be some memory that both the
> CPU and GPU (and possibly others) can access through some sort of
> a negotiator/manager.
> 
> In our case, I believe that's LLC. And guess what that implies.
> MEMNOC instead of BIMC. And guess what that implies. RPMh!
> 
> Now, we know GMU => RPMh, but does it work the other way around?

I don't think we should tie gpu io-coherency with rpmh or llc. These
features are more dependent on SoC architecture than GPU arch.

-Akhil

> 
> Yes. GMU wrapper was a hack because probably nobody in the Adreno team
> would have imagined that somebody would be crazy enough to fork
> multiple year old designs multiple times and release them as new
> SoCs with updated arm cores and 5G..
> 
> (Except for A612 which has a "Reduced GMU" but that zombie still talks
> to RPMh. And A612 is IO-coherent. So I guess it works anyway.)
> 
> Konrad
> 
> >  drivers/gpu/drm/msm/adreno/adreno_device.c | 34 +++---
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
> >  2 files changed, 31 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 3c531da417b9..e62bc895a31f 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -258,6 +258,32 @@ static const struct adreno_info gpulist[] = {
> > .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > .init = a6xx_gpu_init,
> > +   }, {
> > +   .machine = "qcom,sm4350",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > +   }, {
> > +   .machine = "qcom,sm6375",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > }, {
> > .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > .revn = 619,
> > @@ -409,6 +435,8 @@ const struct adreno_info *adreno_info(struct adreno_rev 
> > rev)
> > /* identify gpu: */
> > for (i = 0; i < ARRAY_SIZE(gpulist); i++) {
> > const struct adreno_info *info = [i];
> > +   if (info->machine && !of_machine_is_compatible(info->machine))
> > +   continue;
> > if (adreno_cmp_rev(info->rev, rev))
> > return info;
> > }
> > @@ -563,6 +591,8 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > config.rev.minor, config.rev.patchid);
> >  
> > priv->is_a2xx = config.rev.core == 2;
> > +   priv->has_cached_coherent =
> > +   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT);
> >  
> > gpu = info->init(drm);
> > if (IS_ERR(gpu)) {
> > @@ -574,10 +604,6 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > if (ret)
> > return ret;
> >  
> > -   priv->has_cached_coherent =
> > -   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT) &&
> > -   !adreno_has_gmu_wrapper(to_adreno_gpu(gpu));
> > -
> > return 0;
> >  }
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index e08d41337169..d5335b99c64c 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -61,6 +61,7 @@ extern const struct adreno_reglist a612_hwcg[], 
> > a615_hwcg[], a630_hwcg[], a640_h
> >  extern const struct adreno_reglist a660_hwcg[], a690_hwcg[];
> >  
> >  struct 

Re: [Freedreno] [PATCH 12/12] drm/msm/adreno: Switch to chip-id for identifying GPU

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 06:45:42AM +0300, Dmitry Baryshkov wrote:
> 
> On 07/07/2023 00:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > Since the revision becomes an opaque identifier with future GPUs, move
> > away from treating different ranges of bits as having a given meaning.
> > This means that we need to explicitly list different patch revisions in
> > the device table.
> > 
> > Signed-off-by: Rob Clark 
> > ---
> >   drivers/gpu/drm/msm/adreno/a4xx_gpu.c  |   2 +-
> >   drivers/gpu/drm/msm/adreno/a5xx_gpu.c  |  11 +-
> >   drivers/gpu/drm/msm/adreno/a5xx_power.c|   2 +-
> >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c  |  13 ++-
> >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |   9 +-
> >   drivers/gpu/drm/msm/adreno/adreno_device.c | 128 ++---
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.c|  16 +--
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  51 
> >   8 files changed, 122 insertions(+), 110 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > index 715436cb3996..8b4cdf95f445 100644
> > --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> > @@ -145,7 +145,7 @@ static void a4xx_enable_hwcg(struct msm_gpu *gpu)
> > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_DELAY_HLSQ, 0x0022);
> > /* Early A430's have a timing issue with SP/TP power collapse;
> >disabling HW clock gating prevents it. */
> > -   if (adreno_is_a430(adreno_gpu) && adreno_gpu->rev.patchid < 2)
> > +   if (adreno_is_a430(adreno_gpu) && adreno_patchid(adreno_gpu) < 2)
> > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0);
> > else
> > gpu_write(gpu, REG_A4XX_RBBM_CLOCK_CTL, 0x);
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index f0803e94ebe5..70d2b5342cd9 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -1744,6 +1744,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
> > struct msm_drm_private *priv = dev->dev_private;
> > struct platform_device *pdev = priv->gpu_pdev;
> > struct adreno_platform_config *config = pdev->dev.platform_data;
> > +   const struct adreno_info *info;
> > struct a5xx_gpu *a5xx_gpu = NULL;
> > struct adreno_gpu *adreno_gpu;
> > struct msm_gpu *gpu;
> > @@ -1770,7 +1771,15 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
> > nr_rings = 4;
> > -   if (adreno_cmp_rev(ADRENO_REV(5, 1, 0, ANY_ID), config->rev))
> > +   /*
> > +* Note that we wouldn't have been able to get this far if there is not
> > +* a device table entry for this chip_id
> > +*/
> > +   info = adreno_find_info(config->chip_id);
> > +   if (WARN_ON(!info))
> > +   return ERR_PTR(-EINVAL);
> > +
> > +   if (info->revn == 510)
> > nr_rings = 1;
> > ret = adreno_gpu_init(dev, pdev, adreno_gpu, , nr_rings);
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_power.c 
> > b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > index 0e63a1429189..7705f8010484 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_power.c
> > @@ -179,7 +179,7 @@ static void a540_lm_setup(struct msm_gpu *gpu)
> > /* The battery current limiter isn't enabled for A540 */
> > config = AGC_LM_CONFIG_BCL_DISABLED;
> > -   config |= adreno_gpu->rev.patchid << AGC_LM_CONFIG_GPU_VERSION_SHIFT;
> > +   config |= adreno_patchid(adreno_gpu) << AGC_LM_CONFIG_GPU_VERSION_SHIFT;
> > /* For now disable GPMU side throttling */
> > config |= AGC_LM_CONFIG_THROTTLE_DISABLE;
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > index f1bb20574018..a9ba547a120c 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > @@ -790,10 +790,15 @@ static int a6xx_gmu_fw_start(struct a6xx_gmu *gmu, 
> > unsigned int state)
> > gmu_write(gmu, REG_A6XX_GMU_AHB_FENCE_RANGE_0,
> > (1 << 31) | (0xa << 18) | (0xa0));
> > -   chipid = adreno_gpu->rev.core << 24;
> > -   chipid |= adreno_gpu->rev.major << 16;
> > -   chipid |= adreno_gpu->rev.minor << 12;
> > -   chipid |= adreno_gpu->rev.patchid << 8;
> > +   /* Note that the GMU has a slightly different layout for
> > +* chip_id, for whatever reason, so a bit of massaging
> > +* is needed.  The upper 16b are the same, but minor and
> > +* patchid are packed in four bits each with the lower
> > +* 8b unused:
> > +*/
> > +   chipid  = adreno_gpu->chip_id & 0x;
> > +   chipid |= (adreno_gpu->chip_id << 4) & 0xf000; /* minor */
> > +   chipid |= (adreno_gpu->chip_id << 8) & 0x0f00; /* patchid */
> 
> I'd beg for explicit FIELD_GET and FIELD_PREP here.
> 
> > gmu_write(gmu, REG_A6XX_GMU_HFI_SFR_ADDR, chipid);
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > 

Re: [Freedreno] [PATCH 06/12] drm/msm/adreno: Allow SoC specific gpu device table entries

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 05:34:04AM +0300, Dmitry Baryshkov wrote:
> 
> On 07/07/2023 00:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > There are cases where there are differences due to SoC integration.
> > Such as cache-coherency support, and (in the next patch) e-fuse to
> > speedbin mappings.
> 
> I have the feeling that we are trying to circumvent the way DT works. I'd
> suggest adding explicit SoC-compatible strings to Adreno bindings and then
> using of_device_id::data and then of_device_get_match_data().
> 
Just thinking, then how about a unique compatible string which we match
to identify gpu->info and drop chip-id check completely here?

-Akhil

> > 
> > Signed-off-by: Rob Clark 
> > ---
> >   drivers/gpu/drm/msm/adreno/adreno_device.c | 34 +++---
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
> >   2 files changed, 31 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 3c531da417b9..e62bc895a31f 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -258,6 +258,32 @@ static const struct adreno_info gpulist[] = {
> > .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
> > .init = a6xx_gpu_init,
> > +   }, {
> > +   .machine = "qcom,sm4350",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > +   }, {
> > +   .machine = "qcom,sm6375",
> > +   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > +   .revn = 619,
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a630_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a619_gmu.bin",
> > +   },
> > +   .gmem = SZ_512K,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a615_zap.mdt",
> > +   .hwcg = a615_hwcg,
> > }, {
> > .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> > .revn = 619,
> > @@ -409,6 +435,8 @@ const struct adreno_info *adreno_info(struct adreno_rev 
> > rev)
> > /* identify gpu: */
> > for (i = 0; i < ARRAY_SIZE(gpulist); i++) {
> > const struct adreno_info *info = [i];
> > +   if (info->machine && !of_machine_is_compatible(info->machine))
> > +   continue;
> > if (adreno_cmp_rev(info->rev, rev))
> > return info;
> > }
> > @@ -563,6 +591,8 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > config.rev.minor, config.rev.patchid);
> > priv->is_a2xx = config.rev.core == 2;
> > +   priv->has_cached_coherent =
> > +   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT);
> > gpu = info->init(drm);
> > if (IS_ERR(gpu)) {
> > @@ -574,10 +604,6 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> > if (ret)
> > return ret;
> > -   priv->has_cached_coherent =
> > -   !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT) &&
> > -   !adreno_has_gmu_wrapper(to_adreno_gpu(gpu));
> > -
> > return 0;
> >   }
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index e08d41337169..d5335b99c64c 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -61,6 +61,7 @@ extern const struct adreno_reglist a612_hwcg[], 
> > a615_hwcg[], a630_hwcg[], a640_h
> >   extern const struct adreno_reglist a660_hwcg[], a690_hwcg[];
> >   struct adreno_info {
> > +   const char *machine;
> > struct adreno_rev rev;
> > uint32_t revn;
> > const char *fw[ADRENO_FW_MAX];
> 
> -- 
> With best wishes
> Dmitry
> 


Re: [Freedreno] [PATCH 05/12] drm/msm/adreno: Use quirk to identify cached-coherent support

2023-07-13 Thread Akhil P Oommen
On Thu, Jul 06, 2023 at 02:10:38PM -0700, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> It is better to explicitly list it.  With the move to opaque chip-id's
> for future devices, we should avoid trying to infer things like
> generation from the numerical value.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 23 +++---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h|  1 +
>  2 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index f469f951a907..3c531da417b9 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -256,6 +256,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   }, {
>   .rev = ADRENO_REV(6, 1, 9, ANY_ID),
> @@ -266,6 +267,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a615_zap.mdt",
>   .hwcg = a615_hwcg,
> @@ -278,6 +280,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a630_zap.mdt",
>   .hwcg = a630_hwcg,
> @@ -290,6 +293,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a640_zap.mdt",
>   .hwcg = a640_hwcg,
> @@ -302,7 +306,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M + SZ_128K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .zapfw = "a650_zap.mdt",
>   .hwcg = a650_hwcg,
> @@ -316,7 +321,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_1M + SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .zapfw = "a660_zap.mdt",
>   .hwcg = a660_hwcg,
> @@ -329,7 +335,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_512K,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .hwcg = a660_hwcg,
>   .address_space_size = SZ_16G,
> @@ -342,6 +349,7 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_2M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
>   .init = a6xx_gpu_init,
>   .zapfw = "a640_zap.mdt",
>   .hwcg = a640_hwcg,
> @@ -353,7 +361,8 @@ static const struct adreno_info gpulist[] = {
>   },
>   .gmem = SZ_4M,
>   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> - .quirks = ADRENO_QUIRK_HAS_HW_APRIV,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
>   .init = a6xx_gpu_init,
>   .zapfw = "a690_zap.mdt",
>   .hwcg = a690_hwcg,
> @@ -565,9 +574,9 @@ static int adreno_bind(struct device *dev, struct device 
> *master, void *data)
>   if (ret)
>   return ret;
>  
> - if (config.rev.core >= 6)
> - if (!adreno_has_gmu_wrapper(to_adreno_gpu(gpu)))
> - priv->has_cached_coherent = true;
> + priv->has_cached_coherent =
> + !!(info->quirks & ADRENO_QUIRK_HAS_CACHED_COHERENT) &&
> + !adreno_has_gmu_wrapper(to_adreno_gpu(gpu));
>  
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index a7c4a2c536e3..e08d41337169 100644
> 

Re: [PATCH 02/12] drm/msm/adreno: Remove redundant gmem size param

2023-07-13 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 01:22:56AM +0200, Konrad Dybcio wrote:
> 
> On 6.07.2023 23:10, Rob Clark wrote:
> > From: Rob Clark 
> > 
> > Even in the ocmem case, the allocated ocmem buffer size should match the
> > requested size.
> > 
> > Signed-off-by: Rob Clark 
> > ---
> [...]
> 
> > +
> > +   WARN_ON(ocmem_hdl->len != adreno_gpu->info->gmem);
> I believe this should be an error condition. If the sizes are mismatched,
> best case scenario you get suboptimal perf and worst case scenario your
> system explodes.

No, the worst case scenarios are subtle bugs like random corruptions,
pagefaults etc which you debug for months. ;)

-Akhil.

> 
> Very nice cleanup though!
> 
> Konrad
> >  
> > return 0;
> >  }
> > @@ -1097,7 +1098,6 @@ int adreno_gpu_init(struct drm_device *drm, struct 
> > platform_device *pdev,
> >  
> > adreno_gpu->funcs = funcs;
> > adreno_gpu->info = adreno_info(config->rev);
> > -   adreno_gpu->gmem = adreno_gpu->info->gmem;
> > adreno_gpu->revn = adreno_gpu->info->revn;
> > adreno_gpu->rev = *rev;
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index 6830c3776c2d..aaf09c642dc6 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -77,7 +77,6 @@ struct adreno_gpu {
> > struct msm_gpu base;
> > struct adreno_rev rev;
> > const struct adreno_info *info;
> > -   uint32_t gmem;  /* actual gmem size */
> > uint32_t revn;  /* numeric revision name */
> > uint16_t speedbin;
> > const struct adreno_gpu_funcs *funcs;


Re: [Freedreno] [PATCH] drm/msm/a6xx: Fix misleading comment

2023-07-13 Thread Akhil P Oommen
On Fri, Jun 30, 2023 at 09:20:43AM -0700, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> The range is actually len+1.
> 
> Signed-off-by: Rob Clark 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> index eea2e60ce3b7..edf76a4b16bd 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> @@ -39,8 +39,8 @@ struct a6xx_gpu {
>  
>  /*
>   * Given a register and a count, return a value to program into
> - * REG_CP_PROTECT_REG(n) - this will block both reads and writes for _len
> - * registers starting at _reg.
> + * REG_CP_PROTECT_REG(n) - this will block both reads and writes for
> + * _len + 1 registers starting at _reg.
>   */
>  #define A6XX_PROTECT_NORDWR(_reg, _len) \
>   ((1 << 31) | \
> -- 
> 2.41.0
> 


Re: [PATCH v2 2/3] drm/msm: Fix IS_ERR() vs NULL check in a5xx_submit_in_rb()

2023-07-13 Thread Akhil P Oommen
On Thu, Jul 13, 2023 at 10:05:55AM +0800, Gaosheng Cui wrote:
> 
> The msm_gem_get_vaddr() returns an ERR_PTR() on failure, we should
> use IS_ERR() to check the return value.
> 
> Fixes: 6a8bd08d0465 ("drm/msm: add sudo flag to submit ioctl")
> Signed-off-by: Gaosheng Cui 
> Reviewed-by: Abhinav Kumar 
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index a99310b68793..a499e3b350fc 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -89,7 +89,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct 
> msm_gem_submit *submit
>* since we've already mapped it once in
>* submit_reloc()
>*/
> - if (WARN_ON(!ptr))
> + if (WARN_ON(IS_ERR(ptr)))
nit: can we make this IS_ERR_OR_NULL() check to retain the current
validation? A null is catastrophic here. Yeah, I see that the current
implementation of ...get_vaddr() doesn't return a NULL.

Reviewed-by: Akhil P Oommen 

-Akhil

>   return;
>  
>   for (i = 0; i < dwords; i++) {
> -- 
> 2.25.1
> 


Re: [PATCH] drm/msm/adreno: Fix snapshot BINDLESS_DATA size

2023-07-13 Thread Akhil P Oommen
On Tue, Jul 11, 2023 at 10:54:07AM -0700, Rob Clark wrote:
> 
> From: Rob Clark 
> 
> The incorrect size was causing "CP | AHB bus error" when snapshotting
> the GPU state on a6xx gen4 (a660 family).
> 
> Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/26
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
> index 790f55e24533..e788ed72eb0d 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
> @@ -206,7 +206,7 @@ static const struct a6xx_shader_block {
>   SHADER(A6XX_SP_LB_3_DATA, 0x800),
>   SHADER(A6XX_SP_LB_4_DATA, 0x800),
>   SHADER(A6XX_SP_LB_5_DATA, 0x200),
> - SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x2000),
> + SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x800),
>   SHADER(A6XX_SP_CB_LEGACY_DATA, 0x280),
>   SHADER(A6XX_SP_UAV_DATA, 0x80),
>   SHADER(A6XX_SP_INST_TAG, 0x80),
> -- 
> 2.41.0
> 
Reviewed-by: Akhil P Oommen 

-Akhil


Re: [Freedreno] [PATCH] drm/msm: Check for the GPU IOMMU during bind

2023-07-09 Thread Akhil P Oommen
On Fri, Jul 07, 2023 at 08:27:18PM +0300, Dmitry Baryshkov wrote:
> 
> On 07/07/2023 18:03, Jordan Crouse wrote:
> > On Thu, Jul 06, 2023 at 09:55:13PM +0300, Dmitry Baryshkov wrote:
> > > 
> > > On 10/03/2023 00:20, Jordan Crouse wrote:
> > > > While booting with amd,imageon on a headless target the GPU probe was
> > > > failing with -ENOSPC in get_pages() from msm_gem.c.
> > > > 
> > > > Investigation showed that the driver was using the default 16MB VRAM
> > > > carveout because msm_use_mmu() was returning false since headless 
> > > > devices
> > > > use a dummy parent device. Avoid this by extending the existing is_a2xx
> > > > priv member to check the GPU IOMMU state on all platforms and use that
> > > > check in msm_use_mmu().
> > > > 
> > > > This works for memory allocations but it doesn't prevent the VRAM 
> > > > carveout
> > > > from being created because that happens before we have a chance to check
> > > > the GPU IOMMU state in adreno_bind.
> > > > 
> > > > There are a number of possible options to resolve this but none of them 
> > > > are
> > > > very clean. The easiest way is to likely specify vram=0 as module 
> > > > parameter
> > > > on headless devices so that the memory doesn't get wasted.
> > > 
> > > This patch was on my plate for quite a while, please excuse me for
> > > taking it so long.
> > 
> > No worries. I'm also chasing a bunch of other stuff too.
> > 
> > > I see the following problem with the current code. We have two different
> > > instances than can access memory: MDP/DPU and GPU. And each of them can
> > > either have or miss the MMU.
> > > 
> > > For some time I toyed with the idea of determining whether the allocated
> > > BO is going to be used by display or by GPU, but then I abandoned it. We
> > > can have display BOs being filled by GPU, so handling it this way would
> > > complicate things a lot.
> > > 
> > > This actually rings a tiny bell in my head with the idea of splitting
> > > the display and GPU parts to two different drivers, but I'm not sure
> > > what would be the overall impact.
> > 
> > As I now exclusively work on headless devices I would be 100% for this,
> > but I'm sure that our laptop friends might not agree :)
> 
> I do not know here. This is probably a question to Rob, as he better
> understands the interaction between GPU and display parts of the userspace.

I fully support this if it is feasible.

In our architecture, display and GPU are completely independent subsystems.
Like Jordan mentioned, there are IOT products without display. And I wouldn't
be surprised if there is a product with just display and uses software 
rendering.

-Akhil

> 
> > 
> > > More on the msm_use_mmu() below.
> > > 
> > > > 
> > > > Signed-off-by: Jordan Crouse 
> > > > ---
> > > > 
> > > >drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +-
> > > >drivers/gpu/drm/msm/msm_drv.c  | 7 +++
> > > >drivers/gpu/drm/msm/msm_drv.h  | 2 +-
> > > >3 files changed, 9 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > > > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > > index 36f062c7582f..4f19da28f80f 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > > > @@ -539,7 +539,11 @@ static int adreno_bind(struct device *dev, struct 
> > > > device *master, void *data)
> > > >DBG("Found GPU: %u.%u.%u.%u", config.rev.core, config.rev.major,
> > > >config.rev.minor, config.rev.patchid);
> > > > 
> > > > - priv->is_a2xx = config.rev.core == 2;
> > > > + /*
> > > > +  * A2xx has a built in IOMMU and all other IOMMU enabled targets 
> > > > will
> > > > +  * have an ARM IOMMU attached
> > > > +  */
> > > > + priv->has_gpu_iommu = config.rev.core == 2 || 
> > > > device_iommu_mapped(dev);
> > > >priv->has_cached_coherent = config.rev.core >= 6;
> > > > 
> > > >gpu = info->init(drm);
> > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c 
> > > > b/drivers/gpu/drm/msm/msm_drv.c
> > > > index aca48c868c14..a125a351ec90 100644
> > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > @@ -318,11 +318,10 @@ bool msm_use_mmu(struct drm_device *dev)
> > > >struct msm_drm_private *priv = dev->dev_private;
> > > > 
> > > >/*
> > > > -  * a2xx comes with its own MMU
> > > > -  * On other platforms IOMMU can be declared specified either for 
> > > > the
> > > > -  * MDP/DPU device or for its parent, MDSS device.
> > > > +  * Return true if the GPU or the MDP/DPU or parent MDSS device 
> > > > has an
> > > > +  * IOMMU
> > > > */
> > > > - return priv->is_a2xx ||
> > > > + return priv->has_gpu_iommu ||
> > > >device_iommu_mapped(dev->dev) ||
> > > >device_iommu_mapped(dev->dev->parent);
> > > 
> > > I have a generic feeling that both old an new 

Re: [Freedreno] [PATCH v8 10/18] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-17 Thread Akhil P Oommen
On Sat, Jun 17, 2023 at 02:00:50AM +0200, Konrad Dybcio wrote:
> 
> On 16.06.2023 19:54, Akhil P Oommen wrote:
> > On Thu, Jun 15, 2023 at 11:43:04PM +0200, Konrad Dybcio wrote:
> >>
> >> On 10.06.2023 00:06, Akhil P Oommen wrote:
> >>> On Mon, May 29, 2023 at 03:52:29PM +0200, Konrad Dybcio wrote:
> >>>>
> >>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>> but don't implement the associated GMUs. This is due to the fact that
> >>>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>
> >>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>
> >>>> This is essentially a register region which is convenient to model
> >>>> as a device. We'll use it for managing the GDSCs. The register
> >>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>>>
> >>>> Signed-off-by: Konrad Dybcio 
> >>>> ---
> [...]
> 
> >>>> +
> >>>> +ret = clk_bulk_prepare_enable(gpu->nr_clocks, gpu->grp_clks);
> >>>> +if (ret)
> >>>> +goto err_bulk_clk;
> >>>> +
> >>>> +/* If anything goes south, tear the GPU down piece by piece.. */
> >>>> +if (ret) {
> >>>> +err_bulk_clk:
> >>>
> >>> Goto jump directly to another block looks odd to me. Why do you need this 
> >>> label
> >>> anyway?
> >> If clk_bulk_prepare_enable() fails, trying to proceed will hang the
> >> platform with unclocked accesses. We need to unwind everything that
> >> has been done up until that point, in reverse order.
> > 
> > I missed this response from you earlier.
> > 
> > But you are checking for 'ret' twice here. You will end up here even
> > if you don't jump! So "if (ret) goto err_bulk_clk;" looks
> > unnecessary.
> > 
> > -Akhil.
> Ohhh right, silly mistake on my part ;)
> 
> I already sent out a v9 since.. Please check it out and if you
> have any further comments, I'll fix this, and if not.. Perhaps I
> could fix it in an incremental patch if that revision is gtg?

Incremental patch is fine as there is no functional issue.

-Akhil.

> 
> Konrad
> > 
> >>
> >>>
> >>>> +pm_runtime_put(gmu->gxpd);
> >>>> +pm_runtime_put(gmu->dev);
> >>>> +dev_pm_opp_set_opp(>pdev->dev, NULL);
> >>>> +}
> >>>> +err_set_opp:
> >>>
> >>> Generally, it is better to name the label based on what you do here. For
> >>> eg: "unlock_lock:".
> >> That seems to be a mixed bag all throughout the kernel, I've seen many
> >> usages of err_(what went wrong)
> >>
> >>>
> >>> Also, this function is small enough that it is better to return directly
> >>> in case of error. I think that would be more readable.
> >> Not really, adding the necessary cleanup steps in `if (ret)`
> >> blocks would roughly double the function's size.
> >>
> >>>
> >>>> +mutex_unlock(_gpu->gmu.lock);
> >>>> +
> >>>> +if (!ret)
> >>>> +msm_devfreq_resume(gpu);
> >>>> +
> >>>> +return ret;
> >>>> +}
> >>>> +
> >>>> +static int a6xx_gmu_pm_suspend(struct msm_gpu *gpu)
> >>>>  {
> >>>>  struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >>>>  struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
> >>>> @@ -1720,7 +1799,40 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
> >>>>  return 0;
> >>>>  }
> >>>>  
> >>>> -static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
> >>>> +static int a6xx_pm_suspend(struct msm_gpu *gpu)
> >>>> +{
> >>

Re: [Freedreno] [PATCH v8 10/18] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-16 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 11:43:04PM +0200, Konrad Dybcio wrote:
> 
> On 10.06.2023 00:06, Akhil P Oommen wrote:
> > On Mon, May 29, 2023 at 03:52:29PM +0200, Konrad Dybcio wrote:
> >>
> >> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >> but don't implement the associated GMUs. This is due to the fact that
> >> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>
> >> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >> the actual name that Qualcomm uses in their downstream kernels).
> >>
> >> This is essentially a register region which is convenient to model
> >> as a device. We'll use it for managing the GDSCs. The register
> >> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>
> >> Signed-off-by: Konrad Dybcio 
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +-
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 211 
> >> 
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
> >>  6 files changed, 277 insertions(+), 35 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index 5ba8cba69383..385ca3a12462 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -1437,6 +1437,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> >> struct platform_device *pdev,
> >>  
> >>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>  {
> >> +  struct adreno_gpu *adreno_gpu = _gpu->base;
> >>struct a6xx_gmu *gmu = _gpu->gmu;
> >>struct platform_device *pdev = to_platform_device(gmu->dev);
> >>  
> >> @@ -1462,10 +1463,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>gmu->mmio = NULL;
> >>gmu->rscc = NULL;
> >>  
> >> -  a6xx_gmu_memory_free(gmu);
> >> +  if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> >> +  a6xx_gmu_memory_free(gmu);
> >>  
> >> -  free_irq(gmu->gmu_irq, gmu);
> >> -  free_irq(gmu->hfi_irq, gmu);
> >> +  free_irq(gmu->gmu_irq, gmu);
> >> +  free_irq(gmu->hfi_irq, gmu);
> >> +  }
> >>  
> >>/* Drop reference taken in of_find_device_by_node */
> >>put_device(gmu->dev);
> >> @@ -1484,6 +1487,69 @@ static int cxpd_notifier_cb(struct notifier_block 
> >> *nb,
> >>return 0;
> >>  }
> >>  
> >> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> >> *node)
> >> +{
> >> +  struct platform_device *pdev = of_find_device_by_node(node);
> >> +  struct a6xx_gmu *gmu = _gpu->gmu;
> >> +  int ret;
> >> +
> >> +  if (!pdev)
> >> +  return -ENODEV;
> >> +
> >> +  gmu->dev = >dev;
> >> +
> >> +  of_dma_configure(gmu->dev, node, true);
> >> +
> >> +  pm_runtime_enable(gmu->dev);
> >> +
> >> +  /* Mark legacy for manual SPTPRAC control */
> >> +  gmu->legacy = true;
> >> +
> >> +  /* Map the GMU registers */
> >> +  gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> >> +  if (IS_ERR(gmu->mmio)) {
> >> +  ret = PTR_ERR(gmu->mmio);
> >> +  goto err_mmio;
> >> +  }
> >> +
> >> +  gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> >> +  if (IS_ERR(gmu->cxpd)) {
> >> +  ret = PTR_ERR(gmu->cxpd);
> >> +  goto err_mmio;
> >> +  }
> >> +
> >> +  if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> >> +  ret = -ENODEV;
> >> +  goto detach_cxpd;
> >> +  }
> >> +
> >> +  init_completion(>pd_gate);
> >> +  complete_all(>pd_gate);
&g

Re: [PATCH] drm/msm/adreno: Update MODULE_FIRMWARE macros

2023-06-16 Thread Akhil P Oommen
On Fri, Jun 16, 2023 at 02:28:15PM +0200, Juerg Haefliger wrote:
> 
> Add missing MODULE_FIRMWARE macros and remove some for firmwares that
> the driver no longer references.
> 
> Signed-off-by: Juerg Haefliger 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 23 ++
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 8cff86e9d35c..9f70d7c1a72a 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -364,17 +364,32 @@ MODULE_FIRMWARE("qcom/a330_pm4.fw");
>  MODULE_FIRMWARE("qcom/a330_pfp.fw");
>  MODULE_FIRMWARE("qcom/a420_pm4.fw");
>  MODULE_FIRMWARE("qcom/a420_pfp.fw");
> +MODULE_FIRMWARE("qcom/a506_zap.mdt");
> +MODULE_FIRMWARE("qcom/a508_zap.mdt");
> +MODULE_FIRMWARE("qcom/a512_zap.mdt");
>  MODULE_FIRMWARE("qcom/a530_pm4.fw");
>  MODULE_FIRMWARE("qcom/a530_pfp.fw");
>  MODULE_FIRMWARE("qcom/a530v3_gpmu.fw2");
>  MODULE_FIRMWARE("qcom/a530_zap.mdt");
> -MODULE_FIRMWARE("qcom/a530_zap.b00");
> -MODULE_FIRMWARE("qcom/a530_zap.b01");
> -MODULE_FIRMWARE("qcom/a530_zap.b02");
Why are these not required when "qcom/a530_zap.mdt" is present?

mdt & b0* binaries are different partitions of the same secure
firmware. Even though we specify only the .mdt file here, the PIL driver
will load the *.b0* file automatically. OTOH, "*.mbn" is a standalone
unified binary format.

If the requirement is to ensure that all necessary firmwares are part of
your distribution, you should include the *.b0* files too here.

-Akhil

> +MODULE_FIRMWARE("qcom/a540_gpmu.fw2");
> +MODULE_FIRMWARE("qcom/a540_zap.mdt");
> +MODULE_FIRMWARE("qcom/a615_zap.mdt");
>  MODULE_FIRMWARE("qcom/a619_gmu.bin");
>  MODULE_FIRMWARE("qcom/a630_sqe.fw");
>  MODULE_FIRMWARE("qcom/a630_gmu.bin");
> -MODULE_FIRMWARE("qcom/a630_zap.mbn");
> +MODULE_FIRMWARE("qcom/a630_zap.mdt");
> +MODULE_FIRMWARE("qcom/a640_gmu.bin");
> +MODULE_FIRMWARE("qcom/a640_zap.mdt");
> +MODULE_FIRMWARE("qcom/a650_gmu.bin");
> +MODULE_FIRMWARE("qcom/a650_sqe.fw");
> +MODULE_FIRMWARE("qcom/a650_zap.mdt");
> +MODULE_FIRMWARE("qcom/a660_gmu.bin");
> +MODULE_FIRMWARE("qcom/a660_sqe.fw");
> +MODULE_FIRMWARE("qcom/a660_zap.mdt");
> +MODULE_FIRMWARE("qcom/leia_pfp_470.fw");
> +MODULE_FIRMWARE("qcom/leia_pm4_470.fw");
> +MODULE_FIRMWARE("qcom/yamato_pfp.fw");
> +MODULE_FIRMWARE("qcom/yamato_pm4.fw");
>  
>  static inline bool _rev_match(uint8_t entry, uint8_t id)
>  {
> -- 
> 2.37.2
> 


Re: [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 10:59:23PM +0200, Konrad Dybcio wrote:
> 
> On 15.06.2023 22:11, Akhil P Oommen wrote:
> > On Thu, Jun 15, 2023 at 12:34:06PM +0200, Konrad Dybcio wrote:
> >>
> >> On 6.06.2023 19:18, Akhil P Oommen wrote:
> >>> On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> >>>>
> >>>> Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
> >>>> GPUs and reuse it in a6xx_gmu_force_off().
> >>>>
> >>>> This helper, contrary to the original usage in GMU code paths, adds
> >>>> a write memory barrier which together with the necessary delay should
> >>>> ensure that the reset is never deasserted too quickly due to e.g. OoO
> >>>> execution going crazy.
> >>>>
> >>>> Signed-off-by: Konrad Dybcio 
> >>>> ---
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
> >>>>  3 files changed, 13 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >>>> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >>>> index b86be123ecd0..5ba8cba69383 100644
> >>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >>>> @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
> >>>>  a6xx_bus_clear_pending_transactions(adreno_gpu, true);
> >>>>  
> >>>>  /* Reset GPU core blocks */
> >>>> -gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
> >>>> -udelay(100);
> >>>> +a6xx_gpu_sw_reset(gpu, true);
> >>>>  }
> >>>>  
> >>>>  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct 
> >>>> a6xx_gmu *gmu)
> >>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> >>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>>> index e3ac3f045665..083ccb5bcb4e 100644
> >>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>>> @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
> >>>> adreno_gpu *adreno_gpu, bool gx_
> >>>>  gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> >>>>  }
> >>>>  
> >>>> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
> >>>> +{
> >>>> +gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
> >>>> +/* Add a barrier to avoid bad surprises */
> >>> Can you please make this comment a bit more clear? Highlight that we
> >>> should ensure the register is posted at hw before polling.
> >>>
> >>> I think this barrier is required only during assert.
> >> Generally it should not be strictly required at all, but I'm thinking
> >> that it'd be good to keep it in both cases, so that:
> >>
> >> if (assert)
> >>we don't keep writing things to the GPU if it's in reset
> >> else
> >>we don't start writing things to the GPU becomes it comes
> >>out of reset
> >>
> >> Also, if you squint hard enough at the commit message, you'll notice
> >> I intended for this so only be a wmb, but for some reason generalized
> >> it.. Perhaps that's another thing I should fix!
> >> for v9..
> > 
> > wmb() doesn't provide any ordering guarantee with the delay loop.
> Hm, fair.. I'm still not as fluent with memory access knowledge as I'd
> like to be..
> 
> > A common practice is to just read back the same register before
> > the loop because a readl followed by delay() is guaranteed to be ordered.
> So, how should I proceed? Keep the r/w barrier, or add a readback and
> a tiiiny (perhaps even using ndelay instead of udelay?) delay on de-assert?

readback + delay (similar value as downstream). This path is exercised
rarely.

-Akhil.

> 
> Konrad
> > 
> > -Akhil.
> >>
> >> Konrad
> >>>
> >>> -Akhil.
> >>>> +mb();
> >>>> +
> >>>> +/* The reset line needs to be asserted for at least 100 us */
> >>>> +if (assert)
> >>>> +udelay(100);
> >>>> +}
> >>>> +
> >>>>  static int a6xx_pm_resume(struct msm_gpu *gpu)
> >>>>  {
> >>>>  struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> >>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >>>> index 9580def06d45..aa70390ee1c6 100644
> >>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >>>> @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct 
> >>>> msm_gpu *gpu);
> >>>>  int a6xx_gpu_state_put(struct msm_gpu_state *state);
> >>>>  
> >>>>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, 
> >>>> bool gx_off);
> >>>> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
> >>>>  
> >>>>  #endif /* __A6XX_GPU_H__ */
> >>>>
> >>>> -- 
> >>>> 2.40.1
> >>>>


Re: [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 12:34:06PM +0200, Konrad Dybcio wrote:
> 
> On 6.06.2023 19:18, Akhil P Oommen wrote:
> > On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> >>
> >> Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
> >> GPUs and reuse it in a6xx_gmu_force_off().
> >>
> >> This helper, contrary to the original usage in GMU code paths, adds
> >> a write memory barrier which together with the necessary delay should
> >> ensure that the reset is never deasserted too quickly due to e.g. OoO
> >> execution going crazy.
> >>
> >> Signed-off-by: Konrad Dybcio 
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
> >>  3 files changed, 13 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index b86be123ecd0..5ba8cba69383 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
> >>a6xx_bus_clear_pending_transactions(adreno_gpu, true);
> >>  
> >>/* Reset GPU core blocks */
> >> -  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
> >> -  udelay(100);
> >> +  a6xx_gpu_sw_reset(gpu, true);
> >>  }
> >>  
> >>  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct 
> >> a6xx_gmu *gmu)
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> index e3ac3f045665..083ccb5bcb4e 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
> >> adreno_gpu *adreno_gpu, bool gx_
> >>gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> >>  }
> >>  
> >> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
> >> +{
> >> +  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
> >> +  /* Add a barrier to avoid bad surprises */
> > Can you please make this comment a bit more clear? Highlight that we
> > should ensure the register is posted at hw before polling.
> > 
> > I think this barrier is required only during assert.
> Generally it should not be strictly required at all, but I'm thinking
> that it'd be good to keep it in both cases, so that:
> 
> if (assert)
>   we don't keep writing things to the GPU if it's in reset
> else
>   we don't start writing things to the GPU becomes it comes
>   out of reset
> 
> Also, if you squint hard enough at the commit message, you'll notice
> I intended for this so only be a wmb, but for some reason generalized
> it.. Perhaps that's another thing I should fix!
> for v9..

wmb() doesn't provide any ordering guarantee with the delay loop.
A common practice is to just read back the same register before
the loop because a readl followed by delay() is guaranteed to be ordered.

-Akhil.
> 
> Konrad
> > 
> > -Akhil.
> >> +  mb();
> >> +
> >> +  /* The reset line needs to be asserted for at least 100 us */
> >> +  if (assert)
> >> +  udelay(100);
> >> +}
> >> +
> >>  static int a6xx_pm_resume(struct msm_gpu *gpu)
> >>  {
> >>struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> index 9580def06d45..aa70390ee1c6 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu 
> >> *gpu);
> >>  int a6xx_gpu_state_put(struct msm_gpu_state *state);
> >>  
> >>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, 
> >> bool gx_off);
> >> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
> >>  
> >>  #endif /* __A6XX_GPU_H__ */
> >>
> >> -- 
> >> 2.40.1
> >>


Re: [PATCH v8 18/18] drm/msm/a6xx: Add A610 speedbin support

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:37PM +0200, Konrad Dybcio wrote:
> 
> A610 is implemented on at least three SoCs: SM6115 (bengal), SM6125
> (trinket) and SM6225 (khaje). Trinket does not support speed binning
> (only a single SKU exists) and we don't yet support khaje upstream.
> Hence, add a fuse mapping table for bengal to allow for per-chip
> frequency limiting.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index d046af5f6de2..c304fa118cff 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2098,6 +2098,30 @@ static bool a6xx_progress(struct msm_gpu *gpu, struct 
> msm_ringbuffer *ring)
>   return progress;
>  }
>  
> +static u32 a610_get_speed_bin(u32 fuse)
> +{
> + /*
> +  * There are (at least) three SoCs implementing A610: SM6125 (trinket),
> +  * SM6115 (bengal) and SM6225 (khaje). Trinket does not have 
> speedbinning,
> +  * as only a single SKU exists and we don't support khaje upstream yet.
> +  * Hence, this matching table is only valid for bengal and can be easily
> +  * expanded if need be.
> +  */
> +
> + if (fuse == 0)
> + return 0;
> + else if (fuse == 206)
> + return 1;
> + else if (fuse == 200)
> + return 2;
> + else if (fuse == 157)
> + return 3;
> + else if (fuse == 127)
> + return 4;
> +
> + return UINT_MAX;
> +}
> +
>  static u32 a618_get_speed_bin(u32 fuse)
>  {
>   if (fuse == 0)
> @@ -2195,6 +2219,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_gpu *adreno_gpu, u3
>  {
>   u32 val = UINT_MAX;
>  
> + if (adreno_is_a610(adreno_gpu))
> + val = a610_get_speed_bin(fuse);
> +

Didn't you update here to convert to 'else if' in one of the earlier
patches??

Reviewed-by: Akhil P Oommen 

-Akhil.
>   if (adreno_is_a618(adreno_gpu))
>   val = a618_get_speed_bin(fuse);
>  
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 17/18] drm/msm/a6xx: Add A619_holi speedbin support

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:36PM +0200, Konrad Dybcio wrote:
> 
> A619_holi is implemented on at least two SoCs: SM4350 (holi) and SM6375
> (blair). This is what seems to be a first occurrence of this happening,
> but it's easy to overcome by guarding the SoC-specific fuse values with
> of_machine_is_compatible(). Do just that to enable frequency limiting
> on these SoCs.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index ca4ffa44097e..d046af5f6de2 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2110,6 +2110,34 @@ static u32 a618_get_speed_bin(u32 fuse)
>   return UINT_MAX;
>  }
>  
> +static u32 a619_holi_get_speed_bin(u32 fuse)
> +{
> + /*
> +  * There are (at least) two SoCs implementing A619_holi: SM4350 (holi)
> +  * and SM6375 (blair). Limit the fuse matching to the corresponding
> +  * SoC to prevent bogus frequency setting (as improbable as it may be,
> +  * given unexpected fuse values are.. unexpected! But still possible.)
> +  */
> +
> + if (fuse == 0)
> + return 0;
> +
> + if (of_machine_is_compatible("qcom,sm4350")) {
> + if (fuse == 138)
> + return 1;
> + else if (fuse == 92)
> + return 2;
> + } else if (of_machine_is_compatible("qcom,sm6375")) {
> + if (fuse == 190)
> + return 1;
> + else if (fuse == 177)
> + return 2;
> + } else
> + pr_warn("Unknown SoC implementing A619_holi!\n");
> +
> + return UINT_MAX;
> +}
> +
>  static u32 a619_get_speed_bin(u32 fuse)
>  {
>   if (fuse == 0)
> @@ -2170,6 +2198,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_gpu *adreno_gpu, u3
>   if (adreno_is_a618(adreno_gpu))
>   val = a618_get_speed_bin(fuse);
>  
> + else if (adreno_is_a619_holi(adreno_gpu))
> + val = a619_holi_get_speed_bin(fuse);
> +
>   else if (adreno_is_a619(adreno_gpu))
>   val = a619_get_speed_bin(fuse);
>  
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 16/18] drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:35PM +0200, Konrad Dybcio wrote:
> 
> Before transitioning to using per-SoC and not per-Adreno speedbin
> fuse values (need another patchset to land elsewhere), a good
> improvement/stopgap solution is to use adreno_is_aXYZ macros in
> place of explicit revision matching. Do so to allow differentiating
> between A619 and A619_holi.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 18 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h | 14 --
>  2 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 5faa85543428..ca4ffa44097e 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2163,23 +2163,23 @@ static u32 adreno_7c3_get_speed_bin(u32 fuse)
>   return UINT_MAX;
>  }
>  
> -static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 
> fuse)
> +static u32 fuse_to_supp_hw(struct device *dev, struct adreno_gpu 
> *adreno_gpu, u32 fuse)
>  {
>   u32 val = UINT_MAX;
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
> + if (adreno_is_a618(adreno_gpu))
>   val = a618_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
> + else if (adreno_is_a619(adreno_gpu))
>   val = a619_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
> + else if (adreno_is_7c3(adreno_gpu))
>   val = adreno_7c3_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
> + else if (adreno_is_a640(adreno_gpu))
>   val = a640_get_speed_bin(fuse);
>  
> - else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
> + else if (adreno_is_a650(adreno_gpu))
>   val = a650_get_speed_bin(fuse);
>  
>   if (val == UINT_MAX) {
> @@ -2192,7 +2192,7 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_rev rev, u32 fuse)
>   return (1 << val);
>  }
>  
> -static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
> +static int a6xx_set_supported_hw(struct device *dev, struct adreno_gpu 
> *adreno_gpu)
>  {
>   u32 supp_hw;
>   u32 speedbin;
> @@ -2211,7 +2211,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
> struct adreno_rev rev)
>   return ret;
>   }
>  
> - supp_hw = fuse_to_supp_hw(dev, rev, speedbin);
> + supp_hw = fuse_to_supp_hw(dev, adreno_gpu, speedbin);
>  
>   ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
>   if (ret)
> @@ -2330,7 +2330,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>  
>   a6xx_llc_slices_init(pdev, a6xx_gpu);
>  
> - ret = a6xx_set_supported_hw(>dev, config->rev);
> + ret = a6xx_set_supported_hw(>dev, adreno_gpu);
>   if (ret) {
>   a6xx_destroy(&(a6xx_gpu->base.base));
>   return ERR_PTR(ret);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index 7a5d595d4b99..21513cec038f 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -268,9 +268,9 @@ static inline int adreno_is_a630(struct adreno_gpu *gpu)
>   return gpu->revn == 630;
>  }
>  
> -static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
> +static inline int adreno_is_a640(struct adreno_gpu *gpu)
>  {
> - return (gpu->revn == 640) || (gpu->revn == 680);
> + return gpu->revn == 640;
>  }
>  
>  static inline int adreno_is_a650(struct adreno_gpu *gpu)
> @@ -289,6 +289,11 @@ static inline int adreno_is_a660(struct adreno_gpu *gpu)
>   return gpu->revn == 660;
>  }
>  
> +static inline int adreno_is_a680(struct adreno_gpu *gpu)
> +{
> + return gpu->revn == 680;
> +}
> +
>  /* check for a615, a616, a618, a619 or any derivatives */
>  static inline int adreno_is_a615_family(struct adreno_gpu *gpu)
>  {
> @@ -306,6 +311,11 @@ static inline int adreno_is_a650_family(struct 
> adreno_gpu *gpu)
>   return gpu->revn == 650 || gpu->revn == 620 || 
> adreno_is_a660_family(gpu);
>  }
>  
> +static inline int adreno_is_a640_family(struct adreno_gpu *gpu)
> +{
> + return adreno_is_a640(gpu) || adreno_is_a680(gpu);
> +}
> +
>  u64 adreno_private_address_space_size(struct msm_gpu *gpu);
>  int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
>uint32_t param, uint64_t *value, uint32_t *len);
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 15/18] drm/msm/a6xx: Use "else if" in GPU speedbin rev matching

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:34PM +0200, Konrad Dybcio wrote:
> 
> The GPU can only be one at a time. Turn a series of ifs into if +
> elseifs to save some CPU cycles.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 1a29e7dd9975..5faa85543428 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -2170,16 +2170,16 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
> adreno_rev rev, u32 fuse)
>   if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
>   val = a618_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
>   val = a619_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
>   val = adreno_7c3_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
>   val = a640_get_speed_bin(fuse);
>  
> - if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
> + else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
>   val = a650_get_speed_bin(fuse);
>  
>   if (val == UINT_MAX) {
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 14/18] drm/msm/a6xx: Fix some A619 tunables

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:33PM +0200, Konrad Dybcio wrote:
> 
> Adreno 619 expects some tunables to be set differently. Make up for it.
> 
> Fixes: b7616b5c69e6 ("drm/msm/adreno: Add A619 support")
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c0d5973320d9..1a29e7dd9975 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1198,6 +1198,8 @@ static int hw_init(struct msm_gpu *gpu)
>   gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00200200);
>   else if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu))
>   gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00300200);
> + else if (adreno_is_a619(adreno_gpu))
> + gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00018000);
>   else if (adreno_is_a610(adreno_gpu))
>   gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x0008);
>   else
> @@ -1215,7 +1217,9 @@ static int hw_init(struct msm_gpu *gpu)
>   a6xx_set_ubwc_config(gpu);
>  
>   /* Enable fault detection */
> - if (adreno_is_a610(adreno_gpu))
> + if (adreno_is_a619(adreno_gpu))
> + gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
> | 0x3f);
> + else if (adreno_is_a610(adreno_gpu))
>   gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
> | 0x3ffff);
>   else
>   gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
> | 0x1f);

Reviewed-by: Akhil P Oommen 

-Akhil
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 13/18] drm/msm/a6xx: Add A610 support

2023-06-14 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:32PM +0200, Konrad Dybcio wrote:
> 
> A610 is one of (if not the) lowest-tier SKUs in the A6XX family. It
> features no GMU, as it's implemented solely on SoCs with SMD_RPM.
> What's more interesting is that it does not feature a VDDGX line
> either, being powered solely by VDDCX and has an unfortunate hardware
> quirk that makes its reset line broken - after a couple of assert/
> deassert cycles, it will hang for good and will not wake up again.
> 
> This GPU requires mesa changes for proper rendering, and lots of them
> at that. The command streams are quite far away from any other A6XX
> GPU and hence it needs special care. This patch was validated both
> by running an (incomplete) downstream mesa with some hacks (frames
> rendered correctly, though some instructions made the GPU hangcheck
> which is expected - garbage in, garbage out) and by replaying RD
> traces captured with the downstream KGSL driver - no crashes there,
> ever.
> 
> Add support for this GPU on the kernel side, which comes down to
> pretty simply adding A612 HWCG tables, altering a few values and
> adding a special case for handling the reset line.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 101 
> +
>  drivers/gpu/drm/msm/adreno/adreno_device.c |  12 
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h|   8 ++-
>  3 files changed, 108 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index bb04f65e6f68..c0d5973320d9 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -252,6 +252,56 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
> msm_gem_submit *submit)
>   a6xx_flush(gpu, ring);
>  }
>  
> +const struct adreno_reglist a612_hwcg[] = {
> + {REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220},
> + {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0081},
> + {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xf3cf},
> + {REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002},
> + {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001},
> + {REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007},
> + {REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x0120},
> + {REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x2220},
> + {REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00},
> + {REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022},
> + {REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x0011},
> + {REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044},
> + {REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x0422},
> + {REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x},
> + {REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x0222},
> + {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x0002},
> + {REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x4000},
> + {REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x},
> + {REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x0200},
> + {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004},
> + {REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x},
> + {REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x0004},
> + {REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002},
> + {REG_A6XX_RBBM_ISDB_CNT, 0x0182},
> + {REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x},
> + {REG_A6XX_RBBM_SP_HYST_CNT, 0x},
> + {REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x0222},
> + {REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x0111},
> + {REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x0555},
> + {},
> +};
> +
>  /* For a615 family (a615, a616, a618 and a619) */
>  const struct adreno_reglist a615_hwcg[] = {
>   {REG_A6XX_RBBM_CLOCK_CNTL_SP0,  0x0222},
> @@ -602,6 +652,8 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
>  
>   if (adreno_is_a630(adreno_gpu))
>   clock_cntl_on = 0x8aa8aa02;
> + else if (adreno_is_a610(adreno_gpu))
> + clock_cntl_on = 0xaaa8aa82;
>   else
>   clock_cntl_on = 0x8aa8aa82;
>  
> @@ -612,13 +664,15 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool 
> state)
>   return;
>  
>   /* 

Re: [PATCH v8 10/18] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-09 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:29PM +0200, Konrad Dybcio wrote:
> 
> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> but don't implement the associated GMUs. This is due to the fact that
> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> of enabling & scaling power rails, clocks and bandwidth ourselves.
> 
> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> A6XX code to facilitate these GPUs. This involves if-ing out lots
> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> the actual name that Qualcomm uses in their downstream kernels).
> 
> This is essentially a register region which is convenient to model
> as a device. We'll use it for managing the GDSCs. The register
> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 211 
> 
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
>  6 files changed, 277 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 5ba8cba69383..385ca3a12462 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1437,6 +1437,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> struct platform_device *pdev,
>  
>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  {
> + struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct a6xx_gmu *gmu = _gpu->gmu;
>   struct platform_device *pdev = to_platform_device(gmu->dev);
>  
> @@ -1462,10 +1463,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>   gmu->mmio = NULL;
>   gmu->rscc = NULL;
>  
> - a6xx_gmu_memory_free(gmu);
> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> + a6xx_gmu_memory_free(gmu);
>  
> - free_irq(gmu->gmu_irq, gmu);
> - free_irq(gmu->hfi_irq, gmu);
> + free_irq(gmu->gmu_irq, gmu);
> + free_irq(gmu->hfi_irq, gmu);
> + }
>  
>   /* Drop reference taken in of_find_device_by_node */
>   put_device(gmu->dev);
> @@ -1484,6 +1487,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
>   return 0;
>  }
>  
> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> *node)
> +{
> + struct platform_device *pdev = of_find_device_by_node(node);
> + struct a6xx_gmu *gmu = _gpu->gmu;
> + int ret;
> +
> + if (!pdev)
> + return -ENODEV;
> +
> + gmu->dev = >dev;
> +
> + of_dma_configure(gmu->dev, node, true);
> +
> + pm_runtime_enable(gmu->dev);
> +
> + /* Mark legacy for manual SPTPRAC control */
> + gmu->legacy = true;
> +
> + /* Map the GMU registers */
> + gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> + if (IS_ERR(gmu->mmio)) {
> + ret = PTR_ERR(gmu->mmio);
> + goto err_mmio;
> + }
> +
> + gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> + if (IS_ERR(gmu->cxpd)) {
> + ret = PTR_ERR(gmu->cxpd);
> + goto err_mmio;
> + }
> +
> + if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> + ret = -ENODEV;
> + goto detach_cxpd;
> + }
> +
> + init_completion(>pd_gate);
> + complete_all(>pd_gate);
> + gmu->pd_nb.notifier_call = cxpd_notifier_cb;
> +
> + /* Get a link to the GX power domain to reset the GPU */
> + gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
> + if (IS_ERR(gmu->gxpd)) {
> + ret = PTR_ERR(gmu->gxpd);
> + goto err_mmio;
> + }
> +
> + gmu->initialized = true;
> +
> + return 0;
> +
> +detach_cxpd:
> + dev_pm_domain_detach(gmu->cxpd, false);
> +
> +err_mmio:
> + iounmap(gmu->mmio);
> +
> + /* Drop reference taken in of_find_device_by_node */
> + put_device(gmu->dev);
> +
> + return ret;
> +}
> +
>  int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  {
>   struct adreno_gpu *adreno_gpu = _gpu->base;
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 58bf405b85d8..0a44762dbb6d 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -21,7 +21,7 @@ static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
>   struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
>  
>   /* Check that the GMU is idle */
> - if (!a6xx_gmu_isidle(_gpu->gmu))
> + if (!adreno_has_gmu_wrapper(adreno_gpu) && 
> !a6xx_gmu_isidle(_gpu->gmu))
>   return false;
>  
>   /* Check tha 

Re: [PATCH v8 09/18] drm/msm/a6xx: Extend and explain UBWC config

2023-06-09 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:28PM +0200, Konrad Dybcio wrote:
> 
> Rename lower_bit to hbb_lo and explain what it signifies.
> Add explanations (wherever possible to other tunables).
> 
> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
> 
> Reviewed-by: Rob Clark 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 39 
> +++
>  1 file changed, 30 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index dfde5fb65eed..58bf405b85d8 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,10 +786,25 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> - u32 amsbc = 0;
> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>   u32 rgb565_predicator = 0;
> + /* Unknown, introduced with A650 family */
>   u32 uavflagprd_inv = 0;
> + /* Whether the minimum access length is 64 bits */
> + u32 min_acc_len = 0;
> + /* Entirely magic, per-GPU-gen value */
> + u32 ubwc_mode = 0;
> + /*
> +  * The Highest Bank Bit value represents the bit of the highest DDR 
> bank.
> +  * We then subtract 13 from it (13 is the minimum value allowed by hw) 
> and
> +  * write the lowest two bits of the remaining value as hbb_lo and the
> +  * one above it as hbb_hi to the hardware. This should ideally use DRAM
> +  * type detection.
> +  */
> + u32 hbb_hi = 0;
> + u32 hbb_lo = 2;
> + /* Unknown, introduced with A640/680 */
> + u32 amsbc = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
> @@ -800,25 +815,31 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> - lower_bit = 3;
> + hbb_lo = 3;
>   amsbc = 1;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
> + hbb_lo = 1;
>   amsbc = 1;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> +     gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 
> 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)
> 

Reviewed-by: Akhil P Oommen 

-Akhil
> -- 
> 2.40.1
> 


Re: [PATCH v8 08/18] drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init

2023-06-09 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:27PM +0200, Konrad Dybcio wrote:
> 
> Currently we're only deasserting REG_A6XX_RBBM_GBIF_HALT, but we also
> need REG_A6XX_GBIF_HALT to be set to 0.
> 
> This is typically done automatically on successful GX collapse, but in
> case that fails, we should take care of it.
> 
> Also, add a memory barrier to ensure it's gone through before jumping
> to further initialization.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 083ccb5bcb4e..dfde5fb65eed 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1003,8 +1003,12 @@ static int hw_init(struct msm_gpu *gpu)
>   a6xx_gmu_set_oob(_gpu->gmu, GMU_OOB_GPU_SET);
>  
>   /* Clear GBIF halt in case GX domain was not collapsed */
> - if (a6xx_has_gbif(adreno_gpu))
> + if (a6xx_has_gbif(adreno_gpu)) {
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
>   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
> + /* Let's make extra sure that the GPU can access the memory.. */
> + mb();
This barrier is unnecessary because writel transactions are ordered and
we don't expect a traffic from GPU immediately after this.

-Akhil
> + }
>  
>   gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);
>  
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> 
> Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
> GPUs and reuse it in a6xx_gmu_force_off().
> 
> This helper, contrary to the original usage in GMU code paths, adds
> a write memory barrier which together with the necessary delay should
> ensure that the reset is never deasserted too quickly due to e.g. OoO
> execution going crazy.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index b86be123ecd0..5ba8cba69383 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
>   a6xx_bus_clear_pending_transactions(adreno_gpu, true);
>  
>   /* Reset GPU core blocks */
> - gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
> - udelay(100);
> + a6xx_gpu_sw_reset(gpu, true);
>  }
>  
>  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu 
> *gmu)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index e3ac3f045665..083ccb5bcb4e 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
> adreno_gpu *adreno_gpu, bool gx_
>   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
>  }
>  
> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
> +{
> + gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
> + /* Add a barrier to avoid bad surprises */
Can you please make this comment a bit more clear? Highlight that we
should ensure the register is posted at hw before polling.

I think this barrier is required only during assert.

-Akhil.
> + mb();
> +
> + /* The reset line needs to be asserted for at least 100 us */
> + if (assert)
> + udelay(100);
> +}
> +
>  static int a6xx_pm_resume(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> index 9580def06d45..aa70390ee1c6 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu 
> *gpu);
>  int a6xx_gpu_state_put(struct msm_gpu_state *state);
>  
>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
> gx_off);
> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
>  
>  #endif /* __A6XX_GPU_H__ */
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 06/18] drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions()

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:25PM +0200, Konrad Dybcio wrote:
> 
> Unify the indentation and explain the cryptic 0xF value.
> 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil

> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 6bb4da70f6a6..e3ac3f045665 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1597,17 +1597,18 @@ static void a6xx_llc_slices_init(struct 
> platform_device *pdev,
>   a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
>  }
>  
> -#define GBIF_CLIENT_HALT_MASK BIT(0)
> -#define GBIF_ARB_HALT_MASKBIT(1)
> +#define GBIF_CLIENT_HALT_MASKBIT(0)
> +#define GBIF_ARB_HALT_MASK   BIT(1)
> +#define VBIF_XIN_HALT_CTRL0_MASK GENMASK(3, 0)
>  
>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
> gx_off)
>  {
>   struct msm_gpu *gpu = _gpu->base;
>  
>   if (!a6xx_has_gbif(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
> + gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 
> VBIF_XIN_HALT_CTRL0_MASK);
>   spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
> - 0xf) == 0xf);
> + (VBIF_XIN_HALT_CTRL0_MASK)) == 
> VBIF_XIN_HALT_CTRL0_MASK);
>   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
>  
>   return;
> 
> -- 
> 2.40.1
> 


Re: [Freedreno] [PATCH v8 11/18] drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:30PM +0200, Konrad Dybcio wrote:
> 
> A610 and A619_holi don't support the feature. Disable it to make the GPU stop
> crashing after almost each and every submission - the received data on
> the GPU end was simply incomplete in garbled, resulting in almost nothing
> being executed properly. Extend the disablement to adreno_has_gmu_wrapper,
> as none of the GMU wrapper Adrenos that don't support yet seem to feature it.
> 
> Signed-off-by: Konrad Dybcio 
> ---
Reviewed-by: Akhil P Oommen 

-Akhil
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 8cff86e9d35c..b133755a56c4 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -551,7 +551,6 @@ static int adreno_bind(struct device *dev, struct device 
> *master, void *data)
>   config.rev.minor, config.rev.patchid);
>  
>   priv->is_a2xx = config.rev.core == 2;
> - priv->has_cached_coherent = config.rev.core >= 6;
>  
>   gpu = info->init(drm);
>   if (IS_ERR(gpu)) {
> @@ -563,6 +562,10 @@ static int adreno_bind(struct device *dev, struct device 
> *master, void *data)
>   if (ret)
>   return ret;
>  
> + if (config.rev.core >= 6)
> + if (!adreno_has_gmu_wrapper(to_adreno_gpu(gpu)))
> + priv->has_cached_coherent = true;
> +
>   return 0;
>  }
>  
> 
> -- 
> 2.40.1
> 


Re: [PATCH v8 05/18] drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:24PM +0200, Konrad Dybcio wrote:
> 
> This function is responsible for telling the GPU to halt transactions
> on all of its relevant buses, drain them and leave them in a predictable
> state, so that the GPU can be e.g. reset cleanly.
> 
> Move the function to a6xx_gpu.c, remove the static keyword and add a
> prototype in a6xx_gpu.h to accomodate for the move.
> 
> Signed-off-by: Konrad Dybcio 

Reviewed-by: Akhil P Oommen 

-Akhil
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 37 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 36 ++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  2 ++
>  3 files changed, 38 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 9421716a2fe5..b86be123ecd0 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -868,43 +868,6 @@ static void a6xx_gmu_rpmh_off(struct a6xx_gmu *gmu)
>   (val & 1), 100, 1000);
>  }
>  
> -#define GBIF_CLIENT_HALT_MASK BIT(0)
> -#define GBIF_ARB_HALT_MASKBIT(1)
> -
> -static void a6xx_bus_clear_pending_transactions(struct adreno_gpu 
> *adreno_gpu,
> - bool gx_off)
> -{
> - struct msm_gpu *gpu = _gpu->base;
> -
> - if (!a6xx_has_gbif(adreno_gpu)) {
> - gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
> - spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
> - 0xf) == 0xf);
> - gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
> -
> - return;
> - }
> -
> - if (gx_off) {
> - /* Halt the gx side of GBIF */
> - gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
> - spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
> - }
> -
> - /* Halt new client requests on GBIF */
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
> - spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
> - (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
> -
> - /* Halt all AXI requests on GBIF */
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
> - spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
> - (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
> -
> - /* The GBIF halt needs to be explicitly cleared */
> - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> -}
> -
>  /* Force the GMU off in case it isn't responsive */
>  static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
>  {
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index e34aa15156a4..6bb4da70f6a6 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1597,6 +1597,42 @@ static void a6xx_llc_slices_init(struct 
> platform_device *pdev,
>   a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
>  }
>  
> +#define GBIF_CLIENT_HALT_MASK BIT(0)
> +#define GBIF_ARB_HALT_MASKBIT(1)
> +
> +void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
> gx_off)
> +{
> + struct msm_gpu *gpu = _gpu->base;
> +
> + if (!a6xx_has_gbif(adreno_gpu)) {
> + gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
> + spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
> + 0xf) == 0xf);
> + gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
> +
> + return;
> + }
> +
> + if (gx_off) {
> + /* Halt the gx side of GBIF */
> + gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
> + spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
> + }
> +
> + /* Halt new client requests on GBIF */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
> + spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
> + (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
> +
> + /* Halt all AXI requests on GBIF */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
> + spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
> + (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
> +
> + /* The GBIF halt needs to be explicitly cleared */
> + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> +}
> +
>  static int a6xx_pm_resume(struct msm_gpu *gpu)
>  {
>   

Re: [PATCH v8 04/18] drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off()

2023-06-06 Thread Akhil P Oommen
On Mon, May 29, 2023 at 03:52:23PM +0200, Konrad Dybcio wrote:
> 
> As pointed out by Akhil during the review process of GMU wrapper
> introduction [1], it makes sense to move this write into the function
> that's responsible for forcibly shutting the GMU off.
> 
> It is also very convenient to move this to GMU-specific code, so that
> it does not have to be guarded by an if-condition to avoid calling it
> on GMU wrapper targets.
> 
> Move the write to the aforementioned a6xx_gmu_force_off() to achieve
> that. No effective functional change.
Reviewed-by: Akhil P Oommen 
-Akhil.
> 
> [1] 
> https://lore.kernel.org/linux-arm-msm/20230501194022.ga18...@akhilpo-linux.qualcomm.com/
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 ++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 --
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 87babbb2a19f..9421716a2fe5 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -912,6 +912,12 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
>   struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct msm_gpu *gpu = _gpu->base;
>  
> + /*
> +  * Turn off keep alive that might have been enabled by the hang
> +  * interrupt
> +  */
> + gmu_write(_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
> +
>   /* Flush all the queues */
>   a6xx_hfi_stop(gmu);
>  
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 9fb214f150dd..e34aa15156a4 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1274,12 +1274,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
>   /* Halt SQE first */
>   gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 3);
>  
> - /*
> -  * Turn off keep alive that might have been enabled by the hang
> -  * interrupt
> -  */
> - gmu_write(_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
> -
>   pm_runtime_dont_use_autosuspend(>pdev->dev);
>  
>   /* active_submit won't change until we make a submission */
> 
> -- 
> 2.40.1
> 


Re: [PATCH v2 2/3] arm64: dts: qcom: sc8280xp: Add GPU related nodes

2023-06-01 Thread Akhil P Oommen
On Tue, May 30, 2023 at 08:35:14AM -0700, Bjorn Andersson wrote:
> 
> On Mon, May 29, 2023 at 02:16:14PM +0530, Manivannan Sadhasivam wrote:
> > On Mon, May 29, 2023 at 09:38:59AM +0200, Konrad Dybcio wrote:
> > > On 28.05.2023 19:07, Manivannan Sadhasivam wrote:
> > > > On Tue, May 23, 2023 at 09:59:53AM +0200, Konrad Dybcio wrote:
> > > >> On 23.05.2023 03:15, Bjorn Andersson wrote:
> > > >>> From: Bjorn Andersson 
> [..]
> > > >>> diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi 
> > > >>> b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> [..]
> > > >>> + gmu: gmu@3d6a000 {
> [..]
> > > >>> + status = "disabled";
> > > >> I've recently discovered that - and I am not 100% sure - all GMUs are
> > > >> cache-coherent. Could you please ask somebody at qc about this?
> > > >>
> > > > 
> > > > AFAIU, GMU's job is controlling the voltage and clock to the GPU.
> > > Not just that, it's only the limited functionality we've implemented
> > > upstream so far.
> > > 
> > 
> > Okay, good to know!
> > 
> > > It doesn't do
> > > > any data transactions on its own.
> > > Of course it does. AP communication is done through MMIO writes and
> > > the GMU talks to RPMh via the GPU RSC directly. Apart from that, some
> > > of the GPU registers (that nota bene don't have anything to do with
> > > the GMU M3 core itself) lay within the GMU address space.
> > > 
> 
> But those aren't shared memory accesses.
> 
> > 
> > That doesn't justify the fact that cache coherency is needed, especially
> > MMIO writes, unless GMU could snoop the MMIO writes to AP caches.
> > 
> 
> In reviewing the downstream state again I noticed that the GPU smmu is
> marked dma-coherent, so I will adjust that in v3.
Bjorn,

Would you mind sharing a perf delta (preferrably manhattan offscreen)
you see with and without this dma-coherent property?

-Akhil.
> 
> Regards,
> Bjorn


Re: [PATCH v2 2/3] arm64: dts: qcom: sc8280xp: Add GPU related nodes

2023-06-01 Thread Akhil P Oommen
On Mon, May 29, 2023 at 09:38:59AM +0200, Konrad Dybcio wrote:
> 
> 
> 
> On 28.05.2023 19:07, Manivannan Sadhasivam wrote:
> > On Tue, May 23, 2023 at 09:59:53AM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 23.05.2023 03:15, Bjorn Andersson wrote:
> >>> From: Bjorn Andersson 
> >>>
> >>> Add Adreno SMMU, GPU clock controller, GMU and GPU nodes for the
> >>> SC8280XP.
> >>>
> >>> Signed-off-by: Bjorn Andersson 
> >>> Signed-off-by: Bjorn Andersson 
> >>> ---
> >> It does not look like you tested the DTS against bindings. Please run
> >> `make dtbs_check` (see
> >> Documentation/devicetree/bindings/writing-schema.rst for instructions).
> >>
> >>>
> >>> Changes since v1:
> >>> - Dropped gmu_pdc_seq region from , as it shouldn't have been used.
> >>> - Added missing compatible to _smmu.
> >>> - Dropped aoss_qmp clock in  and _smmu.
> >>>  
> >>>  arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 169 +
> >>>  1 file changed, 169 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi 
> >>> b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> >>> index d2a2224d138a..329ec2119ecf 100644
> >>> --- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> >>> +++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> >>> @@ -6,6 +6,7 @@
> >>>  
> >>>  #include 
> >>>  #include 
> >>> +#include 
> >>>  #include 
> >>>  #include 
> >>>  #include 
> >>> @@ -2331,6 +2332,174 @@ tcsr: syscon@1fc {
> >>>   reg = <0x0 0x01fc 0x0 0x3>;
> >>>   };
> >>>  
> >>> + gpu: gpu@3d0 {
> >>> + compatible = "qcom,adreno-690.0", "qcom,adreno";
> >>> +
> >>> + reg = <0 0x03d0 0 0x4>,
> >>> +   <0 0x03d9e000 0 0x1000>,
> >>> +   <0 0x03d61000 0 0x800>;
> >>> + reg-names = "kgsl_3d0_reg_memory",
> >>> + "cx_mem",
> >>> + "cx_dbgc";
> >>> + interrupts = ;
> >>> + iommus = <_smmu 0 0xc00>, <_smmu 1 0xc00>;
> >>> + operating-points-v2 = <_opp_table>;
> >>> +
> >>> + qcom,gmu = <>;
> >>> + interconnects = <_noc MASTER_GFX3D 0 _virt 
> >>> SLAVE_EBI1 0>;
> >>> + interconnect-names = "gfx-mem";
> >>> + #cooling-cells = <2>;
> >>> +
> >>> + status = "disabled";
> >>> +
> >>> + gpu_opp_table: opp-table {
> >>> + compatible = "operating-points-v2";
> >>> +
> >>> + opp-27000 {
> >>> + opp-hz = /bits/ 64 <27000>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <451000>;
> >>> + };
> >>> +
> >>> + opp-41000 {
> >>> + opp-hz = /bits/ 64 <41000>;
> >>> + opp-level = ;
> >>> + opp-peak-kBps = <1555000>;
> >>> + };
> >>> +
> >>> + opp-5 {
> >>> + opp-hz = /bits/ 64 <5>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <1555000>;
> >>> + };
> >>> +
> >>> + opp-54700 {
> >>> + opp-hz = /bits/ 64 <54700>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <1555000>;
> >>> + };
> >>> +
> >>> + opp-60600 {
> >>> + opp-hz = /bits/ 64 <60600>;
> >>> + opp-level = ;
> >>> + opp-peak-kBps = <2736000>;
> >>> + };
> >>> +
> >>> + opp-64000 {
> >>> + opp-hz = /bits/ 64 <64000>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <2736000>;
> >>> + };
> >>> +
> >>> + opp-69000 {
> >>> + opp-hz = /bits/ 64 <69000>;
> >>> + opp-level = 
> >>> ;
> >>> + opp-peak-kBps = <2736000>;
> >>> + };
> >>> + };
> >>> + };
> >>> +
> >>> + gmu: gmu@3d6a000 {
> >>> + compatible = "qcom,adreno-gmu-690.0", "qcom,adreno-gmu";
> >>> + reg = <0 0x03d6a000 0 0x34000>,
> >>> +   <0 0x03de 0 0x1>,
> >>> +   <0 0x0b29 0 0x1>;
> >>> + reg-names = "gmu", "rscc", "gmu_pdc";
> >>> + interrupts = ,
> >>> + 

Re: [PATCH v3 1/3] drm/msm/adreno: Add Adreno A690 support

2023-06-01 Thread Akhil P Oommen
On Wed, May 31, 2023 at 10:30:09PM +0200, Konrad Dybcio wrote:
> 
> 
> 
> On 31.05.2023 05:09, Bjorn Andersson wrote:
> > From: Bjorn Andersson 
> > 
> > Introduce support for the Adreno A690, found in Qualcomm SC8280XP.
> > 
> > Tested-by: Steev Klimaszewski 
> > Reviewed-by: Konrad Dybcio 
> > Signed-off-by: Bjorn Andersson 
> > Signed-off-by: Bjorn Andersson 
> > ---
> Couple of additional nits that you may or may not incorporate:
> 
> [...]
> 
> > +   {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xF3CF},
> It would be cool if we could stop adding uppercase hex outside preprocessor
> defines..
> 
> 
> [...]
> > +   A6XX_PROTECT_RDONLY(0x0fc00, 0x01fff),
> > +   A6XX_PROTECT_NORDWR(0x11c00, 0x0), /*note: infiite range */
> typo
> 
> 
> 
> -- Questions to Rob that don't really concern this patch --
> 
> > +static void a690_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
> Rob, I'll be looking into reworking these into dynamic tables.. would you
> be okay with two more additions (A730, A740) on top of this before I do that?
> The number of these funcs has risen quite a bit and we're abusing the fact
> that so far there's a 1-1 mapping of SoC-Adreno (at the current state of
> mainline, not in general)..

+1. But please leave a618 and 7c3 as it is.

-Akhil

> 
> > +{
> > +   /*
> > +* Send a single "off" entry just to get things running
> > +* TODO: bus scaling
> > +*/
> Also something I'll be looking into in the near future..
> 
> > @@ -531,6 +562,8 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
> > adreno_7c3_build_bw_table();
> > else if (adreno_is_a660(adreno_gpu))
> > a660_build_bw_table();
> > +   else if (adreno_is_a690(adreno_gpu))
> > +   a690_build_bw_table();
> > else
> > a6xx_build_bw_table();
> I think changing the is_adreno_... to switch statements with a gpu_model
> var would make it easier to read.. Should I also rework that?
> 
> Konrad
> 
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 8cff86e9d35c..e5a865024e94 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -355,6 +355,20 @@ static const struct adreno_info gpulist[] = {
> > .init = a6xx_gpu_init,
> > .zapfw = "a640_zap.mdt",
> > .hwcg = a640_hwcg,
> > +   }, {
> > +   .rev = ADRENO_REV(6, 9, 0, ANY_ID),
> > +   .revn = 690,
> > +   .name = "A690",
> > +   .fw = {
> > +   [ADRENO_FW_SQE] = "a660_sqe.fw",
> > +   [ADRENO_FW_GMU] = "a690_gmu.bin",
> > +   },
> > +   .gmem = SZ_4M,
> > +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > +   .init = a6xx_gpu_init,
> > +   .zapfw = "a690_zap.mdt",
> > +   .hwcg = a690_hwcg,
> > +   .address_space_size = SZ_16G,
> > },
> >  };
> >  
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> > b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index f62612a5c70f..ac9c429ca07b 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -55,7 +55,7 @@ struct adreno_reglist {
> > u32 value;
> >  };
> >  
> > -extern const struct adreno_reglist a615_hwcg[], a630_hwcg[], a640_hwcg[], 
> > a650_hwcg[], a660_hwcg[];
> > +extern const struct adreno_reglist a615_hwcg[], a630_hwcg[], a640_hwcg[], 
> > a650_hwcg[], a660_hwcg[], a690_hwcg[];
> >  
> >  struct adreno_info {
> > struct adreno_rev rev;
> > @@ -272,6 +272,11 @@ static inline int adreno_is_a660(struct adreno_gpu 
> > *gpu)
> > return gpu->revn == 660;
> >  }
> >  
> > +static inline int adreno_is_a690(struct adreno_gpu *gpu)
> > +{
> > +   return gpu->revn == 690;
> > +};
> > +
> >  /* check for a615, a616, a618, a619 or any derivatives */
> >  static inline int adreno_is_a615_family(struct adreno_gpu *gpu)
> >  {
> > @@ -280,13 +285,13 @@ static inline int adreno_is_a615_family(struct 
> > adreno_gpu *gpu)
> >  
> >  static inline int adreno_is_a660_family(struct adreno_gpu *gpu)
> >  {
> > -   return adreno_is_a660(gpu) || adreno_is_7c3(gpu);
> > +   return adreno_is_a660(gpu) || adreno_is_a690(gpu) || adreno_is_7c3(gpu);
> >  }
> >  
> >  /* check for a650, a660, or any derivatives */
> >  static inline int adreno_is_a650_family(struct adreno_gpu *gpu)
> >  {
> > -   return gpu->revn == 650 || gpu->revn == 620 || 
> > adreno_is_a660_family(gpu);
> > +   return gpu->revn == 650 || gpu->revn == 620  || 
> > adreno_is_a660_family(gpu);
> >  }
> >  
> >  u64 adreno_private_address_space_size(struct msm_gpu *gpu);


Re: [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-08 Thread Akhil P Oommen
On Mon, May 08, 2023 at 10:59:24AM +0200, Konrad Dybcio wrote:
> 
> 
> On 6.05.2023 16:46, Akhil P Oommen wrote:
> > On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 5.05.2023 10:46, Akhil P Oommen wrote:
> >>> On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> >>>>
> >>>>
> >>>> On 3.05.2023 22:32, Akhil P Oommen wrote:
> >>>>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> >>>>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >>>>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>>>>>> but don't implement the associated GMUs. This is due to the fact that
> >>>>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take 
> >>>>>>>> care
> >>>>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>>>>>
> >>>>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper 
> >>>>>>>> (it's
> >>>>>>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>>>>>
> >>>>>>>> This is essentially a register region which is convenient to model
> >>>>>>>> as a device. We'll use it for managing the GDSCs. The register
> >>>>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>>>>>> << I sent a reply to this patch earlier, but not sure where it went.
> >>>>>>> Still figuring out Mutt... >>
> >>>>>> Answered it here:
> >>>>>>
> >>>>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> >>>>>
> >>>>> Thanks. Will check and respond there if needed.
> >>>>>
> >>>>>>
> >>>>>> I don't think I see any new comments in this "reply revision" (heh), 
> >>>>>> so please
> >>>>>> check that one out.
> >>>>>>
> >>>>>>>
> >>>>>>> Only convenience I found is that we can reuse gmu register ops in a 
> >>>>>>> few
> >>>>>>> places (< 10 I think). If we just model this as another gpu memory
> >>>>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> >>>>>>> architecture code with clean separation. Also, it looks like we need 
> >>>>>>> to
> >>>>>>> keep a dummy gmu platform device in the devicetree with the current
> >>>>>>> approach. That doesn't sound right.
> >>>>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> >>>>>> need additional, gmuwrapper-configuration specific code anyway, as
> >>>>>> OPP & genpd will no longer make use of the default behavior which
> >>>>>> only gets triggered if there's a single power-domains=<> entry, afaicu.
> >>>>> Can you please tell me which specific *default behviour* do you mean 
> >>>>> here?
> >>>>> I am curious to know what I am overlooking here. We can always get a 
> >>>>> cxpd/gxpd device
> >>>>> and vote for the gdscs directly from the driver. Anything related to
> >>>>> OPP?
> >>>> I *believe* this is true:
> >>>>
> >>>> if (ARRAY_SIZE(power-domains) == 1) {
> >>>>  of generic code will enable the power domain at .probe time
> >>> we need to handle the voting directly. I recently shared a patch to
> >>> vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
> >>> only cx rail due to this logic you quoted here.
> >>>
> >>> I see that you have handled it mostly correctly 

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-06 Thread Akhil P Oommen
On Sun, May 07, 2023 at 02:16:36AM +0530, Akhil P Oommen wrote:
> On Sat, May 06, 2023 at 08:16:21PM +0530, Akhil P Oommen wrote:
> > On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> > > 
> > > 
> > > On 5.05.2023 10:46, Akhil P Oommen wrote:
> > > > On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> > > >>
> > > >>
> > > >> On 3.05.2023 22:32, Akhil P Oommen wrote:
> > > >>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> > > >>>>
> > > >>>>
> > > >>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> > > >>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> > > >>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX 
> > > >>>>>> GPUs
> > > >>>>>> but don't implement the associated GMUs. This is due to the fact 
> > > >>>>>> that
> > > >>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take 
> > > >>>>>> care
> > > >>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> > > >>>>>>
> > > >>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> > > >>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> > > >>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper 
> > > >>>>>> (it's
> > > >>>>>> the actual name that Qualcomm uses in their downstream kernels).
> > > >>>>>>
> > > >>>>>> This is essentially a register region which is convenient to model
> > > >>>>>> as a device. We'll use it for managing the GDSCs. The register
> > > >>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" 
> > > >>>>>> devices
> > > >>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> > > >>>>> << I sent a reply to this patch earlier, but not sure where it went.
> > > >>>>> Still figuring out Mutt... >>
> > > >>>> Answered it here:
> > > >>>>
> > > >>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> > > >>>
> > > >>> Thanks. Will check and respond there if needed.
> > > >>>
> > > >>>>
> > > >>>> I don't think I see any new comments in this "reply revision" (heh), 
> > > >>>> so please
> > > >>>> check that one out.
> > > >>>>
> > > >>>>>
> > > >>>>> Only convenience I found is that we can reuse gmu register ops in a 
> > > >>>>> few
> > > >>>>> places (< 10 I think). If we just model this as another gpu memory
> > > >>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> > > >>>>> architecture code with clean separation. Also, it looks like we 
> > > >>>>> need to
> > > >>>>> keep a dummy gmu platform device in the devicetree with the current
> > > >>>>> approach. That doesn't sound right.
> > > >>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> > > >>>> need additional, gmuwrapper-configuration specific code anyway, as
> > > >>>> OPP & genpd will no longer make use of the default behavior which
> > > >>>> only gets triggered if there's a single power-domains=<> entry, 
> > > >>>> afaicu.
> > > >>> Can you please tell me which specific *default behviour* do you mean 
> > > >>> here?
> > > >>> I am curious to know what I am overlooking here. We can always get a 
> > > >>> cxpd/gxpd device
> > > >>> and vote for the gdscs directly from the driver. Anything related to
> > > >>> OPP?
> > > >> I *believe* this is true:
> > > >>
> > > >> if (ARRAY_SIZE(power-domains) == 1) {
> > > >>of generic code will enable the power domain at .probe time
> >

Re: [Freedreno] [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-06 Thread Akhil P Oommen
On Sat, May 06, 2023 at 08:16:21PM +0530, Akhil P Oommen wrote:
> On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> > 
> > 
> > On 5.05.2023 10:46, Akhil P Oommen wrote:
> > > On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> > >>
> > >>
> > >> On 3.05.2023 22:32, Akhil P Oommen wrote:
> > >>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> > >>>>
> > >>>>
> > >>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> > >>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> > >>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> > >>>>>> but don't implement the associated GMUs. This is due to the fact that
> > >>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take 
> > >>>>>> care
> > >>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> > >>>>>>
> > >>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> > >>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> > >>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper 
> > >>>>>> (it's
> > >>>>>> the actual name that Qualcomm uses in their downstream kernels).
> > >>>>>>
> > >>>>>> This is essentially a register region which is convenient to model
> > >>>>>> as a device. We'll use it for managing the GDSCs. The register
> > >>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> > >>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> > >>>>> << I sent a reply to this patch earlier, but not sure where it went.
> > >>>>> Still figuring out Mutt... >>
> > >>>> Answered it here:
> > >>>>
> > >>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> > >>>
> > >>> Thanks. Will check and respond there if needed.
> > >>>
> > >>>>
> > >>>> I don't think I see any new comments in this "reply revision" (heh), 
> > >>>> so please
> > >>>> check that one out.
> > >>>>
> > >>>>>
> > >>>>> Only convenience I found is that we can reuse gmu register ops in a 
> > >>>>> few
> > >>>>> places (< 10 I think). If we just model this as another gpu memory
> > >>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> > >>>>> architecture code with clean separation. Also, it looks like we need 
> > >>>>> to
> > >>>>> keep a dummy gmu platform device in the devicetree with the current
> > >>>>> approach. That doesn't sound right.
> > >>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> > >>>> need additional, gmuwrapper-configuration specific code anyway, as
> > >>>> OPP & genpd will no longer make use of the default behavior which
> > >>>> only gets triggered if there's a single power-domains=<> entry, afaicu.
> > >>> Can you please tell me which specific *default behviour* do you mean 
> > >>> here?
> > >>> I am curious to know what I am overlooking here. We can always get a 
> > >>> cxpd/gxpd device
> > >>> and vote for the gdscs directly from the driver. Anything related to
> > >>> OPP?
> > >> I *believe* this is true:
> > >>
> > >> if (ARRAY_SIZE(power-domains) == 1) {
> > >>  of generic code will enable the power domain at .probe time
> > > we need to handle the voting directly. I recently shared a patch to
> > > vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
> > > only cx rail due to this logic you quoted here.
> > > 
> > > I see that you have handled it mostly correctly from the gpu driver in 
> > > the updated
> > > a6xx_pm_suspend() callback. Just the power domain device ptrs should be 
> > > moved to
> > > gpu from gmu.
> > > 
> > >>
> > >>  opp APIs wil

Re: [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-06 Thread Akhil P Oommen
On Fri, May 05, 2023 at 12:35:18PM +0200, Konrad Dybcio wrote:
> 
> 
> On 5.05.2023 10:46, Akhil P Oommen wrote:
> > On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 3.05.2023 22:32, Akhil P Oommen wrote:
> >>> On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> >>>>
> >>>>
> >>>> On 2.05.2023 09:49, Akhil P Oommen wrote:
> >>>>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >>>>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>>>> but don't implement the associated GMUs. This is due to the fact that
> >>>>>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >>>>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>>>
> >>>>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >>>>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>>>
> >>>>>> This is essentially a register region which is convenient to model
> >>>>>> as a device. We'll use it for managing the GDSCs. The register
> >>>>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>>>> << I sent a reply to this patch earlier, but not sure where it went.
> >>>>> Still figuring out Mutt... >>
> >>>> Answered it here:
> >>>>
> >>>> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> >>>
> >>> Thanks. Will check and respond there if needed.
> >>>
> >>>>
> >>>> I don't think I see any new comments in this "reply revision" (heh), so 
> >>>> please
> >>>> check that one out.
> >>>>
> >>>>>
> >>>>> Only convenience I found is that we can reuse gmu register ops in a few
> >>>>> places (< 10 I think). If we just model this as another gpu memory
> >>>>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> >>>>> architecture code with clean separation. Also, it looks like we need to
> >>>>> keep a dummy gmu platform device in the devicetree with the current
> >>>>> approach. That doesn't sound right.
> >>>> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> >>>> need additional, gmuwrapper-configuration specific code anyway, as
> >>>> OPP & genpd will no longer make use of the default behavior which
> >>>> only gets triggered if there's a single power-domains=<> entry, afaicu.
> >>> Can you please tell me which specific *default behviour* do you mean here?
> >>> I am curious to know what I am overlooking here. We can always get a 
> >>> cxpd/gxpd device
> >>> and vote for the gdscs directly from the driver. Anything related to
> >>> OPP?
> >> I *believe* this is true:
> >>
> >> if (ARRAY_SIZE(power-domains) == 1) {
> >>of generic code will enable the power domain at .probe time
> > we need to handle the voting directly. I recently shared a patch to
> > vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
> > only cx rail due to this logic you quoted here.
> > 
> > I see that you have handled it mostly correctly from the gpu driver in the 
> > updated
> > a6xx_pm_suspend() callback. Just the power domain device ptrs should be 
> > moved to
> > gpu from gmu.
> > 
> >>
> >>opp APIs will default to scaling that domain with required-opps
> > 
> >> }
> >>
> >> and we do need to put GX/CX (with an MX parent to match) there, as the
> >> AP is responsible for voting in this configuration
> > 
> > We should vote to turn ON gx/cx headswitches through genpd from gpu driver. 
> > When you vote for
> > core clk frequency, *clock driver is supposed to scale* all the necessary
> > regulators. At least that is how downstream works. You can refer the 
> > downstream
> > gpucc clk driver of these 

Re: [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-05 Thread Akhil P Oommen
On Thu, May 04, 2023 at 08:34:07AM +0200, Konrad Dybcio wrote:
> 
> 
> On 3.05.2023 22:32, Akhil P Oommen wrote:
> > On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> >>
> >>
> >> On 2.05.2023 09:49, Akhil P Oommen wrote:
> >>> On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >>>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >>>> but don't implement the associated GMUs. This is due to the fact that
> >>>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >>>> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>>>
> >>>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >>>> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >>>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >>>> the actual name that Qualcomm uses in their downstream kernels).
> >>>>
> >>>> This is essentially a register region which is convenient to model
> >>>> as a device. We'll use it for managing the GDSCs. The register
> >>>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >>>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> >>> << I sent a reply to this patch earlier, but not sure where it went.
> >>> Still figuring out Mutt... >>
> >> Answered it here:
> >>
> >> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/
> > 
> > Thanks. Will check and respond there if needed.
> > 
> >>
> >> I don't think I see any new comments in this "reply revision" (heh), so 
> >> please
> >> check that one out.
> >>
> >>>
> >>> Only convenience I found is that we can reuse gmu register ops in a few
> >>> places (< 10 I think). If we just model this as another gpu memory
> >>> region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> >>> architecture code with clean separation. Also, it looks like we need to
> >>> keep a dummy gmu platform device in the devicetree with the current
> >>> approach. That doesn't sound right.
> >> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> >> need additional, gmuwrapper-configuration specific code anyway, as
> >> OPP & genpd will no longer make use of the default behavior which
> >> only gets triggered if there's a single power-domains=<> entry, afaicu.
> > Can you please tell me which specific *default behviour* do you mean here?
> > I am curious to know what I am overlooking here. We can always get a 
> > cxpd/gxpd device
> > and vote for the gdscs directly from the driver. Anything related to
> > OPP?
> I *believe* this is true:
> 
> if (ARRAY_SIZE(power-domains) == 1) {
>   of generic code will enable the power domain at .probe time
we need to handle the voting directly. I recently shared a patch to
vote cx gdsc from gpu driver. Maybe we can ignore this when gpu has
only cx rail due to this logic you quoted here.

I see that you have handled it mostly correctly from the gpu driver in the 
updated
a6xx_pm_suspend() callback. Just the power domain device ptrs should be moved to
gpu from gmu.

> 
>   opp APIs will default to scaling that domain with required-opps

> }
> 
> and we do need to put GX/CX (with an MX parent to match) there, as the
> AP is responsible for voting in this configuration

We should vote to turn ON gx/cx headswitches through genpd from gpu driver. 
When you vote for
core clk frequency, *clock driver is supposed to scale* all the necessary
regulators. At least that is how downstream works. You can refer the downstream
gpucc clk driver of these SoCs. I am not sure how much of that can be easily 
converted to
upstream.

Also, how does having a gmu dt node help in this regard? Feel free to
elaborate, I am not very familiar with clk/regulator implementations.

-Akhil.
> 
> Konrad
> > 
> > -Akhil
> >>
> >> If nothing else, this is a very convenient way to model a part of the
> >> GPU (as that's essentially what GMU_CX is, to my understanding) and
> >> the bindings people didn't shoot me in the head for proposing this, so
> >> I assume it'd be cool to pursue this..
> >>
> >> Konrad
> >>>>
> >>>> Signed-off-by: Konrad Dybcio 
> >>>> ---
> >>>>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  

Re: [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-03 Thread Akhil P Oommen
On Tue, May 02, 2023 at 11:40:26AM +0200, Konrad Dybcio wrote:
> 
> 
> On 2.05.2023 09:49, Akhil P Oommen wrote:
> > On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> >> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> >> but don't implement the associated GMUs. This is due to the fact that
> >> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> >> of enabling & scaling power rails, clocks and bandwidth ourselves.
> >>
> >> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> >> A6XX code to facilitate these GPUs. This involves if-ing out lots
> >> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> >> the actual name that Qualcomm uses in their downstream kernels).
> >>
> >> This is essentially a register region which is convenient to model
> >> as a device. We'll use it for managing the GDSCs. The register
> >> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> >> and lets us reuse quite a bit of gmu_read/write/rmw calls.
> > << I sent a reply to this patch earlier, but not sure where it went.
> > Still figuring out Mutt... >>
> Answered it here:
> 
> https://lore.kernel.org/linux-arm-msm/4d3000c1-c3f9-0bfd-3eb3-23393f9a8...@linaro.org/

Thanks. Will check and respond there if needed.

> 
> I don't think I see any new comments in this "reply revision" (heh), so please
> check that one out.
> 
> > 
> > Only convenience I found is that we can reuse gmu register ops in a few
> > places (< 10 I think). If we just model this as another gpu memory
> > region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
> > architecture code with clean separation. Also, it looks like we need to
> > keep a dummy gmu platform device in the devicetree with the current
> > approach. That doesn't sound right.
> That's correct, but.. if we switch away from that, VDD_GX/VDD_CX will
> need additional, gmuwrapper-configuration specific code anyway, as
> OPP & genpd will no longer make use of the default behavior which
> only gets triggered if there's a single power-domains=<> entry, afaicu.
Can you please tell me which specific *default behviour* do you mean here?
I am curious to know what I am overlooking here. We can always get a cxpd/gxpd 
device
and vote for the gdscs directly from the driver. Anything related to
OPP?

-Akhil
> 
> If nothing else, this is a very convenient way to model a part of the
> GPU (as that's essentially what GMU_CX is, to my understanding) and
> the bindings people didn't shoot me in the head for proposing this, so
> I assume it'd be cool to pursue this..
> 
> Konrad
> >>
> >> Signed-off-by: Konrad Dybcio 
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +++-
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 255 
> >> +---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
> >>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
> >>  6 files changed, 318 insertions(+), 38 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index 87babbb2a19f..b1acdb027205 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -1469,6 +1469,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> >> struct platform_device *pdev,
> >>  
> >>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>  {
> >> +  struct adreno_gpu *adreno_gpu = _gpu->base;
> >>struct a6xx_gmu *gmu = _gpu->gmu;
> >>struct platform_device *pdev = to_platform_device(gmu->dev);
> >>  
> >> @@ -1494,10 +1495,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> >>gmu->mmio = NULL;
> >>gmu->rscc = NULL;
> >>  
> >> -  a6xx_gmu_memory_free(gmu);
> >> +  if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> >> +  a6xx_gmu_memory_free(gmu);
> >>  
> >> -  free_irq(gmu->gmu_irq, gmu);
> >> -  free_irq(gmu->hfi_irq, gmu);
> >> +  free_irq(gmu->gmu_irq, gmu);
> >> +  free_irq(gmu->hfi_irq, gmu);
> >> +  }
> >>  
> >>/* Drop reference taken in of_find_device_by_node */
> >>put_device(gmu->dev);
> >&

Re: [PATCH v6 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-02 Thread Akhil P Oommen
On Sat, Apr 01, 2023 at 01:54:43PM +0200, Konrad Dybcio wrote:
> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> but don't implement the associated GMUs. This is due to the fact that
> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> of enabling & scaling power rails, clocks and bandwidth ourselves.
> 
> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> A6XX code to facilitate these GPUs. This involves if-ing out lots
> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> the actual name that Qualcomm uses in their downstream kernels).
> 
> This is essentially a register region which is convenient to model
> as a device. We'll use it for managing the GDSCs. The register
> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> and lets us reuse quite a bit of gmu_read/write/rmw calls.
<< I sent a reply to this patch earlier, but not sure where it went.
Still figuring out Mutt... >>

Only convenience I found is that we can reuse gmu register ops in a few
places (< 10 I think). If we just model this as another gpu memory
region, I think it will help to keep gmu vs gmu-wrapper/no-gmu
architecture code with clean separation. Also, it looks like we need to
keep a dummy gmu platform device in the devicetree with the current
approach. That doesn't sound right.
> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +++-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 255 
> +---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
>  6 files changed, 318 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 87babbb2a19f..b1acdb027205 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1469,6 +1469,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> struct platform_device *pdev,
>  
>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  {
> + struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct a6xx_gmu *gmu = _gpu->gmu;
>   struct platform_device *pdev = to_platform_device(gmu->dev);
>  
> @@ -1494,10 +1495,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>   gmu->mmio = NULL;
>   gmu->rscc = NULL;
>  
> - a6xx_gmu_memory_free(gmu);
> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> + a6xx_gmu_memory_free(gmu);
>  
> - free_irq(gmu->gmu_irq, gmu);
> - free_irq(gmu->hfi_irq, gmu);
> + free_irq(gmu->gmu_irq, gmu);
> + free_irq(gmu->hfi_irq, gmu);
> + }
>  
>   /* Drop reference taken in of_find_device_by_node */
>   put_device(gmu->dev);
> @@ -1516,6 +1519,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
>   return 0;
>  }
>  
> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> *node)
> +{
> + struct platform_device *pdev = of_find_device_by_node(node);
> + struct a6xx_gmu *gmu = _gpu->gmu;
> + int ret;
> +
> + if (!pdev)
> + return -ENODEV;
> +
> + gmu->dev = >dev;
> +
> + of_dma_configure(gmu->dev, node, true);
why setup dma for a device that is not actually present?
> +
> + pm_runtime_enable(gmu->dev);
> +
> + /* Mark legacy for manual SPTPRAC control */
> + gmu->legacy = true;
> +
> + /* Map the GMU registers */
> + gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> + if (IS_ERR(gmu->mmio)) {
> + ret = PTR_ERR(gmu->mmio);
> + goto err_mmio;
> + }
> +
> + gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> + if (IS_ERR(gmu->cxpd)) {
> + ret = PTR_ERR(gmu->cxpd);
> + goto err_mmio;
> + }
> +
> + if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> + ret = -ENODEV;
> + goto detach_cxpd;
> + }
> +
> + init_completion(>pd_gate);
> + complete_all(>pd_gate);
> + gmu->pd_nb.notifier_call = cxpd_notifier_cb;
> +
> + /* Get a link to the GX power domain to reset the GPU */
> + gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
> + if (IS_ERR(gmu->gxpd)) {
> + ret = PTR_ERR(gmu->gxpd);
> + goto err_mmio;
> + }
> +
> + gmu->initialized = true;
> +
> + return 0;
> +
> +detach_cxpd:
> + dev_pm_domain_detach(gmu->cxpd, false);
> +
> +err_mmio:
> + iounmap(gmu->mmio);
> +
> + /* Drop reference taken in of_find_device_by_node */
> + put_device(gmu->dev);
> +
> + return ret;
> +}
> +
>  int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  {
>   struct adreno_gpu *adreno_gpu = _gpu->base;
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> 

Re: [PATCH v5 06/15] drm/msm/a6xx: Introduce GMU wrapper support

2023-05-01 Thread Akhil P Oommen
On Fri, Mar 31, 2023 at 01:25:20AM +0200, Konrad Dybcio wrote:
> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
> but don't implement the associated GMUs. This is due to the fact that
> the GMU directly pokes at RPMh. Sadly, this means we have to take care
> of enabling & scaling power rails, clocks and bandwidth ourselves.
> 
> Reuse existing Adreno-common code and modify the deeply-GMU-infused
> A6XX code to facilitate these GPUs. This involves if-ing out lots
> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
> the actual name that Qualcomm uses in their downstream kernels).
> 
> This is essentially a register region which is convenient to model
> as a device. We'll use it for managing the GDSCs. The register
> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
> and lets us reuse quite a bit of gmu_read/write/rmw calls.

Commenting here after going through rest of the patch...

Only convenience I see with modeling a dummy gmu is that we can reuse gmu 
read/write routines which I think would be less that 10 instances. If we just 
add a gmu_wrapper region to gpu node, wouldn't that help to create a clean 
separation between gmu-supported vs gmu-wrapper/no-gmu architectures? Also, 
creating a dummy gmu device in device tree doesn't sound right to me.


> 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +++-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 254 
> +---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
>  6 files changed, 317 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 1514b3ed0fcf..c6001e82e03d 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1474,6 +1474,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
> struct platform_device *pdev,
>  
>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  {
> + struct adreno_gpu *adreno_gpu = _gpu->base;
>   struct a6xx_gmu *gmu = _gpu->gmu;
>   struct platform_device *pdev = to_platform_device(gmu->dev);
>  
> @@ -1499,10 +1500,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>   gmu->mmio = NULL;
>   gmu->rscc = NULL;
>  
> - a6xx_gmu_memory_free(gmu);
> + if (!adreno_has_gmu_wrapper(adreno_gpu)) {
> + a6xx_gmu_memory_free(gmu);
>  
> - free_irq(gmu->gmu_irq, gmu);
> - free_irq(gmu->hfi_irq, gmu);
> + free_irq(gmu->gmu_irq, gmu);
> + free_irq(gmu->hfi_irq, gmu);
> + }
>  
>   /* Drop reference taken in of_find_device_by_node */
>   put_device(gmu->dev);
> @@ -1521,6 +1524,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
>   return 0;
>  }
>  
> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
> *node)
> +{
> + struct platform_device *pdev = of_find_device_by_node(node);
> + struct a6xx_gmu *gmu = _gpu->gmu;
> + int ret;
> +
> + if (!pdev)
> + return -ENODEV;
> +
> + gmu->dev = >dev;
> +
> + of_dma_configure(gmu->dev, node, true);
If GMU is dummy, why should we configure dma?
> +
> + pm_runtime_enable(gmu->dev);
> +
> + /* Mark legacy for manual SPTPRAC control */
> + gmu->legacy = true;
> +
> + /* Map the GMU registers */
> + gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
> + if (IS_ERR(gmu->mmio)) {
> + ret = PTR_ERR(gmu->mmio);
> + goto err_mmio;
> + }
> +
> + gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
> + if (IS_ERR(gmu->cxpd)) {
> + ret = PTR_ERR(gmu->cxpd);
> + goto err_mmio;
> + }
> +
> + if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
> + ret = -ENODEV;
> + goto detach_cxpd;
> + }
> +
> + init_completion(>pd_gate);
> + complete_all(>pd_gate);
> + gmu->pd_nb.notifier_call = cxpd_notifier_cb;
> +
> + /* Get a link to the GX power domain to reset the GPU */
> + gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
> + if (IS_ERR(gmu->gxpd)) {
> + ret = PTR_ERR(gmu->gxpd);
> + goto err_mmio;
> + }
> +
> + gmu->initialized = true;
> +
> + return 0;
> +
> +detach_cxpd:
> + dev_pm_domain_detach(gmu->cxpd, false);
> +
> +err_mmio:
> + iounmap(gmu->mmio);
> +
> + /* Drop reference taken in of_find_device_by_node */
> + put_device(gmu->dev);
> +
> + return ret;
> +}
> +
>  int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  {
>   struct adreno_gpu *adreno_gpu = _gpu->base;
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 3/1/2023 2:14 AM, Akhil P Oommen wrote:
> On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>> On 28.02.2023 21:23, Akhil P Oommen wrote:
>>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>>> Add explanations (wherever possible to other tunables).
>>>>
>>>> Sort the variable definition and assignment alphabetically.
>>> Sorting based on decreasing order of line length is more readable, isn't it?
>> I can do that.
>>
>>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>>> Set default values for all of the tunables to zero, as they should be.
>>>>
>>>> Values were validated against downstream and will be fixed up in
>>>> separate commits so as not to make this one even more messy.
>>>>
>>>> A618 remains untouched (left at hw defaults) in this patch.
>>>>
>>>> Signed-off-by: Konrad Dybcio 
>>>> ---
>>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
>>>> ---
>>>>  1 file changed, 45 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>>>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>>  {
>>>>struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>>> -  u32 lower_bit = 2;
>>>> +  /* Unknown, introduced with A640/680 */
>>>>u32 amsbc = 0;
>>>> +  /*
>>>> +   * The Highest Bank Bit value represents the bit of the highest DDR 
>>>> bank.
>>>> +   * We then subtract 13 from it (13 is the minimum value allowed by hw) 
>>>> and
>>>> +   * write the lowest two bits of the remaining value as hbb_lo and the
>>>> +   * one above it as hbb_hi to the hardware. The default values (when HBB 
>>>> is
>>>> +   * not specified) are 0, 0.
>>>> +   */
>>>> +  u32 hbb_hi = 0;
>>>> +  u32 hbb_lo = 0;
>>>> +  /* Whether the minimum access length is 64 bits */
>>>> +  u32 min_acc_len = 0;
>>>> +  /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>>>u32 rgb565_predicator = 0;
>>>> +  /* Unknown, introduced with A650 family */
>>>>u32 uavflagprd_inv = 0;
>>>> +  /* Entirely magic, per-GPU-gen value */
>>>> +  u32 ubwc_mode = 0;
>>>>  
>>>>/* a618 is using the hw default values */
>>>>if (adreno_is_a618(adreno_gpu))
>>>>return;
>>>>  
>>>> -  if (adreno_is_a640_family(adreno_gpu))
>>>> +  if (adreno_is_a619(adreno_gpu)) {
>>>> +  /* HBB = 14 */
>>>> +  hbb_lo = 1;
>>>> +  }
>>>> +
>>>> +  if (adreno_is_a630(adreno_gpu)) {
>>>> +  /* HBB = 15 */
>>>> +  hbb_lo = 2;
>>>> +  }
>>>> +
>>>> +  if (adreno_is_a640_family(adreno_gpu)) {
>>>>amsbc = 1;
>>>> +  /* HBB = 15 */
>>>> +  hbb_lo = 2;
>>>> +  }
>>>>  
>>>>if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>>> -  /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>> -  lower_bit = 3;
>>>>amsbc = 1;
>>>> +  /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>>> +  /* HBB = 16 */
>>>> +  hbb_lo = 3;
>>>>rgb565_predicator = 1;
>>>>uavflagprd_inv = 2;
>>>>}
>>>>  
>>>>if (adreno_is_7c3(adreno_gpu)) {
>>>> -  lower_bit = 1;
>>>>amsbc = 1;
>>>> +  /* HBB is unset in downstream DTS, defaulting to 0 */
>>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
>>> configurations were moved to the driver from DT in recent downstream 
>>> kernels.
>> Right, seems to have happened with msm-5.10. Though a random kernel I
>> grabbed seems to suggest it's 15 and not 14?
&g

Re: [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>
> On 28.02.2023 21:23, Akhil P Oommen wrote:
>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>> Add explanations (wherever possible to other tunables).
>>>
>>> Sort the variable definition and assignment alphabetically.
>> Sorting based on decreasing order of line length is more readable, isn't it?
> I can do that.
>
>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>> Set default values for all of the tunables to zero, as they should be.
>>>
>>> Values were validated against downstream and will be fixed up in
>>> separate commits so as not to make this one even more messy.
>>>
>>> A618 remains untouched (left at hw defaults) in this patch.
>>>
>>> Signed-off-by: Konrad Dybcio 
>>> ---
>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
>>> ---
>>>  1 file changed, 45 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>  {
>>> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>> -   u32 lower_bit = 2;
>>> +   /* Unknown, introduced with A640/680 */
>>> u32 amsbc = 0;
>>> +   /*
>>> +* The Highest Bank Bit value represents the bit of the highest DDR 
>>> bank.
>>> +* We then subtract 13 from it (13 is the minimum value allowed by hw) 
>>> and
>>> +* write the lowest two bits of the remaining value as hbb_lo and the
>>> +* one above it as hbb_hi to the hardware. The default values (when HBB 
>>> is
>>> +* not specified) are 0, 0.
>>> +*/
>>> +   u32 hbb_hi = 0;
>>> +   u32 hbb_lo = 0;
>>> +   /* Whether the minimum access length is 64 bits */
>>> +   u32 min_acc_len = 0;
>>> +   /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>> u32 rgb565_predicator = 0;
>>> +   /* Unknown, introduced with A650 family */
>>> u32 uavflagprd_inv = 0;
>>> +   /* Entirely magic, per-GPU-gen value */
>>> +   u32 ubwc_mode = 0;
>>>  
>>> /* a618 is using the hw default values */
>>> if (adreno_is_a618(adreno_gpu))
>>> return;
>>>  
>>> -   if (adreno_is_a640_family(adreno_gpu))
>>> +   if (adreno_is_a619(adreno_gpu)) {
>>> +   /* HBB = 14 */
>>> +   hbb_lo = 1;
>>> +   }
>>> +
>>> +   if (adreno_is_a630(adreno_gpu)) {
>>> +   /* HBB = 15 */
>>> +   hbb_lo = 2;
>>> +   }
>>> +
>>> +   if (adreno_is_a640_family(adreno_gpu)) {
>>> amsbc = 1;
>>> +   /* HBB = 15 */
>>> +   hbb_lo = 2;
>>> +   }
>>>  
>>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>> -   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> -   lower_bit = 3;
>>> amsbc = 1;
>>> +   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> +   /* HBB = 16 */
>>> +   hbb_lo = 3;
>>> rgb565_predicator = 1;
>>> uavflagprd_inv = 2;
>>> }
>>>  
>>> if (adreno_is_7c3(adreno_gpu)) {
>>> -   lower_bit = 1;
>>> amsbc = 1;
>>> +   /* HBB is unset in downstream DTS, defaulting to 0 */
>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
>> configurations were moved to the driver from DT in recent downstream kernels.
> Right, seems to have happened with msm-5.10. Though a random kernel I
> grabbed seems to suggest it's 15 and not 14?
>
> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 
7c3 is only supported on LP4, we can hardcode 14 here.
In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect 
ddrtype. If we can get something like that in upstream, we should implement a 
simila

Re: [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
> Rename lower_bit to hbb_lo and explain what it signifies.
> Add explanations (wherever possible to other tunables).
>
> Sort the variable definition and assignment alphabetically.
Sorting based on decreasing order of line length is more readable, isn't it?
>
> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
> Set default values for all of the tunables to zero, as they should be.
>
> Values were validated against downstream and will be fixed up in
> separate commits so as not to make this one even more messy.
>
> A618 remains untouched (left at hw defaults) in this patch.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
> ---
>  1 file changed, 45 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..bdae341e0a7c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + /* Unknown, introduced with A640/680 */
>   u32 amsbc = 0;
> + /*
> +  * The Highest Bank Bit value represents the bit of the highest DDR 
> bank.
> +  * We then subtract 13 from it (13 is the minimum value allowed by hw) 
> and
> +  * write the lowest two bits of the remaining value as hbb_lo and the
> +  * one above it as hbb_hi to the hardware. The default values (when HBB 
> is
> +  * not specified) are 0, 0.
> +  */
> + u32 hbb_hi = 0;
> + u32 hbb_lo = 0;
> + /* Whether the minimum access length is 64 bits */
> + u32 min_acc_len = 0;
> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>   u32 rgb565_predicator = 0;
> + /* Unknown, introduced with A650 family */
>   u32 uavflagprd_inv = 0;
> + /* Entirely magic, per-GPU-gen value */
> + u32 ubwc_mode = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
>   return;
>  
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a619(adreno_gpu)) {
> + /* HBB = 14 */
> + hbb_lo = 1;
> + }
> +
> + if (adreno_is_a630(adreno_gpu)) {
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
> +
> + if (adreno_is_a640_family(adreno_gpu)) {
>   amsbc = 1;
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> - lower_bit = 3;
>   amsbc = 1;
> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> + /* HBB = 16 */
> + hbb_lo = 3;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
>   amsbc = 1;
> + /* HBB is unset in downstream DTS, defaulting to 0 */
This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
configurations were moved to the driver from DT in recent downstream kernels.

-Akhil.
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 
> 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)
>



Re: [PATCH 02/14] drm/msm/a6xx: Extend UBWC config

2023-02-01 Thread Akhil P Oommen
On 1/26/2023 8:46 PM, Konrad Dybcio wrote:
> Port setting min_access_length, ubwc_mode and upper_bit from downstream.
> Values were validated using downstream device trees for SM8[123]50 and
> left default (as per downstream) elsewhere.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 26 ++
>  1 file changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..ad5d791b804c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,17 +786,22 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + u32 lower_bit = 1;
Wouldn't this break a630?

-Akhil.
> + u32 upper_bit = 0;
>   u32 amsbc = 0;
>   u32 rgb565_predicator = 0;
>   u32 uavflagprd_inv = 0;
> + u32 min_acc_len = 0;
> + u32 ubwc_mode = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
>   return;
>  
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a640_family(adreno_gpu)) {
>   amsbc = 1;
> + lower_bit = 2;
> + }
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> @@ -807,18 +812,23 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
>   amsbc = 1;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | upper_bit << 10 | amsbc << 4 |
> +   min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, upper_bit << 4 |
> +   min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, upper_bit << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | lower_bit 
> << 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)



Re: [Freedreno] [PATCH v2] drm/msm/adreno: Make adreno quirks not overwrite each other

2023-01-02 Thread Akhil P Oommen
On 1/2/2023 3:32 PM, Konrad Dybcio wrote:
> So far the adreno quirks have all been assigned with an OR operator,
> which is problematic, because they were assigned consecutive integer
> values, which makes checking them with an AND operator kind of no bueno..
>
> Switch to using BIT(n) so that only the quirks that the programmer chose
> are taken into account when evaluating info->quirks & ADRENO_QUIRK_...
>
> Fixes: 370063ee427a ("drm/msm/adreno: Add A540 support")
> Reviewed-by: Dmitry Baryshkov 
> Reviewed-by: Marijn Suijten 
> Reviewed-by: Rob Clark 
> Signed-off-by: Konrad Dybcio 
> ---
> v1 -> v2:
> - pick up tags
> - correct the Fixes: tag
>
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index c85857c0a228..5eb254c9832a 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -29,11 +29,9 @@ enum {
>   ADRENO_FW_MAX,
>  };
>  
> -enum adreno_quirks {
> - ADRENO_QUIRK_TWO_PASS_USE_WFI = 1,
> - ADRENO_QUIRK_FAULT_DETECT_MASK = 2,
> - ADRENO_QUIRK_LMLOADKILL_DISABLE = 3,
> -};
> +#define ADRENO_QUIRK_TWO_PASS_USE_WFIBIT(0)
> +#define ADRENO_QUIRK_FAULT_DETECT_MASK   BIT(1)
> +#define ADRENO_QUIRK_LMLOADKILL_DISABLE  BIT(2)
>  
>  struct adreno_rev {
>   uint8_t  core;
> @@ -65,7 +63,7 @@ struct adreno_info {
>   const char *name;
>   const char *fw[ADRENO_FW_MAX];
>   uint32_t gmem;
> - enum adreno_quirks quirks;
> + u64 quirks;
>   struct msm_gpu *(*init)(struct drm_device *dev);
>   const char *zapfw;
>   u32 inactive_period;

Reviewed-by: Akhil P Oommen 


-Akhil.


  1   2   3   4   5   6   >