Re: [PATCH 0/4] fixes for Adreno A5Xx preemption

2024-07-17 Thread Connor Abbott
On Wed, Jul 17, 2024 at 5:33 PM Vladimir Lypak  wrote:
>
> On Wed, Jul 17, 2024 at 10:40:26AM +0100, Connor Abbott wrote:
> > On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
> >  wrote:
> > >
> > > There are several issues with preemption on Adreno A5XX GPUs which
> > > render system unusable if more than one priority level is used. Those
> > > issues include persistent GPU faults and hangs, full UI lockups with
> > > idling GPU.
> > >
> > > ---
> > > Vladimir Lypak (4):
> > >   drm/msm/a5xx: disable preemption in submits by default
> > >   drm/msm/a5xx: properly clear preemption records on resume
> > >   drm/msm/a5xx: fix races in preemption evaluation stage
> > >   drm/msm/a5xx: workaround early ring-buffer emptiness check
> > >
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 18 ++
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 12 ++---
> > >  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 ---
> > >  3 files changed, 47 insertions(+), 13 deletions(-)
> > > ---
> > > base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88
> > > --
> > > 2.45.2
> > >
> >
> > Hi Vladimir,
>
> Hi Connor!
>
> >
> > While looking at preemption on a7xx, where the overall logic is pretty
> > much the same, and I've been seeing the same "soft lockups". However
> > even after porting patch 3, it turns out that's not enough because
> > there's a different race. The sequence of events is something like
> > this:
> >
> > 1. Medium-prio app A submits to ring 2.
> > 2. Ring 0 retires its last job and we start a preemption to ring 2.
> > 3. High-prio app B submits to ring 0. It sees the preemption from step
> > 2 ongoing and doesn't write the WTPR register or try to preempt.
> > 4. The preemption finishes and we update WPTR.
> At this point with patch 3 applied it should switch to ring 0 right away
> because of trigger call in the end of a5xx_preempt_irq. Didn't you
> forget it? Downstream has such call too. Even though it makes preemption
> a little more aggressive it doesn't work without it.

Yes, I didn't apply that bit because it didn't seem necessary to fix
the original issue you described and it seemed like just an
optimization to make preemption more aggressive, however it does seem
to fix the issue. I can't verify that the issue you describe (the
retire IRQ arriving before preemption IRQ) is what's actually
happening because adding a tracepoint on retire seems to change the
timing enough so that the lockup doesn't happen, though. So I guess
I'll just have to assume that's what it was.

Given how subtle this is, enough that I missed it, maybe it's worth a
comment and an extra commit.

Also, I forgot to mention that while I was reading this over I found
another (theoretical) race - we could flush a submit in between
calling update_wptr() and set_preempt_state(PREEMPT_NONE) in
a5xx_preempt_irq() and never update wptr. I would fix it by renaming
PREEMPT_ABORT to PREEMPT_FINISH and pulling out the ABORT ->
update_wptr() -> NONE sequence in a5xx_preempt_trigger() into a
separate function that also gets called in a5xx_preempt_irq().

Connor

>
> > 5. App A's submit retires. We try to preempt, but the submit and
> > ring->cur write from step 3 happened on a different CPU and the write
> > hasn't landed yet so we skip it.
>
> I don't think this is possible on modern CPUs. Could it be that retire
> IRQ appeared earlier (while it was switching from 0 to 2) and you are
> looking at msm_gpu_submit_retired trace event which is called from
> retire work later.
>
> >
> > It's a bit tricky because write reordering is involved, but this seems
> > to be what's happening - everything except my speculation about the
> > delayed write to ring->cur being the problem comes straight from a
> > trace of this happening.
> >
> > Rob suggested on IRC that we make retire handling happen on the same
> > workqueue as submissions, so that preempt_trigger is always
> > serialized, which IIUC would also make patch 3 unnecessary. What do
> > you think?
>
> In this patch series i have tried to do least amount of changes so it
> could be easily back-ported. It isn't pretty but it works reliably for
> me. Otherwise it would be fine to just disable preemption like it's done
> on LTS before 5.4 and rework preemption in new kernel releases.
>
> Kind regards,
>
> Vladimir
>
> >
> > Best regards,
> >
> > Connor


Re: [PATCH 0/4] fixes for Adreno A5Xx preemption

2024-07-17 Thread Connor Abbott
On Thu, Jul 11, 2024 at 11:10 AM Vladimir Lypak
 wrote:
>
> There are several issues with preemption on Adreno A5XX GPUs which
> render system unusable if more than one priority level is used. Those
> issues include persistent GPU faults and hangs, full UI lockups with
> idling GPU.
>
> ---
> Vladimir Lypak (4):
>   drm/msm/a5xx: disable preemption in submits by default
>   drm/msm/a5xx: properly clear preemption records on resume
>   drm/msm/a5xx: fix races in preemption evaluation stage
>   drm/msm/a5xx: workaround early ring-buffer emptiness check
>
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 18 ++
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 12 ++---
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 30 ---
>  3 files changed, 47 insertions(+), 13 deletions(-)
> ---
> base-commit: 523b23f0bee3014a7a752c9bb9f5c54f0eddae88
> --
> 2.45.2
>

Hi Vladimir,

While looking at preemption on a7xx, where the overall logic is pretty
much the same, and I've been seeing the same "soft lockups". However
even after porting patch 3, it turns out that's not enough because
there's a different race. The sequence of events is something like
this:

1. Medium-prio app A submits to ring 2.
2. Ring 0 retires its last job and we start a preemption to ring 2.
3. High-prio app B submits to ring 0. It sees the preemption from step
2 ongoing and doesn't write the WTPR register or try to preempt.
4. The preemption finishes and we update WPTR.
5. App A's submit retires. We try to preempt, but the submit and
ring->cur write from step 3 happened on a different CPU and the write
hasn't landed yet so we skip it.

It's a bit tricky because write reordering is involved, but this seems
to be what's happening - everything except my speculation about the
delayed write to ring->cur being the problem comes straight from a
trace of this happening.

Rob suggested on IRC that we make retire handling happen on the same
workqueue as submissions, so that preempt_trigger is always
serialized, which IIUC would also make patch 3 unnecessary. What do
you think?

Best regards,

Connor


[PATCH v2 2/3] drm/msm: Expand UBWC config setting

2024-07-03 Thread Connor Abbott
According to downstream we should be setting RBBM_NC_MODE_CNTL to a
non-default value on a663 and a680, we don't support a663 and on a680
we're leaving it at the wrong (suboptimal) value. Just set it on all
GPUs. Similarly, plumb through level2_swizzling_dis which will be
necessary on a663.

ubwc_mode is expanded and renamed to ubwc_swizzle to match the name on
the display side. Similarly macrotile_mode should match the display
side.

Signed-off-by: Connor Abbott 
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  4 
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 34 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 32 ++-
 3 files changed, 53 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index c003f970189b..33b0f607f913 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1788,5 +1788,9 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
else
adreno_gpu->ubwc_config.highest_bank_bit = 14;
 
+   /* a5xx only supports UBWC 1.0, these are not configurable */
+   adreno_gpu->ubwc_config.macrotile_mode = 0;
+   adreno_gpu->ubwc_config.ubwc_swizzle = 0x7;
+
return gpu;
 }
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index bcaec86ac67a..7c2fdd1e7684 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -493,24 +493,17 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
 
 static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
 {
-   /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
gpu->ubwc_config.rgb565_predicator = 0;
-   /* Unknown, introduced with A650 family */
gpu->ubwc_config.uavflagprd_inv = 0;
-   /* Whether the minimum access length is 64 bits */
gpu->ubwc_config.min_acc_len = 0;
-   /* Entirely magic, per-GPU-gen value */
-   gpu->ubwc_config.ubwc_mode = 0;
-   /*
-* The Highest Bank Bit value represents the bit of the highest DDR 
bank.
-* This should ideally use DRAM type detection.
-*/
+   gpu->ubwc_config.ubwc_swizzle = 0x6;
+   gpu->ubwc_config.macrotile_mode = 0;
gpu->ubwc_config.highest_bank_bit = 15;
 
if (adreno_is_a610(gpu)) {
gpu->ubwc_config.highest_bank_bit = 13;
gpu->ubwc_config.min_acc_len = 1;
-   gpu->ubwc_config.ubwc_mode = 1;
+   gpu->ubwc_config.ubwc_swizzle = 0x7;
}
 
if (adreno_is_a618(gpu))
@@ -536,6 +529,7 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
gpu->ubwc_config.amsbc = 1;
gpu->ubwc_config.rgb565_predicator = 1;
gpu->ubwc_config.uavflagprd_inv = 2;
+   gpu->ubwc_config.macrotile_mode = 1;
}
 
if (adreno_is_7c3(gpu)) {
@@ -543,12 +537,12 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
gpu->ubwc_config.amsbc = 1;
gpu->ubwc_config.rgb565_predicator = 1;
gpu->ubwc_config.uavflagprd_inv = 2;
+   gpu->ubwc_config.macrotile_mode = 1;
}
 
if (adreno_is_a702(gpu)) {
gpu->ubwc_config.highest_bank_bit = 14;
gpu->ubwc_config.min_acc_len = 1;
-   gpu->ubwc_config.ubwc_mode = 0;
}
 }
 
@@ -564,21 +558,26 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
u32 hbb = adreno_gpu->ubwc_config.highest_bank_bit - 13;
u32 hbb_hi = hbb >> 2;
u32 hbb_lo = hbb & 3;
+   u32 ubwc_mode = adreno_gpu->ubwc_config.ubwc_swizzle & 1;
+   u32 level2_swizzling_dis = !(adreno_gpu->ubwc_config.ubwc_swizzle & 2);
 
gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
+ level2_swizzling_dis << 12 |
  adreno_gpu->ubwc_config.rgb565_predicator << 11 |
  hbb_hi << 10 | adreno_gpu->ubwc_config.amsbc << 4 |
  adreno_gpu->ubwc_config.min_acc_len << 3 |
- hbb_lo << 1 | adreno_gpu->ubwc_config.ubwc_mode);
+ hbb_lo << 1 | ubwc_mode);
 
-   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
+   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL,
+ level2_swizzling_dis << 6 | hbb_hi << 4 |
  adreno_gpu->ubwc_config.min_acc_len << 3 |
- hbb_lo << 1 | adreno_gpu->ubwc_config.ubwc_mode);
+ hbb_lo << 1 | ubwc_mode);
 
-   gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
+   gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
+ level2_swizzling_dis << 

[PATCH v2 3/3] drm/msm: Expose expanded UBWC config uapi

2024-07-03 Thread Connor Abbott
This adds extra parameters that affect UBWC tiling that will be used by
the Mesa implementation of VK_EXT_host_image_copy.

Signed-off-by: Connor Abbott 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 6 ++
 include/uapi/drm/msm_drm.h  | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 1c6626747b98..a4d3bc2de8df 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -379,6 +379,12 @@ int adreno_get_param(struct msm_gpu *gpu, struct 
msm_file_private *ctx,
case MSM_PARAM_RAYTRACING:
*value = adreno_gpu->has_ray_tracing;
return 0;
+   case MSM_PARAM_UBWC_SWIZZLE:
+   *value = adreno_gpu->ubwc_config.ubwc_swizzle;
+   return 0;
+   case MSM_PARAM_MACROTILE_MODE:
+   *value = adreno_gpu->ubwc_config.macrotile_mode;
+   return 0;
default:
DBG("%s: invalid param: %u", gpu->name, param);
return -EINVAL;
diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h
index 3fca72f73861..2377147b6af0 100644
--- a/include/uapi/drm/msm_drm.h
+++ b/include/uapi/drm/msm_drm.h
@@ -88,6 +88,8 @@ struct drm_msm_timespec {
 #define MSM_PARAM_VA_SIZE0x0f  /* RO: size of valid GPU iova range (bytes) 
*/
 #define MSM_PARAM_HIGHEST_BANK_BIT 0x10 /* RO */
 #define MSM_PARAM_RAYTRACING 0x11 /* RO */
+#define MSM_PARAM_UBWC_SWIZZLE 0x12 /* RO */
+#define MSM_PARAM_MACROTILE_MODE 0x13 /* RO */
 
 /* For backwards compat.  The original support for preemption was based on
  * a single ring per priority level so # of priority levels equals the #

-- 
2.31.1



[PATCH v2 1/3] drm/msm: Update a6xx register XML

2024-07-03 Thread Connor Abbott
Update to Mesa commit 81fd13913a97 ("freedreno: Fix RBBM_NC_MODE_CNTL
variants").

Signed-off-by: Connor Abbott 
---
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 1617 -
 1 file changed, 1603 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml 
b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
index 2dfe6913ab4f..53a453228427 100644
--- a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
+++ b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
@@ -1198,6 +1198,1552 @@ to upconvert to 32b float 

[PATCH v2 0/3] drm/msm: Further expose UBWC tiling parameters

2024-07-03 Thread Connor Abbott
After testing, there are more parameters that we're programming which
affect how UBWC tiles are laid out in memory and therefore affect
the Mesa implementation of VK_EXT_host_image_copy [1], which includes a
CPU implementation of tiling and detiling images. In particular we have:

1. ubwc_mode, which is used to enable level 1 bank swizzling to go back
   to UBWC 1.0 when the implementation supports UBWC 2.0. a610 sets
   this.
2. macrotile_mode, which we previously left as default but according to
   downstream we shouldn't for a680.
3. level2_swizzling_dis, which according to downstream has to be set
   differently for a663.

I want as much as possible to avoid problems from people trying to
upstream Mesa/kernel support not knowing what they're doing and blindly
copying things, so let's make this very explicit that you must set the
correct parameters in the kernel and then make sure that Mesa always
gets the right parameters from the "source of truth" in the kernel by
adding two new UAPI parameters. The Mesa MR has already been updated to
use this if available.

A secondary goal is to make the adreno settings look more like the MDSS
settings, by combining ubwc_mode and level2_swizzling_dis into a single
ubwc_swizzle parameter that matches the MDSS one. This will help with
creating a single source of truth for all drivers later. The UAPI also
matches this, and it makes the Mesa tiling and detiling implementation
simpler/more straightforward.

For more information on what all these parameters mean, see the comments
I've added in the first commit and the giant comment in
src/freedreno/fdl/fd6_tiled_memcpy.c I've added in [1].

Testing of the Mesa MR both with and without this series is appreciated,
there are many different SoCs out there with different UBWC
configurations and I cannot test them all.

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26578

Signed-off-by: Connor Abbott 
---
Changes in v2:
- Move ubwc_config field descriptions to kerneldoc comments on the struct
- Link to v1: 
https://lore.kernel.org/r/20240702-msm-tiling-config-v1-0-adaa6a6e4...@gmail.com

---
Connor Abbott (3):
  drm/msm: Update a6xx register XML
  drm/msm: Expand UBWC config setting
  drm/msm: Expose expanded UBWC config uapi

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |4 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |   34 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |6 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   32 +-
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 1617 -
 include/uapi/drm/msm_drm.h|2 +
 6 files changed, 1664 insertions(+), 31 deletions(-)
---
base-commit: 269b88cb92e62e52718cd44c07b7517265193157
change-id: 20240701-msm-tiling-config-c5f222f5db1c

Best regards,
-- 
Connor Abbott 



Re: [PATCH 2/3] drm/msm: Expand UBWC config setting

2024-07-02 Thread Connor Abbott
On Tue, Jul 2, 2024 at 3:31 PM Rob Clark  wrote:
>
> On Tue, Jul 2, 2024 at 5:56 AM Connor Abbott  wrote:
> >
> > According to downstream we should be setting RBBM_NC_MODE_CNTL to a
> > non-default value on a663 and a680, we don't support a663 and on a680
> > we're leaving it at the wrong (suboptimal) value. Just set it on all
> > GPUs. Similarly, plumb through level2_swizzling_dis which will be
> > necessary on a663.
> >
> > ubwc_mode is expanded and renamed to ubwc_swizzle to match the name on
> > the display side. Similarly macrotile_mode should match the display
> > side.
> >
> > Signed-off-by: Connor Abbott 
> > ---
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  4 
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 36 
> > -
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  3 ++-
> >  3 files changed, 33 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index c003f970189b..33b0f607f913 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -1788,5 +1788,9 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
> > else
> > adreno_gpu->ubwc_config.highest_bank_bit = 14;
> >
> > +   /* a5xx only supports UBWC 1.0, these are not configurable */
> > +   adreno_gpu->ubwc_config.macrotile_mode = 0;
> > +   adreno_gpu->ubwc_config.ubwc_swizzle = 0x7;
> > +
> > return gpu;
> >  }
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index c98cdb1e9326..7a3564dd7941 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -499,8 +499,17 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu 
> > *gpu)
> > gpu->ubwc_config.uavflagprd_inv = 0;
> > /* Whether the minimum access length is 64 bits */
> > gpu->ubwc_config.min_acc_len = 0;
> > -   /* Entirely magic, per-GPU-gen value */
> > -   gpu->ubwc_config.ubwc_mode = 0;
> > +   /* Whether to enable level 1, 2 & 3 bank swizzling.
> > +* UBWC 1.0 always enables all three levels.
> > +* UBWC 2.0 removes level 1 bank swizzling, leaving levels 2 & 3.
> > +* UBWC 4.0 adds the optional ability to disable levels 2 & 3.
>
> I guess this is a bitmask for BIT(level_n)?

Yes, I'll add that to the comment. BIT(0) is level 1, BIT(1) is level
2, and BIT(2) is level 3. I got that convention from msm_mdss.c.

>
> > +*/
> > +   gpu->ubwc_config.ubwc_swizzle = 0x6;
> > +   /* Whether to use 4-channel macrotiling mode or the newer 8-channel
> > +* macrotiling mode introduced in UBWC 3.1. 0 is 4-channel and 1 is
> > +* 8-channel.
> > +*/
>
> Can we add these comments as kerneldoc comments in the ubwc_config
> struct?  That would be a more natural place for eventually moving ubwc
> config to a central systemwide table (and perhaps finally properly
> dealing with the setting differences for DDR vs LPDDR)

Sure. These comments started right next to the code setting the
registers and sort of naturally migrated here but I guess that's
better.

FWIW, I think that in a central systemwide table we'd want to define
the config struct slightly differently, by instead storing the UBWC
version and deriving most of these parameters from that as kgsl
downstream and mdss upstream do. There would be a few extra parameters
that remain:

- highest bank bit
- minimum access length
- levels 2 & 3 bank swizzling enable/disable (level 1 can be inferred
from the version, but maybe we still want to have it to keep all the
bank swizzle config in one place?)

Everybody seems to also have macrotile_mode as a separate parameter,
but that can be avoided by adding UBWC 3.1 and then deriving it from
"ubwc_version >= 3.1".

I haven't taken that step here in adreno because I didn't want to
define UBWC versions in UABI yet when it's not necessary, and if we
don't have that then it's not quite necessary to refactor the driver
yet.

Connor

>
> BR,
> -R
>
> > +   gpu->ubwc_config.macrotile_mode = 0;
> > /*
> >  * The Highest Bank Bit value represents the bit of the highest DDR 
> > bank.
> >  * This should ideally use DRAM type detection.
> > @@ -510,7 +519,7 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu 
> > *gpu)
> > if (adreno_is_a610(gpu)) {
> >

[PATCH 1/3] drm/msm: Update a6xx register XML

2024-07-02 Thread Connor Abbott
Update to Mesa commit 81fd13913a97 ("freedreno: Fix RBBM_NC_MODE_CNTL
variants").

Signed-off-by: Connor Abbott 
---
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 1617 -
 1 file changed, 1603 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml 
b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
index 2dfe6913ab4f..53a453228427 100644
--- a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
+++ b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
@@ -1198,6 +1198,1552 @@ to upconvert to 32b float 

[PATCH 3/3] drm/msm: Expose expanded UBWC config uapi

2024-07-02 Thread Connor Abbott
This adds extra parameters that affect UBWC tiling that will be used by
the Mesa implementation of VK_EXT_host_image_copy.

Signed-off-by: Connor Abbott 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 6 ++
 include/uapi/drm/msm_drm.h  | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 1c6626747b98..a4d3bc2de8df 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -379,6 +379,12 @@ int adreno_get_param(struct msm_gpu *gpu, struct 
msm_file_private *ctx,
case MSM_PARAM_RAYTRACING:
*value = adreno_gpu->has_ray_tracing;
return 0;
+   case MSM_PARAM_UBWC_SWIZZLE:
+   *value = adreno_gpu->ubwc_config.ubwc_swizzle;
+   return 0;
+   case MSM_PARAM_MACROTILE_MODE:
+   *value = adreno_gpu->ubwc_config.macrotile_mode;
+   return 0;
default:
DBG("%s: invalid param: %u", gpu->name, param);
return -EINVAL;
diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h
index 3fca72f73861..2377147b6af0 100644
--- a/include/uapi/drm/msm_drm.h
+++ b/include/uapi/drm/msm_drm.h
@@ -88,6 +88,8 @@ struct drm_msm_timespec {
 #define MSM_PARAM_VA_SIZE0x0f  /* RO: size of valid GPU iova range (bytes) 
*/
 #define MSM_PARAM_HIGHEST_BANK_BIT 0x10 /* RO */
 #define MSM_PARAM_RAYTRACING 0x11 /* RO */
+#define MSM_PARAM_UBWC_SWIZZLE 0x12 /* RO */
+#define MSM_PARAM_MACROTILE_MODE 0x13 /* RO */
 
 /* For backwards compat.  The original support for preemption was based on
  * a single ring per priority level so # of priority levels equals the #

-- 
2.31.1



[PATCH 2/3] drm/msm: Expand UBWC config setting

2024-07-02 Thread Connor Abbott
According to downstream we should be setting RBBM_NC_MODE_CNTL to a
non-default value on a663 and a680, we don't support a663 and on a680
we're leaving it at the wrong (suboptimal) value. Just set it on all
GPUs. Similarly, plumb through level2_swizzling_dis which will be
necessary on a663.

ubwc_mode is expanded and renamed to ubwc_swizzle to match the name on
the display side. Similarly macrotile_mode should match the display
side.

Signed-off-by: Connor Abbott 
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  4 
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 36 -
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  3 ++-
 3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index c003f970189b..33b0f607f913 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1788,5 +1788,9 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
else
adreno_gpu->ubwc_config.highest_bank_bit = 14;
 
+   /* a5xx only supports UBWC 1.0, these are not configurable */
+   adreno_gpu->ubwc_config.macrotile_mode = 0;
+   adreno_gpu->ubwc_config.ubwc_swizzle = 0x7;
+
return gpu;
 }
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index c98cdb1e9326..7a3564dd7941 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -499,8 +499,17 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
gpu->ubwc_config.uavflagprd_inv = 0;
/* Whether the minimum access length is 64 bits */
gpu->ubwc_config.min_acc_len = 0;
-   /* Entirely magic, per-GPU-gen value */
-   gpu->ubwc_config.ubwc_mode = 0;
+   /* Whether to enable level 1, 2 & 3 bank swizzling.
+* UBWC 1.0 always enables all three levels.
+* UBWC 2.0 removes level 1 bank swizzling, leaving levels 2 & 3.
+* UBWC 4.0 adds the optional ability to disable levels 2 & 3.
+*/
+   gpu->ubwc_config.ubwc_swizzle = 0x6;
+   /* Whether to use 4-channel macrotiling mode or the newer 8-channel
+* macrotiling mode introduced in UBWC 3.1. 0 is 4-channel and 1 is
+* 8-channel.
+*/
+   gpu->ubwc_config.macrotile_mode = 0;
/*
 * The Highest Bank Bit value represents the bit of the highest DDR 
bank.
 * This should ideally use DRAM type detection.
@@ -510,7 +519,7 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
if (adreno_is_a610(gpu)) {
gpu->ubwc_config.highest_bank_bit = 13;
gpu->ubwc_config.min_acc_len = 1;
-   gpu->ubwc_config.ubwc_mode = 1;
+   gpu->ubwc_config.ubwc_swizzle = 0x7;
}
 
if (adreno_is_a618(gpu))
@@ -536,6 +545,7 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
gpu->ubwc_config.amsbc = 1;
gpu->ubwc_config.rgb565_predicator = 1;
gpu->ubwc_config.uavflagprd_inv = 2;
+   gpu->ubwc_config.macrotile_mode = 1;
}
 
if (adreno_is_7c3(gpu)) {
@@ -543,12 +553,12 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
gpu->ubwc_config.amsbc = 1;
gpu->ubwc_config.rgb565_predicator = 1;
gpu->ubwc_config.uavflagprd_inv = 2;
+   gpu->ubwc_config.macrotile_mode = 1;
}
 
if (adreno_is_a702(gpu)) {
gpu->ubwc_config.highest_bank_bit = 14;
gpu->ubwc_config.min_acc_len = 1;
-   gpu->ubwc_config.ubwc_mode = 0;
}
 }
 
@@ -564,21 +574,26 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
u32 hbb = adreno_gpu->ubwc_config.highest_bank_bit - 13;
u32 hbb_hi = hbb >> 2;
u32 hbb_lo = hbb & 3;
+   u32 ubwc_mode = adreno_gpu->ubwc_config.ubwc_swizzle & 1;
+   u32 level2_swizzling_dis = !(adreno_gpu->ubwc_config.ubwc_swizzle & 2);
 
gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
+ level2_swizzling_dis << 12 |
  adreno_gpu->ubwc_config.rgb565_predicator << 11 |
  hbb_hi << 10 | adreno_gpu->ubwc_config.amsbc << 4 |
  adreno_gpu->ubwc_config.min_acc_len << 3 |
- hbb_lo << 1 | adreno_gpu->ubwc_config.ubwc_mode);
+ hbb_lo << 1 | ubwc_mode);
 
-   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
+   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL,
+ level2_swizzling_dis << 6 | hbb_hi << 4 |
  adreno_gpu->ubwc_config.min_acc_len << 3 |
- hbb_lo << 1 | adreno_gpu->ubwc_config.ubwc_mode);
+ hb

[PATCH 0/3] drm/msm: Further expose UBWC tiling parameters

2024-07-02 Thread Connor Abbott
After testing, there are more parameters that we're programming which
affect how UBWC tiles are laid out in memory and therefore affect
the Mesa implementation of VK_EXT_host_image_copy [1], which includes a
CPU implementation of tiling and detiling images. In particular we have:

1. ubwc_mode, which is used to enable level 1 bank swizzling to go back
   to UBWC 1.0 when the implementation supports UBWC 2.0. a610 sets
   this.
2. macrotile_mode, which we previously left as default but according to
   downstream we shouldn't for a680.
3. level2_swizzling_dis, which according to downstream has to be set
   differently for a663.

I want as much as possible to avoid problems from people trying to
upstream Mesa/kernel support not knowing what they're doing and blindly
copying things, so let's make this very explicit that you must set the
correct parameters in the kernel and then make sure that Mesa always
gets the right parameters from the "source of truth" in the kernel by
adding two new UAPI parameters. The Mesa MR has already been updated to
use this if available.

A secondary goal is to make the adreno settings look more like the MDSS
settings, by combining ubwc_mode and level2_swizzling_dis into a single
ubwc_swizzle parameter that matches the MDSS one. This will help with
creating a single source of truth for all drivers later. The UAPI also
matches this, and it makes the Mesa tiling and detiling implementation
simpler/more straightforward.

For more information on what all these parameters mean, see the comments
I've added in the first commit and the giant comment in
src/freedreno/fdl/fd6_tiled_memcpy.c I've added in [1].

Testing of the Mesa MR both with and without this series is appreciated,
there are many different SoCs out there with different UBWC
configurations and I cannot test them all.

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26578

Signed-off-by: Connor Abbott 
---
Connor Abbott (3):
  drm/msm: Update a6xx register XML
  drm/msm: Expand UBWC config setting
  drm/msm: Expose expanded UBWC config uapi

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |4 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |   36 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |6 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |3 +-
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 1617 -
 include/uapi/drm/msm_drm.h|2 +
 6 files changed, 1644 insertions(+), 24 deletions(-)
---
base-commit: c39e6f4f08c40710c15a372f5a29de7b84f47fd9
change-id: 20240701-msm-tiling-config-c5f222f5db1c

Best regards,
-- 
Connor Abbott 



Re: [PATCH] drm/msm/adreno: fix a743 and a740 cx mem init

2024-06-26 Thread Connor Abbott
On Wed, Jun 26, 2024 at 1:04 PM Neil Armstrong
 wrote:
>
> Disable the call to qcom_scm_gpu_init_regs() for a730 and a740
> after init failures on the HDK8550 and HDK8450 platforms:
> msm_dpu ae01000.display-controller: failed to load adreno gpu
> msm_dpu ae01000.display-controller: failed to bind 3d0.gpu (ops a3xx_ops 
> [msm]): -5
> msm_dpu ae01000.display-controller: adev bind failed: -5
>
> While debugging, it happens the call to:
> qcom_scm_gpu_init_regs(QCOM_SCM_GPU_ALWAYS_EN_REQ)
> returns -5 and makes the gpu fail to initialize.
>
> Remove the scm call since it's not done downstream either and
> works fine without.
>
> Fixes: 14b27d5df3ea ("drm/msm/a7xx: Initialize a750 "software fuse"")
> Signed-off-by: Neil Armstrong 
> ---

Reviewed-by: Connor Abbott 


Re: [PATCH v2 5/7] drm/msm/adreno: Add A702 support

2024-05-23 Thread Connor Abbott
On Fri, Feb 23, 2024 at 9:28 PM Konrad Dybcio  wrote:
>
> The A702 is a weird mix of 600 and 700 series.. Perhaps even a
> testing ground for some A7xx features with good ol' A6xx silicon.
> It's basically A610 that's been beefed up with some new registers
> and hw features (like APRIV!), that was then cut back in size,
> memory bus and some other ways.
>
> Add support for it, tested with QCM2290 / RB1.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 92 
> +++---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 18 ++
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h| 16 +-
>  3 files changed, 117 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c9c55e2ea584..2a491a486ca1 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -837,6 +837,65 @@ const struct adreno_reglist a690_hwcg[] = {
> {}
>  };
>
> +const struct adreno_reglist a702_hwcg[] = {
> +   { REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0081 },
> +   { REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xf3cf },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001 },
> +   { REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007 },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x0120 },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x2220 },
> +   { REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00 },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022 },
> +   { REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x0011 },
> +   { REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044 },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x0422 },
> +   { REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x },
> +   { REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x0222 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x0002 },
> +   { REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x4000 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x0200 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004 },
> +   { REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x0004 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002 },
> +   { REG_A6XX_RBBM_ISDB_CNT, 0x0182 },
> +   { REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x },
> +   { REG_A6XX_RBBM_SP_HYST_CNT, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x0222 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x0111 },
> +   { REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x0555 },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_FCHE, 0x0222 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_FCHE, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST_FCHE, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_GLC, 0x0022 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_GLC, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST_GLC, 0x },
> +   { REG_A6XX_RBBM_CLOCK_CNTL_MHUB, 0x0002 },
> +   { REG_A6XX_RBBM_CLOCK_DELAY_MHUB, 0x },
> +   { REG_A6XX_RBBM_CLOCK_HYST_MHUB, 0x },
> +   {}
> +};
> +
>  const struct adreno_reglist a730_hwcg[] = {
> { REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x0222 },
> { REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0202 },
> @@ -968,6 +1027,8 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool 
> state)
> clock_cntl_on = 0x8aa8aa02;
> else if (adreno_is_a610(adreno_gpu))
> clock_cntl_on = 0xaaa8aa82;
> +   else if (adreno_is_a702(adreno_gpu))
> +   clock_cntl_on = 0xaa82;
> else
> clock_cntl_on = 0x8aa8aa82;
>
> @@ -989,14 +1050,14 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool 
> state)
> return;
>
> /* Disable SP clock before programming HWCG registers */
> -   if 

Re: [PATCH v5 09/18] drm/msm: import A6xx XML display registers database

2024-04-24 Thread Connor Abbott
On Mon, Apr 1, 2024 at 3:52 AM Dmitry Baryshkov
 wrote:
>
> Import Adreno registers database for A6xx from the Mesa, commit
> 639488f924d9 ("freedreno/registers: limit the rules schema").
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 4970 
> +
>  drivers/gpu/drm/msm/registers/adreno/a6xx_gmu.xml |  228 +
>  2 files changed, 5198 insertions(+)
>

FYI, this will conflict with a series I will send out soon based on
the register updates in [1]. Is there any chance to update this before
it lands in msm-next?

Best regards,

Connor

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28883.


Re: drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used

2024-03-26 Thread Connor Abbott
On Tue, Mar 26, 2024 at 7:47 PM Dmitry Baryshkov
 wrote:
>
> On Tue, 26 Mar 2024 at 21:32, Abhinav Kumar  wrote:
> >
> >
> >
> > On 3/26/2024 12:10 PM, Dmitry Baryshkov wrote:
> > > On Tue, 26 Mar 2024 at 20:31, Abhinav Kumar  
> > > wrote:
> > >>
> > >>
> > >>
> > >> On 3/26/2024 11:19 AM, Dmitry Baryshkov wrote:
> > >>> On Tue, 26 Mar 2024 at 20:05, Miguel Ojeda
> > >>>  wrote:
> > 
> >  Hi,
> > 
> >  In today's next, I got:
> > 
> >    drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: 
> >  variable
> >  'out' set but not used [-Werror,-Wunused-but-set-variable]
> > 
> >  `out` seems to be there since commit 64d6255650d4 ("drm/msm: More
> >  fully implement devcoredump for a7xx").
> > 
> >  Untested diff below assuming `dumper->iova` is constant -- if you want
> >  a formal patch, please let me know.
> > >>>
> > >>> Please send a proper patch that we can pick up.
> > >>>
> > >>
> > >> This should be fixed with 
> > >> https://patchwork.freedesktop.org/patch/581853/.
> > >
> > > Is that a correct fix? If you check other usage locations for
> > > CRASHDUMP_READ, you'll see that `out` is the last parameter and it is
> > > being incremented.
> > >
> >
> > Right but in this function out is not the last parameter of CRASHDUMP_READ.
>
> Yes. I think in this case the patch from this email is more correct.

Yes, this patch is more correct than the other one. I tried to fix a
bug with a6xx that I noticed while adding support for a7xx, which I
forgot to split out from "drm/msm: More fully implement devcoredump
for a7xx" into a separate commit, and this hunk was missing. Sorry
about that.

Connor


Re: [PATCH] drm/msm/a6xx: specify UBWC config for sc7180

2024-02-21 Thread Connor Abbott
On Tue, Feb 20, 2024 at 5:12 PM Dmitry Baryshkov
 wrote:
>
> Historically the Adreno driver has not been updating memory
> configuration registers on a618 (SC7180 platform) implying that the
> default configuration is fine. After the rework performed in the commit
> 8814455a0e54 ("drm/msm: Refactor UBWC config setting") the function
> a6xx_calc_ubwc_config() still contained this shortcut and did not
> calculate UBWC configuration. However the function which now actually
> updates hardware registers, a6xx_set_ubwc_config(), doesn't contain such
> check.
>
> Rather than adding the check to a6xx_set_ubwc_config(), fill in the
> UBWC config for a618 (based on readings from SC7180).
>
> Reported-by: Leonard Lausen 
> Link: https://gitlab.freedesktop.org/drm/msm/-/issues/49
> Fixes: 8814455a0e54 ("drm/msm: Refactor UBWC config setting")
> Cc: Connor Abbott 
> Signed-off-by: Dmitry Baryshkov 

Thanks!

Reviewed-by: Connor Abbott 

> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c9c55e2ea584..dc80e5940f51 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1292,9 +1292,8 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu 
> *gpu)
> gpu->ubwc_config.ubwc_mode = 1;
> }
>
> -   /* a618 is using the hw default values */
> if (adreno_is_a618(gpu))
> -   return;
> +   gpu->ubwc_config.highest_bank_bit = 14;
>
> if (adreno_is_a619_holi(gpu))
> gpu->ubwc_config.highest_bank_bit = 13;
>
> ---
> base-commit: 41c177cf354126a22443b5c80cec9fdd313e67e1
> change-id: 20240220-fd-sc7180-explicit-ubwc-40953fa55947
>
> Best regards,
> --
> Dmitry Baryshkov 
>


Re: [PATCH] drm/msm/a6xx: skip programming of UBWC registers for a618

2024-02-21 Thread Connor Abbott
On Sun, Feb 18, 2024 at 1:44 PM Dmitry Baryshkov
 wrote:
>
> Historically the Adreno driver has not been updating memory
> configuration registers on a618 (SC7180 platform) implying that the
> default configuration is fine. After the rework performed in the commit
> 8814455a0e54 ("drm/msm: Refactor UBWC config setting") the function
> a6xx_calc_ubwc_config() still contained this shortcut and did not
> calculate UBWC configuration. However the function which now actually
> updates hardware registers, a6xx_set_ubwc_config(), doesn't contain such
> check. Thus it ends up rewriting hardware registers with the default
> (incorrect) values. Add the !a618 check to this function.
>
> Reported-by: Leonard Lausen 
> Link: https://gitlab.freedesktop.org/drm/msm/-/issues/49
> Fixes: 8814455a0e54 ("drm/msm: Refactor UBWC config setting")
> Cc: Connor Abbott 
> Signed-off-by: Dmitry Baryshkov 
> ---
>
> Note, a proper fix would be to incorporate actual values for sc7180
> and drop the a618 shortcuts. However it might take some time to
> materialize and to be properly tested. As such, I propose to merge this
> for 6.8, keeping the existing behaviour.

If we do this then there's a chance that 6.8 will report a broken
value for MSM_PARAM_HIGHEST_BANK_BIT, which is going to require
otherwise unnecessary workarounds in [1] for quite a while once I fix
up a618 support. Can you at least dump what the default is to make
sure that the value we report matches what's being programmed?

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26578

Connor


>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c9c55e2ea584..07d60dfacd23 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1325,6 +1325,11 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu 
> *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +
> +   /* a618 is using the hw default values */
> +   if (adreno_is_a618(gpu))
> +   return;
> +
> /*
>  * We subtract 13 from the highest bank bit (13 is the minimum value
>  * allowed by hw) and write the lowest two bits of the remaining value
> --
> 2.39.2
>