On 7/29/2025 6:31 PM, Konrad Dybcio wrote: > On 7/24/25 6:54 PM, Akhil P Oommen wrote: >> On 7/24/2025 5:16 PM, Konrad Dybcio wrote: >>> On 7/23/25 11:06 PM, Akhil P Oommen wrote: >>>> On 7/22/2025 8:22 PM, Konrad Dybcio wrote: >>>>> On 7/22/25 3:39 PM, Dmitry Baryshkov wrote: >>>>>> On Sun, Jul 20, 2025 at 05:46:08PM +0530, Akhil P Oommen wrote: >>>>>>> There are some special registers which are accessible even when GX power >>>>>>> domain is collapsed during an IFPC sleep. Accessing these registers >>>>>>> wakes up GPU from power collapse and allow programming these registers >>>>>>> without additional handshake with GMU. This patch adds support for this >>>>>>> special register write sequence. >>>>>>> >>>>>>> Signed-off-by: Akhil P Oommen <akhi...@oss.qualcomm.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 63 >>>>>>> ++++++++++++++++++++++++++++++- >>>>>>> drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 + >>>>>>> drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 20 +++++----- >>>>>>> 3 files changed, 73 insertions(+), 11 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>>>>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>>>>>> index >>>>>>> 491fde0083a202bec7c6b3bca88d0e5a717a6560..8c004fc3abd2896d467a9728b34e99e4ed944dc4 >>>>>>> 100644 >>>>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>>>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c >>>>>>> @@ -16,6 +16,67 @@ >>>>>>> >>>>>>> #define GPU_PAS_ID 13 >>>>>>> >>>>>>> +static bool fence_status_check(struct msm_gpu *gpu, u32 offset, u32 >>>>>>> value, u32 status, u32 mask) >>>>>>> +{ >>>>>>> + /* Success if !writedropped0/1 */ >>>>>>> + if (!(status & mask)) >>>>>>> + return true; >>>>>>> + >>>>>>> + udelay(10); >>>>>> >>>>>> Why do we need udelay() here? Why can't we use interval setting inside >>>>>> gmu_poll_timeout()? >>>>> >>>>> Similarly here: >>>>> >>>>> [...] >>>>> >>>>>>> + if (!gmu_poll_timeout(gmu, REG_A6XX_GMU_AHB_FENCE_STATUS, >>>>>>> status, >>>>>>> + fence_status_check(gpu, offset, value, status, >>>>>>> mask), 0, 1000)) >>>>>>> + return 0; >>>>>>> + >>>>>>> + dev_err_ratelimited(gmu->dev, "delay in fenced register write >>>>>>> (0x%x)\n", >>>>>>> + offset);
This print should be after the 2nd polling. Otherwise the delay due to this may allow GPU to go back to IFPC. >>>>>>> + >>>>>>> + /* Try again for another 1ms before failing */ >>>>>>> + gpu_write(gpu, offset, value); >>>>>>> + if (!gmu_poll_timeout(gmu, REG_A6XX_GMU_AHB_FENCE_STATUS, >>>>>>> status, >>>>>>> + fence_status_check(gpu, offset, value, status, >>>>>>> mask), 0, 1000)) >>>>>>> + return 0; >>>>>>> + >>>>>>> + dev_err_ratelimited(gmu->dev, "fenced register write (0x%x) >>>>>>> fail\n", >>>>>>> + offset); >>>>> >>>>> We may want to combine the two, so as not to worry the user too much.. >>>>> >>>>> If it's going to fail, I would assume it's going to fail both checks >>>>> (unless e.g. the bus is so congested a single write can't go through >>>>> to a sleepy GPU across 2 miliseconds, but that's another issue) >>>> >>>> In case of success, we cannot be sure if the first write went through. >>>> So we should poll separately. >>> >>> You're writing to it 2 (outside fence_status_check) + 2*1000/10 (inside) >>> == 202 times, it really better go through.. >> >> For the following sequence: >> 1. write reg1 <- suppose this is dropped >> 2. write reg2 <- and this went through >> 3. Check fence status <- This will show success > > What I'm saying is that fence_status_check() does the same write you > execute inbetween the polling calls On a second thought I think it is simpler to just use a single polling of 2ms and measure the time taken using ktime to print a warning if it took more that 1ms. -Akhil. > > Konrad >> >>> >>> If it's just about the write reaching the GPU, you can write it once and >>> read back the register you've written to, this way you're sure that the >>> GPU can observe the write >> >> This is a very unique hw behavior. We can't do posted write. >> >> -Akhil >> >>> >>> Konrad >>