Re: [PATCH] drm/amd/display: fix if == else warning
Applied. Thanks! Alex On Sun, Apr 24, 2022 at 4:15 PM Liu, Zhan wrote: > > [AMD Official Use Only - General] > > > -Original Message- > > From: Guo Zhengkui > > Sent: 2022/April/24, Sunday 5:06 AM > > To: Wentland, Harry ; Li, Sun peng (Leo) > > ; Siqueira, Rodrigo ; > > Deucher, Alexander ; Koenig, Christian > > ; Pan, Xinhui ; David Airlie > > ; Daniel Vetter ; Liu, Charlene > > ; Lei, Jun ; Guo Zhengkui > > ; Liu, Zhan ; José Expósito > > ; open list:AMD DISPLAY CORE > g...@lists.freedesktop.org>; open list:DRM DRIVERS > de...@lists.freedesktop.org>; open list > > Cc: zhengkui_...@outlook.com > > Subject: [PATCH] drm/amd/display: fix if == else warning > > > > Fix the following coccicheck warning: > > > > drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c:98:8-10: > > WARNING: possible condition with no effect (if == else) > > > > Signed-off-by: Guo Zhengkui > > Thanks a lot for fixing this warning. > > Reviewed-by: Zhan Liu > > > --- > > drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c > > b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c > > index fe22530242d2..05b3fba9ccce 100644 > > --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c > > +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c > > @@ -95,8 +95,6 @@ static void gpu_addr_to_uma(struct dce_hwseq *hwseq, > > } else if (hwseq->fb_offset.quad_part <= addr->quad_part && > > addr->quad_part <= hwseq->uma_top.quad_part) { > > is_in_uma = true; > > - } else if (addr->quad_part == 0) { > > - is_in_uma = false; > > } else { > > is_in_uma = false; > > } > > -- > > 2.20.1 >
Re: [PATCH] drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspend
[Public] Acked-by: Alex Deucher From: amd-gfx on behalf of Prike Liang Sent: Monday, April 25, 2022 2:52 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Lazar, Lijo ; Liang, Prike ; Huang, Ray Subject: [PATCH] drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspend Without MMHUB clock gating being enabled then MMHUB will not disconnect from DF and will result in DF C-state entry can't be accessed during S2idle suspend, and eventually s0ix entry will be blocked. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index a455e59f41f4..20946bc7fc93 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -1151,6 +1151,16 @@ static int gmc_v10_0_set_clockgating_state(void *handle, int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; + /* +* The issue mmhub can't disconnect from DF with MMHUB clock gating being disabled +* is a new problem observed at DF 3.0.3, however with the same suspend sequence not +* seen any issue on the DF 3.0.2 series platform. +*/ + if (adev->in_s0ix && adev->ip_versions[DF_HWIP][0] > IP_VERSION(3, 0, 2)) { + dev_dbg(adev->dev, "keep mmhub clock gating being enabled for s0ix\n"); + return 0; + } + r = adev->mmhub.funcs->set_clockgating(adev, state); if (r) return r; -- 2.25.1
RE: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
[Public] > -Original Message- > From: Demi Marie Obenour > Sent: Sunday, April 24, 2022 7:39 PM > To: Thorsten Leemhuis ; Greg KH > > Cc: amd-gfx@lists.freedesktop.org; regressi...@lists.linux.dev; > Deucher, Alexander ; Koenig, Christian > ; Pan, Xinhui > Subject: Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4 > > On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote: > > CCing the amdgpu maintainers > > > > On 24.04.22 08:12, Greg KH wrote: > > > On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote: > > >> Two Qubes OS users reported that their AMD GPU systems did not > > >> work on 5.17.4, while 5.16.18 worked fine. Details can be found > > >> on https://github.com/QubesOS/qubes-issues/issues/7462. The > > >> initial report listed the GPU as “Advanced Micro Devices, Inc. > > >> [AMD/ATI] Renoir [1002:1636} (rev d3) (prog-if 00 [VGA > > >> controller])” and the CPU as “AMD Ryzen 5 PRO 4650U with Radeon > > >> Graphics”. > > >> > > >> #regzbot introduced: v5.16.18..v5.17.4 > > > > > > That's a big range, can you use 'git bisect' to track it down? > > > > FWIW, in another mail in this thread and > > https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment- > 11076 > > 81126 it was clarified that the problem was introduced between > > 5.17.3 and 5.17.4, where a few amdgpu changes were applied. Maybe > > they are the reason. > > > > Anyway: Yes, as Gregkh said, a bisection really would help. But > > maybe the amdgfx developers might already have an idea what might be > > wrong here? The QubesOS ticket linked above has some more details. > > The reporter was able to bisect the regression. I am not surprised > that this commit caused problems for Qubes OS, as dom0 in Qubes OS is > technically a guest of Xen. > > #regzbot ^introduced: b818a5d374542ccec73dcfe578a081574029820e Can you please follow up on the bug ticket: https://gitlab.freedesktop.org/drm/amd/-/issues/1985 It will be easier to track everything there. Alex
[PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations
[Why] Allow for PSR SMU optimization and PSR multiple display optimization. [How] Add feature flags of PSR smu optimization and PSR multiple display optimiztaion, and set them during init sequence. By default, flags are disabled. Signed-off-by: David Zhang Reviewed-by: Harry Wentland Reviewed-by: Roman Li --- drivers/gpu/drm/amd/include/amd_shared.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h index 741dae17562a..06f21e9008c6 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -234,6 +234,8 @@ enum DC_FEATURE_MASK { DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default + DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default + DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default }; enum DC_DEBUG_MASK { -- 2.25.1
RE: [PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations
[Public] Reviewed-by: Roman Li > -Original Message- > From: Zhang, Dingchen (David) > Sent: Monday, April 25, 2022 3:25 PM > To: amd-gfx@lists.freedesktop.org > Cc: Wentland, Harry ; Li, Sun peng (Leo) > ; Lakha, Bhawanpreet > ; Siqueira, Rodrigo > ; Pillai, Aurabindo ; > Zhuo, Qingqing (Lillian) ; Li, Roman > ; Lin, Wayne ; Wang, Chao-kai > (Stylon) ; Chiu, Solomon ; > Kotarac, Pavle ; Gutierrez, Agustin > ; Zuo, Jerry > Subject: [PATCH] drm/amd: add dc feature mask flags for PSR allow smu and > multi-display optimizations > > [Why] > Allow for PSR SMU optimization and PSR multiple display optimization. > > [How] > Add feature flags of PSR smu optimization and PSR multiple display > optimiztaion, and set them during init sequence. By default, flags are > disabled. > > Signed-off-by: David Zhang > --- > drivers/gpu/drm/amd/include/amd_shared.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/include/amd_shared.h > b/drivers/gpu/drm/amd/include/amd_shared.h > index 741dae17562a..06f21e9008c6 100644 > --- a/drivers/gpu/drm/amd/include/amd_shared.h > +++ b/drivers/gpu/drm/amd/include/amd_shared.h > @@ -234,6 +234,8 @@ enum DC_FEATURE_MASK { > DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by > default > DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default > DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default > + DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default > + DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by > default > }; > > enum DC_DEBUG_MASK { > -- > 2.25.1
[PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations
[Why] Allow for PSR SMU optimization and PSR multiple display optimization. [How] Add feature flags of PSR smu optimization and PSR multiple display optimiztaion, and set them during init sequence. By default, flags are disabled. Signed-off-by: David Zhang --- drivers/gpu/drm/amd/include/amd_shared.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h index 741dae17562a..06f21e9008c6 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -234,6 +234,8 @@ enum DC_FEATURE_MASK { DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default + DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default + DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default }; enum DC_DEBUG_MASK { -- 2.25.1
Re: [PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations
On 2022-04-25 15:10, David Zhang wrote: [Why] Allow for PSR SMU optimization and PSR multiple display optimization. [How] Add feature flags of PSR smu optimization and PSR multiple display optimiztaion, and set them during init sequence. By default, flags are disabled. Signed-off-by: David Zhang Reviewed-by: Harry Wentland Harry --- drivers/gpu/drm/amd/include/amd_shared.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h index 741dae17562a..06f21e9008c6 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -234,6 +234,8 @@ enum DC_FEATURE_MASK { DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default + DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default + DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default }; enum DC_DEBUG_MASK {
[PATCH] drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations
[Why] Allow for PSR SMU optimization and PSR multiple display optimization. [How] Add feature flags of PSR smu optimization and PSR multiple display optimiztaion, and set them during init sequence. By default, flags are disabled. Signed-off-by: David Zhang --- drivers/gpu/drm/amd/include/amd_shared.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h index 741dae17562a..06f21e9008c6 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -234,6 +234,8 @@ enum DC_FEATURE_MASK { DC_EDP_NO_POWER_SEQUENCING = (1 << 4), //0x10, disabled by default DC_DISABLE_LTTPR_DP1_4A = (1 << 5), //0x20, disabled by default DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default + DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default + DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default }; enum DC_DEBUG_MASK { -- 2.25.1
Re: [PATCH] drm/amdgpu/display: Make dcn31_set_low_power_state static
On 2022-04-25 12:11, Alex Deucher wrote: It's not used outside of dcn31_clk_mgr.c. Reported-by: kernel test robot Signed-off-by: Alex Deucher Reviewed-by: Harry Wentland Harry --- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c index 969b40250434..ceb34376decb 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c @@ -615,7 +615,7 @@ static void dcn31_clk_mgr_helper_populate_bw_params(struct clk_mgr_internal *clk } } -void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base) +static void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base) { int display_count; struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base);
Re: [PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()
Applied, but please fix your mailer. Also, please prepend patch titles with "drm/amdgpu". Thanks, Alex On Mon, Apr 25, 2022 at 7:03 AM Haohui Mai wrote: > > Thanks for the prompt reviews. Here is the updated patch. > > Signed-off-by: Haohui Mai > --- > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > index 9426e252d8aa..c15549bbe636 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > @@ -3830,8 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct > amdgpu_ring *ring, long timeout) > gpu_addr = adev->wb.gpu_addr + (index * 4); > adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD); > memset(&ib, 0, sizeof(ib)); > - r = amdgpu_ib_get(adev, NULL, 16, > - AMDGPU_IB_POOL_DIRECT, &ib); > + r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib); > if (r) > goto err1; > > -- > 2.25.1 > > On Mon, Apr 25, 2022 at 6:52 PM Christian König > wrote: > > > > Am 25.04.22 um 10:56 schrieb Haohui Mai: > > > The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16 > > > bytes during the test. The patch sets the size of the allocation to be > > > 4-byte larger to match the actual usage. > > > > > > Signed-off-by: Haohui Mai > > > --- > > > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > > index 9426e252d8aa..b131235826b1 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > > @@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct > > > amdgpu_ring *ring, long timeout) > > > gpu_addr = adev->wb.gpu_addr + (index * 4); > > > adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD); > > > memset(&ib, 0, sizeof(ib)); > > > - r = amdgpu_ib_get(adev, NULL, 16, > > > + r = amdgpu_ib_get(adev, NULL, 20, > > > AMDGPU_IB_POOL_DIRECT, &ib); > > > > Good catch, but while at it please fix the coding style and move the > > "AMDGPU_IB_POOL_DIRECT, &ib);" on the same line as well. > > > > With that done, the patch is Reviewed-by: Christian König > > > > > > > if (r) > > > goto err1; > > > -- > > > 2.25.1 > >
Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
I applied the patches, but I had to manually munge them to make them apply since the formatting was all messed up. Please use git send-email in the future. Alex On Mon, Apr 25, 2022 at 12:03 PM Christian König wrote: > > Alex is usually picking up patches like this one here from the mailing list. > > Feel free to add a Reviewed-by: Christian König > to the series. > > Thanks for the help, > Christian. > > Am 25.04.22 um 14:44 schrieb Haohui Mai: > > Your prompt reviews are highly appreciated. Thanks. > > > > A little bit off-topic -- I'm not too familiar with the whole process. > > Just wondering, what else needs to be done in order to ensure the > > patches get picked up in the next available merge window? > > > > Thanks, > > Haohui > > > > On Mon, Apr 25, 2022 at 8:41 PM Haohui Mai wrote: > >> This patch fixes the issue where the driver miscomputes the 64-bit > >> values of the wptr of the SDMA doorbell when initializing the > >> hardware. SDMA engines v4 and later on have full 64-bit registers for > >> wptr thus they should be set properly. > >> > >> Older generation hardwares like CIK / SI have only 16 / 20 / 24bits > >> for the WPTR, where the calls of lower_32_bits() will be removed in a > >> following patch. > >> > >> Signed-off-by: Haohui Mai > >> --- > >> drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- > >> drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 > >> drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 > >> 3 files changed, 10 insertions(+), 10 deletions(-) > >> > >> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > >> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > >> index d7e8f7232364..ff86c43b63d1 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > >> @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct > >> amdgpu_ring *ring) > >> > >> DRM_DEBUG("Using doorbell -- " > >> "wptr_offs == 0x%08x " > >> - "lower_32_bits(ring->wptr) << 2 == 0x%08x " > >> - "upper_32_bits(ring->wptr) << 2 == > >> 0x%08x\n", > >> + "lower_32_bits(ring->wptr << 2) == 0x%08x " > >> + "upper_32_bits(ring->wptr << 2) == > >> 0x%08x\n", > >> ring->wptr_offs, > >> lower_32_bits(ring->wptr << 2), > >> upper_32_bits(ring->wptr << 2)); > >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > >> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > >> index a8d49c005f73..627eb1f147c2 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > >> @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct > >> amdgpu_ring *ring) > >> if (ring->use_doorbell) { > >> DRM_DEBUG("Using doorbell -- " > >> "wptr_offs == 0x%08x " > >> - "lower_32_bits(ring->wptr) << 2 == 0x%08x " > >> - "upper_32_bits(ring->wptr) << 2 == > >> 0x%08x\n", > >> + "lower_32_bits(ring->wptr << 2) == 0x%08x " > >> + "upper_32_bits(ring->wptr << 2) == > >> 0x%08x\n", > >> ring->wptr_offs, > >> lower_32_bits(ring->wptr << 2), > >> upper_32_bits(ring->wptr << 2)); > >> @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device > >> *adev) > >> > >> if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use > >> register write for wptr */ > >> WREG32(sdma_v5_0_get_reg_offset(adev, i, > >> mmSDMA0_GFX_RB_WPTR), > >> - lower_32_bits(ring->wptr) << 2); > >> + lower_32_bits(ring->wptr << 2)); > >> WREG32(sdma_v5_0_get_reg_offset(adev, i, > >> mmSDMA0_GFX_RB_WPTR_HI), > >> - upper_32_bits(ring->wptr) << 2); > >> + upper_32_bits(ring->wptr << 2)); > >> } > >> > >> doorbell = RREG32_SOC15_IP(GC, > >> sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); > >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > >> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > >> index 824eace69884..a5eb82bfeaa8 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > >> @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct > >> amdgpu_ring *ring) > >> if (ring->use_doorbell) { > >> DRM_DEBUG("Using doorbell -- " > >> "wptr_offs == 0x%08x " > >> - "lower_32_bits(ring->wptr) << 2 == 0x%08x " > >> - "upper_32_b
Re: Screen corruption using radeon kernel driver
+ dri-devel On Mon, Apr 25, 2022 at 3:33 AM Krylov Michael wrote: > > Hello! > > After updating my Linux kernel from version 4.19 (Debian 10 version) to > 5.10 (packaged with Debian 11), I've noticed that the image > displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950 > AGP video card is severely corrupted in the graphical (Xorg and Wayland) > mode: all kinds of black and white stripes across the screen, some > letters missing, etc. > > I've checked several options (Xorg drivers, Wayland instead of > Xorg, radeon.agpmode=-1 in kernel command line and so on), but the > problem persisted. I've managed to find that the problem was in the > kernel, as everything worked well with 4.19 kernel with everything > else being from Debian 11. > > I have managed to find the culprit of that corruption, that is the > commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel. > Reverting this commit and building the kernel with that commit reverted > fixes the problem. Disabling HIMEM also gets rid of that problem. But it > also leaves the system with less that 1G of RAM, which is, of course, > undesirable. > > Apparently this problem is somewhat known, as I can tell after googling > for the commit id, see this link for example: > https://lkml.org/lkml/2020/1/9/518 > > Mageia distro, for example, reverted this commit in the kernel they are > building: > > http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32 > > I've reported this bug to Debian bugtracker, checked the recent verion > of the kernel (5.17), bug still persists. Here's a link to the Debian > bug page: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670 > > I'm not sure if reverting this commit is the correct way to go, so if > you need to check any changes/patches that I could apply and test on > the real hardware, I'll be glad to do that (but please keep in mind > that testing could take some time, I don't have access to this computer > 24/7, but I'll do my best to respond ASAP). I would be happy to revert that commit. I attempted to revert it a year or so ago, but Christoph didn't want to. He was going to look further into it. I was not able to repro the issue. It seemed to be related to highmem support. You might try disabling that. Here is the previous thread for reference: https://lists.freedesktop.org/archives/amd-gfx/2020-September/053922.html Alex
Re: [PATCH] drm/amd/pm: fix the compile warning
On Sun, Apr 24, 2022 at 10:25 PM Evan Quan wrote: > > Fix the compile warning below: > drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/kv_dpm.c:1641 > kv_get_acp_boot_level() warn: always true condition '(table->entries[i]->clk > >= 0) => (0-u32max >= 0)' > > Reported-by: kernel test robot > CC: Alex Deucher > Signed-off-by: Evan Quan > Change-Id: If4985252017023d6711b4d7eb1192a397baff813 > --- > drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c > b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c > index 8b23cc9f098a..cab948118d4b 100644 > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c > @@ -1623,6 +1623,7 @@ static int kv_update_samu_dpm(struct amdgpu_device > *adev, bool gate) > > static u8 kv_get_acp_boot_level(struct amdgpu_device *adev) > { > +#if 0 > u8 i; > struct amdgpu_clock_voltage_dependency_table *table = > &adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table; > @@ -1636,6 +1637,8 @@ static u8 kv_get_acp_boot_level(struct amdgpu_device > *adev) > i = table->count - 1; > > return i; > +#endif Just drop the code at this point and return 0. Alex > + return 0; > } > > static void kv_update_acp_boot_level(struct amdgpu_device *adev) > -- > 2.29.0 >
Re: [PATCH] drm/radeon: change cac_weights_* to static
Applied. Thanks! Alex On Sat, Apr 23, 2022 at 4:02 PM Tom Rix wrote: > > Sparse reports these issues > si_dpm.c:332:26: warning: symbol 'cac_weights_pitcairn' was not declared. > Should it be static? > si_dpm.c:1088:26: warning: symbol 'cac_weights_oland' was not declared. > Should it be static? > > Both of these variables are only used in si_dpm.c. Single file variables > should be static, so change their storage-class specifiers to static. > > Signed-off-by: Tom Rix > --- > drivers/gpu/drm/radeon/si_dpm.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c > index 3add39c1a689..fbf968e3f6d7 100644 > --- a/drivers/gpu/drm/radeon/si_dpm.c > +++ b/drivers/gpu/drm/radeon/si_dpm.c > @@ -329,7 +329,7 @@ static const struct si_dte_data dte_data_malta = > true > }; > > -struct si_cac_config_reg cac_weights_pitcairn[] = > +static struct si_cac_config_reg cac_weights_pitcairn[] = > { > { 0x0, 0x, 0, 0x8a, SISLANDS_CACCONFIG_CGIND }, > { 0x0, 0x, 16, 0x0, SISLANDS_CACCONFIG_CGIND }, > @@ -1085,7 +1085,7 @@ static const struct si_dte_data dte_data_venus_pro = > true > }; > > -struct si_cac_config_reg cac_weights_oland[] = > +static struct si_cac_config_reg cac_weights_oland[] = > { > { 0x0, 0x, 0, 0x82, SISLANDS_CACCONFIG_CGIND }, > { 0x0, 0x, 16, 0x4F, SISLANDS_CACCONFIG_CGIND }, > -- > 2.27.0 >
Re: [PATCH] drm/radeon: change cik_default_state table from global to static
On Sat, Apr 23, 2022 at 9:44 AM Tom Rix wrote: > > Sparse reports these issues > cik_blit_shaders.c:31:11: warning: symbol 'cik_default_state' was not > declared. Should it be static? > cik_blit_shaders.c:246:11: warning: symbol 'cik_default_size' was not > declared. Should it be static? > > cik_default_state and cik_default_size are only used in cik.c. Single file > symbols > should be static. So move their definitions to cik_blit_shaders.h and change > their > storage-class-specifier to static. > > Remove unneeded cik_blit_shader.c > > Signed-off-by: Tom Rix Applied. Thanks! Alex > --- > drivers/gpu/drm/radeon/Makefile | 2 +- > drivers/gpu/drm/radeon/cik_blit_shaders.c | 246 -- > drivers/gpu/drm/radeon/cik_blit_shaders.h | 219 ++- > 3 files changed, 218 insertions(+), 249 deletions(-) > delete mode 100644 drivers/gpu/drm/radeon/cik_blit_shaders.c > > diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile > index 1045d2c46a76..ea5380e24c3c 100644 > --- a/drivers/gpu/drm/radeon/Makefile > +++ b/drivers/gpu/drm/radeon/Makefile > @@ -44,7 +44,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \ > evergreen.o evergreen_cs.o \ > evergreen_hdmi.o radeon_trace_points.o ni.o \ > atombios_encoders.o radeon_semaphore.o radeon_sa.o atombios_i2c.o > si.o \ > - radeon_prime.o cik.o cik_blit_shaders.o \ > + radeon_prime.o cik.o \ > r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o > rv740_dpm.o \ > rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o > trinity_dpm.o \ > trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \ > diff --git a/drivers/gpu/drm/radeon/cik_blit_shaders.c > b/drivers/gpu/drm/radeon/cik_blit_shaders.c > deleted file mode 100644 > index ff1311806e91.. > --- a/drivers/gpu/drm/radeon/cik_blit_shaders.c > +++ /dev/null > @@ -1,246 +0,0 @@ > -/* > - * Copyright 2012 Advanced Micro Devices, Inc. > - * > - * Permission is hereby granted, free of charge, to any person obtaining a > - * copy of this software and associated documentation files (the "Software"), > - * to deal in the Software without restriction, including without limitation > - * the rights to use, copy, modify, merge, publish, distribute, sublicense, > - * and/or sell copies of the Software, and to permit persons to whom the > - * Software is furnished to do so, subject to the following conditions: > - * > - * The above copyright notice and this permission notice (including the next > - * paragraph) shall be included in all copies or substantial portions of the > - * Software. > - * > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > - * THE COPYRIGHT HOLDER(S) AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, > DAMAGES OR > - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > OTHER > - * DEALINGS IN THE SOFTWARE. > - * > - * Authors: > - * Alex Deucher > - */ > - > -#include > -#include > -#include > - > -const u32 cik_default_state[] = > -{ > - 0xc0066900, > - 0x, > - 0x0060, /* DB_RENDER_CONTROL */ > - 0x, /* DB_COUNT_CONTROL */ > - 0x, /* DB_DEPTH_VIEW */ > - 0x002a, /* DB_RENDER_OVERRIDE */ > - 0x, /* DB_RENDER_OVERRIDE2 */ > - 0x, /* DB_HTILE_DATA_BASE */ > - > - 0xc0046900, > - 0x0008, > - 0x, /* DB_DEPTH_BOUNDS_MIN */ > - 0x, /* DB_DEPTH_BOUNDS_MAX */ > - 0x, /* DB_STENCIL_CLEAR */ > - 0x, /* DB_DEPTH_CLEAR */ > - > - 0xc0036900, > - 0x000f, > - 0x, /* DB_DEPTH_INFO */ > - 0x, /* DB_Z_INFO */ > - 0x, /* DB_STENCIL_INFO */ > - > - 0xc0016900, > - 0x0080, > - 0x, /* PA_SC_WINDOW_OFFSET */ > - > - 0xc00d6900, > - 0x0083, > - 0x, /* PA_SC_CLIPRECT_RULE */ > - 0x, /* PA_SC_CLIPRECT_0_TL */ > - 0x20002000, /* PA_SC_CLIPRECT_0_BR */ > - 0x, > - 0x20002000, > - 0x, > - 0x20002000, > - 0x, > - 0x20002000, > - 0x, /* PA_SC_EDGERULE */ > - 0x, /* PA_SU_HARDWARE_SCREEN_OFFSET */ > - 0x000f, /* CB_TARGET_MASK */ > - 0x000f, /* CB_SHADER_MASK */ > - > - 0xc0226900, > - 0x0094, > - 0x8000, /* PA_SC_VPORT_SCISSOR_0_TL */ > - 0x20002000, /* PA_SC_VPORT_SCISSOR_0_BR */ > - 0x8000, > - 0x20002000, > - 0x8000, > - 0x20002000, > - 0x8000, > - 0x20002000, > - 0x8000, > -
Re: [PATCH] drm/amd/display: fix non-kernel-doc comment warnings
On Fri, Apr 22, 2022 at 9:29 PM Randy Dunlap wrote: > > Fix kernel-doc warnings for a comment that should not use > kernel-doc notation: > > dmub_psr.c:235: warning: This comment starts with '/**', but isn't a > kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst > * Set PSR power optimization flags. > dmub_psr.c:235: warning: missing initial short description on line: > * Set PSR power optimization flags. > > Fixes: e5dfcd272722 ("drm/amd/display: dc_link_set_psr_allow_active > refactoring") > Signed-off-by: Randy Dunlap > Reported-by: kernel test robot > Cc: Robin Chen > Cc: Alex Deucher > Cc: Anthony Koo > Cc: amd-gfx@lists.freedesktop.org > Cc: dri-de...@lists.freedesktop.org > Cc: David Airlie > Cc: Daniel Vetter > Cc: Harry Wentland > Cc: Leo Li > Cc: Rodrigo Siqueira Applied. Thanks! > --- > drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c > +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c > @@ -231,7 +231,7 @@ static void dmub_psr_set_level(struct dm > dc_dmub_srv_wait_idle(dc->dmub_srv); > } > > -/** > +/* > * Set PSR power optimization flags. > */ > static void dmub_psr_set_power_opt(struct dmub_psr *dmub, unsigned int > power_opt, uint8_t panel_inst)
Re: [PATCH v2] drm/amdgpu: replace VM fault error by info logs
Am 2022-04-25 um 11:29 schrieb Felix Kuehling: Am 2022-04-23 um 13:54 schrieb Christian König: Am 23.04.22 um 04:24 schrieb Alex Sierra: This is not a kernel error. These logs are caused by VM faults that could not be handled. Typically, generated by user mode applications. Well it's still a hardware fault which should be logged as an error. At least in ROCm compute applications, a VM fault does not take down the hardware. It only leads to termination of the process that caused the fault. It's very similar to a segfault in an application, which is not considered a HW fault either. Turns out, a segfault also prints an error message in dmesg. So I guess the VM fault can remain an error message as well. In a test that triggers VM faults on purpose, we can add some message to let users know to expect kernel error messages from the test. Regards, Felix Other processes are completely unaffected. The cause of the error is typically in user mode. I think the general policy is, that user mode errors should not spam the kernel logs with error messages. The messages are useful for debugging application issues, so it's good to have them. But IMHO they should not be error messages. Such messages often lead to spurious bug reports against the kernel for things that aren't really kernel issues. Regards, Felix So I'm absolutely not keen about reducing it to just an information. Regards, Christian. Signed-off-by: Alex Sierra --- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 8 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 8 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 8 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 20 ++-- drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c | 14 +++--- 9 files changed, 52 insertions(+), 52 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c index 6e0ace2fbfab..c226a4803086 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c @@ -79,25 +79,25 @@ gfxhub_v2_0_print_l2_protection_fault_status(struct amdgpu_device *adev, u32 cid = REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, CID); - dev_err(adev->dev, + dev_info(adev->dev, "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n", status); - dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", + dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : gfxhub_client_ids[cid], cid); - dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n", + dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS)); - dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n", + dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR)); - dev_err(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n", + dev_info(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, PERMISSION_FAULTS)); - dev_err(adev->dev, "\t MAPPING_ERROR: 0x%lx\n", + dev_info(adev->dev, "\t MAPPING_ERROR: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, MAPPING_ERROR)); - dev_err(adev->dev, "\t RW: 0x%lx\n", + dev_info(adev->dev, "\t RW: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, RW)); } diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c index ff738e9725ee..fdcca1477592 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c @@ -82,25 +82,25 @@ gfxhub_v2_1_print_l2_protection_fault_status(struct amdgpu_device *adev, u32 cid = REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, CID); - dev_err(adev->dev, + dev_info(adev->dev, "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n", status); - dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", + dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : gfxhub_client_ids[cid], cid); - dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n", + dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS)); - dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n", + dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR
[PATCH] drm/amdgpu/display: Make dcn31_set_low_power_state static
It's not used outside of dcn31_clk_mgr.c. Reported-by: kernel test robot Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c index 969b40250434..ceb34376decb 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c @@ -615,7 +615,7 @@ static void dcn31_clk_mgr_helper_populate_bw_params(struct clk_mgr_internal *clk } } -void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base) +static void dcn31_set_low_power_state(struct clk_mgr *clk_mgr_base) { int display_count; struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base); -- 2.35.1
Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
Alex is usually picking up patches like this one here from the mailing list. Feel free to add a Reviewed-by: Christian König to the series. Thanks for the help, Christian. Am 25.04.22 um 14:44 schrieb Haohui Mai: Your prompt reviews are highly appreciated. Thanks. A little bit off-topic -- I'm not too familiar with the whole process. Just wondering, what else needs to be done in order to ensure the patches get picked up in the next available merge window? Thanks, Haohui On Mon, Apr 25, 2022 at 8:41 PM Haohui Mai wrote: This patch fixes the issue where the driver miscomputes the 64-bit values of the wptr of the SDMA doorbell when initializing the hardware. SDMA engines v4 and later on have full 64-bit registers for wptr thus they should be set properly. Older generation hardwares like CIK / SI have only 16 / 20 / 24bits for the WPTR, where the calls of lower_32_bits() will be removed in a following patch. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev) if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), - lower_32_bits(ring->wptr) << 2); + lower_32_bits(ring->wptr << 2)); WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), - upper_32_bits(ring->wptr) << 2); + upper_32_bits(ring->wptr << 2)); } doorbell = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index 824eace69884..a5eb82bfeaa8 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_MINOR_PTR_UPDATE), 1); if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ - WREG32(sdm
Re: [PATCH v2] drm/amdgpu: replace VM fault error by info logs
Am 2022-04-23 um 13:54 schrieb Christian König: Am 23.04.22 um 04:24 schrieb Alex Sierra: This is not a kernel error. These logs are caused by VM faults that could not be handled. Typically, generated by user mode applications. Well it's still a hardware fault which should be logged as an error. At least in ROCm compute applications, a VM fault does not take down the hardware. It only leads to termination of the process that caused the fault. It's very similar to a segfault in an application, which is not considered a HW fault either. Other processes are completely unaffected. The cause of the error is typically in user mode. I think the general policy is, that user mode errors should not spam the kernel logs with error messages. The messages are useful for debugging application issues, so it's good to have them. But IMHO they should not be error messages. Such messages often lead to spurious bug reports against the kernel for things that aren't really kernel issues. Regards, Felix So I'm absolutely not keen about reducing it to just an information. Regards, Christian. Signed-off-by: Alex Sierra --- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 8 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 8 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 8 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 20 ++-- drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c | 14 +++--- 9 files changed, 52 insertions(+), 52 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c index 6e0ace2fbfab..c226a4803086 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c @@ -79,25 +79,25 @@ gfxhub_v2_0_print_l2_protection_fault_status(struct amdgpu_device *adev, u32 cid = REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, CID); - dev_err(adev->dev, + dev_info(adev->dev, "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n", status); - dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", + dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : gfxhub_client_ids[cid], cid); - dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n", + dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS)); - dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n", + dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR)); - dev_err(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n", + dev_info(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, PERMISSION_FAULTS)); - dev_err(adev->dev, "\t MAPPING_ERROR: 0x%lx\n", + dev_info(adev->dev, "\t MAPPING_ERROR: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, MAPPING_ERROR)); - dev_err(adev->dev, "\t RW: 0x%lx\n", + dev_info(adev->dev, "\t RW: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, RW)); } diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c index ff738e9725ee..fdcca1477592 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c @@ -82,25 +82,25 @@ gfxhub_v2_1_print_l2_protection_fault_status(struct amdgpu_device *adev, u32 cid = REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, CID); - dev_err(adev->dev, + dev_info(adev->dev, "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n", status); - dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", + dev_info(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n", cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : gfxhub_client_ids[cid], cid); - dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n", + dev_info(adev->dev, "\t MORE_FAULTS: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, MORE_FAULTS)); - dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n", + dev_info(adev->dev, "\t WALKER_ERROR: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, WALKER_ERROR)); - dev_err(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n", + dev_info(adev->dev, "\t PERMISSION_FAULTS: 0x%lx\n", REG_GET_FIELD(status, GCVM_L2_PROTECTION_FAULT_STATUS, PERMISSION_FAULTS)); - dev_err(adev->dev, "\t MAPPING_ERROR: 0x%lx\n", + dev_info(adev->dev, "\t MAPPING_ERROR: 0x%lx\n", RE
Re: [PATCH 1/2] drm/amdkfd: Add SVM range mapped_to_gpu flag
Am 2022-04-19 um 20:47 schrieb Philip Yang: To avoid unnecessary unmap SVM range from GPUs if range is not mapped on GPUs when migrating the range. This flag will also be used to flush TLB when updating the existing mapping on GPUs. It is protected by prange->migrate_mutex and mmap read lock in MMU notifier callback. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 17 - drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 1 + 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 5afe216cf099..6be1695f3a09 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -951,6 +951,7 @@ svm_range_split_adjust(struct svm_range *new, struct svm_range *old, new->prefetch_loc = old->prefetch_loc; new->actual_loc = old->actual_loc; new->granularity = old->granularity; + new->mapped_to_gpu = old->mapped_to_gpu; bitmap_copy(new->bitmap_access, old->bitmap_access, MAX_GPU_INSTANCE); bitmap_copy(new->bitmap_aip, old->bitmap_aip, MAX_GPU_INSTANCE); @@ -1204,6 +1205,17 @@ svm_range_unmap_from_gpus(struct svm_range *prange, unsigned long start, uint32_t gpuidx; int r = 0; + if (!prange->mapped_to_gpu) { + pr_debug("prange 0x%p [0x%lx 0x%lx] not mapped to GPU\n", +prange, prange->start, prange->last); + return 0; + } + + if (prange->start == start && prange->last == last) { + pr_debug("unmap svms 0x%p prange 0x%p\n", prange->svms, prange); + prange->mapped_to_gpu = false; + } + bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip, MAX_GPU_INSTANCE); p = container_of(prange->svms, struct kfd_process, svms); @@ -1590,8 +1602,10 @@ static int svm_range_validate_and_map(struct mm_struct *mm, addr = next; } - if (addr == end) + if (addr == end) { prange->validated_once = true; + prange->mapped_to_gpu = true; + } unreserve_out: svm_range_unreserve_bos(&ctx); @@ -1822,6 +1836,7 @@ static struct svm_range *svm_range_clone(struct svm_range *old) new->prefetch_loc = old->prefetch_loc; new->actual_loc = old->actual_loc; new->granularity = old->granularity; + new->mapped_to_gpu = old->mapped_to_gpu; bitmap_copy(new->bitmap_access, old->bitmap_access, MAX_GPU_INSTANCE); bitmap_copy(new->bitmap_aip, old->bitmap_aip, MAX_GPU_INSTANCE); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h index 66c77f00ac3e..2d54147b4dda 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h @@ -133,6 +133,7 @@ struct svm_range { DECLARE_BITMAP(bitmap_access, MAX_GPU_INSTANCE); DECLARE_BITMAP(bitmap_aip, MAX_GPU_INSTANCE); boolvalidated_once; + boolmapped_to_gpu; }; static inline void svm_range_lock(struct svm_range *prange)
Re: [PATCH v2 2/2] drm/amdkfd: Update mapping if range attributes changed
Am 2022-04-22 um 18:05 schrieb Felix Kuehling: On 2022-04-22 10:06, Philip Yang wrote: Change SVM range mapping flags or access attributes don't trigger migration, if range is already mapped on GPUs we should update GPU mapping and pass flush_tlb flag true to amdgpu vm. Change SVM range preferred_loc or migration granularity don't need update GPU mapping, skip the validate_and_map. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 46 +++- 1 file changed, 32 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 8a077cd066a1..e740384df9c7 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -686,7 +686,8 @@ svm_range_check_attr(struct kfd_process *p, static void svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange, - uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs) + uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs, + bool *update_mapping) { uint32_t i; int gpuidx; @@ -702,6 +703,7 @@ svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange, case KFD_IOCTL_SVM_ATTR_ACCESS: case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE: case KFD_IOCTL_SVM_ATTR_NO_ACCESS: + *update_mapping = true; gpuidx = kfd_process_gpuidx_from_gpuid(p, attrs[i].value); if (attrs[i].type == KFD_IOCTL_SVM_ATTR_NO_ACCESS) { @@ -716,9 +718,11 @@ svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange, } break; case KFD_IOCTL_SVM_ATTR_SET_FLAGS: + *update_mapping = true; prange->flags |= attrs[i].value; break; case KFD_IOCTL_SVM_ATTR_CLR_FLAGS: + *update_mapping = true; prange->flags &= ~attrs[i].value; break; case KFD_IOCTL_SVM_ATTR_GRANULARITY: @@ -1254,7 +1258,7 @@ static int svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange, unsigned long offset, unsigned long npages, bool readonly, dma_addr_t *dma_addr, struct amdgpu_device *bo_adev, - struct dma_fence **fence) + struct dma_fence **fence, bool flush_tlb) { struct amdgpu_device *adev = pdd->dev->adev; struct amdgpu_vm *vm = drm_priv_to_vm(pdd->drm_priv); @@ -1292,7 +1296,7 @@ svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange, (last_domain == SVM_RANGE_VRAM_DOMAIN) ? 1 : 0, pte_flags); - r = amdgpu_vm_update_range(adev, vm, false, false, false, NULL, + r = amdgpu_vm_update_range(adev, vm, false, false, flush_tlb, NULL, last_start, prange->start + i, pte_flags, last_start - prange->start, @@ -1326,7 +1330,7 @@ svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange, static int svm_range_map_to_gpus(struct svm_range *prange, unsigned long offset, unsigned long npages, bool readonly, - unsigned long *bitmap, bool wait) + unsigned long *bitmap, bool wait, bool flush_tlb) { struct kfd_process_device *pdd; struct amdgpu_device *bo_adev; @@ -1361,7 +1365,8 @@ svm_range_map_to_gpus(struct svm_range *prange, unsigned long offset, r = svm_range_map_to_gpu(pdd, prange, offset, npages, readonly, prange->dma_addr[gpuidx], - bo_adev, wait ? &fence : NULL); + bo_adev, wait ? &fence : NULL, + flush_tlb); if (r) break; @@ -1482,8 +1487,8 @@ static void *kfd_svm_page_owner(struct kfd_process *p, int32_t gpuidx) * 5. Release page table (and SVM BO) reservation */ static int svm_range_validate_and_map(struct mm_struct *mm, - struct svm_range *prange, - int32_t gpuidx, bool intr, bool wait) + struct svm_range *prange, int32_t gpuidx, + bool intr, bool wait, bool flush_tlb) { struct svm_validate_context ctx; unsigned long start, end, addr; @@ -1522,8 +1527,12 @@ static int svm_range_validate_and_map(struct mm_struct *mm, prange->bitmap_aip, MAX_GPU_INSTANCE); } - if (bitmap_empty(ctx.bitmap, MAX_GPU_INSTANCE)) - return 0; + if (bitmap_empty(ctx.bitmap, MAX_GPU_INSTANCE)) { + if (!prange->mapped_to_gpu) + return 0; + + bitmap_copy(ctx.bitmap, prange->bitmap_access, MAX_GPU_INSTANCE); + } if (prange->actual_loc && !prange->ttm_res) { /* This should never happen. actual_loc gets set by @@ -1595,7 +1604,7 @@ static int svm_range_validate_and_map(struct mm_
RE: [PATCH 00/13] DC Patches April 20 2022
[Public] Hi all, This week this patchset was tested on the following systems: HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA) Lenovo Thinkpad T14s Gen2 with AMD Ryzen 5 5650U, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA) Sapphire Pulse RX5700XT with the following display types: 4k 60hz (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA) Reference AMD RX6800 with the following display types: 4k 60hz (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA) Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz and DSC via USB-C to DP DSC Hub with 3x 4k 60hz. Tested on Ubuntu 20.04.3 with Kernel Version 5.16 and ChromeOS Tested-by: Daniel Wheeler Thank you, Dan Wheeler Technologist | AMD SW Display -Original Message- From: amd-gfx On Behalf Of Tom Chung Sent: April 22, 2022 12:45 PM To: amd-gfx@lists.freedesktop.org Cc: Wang, Chao-kai (Stylon) ; Chung, ChiaHsuan (Tom) ; Li, Sun peng (Leo) ; Wentland, Harry ; Zhuo, Qingqing (Lillian) ; Siqueira, Rodrigo ; Li, Roman ; Chiu, Solomon ; Pillai, Aurabindo ; Lin, Wayne ; Lakha, Bhawanpreet ; Gutierrez, Agustin ; Kotarac, Pavle Subject: [PATCH 00/13] DC Patches April 20 2022 This version brings along following fixes: - Keep tracking of DSC packed PPS for future use - Maintain current link settings in link loss interrupt - Remove DDC write and read size check - Read PSR-SU cap DPCD for specific panel - Don't pass HostVM by default on DCN3.1 - Reset cached PSR parameters after hibernate - Add audio readback registers - Update dcn315 clk table read - Fix HDCP QUERY Error for eDP and Tiled - Insert smu busy status before sending another request Aric Cyr (2): drm/amd/display: 3.2.182 drm/amd/display: 3.2.183 David Zhang (1): drm/amd/display: read PSR-SU cap DPCD for specific panel Dillon Varone (1): drm/amd/display: Remove unused integer Dmytro Laktyushkin (1): drm/amd/display: update dcn315 clk table read Evgenii Krasnikov (1): drm/amd/display: Reset cached PSR parameters after hibernate Gary Li (1): drm/amd/display: Maintain current link settings in link loss interrupt Ilya (2): drm/amd/display: Add Audio readback registers drm/amd/display: Keep track of DSC packed PPS Leo (Hanghong) Ma (1): drm/amd/display: Remove ddc write and read size checking Michael Strauss (1): drm/amd/display: Don't pass HostVM by default on DCN3.1 Mustapha Ghaddar (1): drm/amd/display: Fix HDCP QUERY Error for eDP and Tiled Oliver Logush (1): drm/amd/display: Insert smu busy status before sending another request .../display/dc/clk_mgr/dcn301/dcn301_smu.c| 2 + .../dc/clk_mgr/dcn315/dcn315_clk_mgr.c| 114 +- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 15 +- .../gpu/drm/amd/display/dc/core/dc_link_ddc.c | 6 - .../gpu/drm/amd/display/dc/core/dc_link_dp.c | 15 ++ drivers/gpu/drm/amd/display/dc/dc.h | 10 +- drivers/gpu/drm/amd/display/dc/dc_stream.h| 2 +- .../display/dc/dcn10/dcn10_stream_encoder.c | 1 + .../display/dc/dcn10/dcn10_stream_encoder.h | 8 + .../dc/dcn30/dcn30_dio_stream_encoder.h | 4 + .../drm/amd/display/dc/dcn31/dcn31_resource.c | 9 +- .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 145 -- .../gpu/drm/amd/display/dc/inc/hw/clk_mgr.h | 1 + .../amd/display/include/ddc_service_types.h | 2 + 14 files changed, 172 insertions(+), 162 deletions(-) -- 2.25.1
Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
Your prompt reviews are highly appreciated. Thanks. A little bit off-topic -- I'm not too familiar with the whole process. Just wondering, what else needs to be done in order to ensure the patches get picked up in the next available merge window? Thanks, Haohui On Mon, Apr 25, 2022 at 8:41 PM Haohui Mai wrote: > > This patch fixes the issue where the driver miscomputes the 64-bit > values of the wptr of the SDMA doorbell when initializing the > hardware. SDMA engines v4 and later on have full 64-bit registers for > wptr thus they should be set properly. > > Older generation hardwares like CIK / SI have only 16 / 20 / 24bits > for the WPTR, where the calls of lower_32_bits() will be removed in a > following patch. > > Signed-off-by: Haohui Mai > --- > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- > drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 > drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 > 3 files changed, 10 insertions(+), 10 deletions(-) > > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > index d7e8f7232364..ff86c43b63d1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct > amdgpu_ring *ring) > > DRM_DEBUG("Using doorbell -- " > "wptr_offs == 0x%08x " > - "lower_32_bits(ring->wptr) << 2 == 0x%08x " > - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", > + "lower_32_bits(ring->wptr << 2) == 0x%08x " > + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", > ring->wptr_offs, > lower_32_bits(ring->wptr << 2), > upper_32_bits(ring->wptr << 2)); > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > index a8d49c005f73..627eb1f147c2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct > amdgpu_ring *ring) > if (ring->use_doorbell) { > DRM_DEBUG("Using doorbell -- " > "wptr_offs == 0x%08x " > - "lower_32_bits(ring->wptr) << 2 == 0x%08x " > - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", > + "lower_32_bits(ring->wptr << 2) == 0x%08x " > + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", > ring->wptr_offs, > lower_32_bits(ring->wptr << 2), > upper_32_bits(ring->wptr << 2)); > @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device > *adev) > > if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use > register write for wptr */ > WREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_GFX_RB_WPTR), > - lower_32_bits(ring->wptr) << 2); > + lower_32_bits(ring->wptr << 2)); > WREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_GFX_RB_WPTR_HI), > - upper_32_bits(ring->wptr) << 2); > + upper_32_bits(ring->wptr << 2)); > } > > doorbell = RREG32_SOC15_IP(GC, > sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > index 824eace69884..a5eb82bfeaa8 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct > amdgpu_ring *ring) > if (ring->use_doorbell) { > DRM_DEBUG("Using doorbell -- " > "wptr_offs == 0x%08x " > - "lower_32_bits(ring->wptr) << 2 == 0x%08x " > - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", > + "lower_32_bits(ring->wptr << 2) == 0x%08x " > + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", > ring->wptr_offs, > lower_32_bits(ring->wptr << 2), > upper_32_bits(ring->wptr << 2)); > @@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device > *adev) > WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_GFX_MINOR_PTR_UPDATE), 1); > > if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use > register write for wptr */ > - WREG32(sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_GFX_RB_WPTR), lower_32_bits
Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
This patch fixes the issue where the driver miscomputes the 64-bit values of the wptr of the SDMA doorbell when initializing the hardware. SDMA engines v4 and later on have full 64-bit registers for wptr thus they should be set properly. Older generation hardwares like CIK / SI have only 16 / 20 / 24bits for the WPTR, where the calls of lower_32_bits() will be removed in a following patch. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev) if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), - lower_32_bits(ring->wptr) << 2); + lower_32_bits(ring->wptr << 2)); WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), - upper_32_bits(ring->wptr) << 2); + upper_32_bits(ring->wptr << 2)); } doorbell = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index 824eace69884..a5eb82bfeaa8 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_MINOR_PTR_UPDATE), 1); if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2); - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); } doorbell = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_DO
Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
Am 25.04.22 um 14:19 schrieb Haohui Mai: Dropped the changes of older generations. Signed-off-by: Haohui Mai Please update the commit messages to include all the background we just discussed. With that done the series is Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev) if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), - lower_32_bits(ring->wptr) << 2); + lower_32_bits(ring->wptr << 2)); WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), - upper_32_bits(ring->wptr) << 2); + upper_32_bits(ring->wptr << 2)); } doorbell = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index 824eace69884..a5eb82bfeaa8 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_MINOR_PTR_UPDATE), 1); if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2); - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); } doorbell = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); -- 2.25.1 On Mon, Apr 25, 2022 at 7:52 PM Christian König wrote: Am 25.04.22 um 13:47 schrieb Haohui Mai: Updated t
Re: [PATCH 2/2] Remove redundant lower_32_bits() calls when settings SDMA doorbell
Sounds good to me. Updated the patch for the CIK / SI hardware. I kept the clamping code to be safe. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 5 ++--- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 5 ++--- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index c8ebd108548d..cf99f6d07b49 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -194,8 +194,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], (ring->wptr << 2) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +486,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index 1d8bbcbd7a37..84b57b06b20c 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2); /* enable DMA RB */ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 4ef4feff5649..c86f181623f1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; /* XXX check if swapping is necessary on BE */ - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); - WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2); + WRITE_ONCE(*wb, ring->wptr << 2); + WDOORBELL32(ring->doorbell_index, ring->wptr << 2); } else if (ring->use_pollmem) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); + WRITE_ONCE(*wb, ring->wptr << 2); } else { - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2); } } diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c index 195b45bcb8ad..2f95235bbfb3 100644 --- a/drivers/gpu/drm/amd/amdgpu/si_dma.c +++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c @@ -56,8 +56,7 @@ static void si_dma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; u32 me = (ring == &adev->sdma.instance[0].ring) ? 0 : 1; - WREG32(DMA_RB_WPTR + sdma_offsets[me], -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); + WREG32(DMA_RB_WPTR + sdma_offsets[me], (ring->wptr << 2) & 0x3fffc); } static void si_dma_ring_emit_ib(struct amdgpu_ring *ring, @@ -175,7 +174,7 @@ static int si_dma_start(struct amdgpu_device *adev) WREG32(DMA_CNTL + sdma_offsets[i], dma_cntl); ring->wptr = 0; - WREG32(DMA_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(DMA_RB_WPTR + sdma_offsets[i], ring->wptr << 2); WREG32(DMA_RB_CNTL + sdma_offsets[i], rb_cntl | DMA_RB_ENABLE); ring->sched.ready = true; -- 2.25.1
Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
Dropped the changes of older generations. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev) if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), - lower_32_bits(ring->wptr) << 2); + lower_32_bits(ring->wptr << 2)); WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), - upper_32_bits(ring->wptr) << 2); + upper_32_bits(ring->wptr << 2)); } doorbell = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index 824eace69884..a5eb82bfeaa8 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_MINOR_PTR_UPDATE), 1); if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */ - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2); - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); } doorbell = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL)); -- 2.25.1 On Mon, Apr 25, 2022 at 7:52 PM Christian König wrote: > > Am 25.04.22 um 13:47 schrieb Haohui Mai: > > Updated the commit messages based on the previous discussion. > > Please drop all the changes for pre SDMA v4 hardware (e.g. the ones with > only a 32bit register), so that we only have the changes for the 64bit > hw versions i
Re: [PATCH 2/2] Remove redundant lower_32_bits() calls when settings SDMA doorbell
Am 25.04.22 um 13:49 schrieb Haohui Mai: I kept the original clamping for CIK / SI in this patch. I don't really care much about them. Keep it as you like, it's only for the older hw generations anyway. Regards, Christian. Please let me know if you want to remove them. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 5 ++--- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 5 ++--- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index df863d346995..cf99f6d07b49 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -194,8 +194,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr << 2)) & 0x3fffc); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], (ring->wptr << 2) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +486,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index b83fd00466fe..84b57b06b20c 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2); /* enable DMA RB */ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 557a7d5174b0..c86f181623f1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; /* XXX check if swapping is necessary on BE */ - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); - WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2)); + WRITE_ONCE(*wb, ring->wptr << 2); + WDOORBELL32(ring->doorbell_index, ring->wptr << 2); } else if (ring->use_pollmem) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); + WRITE_ONCE(*wb, ring->wptr << 2); } else { - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2); } } diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c index 0af11d3b00e7..2f95235bbfb3 100644 --- a/drivers/gpu/drm/amd/amdgpu/si_dma.c +++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c @@ -56,8 +56,7 @@ static void si_dma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; u32 me = (ring == &adev->sdma.instance[0].ring) ? 0 : 1; - WREG32(DMA_RB_WPTR + sdma_offsets[me], -(lower_32_bits(ring->wptr << 2)) & 0x3fffc); + WREG32(DMA_RB_WPTR + sdma_offsets[me], (ring->wptr << 2) & 0x3fffc); } static void si_dma_ring_emit_ib(struct amdgpu_ring *ring, @@ -175,7 +174,7 @@ static int si_dma_start(struct amdgpu_device *adev) WREG32(DMA_CNTL + sdma_offsets[i], dma_cntl); ring->wptr = 0; - WREG32(DMA_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); + WREG32(DMA_RB_WPTR + sdma_offsets[i], ring->wptr << 2); WREG32(DMA_RB_CNTL + sdma_offsets[i], rb_cntl | DMA_RB_ENABLE); ring->sched.ready = true; -- 2.25.1
Re: [PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
Am 25.04.22 um 13:47 schrieb Haohui Mai: Updated the commit messages based on the previous discussion. Please drop all the changes for pre SDMA v4 hardware (e.g. the ones with only a 32bit register), so that we only have the changes for the 64bit hw versions in here. Apart from that looks good to me. Thanks, Christian. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++-- 7 files changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index c8ebd108548d..df863d346995 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); +(lower_32_bits(ring->wptr << 2)) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index 1d8bbcbd7a37..b83fd00466fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 4ef4feff5649..557a7d5174b0 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; /* XXX check if swapping is necessary on BE */ - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); - WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); + WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2)); } else if (ring->use_pollmem) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); } else { - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } } diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_
[PATCH 2/2] Remove redundant lower_32_bits() calls when settings SDMA doorbell
I kept the original clamping for CIK / SI in this patch. Please let me know if you want to remove them. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 5 ++--- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 5 ++--- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index df863d346995..cf99f6d07b49 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -194,8 +194,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr << 2)) & 0x3fffc); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], (ring->wptr << 2) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +486,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index b83fd00466fe..84b57b06b20c 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], ring->wptr << 2); /* enable DMA RB */ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 557a7d5174b0..c86f181623f1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; /* XXX check if swapping is necessary on BE */ - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); - WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2)); + WRITE_ONCE(*wb, ring->wptr << 2); + WDOORBELL32(ring->doorbell_index, ring->wptr << 2); } else if (ring->use_pollmem) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); + WRITE_ONCE(*wb, ring->wptr << 2); } else { - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], ring->wptr << 2); } } diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c index 0af11d3b00e7..2f95235bbfb3 100644 --- a/drivers/gpu/drm/amd/amdgpu/si_dma.c +++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c @@ -56,8 +56,7 @@ static void si_dma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; u32 me = (ring == &adev->sdma.instance[0].ring) ? 0 : 1; - WREG32(DMA_RB_WPTR + sdma_offsets[me], -(lower_32_bits(ring->wptr << 2)) & 0x3fffc); + WREG32(DMA_RB_WPTR + sdma_offsets[me], (ring->wptr << 2) & 0x3fffc); } static void si_dma_ring_emit_ib(struct amdgpu_ring *ring, @@ -175,7 +174,7 @@ static int si_dma_start(struct amdgpu_device *adev) WREG32(DMA_CNTL + sdma_offsets[i], dma_cntl); ring->wptr = 0; - WREG32(DMA_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); + WREG32(DMA_RB_WPTR + sdma_offsets[i], ring->wptr << 2); WREG32(DMA_RB_CNTL + sdma_offsets[i], rb_cntl | DMA_RB_ENABLE); ring->sched.ready = true; -- 2.25.1
[PATCH 1/2] Fix incorrect calculations of the wptr of the doorbells
Updated the commit messages based on the previous discussion. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++-- 7 files changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index c8ebd108548d..df863d346995 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); +(lower_32_bits(ring->wptr << 2)) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index 1d8bbcbd7a37..b83fd00466fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 4ef4feff5649..557a7d5174b0 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; /* XXX check if swapping is necessary on BE */ - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); - WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); + WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2)); } else if (ring->use_pollmem) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); } else { - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } } diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); @@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev) if (!am
Re: [PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly
Am 25.04.22 um 13:14 schrieb Haohui Mai: Sorry for the confusion. I misread the code, but it still seems to me it is a valid issue. What the patch tries to do is to fix the following pattern: - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2); - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); Ah, yes that's indeed a bug. I agree with you that ring->wptr is an offset to the ring. Just looking at the above lines it seems that they are incorrect when ring->wptr is larger than 1GB. As you pointed out that ring->wptr cannot be larger than (1 << 24), it can be resolved via either (1) fixing the patterns in the provided patch, or (2) clamping the results to (1 << 24) - 1 and getting rid of lower_32_bits() / higher_32_bits() at all. What's your suggestion of moving forward? It depends on the hardware generation. We used to have a bunch of older hardware where the wptr was one 32bit register where actually only 16,20 or 24bits where used. Then we have the newer hardware which has two 32bit registers for the WPTR and those really have 64bits in them. Clamping the value actually doesn't much sense because the hardware will do it anyway and if the value is larger than supported it won't work either way. So my suggestion is that you generate one patch where you fix all the lower/upper and shift mixup for the 64bit cases. And then maybe another patch where all the lower+masking for the 32bit cases is removed as a cleanup. Regards, Christian. Thanks, Haohui On Mon, Apr 25, 2022 at 7:02 PM Christian König wrote: Am 25.04.22 um 11:15 schrieb Haohui Mai: Computing the address of the doorbell should be done before instead of after separating the 64-bit address into the higher and lower half. The current code sets the MMIO registers incorrectly if the address of the doorbell is above 1G. That doesn't make any sense at all. The address of the doorbell is completely irrelevant to the value you write into it. What we could do is to stop using the lower_32_bits() function, since the WPTR can't handle more than 16, 20 or 24 bits (IIRC) depending on hw generation anyway. Regards, Christian. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++-- 7 files changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index c8ebd108548d..df863d346995 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); +(lower_32_bits(ring->wptr << 2)) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index 1d8bbcbd7a37..b83fd00466fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ rb_cntl = REG_
Re: [PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly
Sorry for the confusion. I misread the code, but it still seems to me it is a valid issue. What the patch tries to do is to fix the following pattern: - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2); - WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); I agree with you that ring->wptr is an offset to the ring. Just looking at the above lines it seems that they are incorrect when ring->wptr is larger than 1GB. As you pointed out that ring->wptr cannot be larger than (1 << 24), it can be resolved via either (1) fixing the patterns in the provided patch, or (2) clamping the results to (1 << 24) - 1 and getting rid of lower_32_bits() / higher_32_bits() at all. What's your suggestion of moving forward? Thanks, Haohui On Mon, Apr 25, 2022 at 7:02 PM Christian König wrote: > > Am 25.04.22 um 11:15 schrieb Haohui Mai: > > Computing the address of the doorbell should be done before instead of after > > separating the 64-bit address into the higher and lower half. The > > current code sets the MMIO registers incorrectly if the address of the > > doorbell is above 1G. > > That doesn't make any sense at all. The address of the doorbell is > completely irrelevant to the value you write into it. > > What we could do is to stop using the lower_32_bits() function, since > the WPTR can't handle more than 16, 20 or 24 bits (IIRC) depending on hw > generation anyway. > > Regards, > Christian. > > > > > Signed-off-by: Haohui Mai > > --- > > drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 ++-- > > drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- > > drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 > > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- > > drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 > > drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 > > drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++-- > > 7 files changed, 20 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c > > b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c > > index c8ebd108548d..df863d346995 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c > > +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c > > @@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring > > *ring) > >struct amdgpu_device *adev = ring->adev; > > > >WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], > > -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); > > +(lower_32_bits(ring->wptr << 2)) & 0x3fffc); > > } > > > > static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t > > count) > > @@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device > > *adev) > >WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); > > > >ring->wptr = 0; > > - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) > > << 2); > > + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << > > 2)); > > > >/* enable DMA RB */ > >WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c > > b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c > > index 1d8bbcbd7a37..b83fd00466fe 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c > > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c > > @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct > > amdgpu_ring *ring) > > { > >struct amdgpu_device *adev = ring->adev; > > > > - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], > > lower_32_bits(ring->wptr) << 2); > > + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], > > lower_32_bits(ring->wptr << 2)); > > } > > > > static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t > > count) > > @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device > > *adev) > >WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); > > > >ring->wptr = 0; > > - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) > > << 2); > > + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << > > 2)); > > > >/* enable DMA RB */ > >rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > > b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > > index 4ef4feff5649..557a7d5174b0 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c > > @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct > > amdgpu_ring *ring) > >if (ring->use_doorbell) { > >u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; > >
Re: [PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()
Thanks for the prompt reviews. Here is the updated patch. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 9426e252d8aa..c15549bbe636 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3830,8 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) gpu_addr = adev->wb.gpu_addr + (index * 4); adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD); memset(&ib, 0, sizeof(ib)); - r = amdgpu_ib_get(adev, NULL, 16, - AMDGPU_IB_POOL_DIRECT, &ib); + r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib); if (r) goto err1; -- 2.25.1 On Mon, Apr 25, 2022 at 6:52 PM Christian König wrote: > > Am 25.04.22 um 10:56 schrieb Haohui Mai: > > The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16 > > bytes during the test. The patch sets the size of the allocation to be > > 4-byte larger to match the actual usage. > > > > Signed-off-by: Haohui Mai > > --- > > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > index 9426e252d8aa..b131235826b1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > > @@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct > > amdgpu_ring *ring, long timeout) > > gpu_addr = adev->wb.gpu_addr + (index * 4); > > adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD); > > memset(&ib, 0, sizeof(ib)); > > - r = amdgpu_ib_get(adev, NULL, 16, > > + r = amdgpu_ib_get(adev, NULL, 20, > > AMDGPU_IB_POOL_DIRECT, &ib); > > Good catch, but while at it please fix the coding style and move the > "AMDGPU_IB_POOL_DIRECT, &ib);" on the same line as well. > > With that done, the patch is Reviewed-by: Christian König > > > > if (r) > > goto err1; > > -- > > 2.25.1 >
Re: [PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly
Am 25.04.22 um 11:15 schrieb Haohui Mai: Computing the address of the doorbell should be done before instead of after separating the 64-bit address into the higher and lower half. The current code sets the MMIO registers incorrectly if the address of the doorbell is above 1G. That doesn't make any sense at all. The address of the doorbell is completely irrelevant to the value you write into it. What we could do is to stop using the lower_32_bits() function, since the WPTR can't handle more than 16, 20 or 24 bits (IIRC) depending on hw generation anyway. Regards, Christian. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++-- 7 files changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index c8ebd108548d..df863d346995 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); +(lower_32_bits(ring->wptr << 2)) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index 1d8bbcbd7a37..b83fd00466fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 4ef4feff5649..557a7d5174b0 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; /* XXX check if swapping is necessary on BE */ - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); - WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); + WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2)); } else if (ring->use_pollmem) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); } else { - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } } diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/driv
Re: [PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()
Am 25.04.22 um 10:56 schrieb Haohui Mai: The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16 bytes during the test. The patch sets the size of the allocation to be 4-byte larger to match the actual usage. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 9426e252d8aa..b131235826b1 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) gpu_addr = adev->wb.gpu_addr + (index * 4); adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD); memset(&ib, 0, sizeof(ib)); - r = amdgpu_ib_get(adev, NULL, 16, + r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib); Good catch, but while at it please fix the coding style and move the "AMDGPU_IB_POOL_DIRECT, &ib);" on the same line as well. With that done, the patch is Reviewed-by: Christian König if (r) goto err1; -- 2.25.1
[PATCH] Set the 64-bit address of the wptr of SDMA doorbell properly
Computing the address of the doorbell should be done before instead of after separating the 64-bit address into the higher and lower half. The current code sets the MMIO registers incorrectly if the address of the doorbell is above 1G. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 drivers/gpu/drm/amd/amdgpu/si_dma.c| 4 ++-- 7 files changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c index c8ebd108548d..df863d346995 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c @@ -195,7 +195,7 @@ static void cik_sdma_ring_set_wptr(struct amdgpu_ring *ring) struct amdgpu_device *adev = ring->adev; WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], -(lower_32_bits(ring->wptr) << 2) & 0x3fffc); +(lower_32_bits(ring->wptr << 2)) & 0x3fffc); } static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -487,7 +487,7 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c index 1d8bbcbd7a37..b83fd00466fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c @@ -223,7 +223,7 @@ static void sdma_v2_4_ring_set_wptr(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) @@ -465,7 +465,7 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev) WREG32(mmSDMA0_GFX_RB_BASE_HI + sdma_offsets[i], ring->gpu_addr >> 40); ring->wptr = 0; - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[i], lower_32_bits(ring->wptr << 2)); /* enable DMA RB */ rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c index 4ef4feff5649..557a7d5174b0 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c @@ -389,14 +389,14 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; /* XXX check if swapping is necessary on BE */ - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); - WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr) << 2); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); + WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr << 2)); } else if (ring->use_pollmem) { u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs]; - WRITE_ONCE(*wb, (lower_32_bits(ring->wptr) << 2)); + WRITE_ONCE(*wb, (lower_32_bits(ring->wptr << 2))); } else { - WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr) << 2); + WREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me], lower_32_bits(ring->wptr << 2)); } } diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index d7e8f7232364..ff86c43b63d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ring->wptr_offs, lower_32_bits(ring->wptr << 2), upper_32_bits(ring->wptr << 2)); diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index a8d49c005f73..627eb1f147c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring) if (ring->use_doorbell) { DRM_DEBUG("Using doorbell -- " "wptr_offs == 0x%08x " - "lower_32_bits(ring->wptr) << 2 == 0x%08x " - "upper_32_bits(ring->wptr) << 2 == 0x%08x\n", + "lower_32_bits(ring->wptr << 2) == 0x%08x " + "upper_32_bits(ring->wptr << 2) == 0x%08x\n", ri
[PATCH] Fix out-of-bound access for gfx_v10_0_ring_test_ib()
The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16 bytes during the test. The patch sets the size of the allocation to be 4-byte larger to match the actual usage. Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 9426e252d8aa..b131235826b1 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3830,7 +3830,7 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout) gpu_addr = adev->wb.gpu_addr + (index * 4); adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD); memset(&ib, 0, sizeof(ib)); - r = amdgpu_ib_get(adev, NULL, 16, + r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib); if (r) goto err1; -- 2.25.1
[REGRESSION] AMD GPU regression in 5.16.18..5.17.4
Two Qubes OS users reported that their AMD GPU systems did not work on 5.17.4, while 5.16.18 worked fine. Details can be found on https://github.com/QubesOS/qubes-issues/issues/7462. The initial report listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as “AMD Ryzen 5 PRO 4650U with Radeon Graphics”. #regzbot introduced: v5.16.18..v5.17.4 -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab signature.asc Description: PGP signature
Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
#regzbot introduced: v5.17.3..v5.17.4 (as per https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107679532) -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab signature.asc Description: PGP signature
AMD GPU regression in 5.16.18..5.17.4
Two Qubes OS users reported that their AMD GPU systems did not work on 5.17.4, while 5.16.18 worked fine. Details can be found on https://github.com/QubesOS/qubes-issues/issues/7462. The initial report listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as “AMD Ryzen 5 PRO 4650U with Radeon Graphics”. #regzbot introduced: v5.16.18..v5.17.4 -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab signature.asc Description: PGP signature
[PATCH] drm/amd/display: fix if == else warning
Fix the following coccicheck warning: drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c:98:8-10: WARNING: possible condition with no effect (if == else) Signed-off-by: Guo Zhengkui --- drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c index fe22530242d2..05b3fba9ccce 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c @@ -95,8 +95,6 @@ static void gpu_addr_to_uma(struct dce_hwseq *hwseq, } else if (hwseq->fb_offset.quad_part <= addr->quad_part && addr->quad_part <= hwseq->uma_top.quad_part) { is_in_uma = true; - } else if (addr->quad_part == 0) { - is_in_uma = false; } else { is_in_uma = false; } -- 2.20.1
Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote: > CCing the amdgpu maintainers Also CCing Marek Marczykowski-Górecki as Qubes OS Project Lead. > On 24.04.22 08:12, Greg KH wrote: > > On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote: > >> Two Qubes OS users reported that their AMD GPU systems did not work on > >> 5.17.4, while 5.16.18 worked fine. Details can be found on > >> https://github.com/QubesOS/qubes-issues/issues/7462. The initial report > >> listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir > >> [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as > >> “AMD Ryzen 5 PRO 4650U with Radeon Graphics”. > >> > >> #regzbot introduced: v5.16.18..v5.17.4 > > > > That's a big range, can you use 'git bisect' to track it down? > > FWIW, in another mail in this thread and > https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107681126 > it was clarified that the problem was introduced between 5.17.3 and > 5.17.4, where a few amdgpu changes were applied. Maybe they are the reason. I suspect they are, but I am not certain. It could also be an interaction between these changes and running as a PV dom0 under Xen. > Anyway: Yes, as Gregkh said, a bisection really would help. But maybe > the amdgfx developers might already have an idea what might be wrong > here? The QubesOS ticket linked above has some more details. I can’t provide a bisection myself as I don’t have the hardware. I asked the user who reported the bug if they are comfortable doing a bisection. -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab signature.asc Description: PGP signature
Screen corruption using radeon kernel driver
Hello! After updating my Linux kernel from version 4.19 (Debian 10 version) to 5.10 (packaged with Debian 11), I've noticed that the image displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950 AGP video card is severely corrupted in the graphical (Xorg and Wayland) mode: all kinds of black and white stripes across the screen, some letters missing, etc. I've checked several options (Xorg drivers, Wayland instead of Xorg, radeon.agpmode=-1 in kernel command line and so on), but the problem persisted. I've managed to find that the problem was in the kernel, as everything worked well with 4.19 kernel with everything else being from Debian 11. I have managed to find the culprit of that corruption, that is the commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel. Reverting this commit and building the kernel with that commit reverted fixes the problem. Disabling HIMEM also gets rid of that problem. But it also leaves the system with less that 1G of RAM, which is, of course, undesirable. Apparently this problem is somewhat known, as I can tell after googling for the commit id, see this link for example: https://lkml.org/lkml/2020/1/9/518 Mageia distro, for example, reverted this commit in the kernel they are building: http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32 I've reported this bug to Debian bugtracker, checked the recent verion of the kernel (5.17), bug still persists. Here's a link to the Debian bug page: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670 I'm not sure if reverting this commit is the correct way to go, so if you need to check any changes/patches that I could apply and test on the real hardware, I'll be glad to do that (but please keep in mind that testing could take some time, I don't have access to this computer 24/7, but I'll do my best to respond ASAP). Thanks in advance! pgpEuvmQuSj20.pgp Description: OpenPGP digital signature
Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote: > CCing the amdgpu maintainers > > On 24.04.22 08:12, Greg KH wrote: > > On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote: > >> Two Qubes OS users reported that their AMD GPU systems did not work on > >> 5.17.4, while 5.16.18 worked fine. Details can be found on > >> https://github.com/QubesOS/qubes-issues/issues/7462. The initial report > >> listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir > >> [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as > >> “AMD Ryzen 5 PRO 4650U with Radeon Graphics”. > >> > >> #regzbot introduced: v5.16.18..v5.17.4 > > > > That's a big range, can you use 'git bisect' to track it down? > > FWIW, in another mail in this thread and > https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107681126 > it was clarified that the problem was introduced between 5.17.3 and > 5.17.4, where a few amdgpu changes were applied. Maybe they are the reason. > > Anyway: Yes, as Gregkh said, a bisection really would help. But maybe > the amdgfx developers might already have an idea what might be wrong > here? The QubesOS ticket linked above has some more details. The reporter was able to bisect the regression. I am not surprised that this commit caused problems for Qubes OS, as dom0 in Qubes OS is technically a guest of Xen. #regzbot ^introduced: b818a5d374542ccec73dcfe578a081574029820e -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab signature.asc Description: PGP signature
[PATCH] drm/amd/display: fix non-kernel-doc comment warnings
Fix kernel-doc warnings for a comment that should not use kernel-doc notation: dmub_psr.c:235: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Set PSR power optimization flags. dmub_psr.c:235: warning: missing initial short description on line: * Set PSR power optimization flags. Fixes: e5dfcd272722 ("drm/amd/display: dc_link_set_psr_allow_active refactoring") Signed-off-by: Randy Dunlap Reported-by: kernel test robot Cc: Robin Chen Cc: Alex Deucher Cc: Anthony Koo Cc: amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Cc: David Airlie Cc: Daniel Vetter Cc: Harry Wentland Cc: Leo Li Cc: Rodrigo Siqueira --- drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c @@ -231,7 +231,7 @@ static void dmub_psr_set_level(struct dm dc_dmub_srv_wait_idle(dc->dmub_srv); } -/** +/* * Set PSR power optimization flags. */ static void dmub_psr_set_power_opt(struct dmub_psr *dmub, unsigned int power_opt, uint8_t panel_inst)
Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote: > Two Qubes OS users reported that their AMD GPU systems did not work on > 5.17.4, while 5.16.18 worked fine. Details can be found on > https://github.com/QubesOS/qubes-issues/issues/7462. The initial report > listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir > [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as > “AMD Ryzen 5 PRO 4650U with Radeon Graphics”. > > #regzbot introduced: v5.16.18..v5.17.4 That's a big range, can you use 'git bisect' to track it down? thanks, greg k-h
[PATCH] drm/radeon: change cac_weights_* to static
Sparse reports these issues si_dpm.c:332:26: warning: symbol 'cac_weights_pitcairn' was not declared. Should it be static? si_dpm.c:1088:26: warning: symbol 'cac_weights_oland' was not declared. Should it be static? Both of these variables are only used in si_dpm.c. Single file variables should be static, so change their storage-class specifiers to static. Signed-off-by: Tom Rix --- drivers/gpu/drm/radeon/si_dpm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index 3add39c1a689..fbf968e3f6d7 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -329,7 +329,7 @@ static const struct si_dte_data dte_data_malta = true }; -struct si_cac_config_reg cac_weights_pitcairn[] = +static struct si_cac_config_reg cac_weights_pitcairn[] = { { 0x0, 0x, 0, 0x8a, SISLANDS_CACCONFIG_CGIND }, { 0x0, 0x, 16, 0x0, SISLANDS_CACCONFIG_CGIND }, @@ -1085,7 +1085,7 @@ static const struct si_dte_data dte_data_venus_pro = true }; -struct si_cac_config_reg cac_weights_oland[] = +static struct si_cac_config_reg cac_weights_oland[] = { { 0x0, 0x, 0, 0x82, SISLANDS_CACCONFIG_CGIND }, { 0x0, 0x, 16, 0x4F, SISLANDS_CACCONFIG_CGIND }, -- 2.27.0
[PATCH] drm/radeon: change cik_default_state table from global to static
Sparse reports these issues cik_blit_shaders.c:31:11: warning: symbol 'cik_default_state' was not declared. Should it be static? cik_blit_shaders.c:246:11: warning: symbol 'cik_default_size' was not declared. Should it be static? cik_default_state and cik_default_size are only used in cik.c. Single file symbols should be static. So move their definitions to cik_blit_shaders.h and change their storage-class-specifier to static. Remove unneeded cik_blit_shader.c Signed-off-by: Tom Rix --- drivers/gpu/drm/radeon/Makefile | 2 +- drivers/gpu/drm/radeon/cik_blit_shaders.c | 246 -- drivers/gpu/drm/radeon/cik_blit_shaders.h | 219 ++- 3 files changed, 218 insertions(+), 249 deletions(-) delete mode 100644 drivers/gpu/drm/radeon/cik_blit_shaders.c diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile index 1045d2c46a76..ea5380e24c3c 100644 --- a/drivers/gpu/drm/radeon/Makefile +++ b/drivers/gpu/drm/radeon/Makefile @@ -44,7 +44,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \ evergreen.o evergreen_cs.o \ evergreen_hdmi.o radeon_trace_points.o ni.o \ atombios_encoders.o radeon_semaphore.o radeon_sa.o atombios_i2c.o si.o \ - radeon_prime.o cik.o cik_blit_shaders.o \ + radeon_prime.o cik.o \ r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \ rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o \ trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \ diff --git a/drivers/gpu/drm/radeon/cik_blit_shaders.c b/drivers/gpu/drm/radeon/cik_blit_shaders.c deleted file mode 100644 index ff1311806e91.. --- a/drivers/gpu/drm/radeon/cik_blit_shaders.c +++ /dev/null @@ -1,246 +0,0 @@ -/* - * Copyright 2012 Advanced Micro Devices, Inc. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE COPYRIGHT HOLDER(S) AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER - * DEALINGS IN THE SOFTWARE. - * - * Authors: - * Alex Deucher - */ - -#include -#include -#include - -const u32 cik_default_state[] = -{ - 0xc0066900, - 0x, - 0x0060, /* DB_RENDER_CONTROL */ - 0x, /* DB_COUNT_CONTROL */ - 0x, /* DB_DEPTH_VIEW */ - 0x002a, /* DB_RENDER_OVERRIDE */ - 0x, /* DB_RENDER_OVERRIDE2 */ - 0x, /* DB_HTILE_DATA_BASE */ - - 0xc0046900, - 0x0008, - 0x, /* DB_DEPTH_BOUNDS_MIN */ - 0x, /* DB_DEPTH_BOUNDS_MAX */ - 0x, /* DB_STENCIL_CLEAR */ - 0x, /* DB_DEPTH_CLEAR */ - - 0xc0036900, - 0x000f, - 0x, /* DB_DEPTH_INFO */ - 0x, /* DB_Z_INFO */ - 0x, /* DB_STENCIL_INFO */ - - 0xc0016900, - 0x0080, - 0x, /* PA_SC_WINDOW_OFFSET */ - - 0xc00d6900, - 0x0083, - 0x, /* PA_SC_CLIPRECT_RULE */ - 0x, /* PA_SC_CLIPRECT_0_TL */ - 0x20002000, /* PA_SC_CLIPRECT_0_BR */ - 0x, - 0x20002000, - 0x, - 0x20002000, - 0x, - 0x20002000, - 0x, /* PA_SC_EDGERULE */ - 0x, /* PA_SU_HARDWARE_SCREEN_OFFSET */ - 0x000f, /* CB_TARGET_MASK */ - 0x000f, /* CB_SHADER_MASK */ - - 0xc0226900, - 0x0094, - 0x8000, /* PA_SC_VPORT_SCISSOR_0_TL */ - 0x20002000, /* PA_SC_VPORT_SCISSOR_0_BR */ - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, - 0x8000, - 0x20002000, -
Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
CCing the amdgpu maintainers On 24.04.22 08:12, Greg KH wrote: > On Sat, Apr 23, 2022 at 12:06:33PM -0400, Demi Marie Obenour wrote: >> Two Qubes OS users reported that their AMD GPU systems did not work on >> 5.17.4, while 5.16.18 worked fine. Details can be found on >> https://github.com/QubesOS/qubes-issues/issues/7462. The initial report >> listed the GPU as “Advanced Micro Devices, Inc. [AMD/ATI] Renoir >> [1002:1636} (rev d3) (prog-if 00 [VGA controller])” and the CPU as >> “AMD Ryzen 5 PRO 4650U with Radeon Graphics”. >> >> #regzbot introduced: v5.16.18..v5.17.4 > > That's a big range, can you use 'git bisect' to track it down? FWIW, in another mail in this thread and https://github.com/QubesOS/qubes-issues/issues/7462#issuecomment-1107681126 it was clarified that the problem was introduced between 5.17.3 and 5.17.4, where a few amdgpu changes were applied. Maybe they are the reason. Anyway: Yes, as Gregkh said, a bisection really would help. But maybe the amdgfx developers might already have an idea what might be wrong here? The QubesOS ticket linked above has some more details. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
Re: [REGRESSION] AMD GPU regression in 5.16.18..5.17.4
On Sun, Apr 24, 2022 at 11:52:05AM -0400, Demi Marie Obenour wrote: > On Sun, Apr 24, 2022 at 11:02:43AM +0200, Thorsten Leemhuis wrote: > > CCing the amdgpu maintainers > > Also CCing Marek Marczykowski-Górecki as Qubes OS Project Lead. For real this time (whoops!). -- Sincerely, Demi Marie Obenour (she/her/hers) Invisible Things Lab signature.asc Description: PGP signature