Re: [PATCH] drm/amdgpu: Fix minmax error
Am 25.11.22 um 08:56 schrieb Luben Tuikov: On 2022-11-25 02:45, Christian König wrote: Am 24.11.22 um 22:19 schrieb Luben Tuikov: Fix minmax compilation error by using min_t()/max_t(), of the assignment type. Cc: James Zhu Cc: Felix Kuehling Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system memory") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c index 8a2e5716d8dba2..d22d14b0ef0c84 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c @@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct mmu_interval_notifier *notifier, hmm_range->dev_private_owner = owner; do { - hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end); + hmm_range->end = min_t(typeof(hmm_range->end), + hmm_range->start + MAX_WALK_BYTE, + end); Since end is a local variable I would strongly prefer to just have it use the correct type for it. Otherwise we might end up using something which doesn't work on all architectures. They all appear to be "unsigned long". I thought, since we assign to hmm_range->end, we use that type. Mhm, then why does the compiler complain here? As far as I can see "unsigned long" is correct here, but if we somehow have a typecast then something is not working as expected. Is MAX_WALK_BYTE maybe of signed type? Would you prefer at the top of the function to define "timeout" and "end" as, typeof(hmm_range->end) end, timeout; Well for end that might make sense, but timeout is independent of the hmm range. Regards, Christian. Regards, Luben
Re: [PATCH] drm/amdgpu: Fix minmax error
On 2022-11-25 02:45, Christian König wrote: > > > Am 24.11.22 um 22:19 schrieb Luben Tuikov: >> Fix minmax compilation error by using min_t()/max_t(), of the assignment >> type. >> >> Cc: James Zhu >> Cc: Felix Kuehling >> Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large >> system memory") >> Signed-off-by: Luben Tuikov >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++--- >> 1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c >> index 8a2e5716d8dba2..d22d14b0ef0c84 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c >> @@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct >> mmu_interval_notifier *notifier, >> hmm_range->dev_private_owner = owner; >> >> do { >> -hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end); >> +hmm_range->end = min_t(typeof(hmm_range->end), >> + hmm_range->start + MAX_WALK_BYTE, >> + end); > > Since end is a local variable I would strongly prefer to just have it > use the correct type for it. > > Otherwise we might end up using something which doesn't work on all > architectures. They all appear to be "unsigned long". I thought, since we assign to hmm_range->end, we use that type. Would you prefer at the top of the function to define "timeout" and "end" as, typeof(hmm_range->end) end, timeout; Regards, Luben
Re: [PATCH] drm/amdgpu: Fix minmax error
Am 24.11.22 um 22:19 schrieb Luben Tuikov: Fix minmax compilation error by using min_t()/max_t(), of the assignment type. Cc: James Zhu Cc: Felix Kuehling Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system memory") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c index 8a2e5716d8dba2..d22d14b0ef0c84 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c @@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct mmu_interval_notifier *notifier, hmm_range->dev_private_owner = owner; do { - hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end); + hmm_range->end = min_t(typeof(hmm_range->end), + hmm_range->start + MAX_WALK_BYTE, + end); Since end is a local variable I would strongly prefer to just have it use the correct type for it. Otherwise we might end up using something which doesn't work on all architectures. Regards, Christian. pr_debug("hmm range: start = 0x%lx, end = 0x%lx", hmm_range->start, hmm_range->end); /* Assuming 512MB takes maxmium 1 second to fault page address */ - timeout = max((hmm_range->end - hmm_range->start) >> 29, 1ULL) * - HMM_RANGE_DEFAULT_TIMEOUT; + timeout = max_t(typeof(timeout), + (hmm_range->end - hmm_range->start) >> 29, + 1ULL); + timeout *= HMM_RANGE_DEFAULT_TIMEOUT; timeout = jiffies + msecs_to_jiffies(timeout); retry: base-commit: d5e8f4912061ad2e577b4909556e1364e2c2018e prerequisite-patch-id: 6024d0c36cae3e4a995a8fcf787b91f511a37486
RE: [PATCH] drm/amdgpu: fix for set df cstate failed under sriov
[AMD Official Use Only - General] Hi Shikang, your patch will impact all amdgpu devices. May I know which device/asic has this issue ? other words, you'd better move your changes into smu side as well (likes smu_v13_xx_ppt.c). Best Regards, Kevin -Original Message- From: amd-gfx On Behalf Of Shikang Fan Sent: 2022年11月25日 15:15 To: amd-gfx@lists.freedesktop.org Cc: Fan, Shikang Subject: [PATCH] drm/amdgpu: fix for set df cstate failed under sriov - set_df_cstate won't be called under sriov case. Signed-off-by: Shikang Fan --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 818fa72c670d..c557585aa46c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2943,13 +2943,15 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev) amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); - /* -* Per PMFW team's suggestion, driver needs to handle gfxoff -* and df cstate features disablement for gpu reset(e.g. Mode1Reset) -* scenario. Add the missing df cstate disablement here. -*/ - if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW)) - dev_warn(adev->dev, "Failed to disallow df cstate"); + if (!amdgpu_sriov_vf(adev)) { + /* +* Per PMFW team's suggestion, driver needs to handle gfxoff +* and df cstate features disablement for gpu reset(e.g. Mode1Reset) +* scenario. Add the missing df cstate disablement here. +*/ + if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW)) + dev_warn(adev->dev, "Failed to disallow df cstate"); + } for (i = adev->num_ip_blocks - 1; i >= 0; i--) { if (!adev->ip_blocks[i].status.valid) -- 2.25.1 <>
[PATCH] drm/amdgpu: fix for set df cstate failed under sriov
- set_df_cstate won't be called under sriov case. Signed-off-by: Shikang Fan --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 818fa72c670d..c557585aa46c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2943,13 +2943,15 @@ static int amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev) amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE); - /* -* Per PMFW team's suggestion, driver needs to handle gfxoff -* and df cstate features disablement for gpu reset(e.g. Mode1Reset) -* scenario. Add the missing df cstate disablement here. -*/ - if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW)) - dev_warn(adev->dev, "Failed to disallow df cstate"); + if (!amdgpu_sriov_vf(adev)) { + /* +* Per PMFW team's suggestion, driver needs to handle gfxoff +* and df cstate features disablement for gpu reset(e.g. Mode1Reset) +* scenario. Add the missing df cstate disablement here. +*/ + if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW)) + dev_warn(adev->dev, "Failed to disallow df cstate"); + } for (i = adev->num_ip_blocks - 1; i >= 0; i--) { if (!adev->ip_blocks[i].status.valid) -- 2.25.1
Re: [PATCH] drm/amdgpu: add mb for si
On 11/25/2022 7:43 AM, Quan, Evan wrote: [AMD Official Use Only - General] -Original Message- From: Lazar, Lijo Sent: Thursday, November 24, 2022 6:49 PM To: Quan, Evan ; 李真能 ; Michel Dänzer ; Koenig, Christian ; Deucher, Alexander Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add mb for si On 11/24/2022 4:11 PM, Lazar, Lijo wrote: On 11/24/2022 3:34 PM, Quan, Evan wrote: [AMD Official Use Only - General] Could the attached patch help? Evan -Original Message- From: amd-gfx On Behalf Of ??? Sent: Friday, November 18, 2022 5:25 PM To: Michel Dänzer ; Koenig, Christian ; Deucher, Alexander Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add mb for si 在 2022/11/18 17:18, Michel Dänzer 写道: On 11/18/22 09:01, Christian König wrote: Am 18.11.22 um 08:48 schrieb Zhenneng Li: During reboot test on arm64 platform, it may failure on boot, so add this mb in smc. The error message are as follows: [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block failed -22 [ 7.006919][ 7] [ T295] amdgpu :04:00.0: The issue is happening in late_init() which eventually does ret = si_thermal_enable_alert(adev, false); Just before this, si_thermal_start_thermal_controller is called in hw_init and that enables thermal alert. Maybe the issue is with enable/disable of thermal alerts in quick succession. Adding a delay inside si_thermal_start_thermal_controller might help. On a second look, temperature range is already set as part of si_thermal_start_thermal_controller in hw_init https://elixir.bootlin.com/linux/v6.1- rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L6780 There is no need to set it again here - https://elixir.bootlin.com/linux/v6.1- rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L7635 I think it is safe to remove the call from late_init altogether. Alex/Evan? [Quan, Evan] Yes, it makes sense to me. But I'm not sure whether that’s related with the issue here. Since per my understandings, if the issue is caused by double calling of thermal_alert enablement, it will fail every time. That cannot explain why adding some delays or a mb() calling can help. The side effect of the patch is just some random delay introduced for every SMC message The issue happens in late_init(). Between late_init() and dpm enablement, there are many smc messages sent which don't have this issue. So I think the issue is not with FW not running. Thus the only case I see is enable/disable of thermal alert in random succession. Thanks, Lijo BR Evan Thanks, Lijo Thanks, Lijo amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu :04:00.0: Fatal error during GPU init Memory barries are not supposed to be sprinkled around like this, you need to give a detailed explanation why this is necessary. Regards, Christian. Signed-off-by: Zhenneng Li --- drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c index 8f994ffa9cd1..c7656f22278d 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct amdgpu_device *adev) u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); + mb(); + if (!(rst & RST_REG) && !(clk & CK_DISABLE)) return true; In particular, it makes no sense in this specific place, since it cannot directly affect the values of rst & clk. I thinks so too. But when I do reboot test using nine desktop machines, there maybe report this error on one or two machines after Hundreds of times or Thousands of times reboot test, at the beginning, I use msleep() instead of mb(), these two methods are all works, but I don't know what is the root case. I use this method on other verdor's oland card, this error message are reported again. What could be the root reason? test environmen: graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 driver: amdgpu os: ubuntu 2004 platform: arm64 kernel: 5.4.18
回复: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm
[AMD Official Use Only - General] -邮件原件- 发件人: Paneer Selvam, Arunpravin 发送时间: 2022年11月24日 22:57 收件人: Christian König ; Wang, Yang(Kevin) ; Koenig, Christian ; amd-gfx@lists.freedesktop.org 抄送: Deucher, Alexander ; Zhang, Hawking 主题: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm On 11/24/2022 1:17 PM, Christian König wrote: > Am 24.11.22 um 08:45 schrieb Wang, Yang(Kevin): >> [AMD Official Use Only - General] >> >> -Original Message- >> From: Koenig, Christian >> Sent: Thursday, November 24, 2022 3:25 PM >> To: Wang, Yang(Kevin) ; >> amd-gfx@lists.freedesktop.org; Paneer Selvam, Arunpravin >> >> Cc: Zhang, Hawking ; Deucher, Alexander >> >> Subject: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information >> into amdgpu_vram_mm >> >> Am 24.11.22 um 06:49 schrieb Yang Wang: >>> add vram usage information into dri debugfs amdgpu_vram_mm node. >>> >>> Background: >>> when amdgpu driver introduces drm buddy allocator, the kernel driver >>> (and developer) is difficult to track the vram usage information. >>> >>> Field: >>> 0x-0x: vram usaged range. >>> type: kernel, device, sg >>> usage: normal, vm, user >>> domain: C-CPU, G-GTT, V-VRAM, P-PRIV >>> @x: the address of "amdgpu_bo" object in kernel space. >>> 4096: vram range range. >>> >>> Example: >>> 0x0003fea68000-0x0003fea68fff: (type:kernel usage:vm >>> domain:--V- --V-) @1d33dfee 4096 bytes >>> 0x0003fea69000-0x0003fea69fff: (type:kernel usage:vm >>> domain:--V- --V-) @a79155b5 4096 bytes >>> 0x0003fea6b000-0x0003fea6bfff: (type:kernel usage:vm >>> domain:--V- --V-) @38ad633b 4096 bytes >>> 0x0003fea6c000-0x0003fea6cfff: (type:device usage:user >>> domain:--V- --V-) @e302f90b 4096 bytes >>> 0x0003fea6d000-0x0003fea6dfff: (type:device usage:user >>> domain:--V- --V-) @e664c172 4096 bytes >>> 0x0003fea6e000-0x0003fea6efff: (type:kernel usage:vm >>> domain:--V- --V-) @4528cb2f 4096 bytes >>> 0x0003fea6f000-0x0003fea6: (type:kernel usage:vm >>> domain:--V- --V-) @a446bdbf 4096 bytes >>> 0x0003fea7-0x0003fea7: (type:device usage:user >>> domain:--V- --V-) @78fae42f 65536 bytes >>> 0x0003fead8000-0x0003feadbfff: (type:kernel usage:normal >>> domain:--V- --V-) @1327b7ff 16384 bytes >>> 0x0003feadc000-0x0003feadcfff: (type:kernel usage:normal >>> domain:--V- --V-) @1327b7ff 4096 bytes >>> 0x0003feadd000-0x0003feaddfff: (type:kernel usage:normal >>> domain:--V- --V-) @b9706fc1 4096 bytes >>> 0x0003feade000-0x0003feadefff: (type:kernel usage:vm >>> domain:--V- --V-) @71a25571 4096 bytes >>> >>> Note: >>> although some vram ranges can be merged in the example above, but >>> this can reflect the actual distribution of drm buddy allocator. >> Well completely NAK. This is way to much complexity for simple >> debugging. >> >> Question is what are your requirements here? E.g. what information do >> you want and why doesn't the buddy allocator already expose this? >> >> Regards, >> Christian. >> >> [Kevin]: >> >> For KMD debug. >> The DRM buddy interface doesn't provide an interface to query which >> ranges of ram(resource) are used. >> It is not easy to debug in KMD side if driver create BO fail at >> specific location. >> and from the view of KMD, the VRAM at some locations has special >> purposes. >> with this patch, we can know which range of vram are actually used. > > Well that's not a good reason to add this complexity. Debugging > doesn't justify that. > > Please work with Arun to add the necessary information to the buddy > allocator interface. > > Regards, > Christian. > Hi Kevin, I will check and list down some of the necessary information that we can add to the buddy allocator interface. Regards, Arun. [kevin]: Thanks, but some information is AMDGPU specific, so I hope we can add some flexible interfaces to adapt the different driver's extension. Best Regards, Kevin >> >> Best Regards, >> Kevin >>> Signed-off-by: Yang Wang >>> --- >>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +- >>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 3 + >>>drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 130 >>> ++- >>>drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 1 + >>>4 files changed, 136 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>> index 90eb07106609..117c754409b3 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >>> @@ -53,7 +53,7 @@ >>> * >>> */ >>> >>> -static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo) >>> +void amdgpu_bo_destroy(struct ttm_buffer_object *tbo) >>>{ >>>struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); >>> >>> @@ -66,7 +66,7 @@ static voi
Re: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition
Applied! Thanks, Luben On 2022-11-24 05:10, Quan, Evan wrote: > [AMD Official Use Only - General] > > Reviewed-by: Evan Quan > >> -Original Message- >> From: amd-gfx On Behalf Of Yu >> Songping >> Sent: Thursday, November 24, 2022 9:53 AM >> To: airl...@gmail.com; dan...@ffwll.ch >> Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; linux- >> ker...@vger.kernel.org >> Subject: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition >> >> The logical operator '&&' will make >> smu->ppt_funcs->set_gfx_power_up_by_imu segment fault when >> ppt_funcs is >> smu->NULL. >> >> Signed-off-by: Yu Songping >> --- >> drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >> index b880f4d7d67e..1cb728b0b7ee 100644 >> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >> @@ -161,7 +161,7 @@ int smu_get_dpm_freq_range(struct smu_context >> *smu, >> >> int smu_set_gfx_power_up_by_imu(struct smu_context *smu) { >> - if (!smu->ppt_funcs && !smu->ppt_funcs- >>> set_gfx_power_up_by_imu) >> + if (!smu->ppt_funcs || !smu->ppt_funcs- >>> set_gfx_power_up_by_imu) >> return -EOPNOTSUPP; >> >> return smu->ppt_funcs->set_gfx_power_up_by_imu(smu); >> -- >> 2.17.1
RE: [PATCH] drm/amdgpu: add mb for si
[AMD Official Use Only - General] > -Original Message- > From: Lazar, Lijo > Sent: Thursday, November 24, 2022 6:49 PM > To: Quan, Evan ; 李真能 ; > Michel Dänzer ; Koenig, Christian > ; Deucher, Alexander > > Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; > linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org > Subject: Re: [PATCH] drm/amdgpu: add mb for si > > > > On 11/24/2022 4:11 PM, Lazar, Lijo wrote: > > > > > > On 11/24/2022 3:34 PM, Quan, Evan wrote: > >> [AMD Official Use Only - General] > >> > >> Could the attached patch help? > >> > >> Evan > >>> -Original Message- > >>> From: amd-gfx On Behalf > Of ??? > >>> Sent: Friday, November 18, 2022 5:25 PM > >>> To: Michel Dänzer ; Koenig, Christian > >>> ; Deucher, Alexander > >>> > >>> Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; > >>> linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org > >>> Subject: Re: [PATCH] drm/amdgpu: add mb for si > >>> > >>> > >>> 在 2022/11/18 17:18, Michel Dänzer 写道: > On 11/18/22 09:01, Christian König wrote: > > Am 18.11.22 um 08:48 schrieb Zhenneng Li: > >> During reboot test on arm64 platform, it may failure on boot, so > >> add this mb in smc. > >> > >> The error message are as follows: > >> [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init > >> [amdgpu]] *ERROR* > >> late_init of IP block failed -22 [ > >> 7.006919][ 7] [ T295] amdgpu :04:00.0: > > > > The issue is happening in late_init() which eventually does > > > > ret = si_thermal_enable_alert(adev, false); > > > > Just before this, si_thermal_start_thermal_controller is called in > > hw_init and that enables thermal alert. > > > > Maybe the issue is with enable/disable of thermal alerts in quick > > succession. Adding a delay inside si_thermal_start_thermal_controller > > might help. > > > > On a second look, temperature range is already set as part of > si_thermal_start_thermal_controller in hw_init > https://elixir.bootlin.com/linux/v6.1- > rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L6780 > > There is no need to set it again here - > > https://elixir.bootlin.com/linux/v6.1- > rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L7635 > > I think it is safe to remove the call from late_init altogether. Alex/Evan? > [Quan, Evan] Yes, it makes sense to me. But I'm not sure whether that’s related with the issue here. Since per my understandings, if the issue is caused by double calling of thermal_alert enablement, it will fail every time. That cannot explain why adding some delays or a mb() calling can help. BR Evan > Thanks, > Lijo > > > Thanks, > > Lijo > > > >> amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu > >> :04:00.0: Fatal error during GPU init > > Memory barries are not supposed to be sprinkled around like this, > you > >>> need to give a detailed explanation why this is necessary. > > > > Regards, > > Christian. > > > >> Signed-off-by: Zhenneng Li > >> --- > >> drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >> index 8f994ffa9cd1..c7656f22278d 100644 > >> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct > >> amdgpu_device *adev) > >> u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); > >> u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); > >> + mb(); > >> + > >> if (!(rst & RST_REG) && !(clk & CK_DISABLE)) > >> return true; > In particular, it makes no sense in this specific place, since it > cannot directly > >>> affect the values of rst & clk. > >>> > >>> I thinks so too. > >>> > >>> But when I do reboot test using nine desktop machines, there maybe > >>> report > >>> this error on one or two machines after Hundreds of times or > >>> Thousands of > >>> times reboot test, at the beginning, I use msleep() instead of mb(), > >>> these > >>> two methods are all works, but I don't know what is the root case. > >>> > >>> I use this method on other verdor's oland card, this error message are > >>> reported again. > >>> > >>> What could be the root reason? > >>> > >>> test environmen: > >>> > >>> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 > >>> > >>> driver: amdgpu > >>> > >>> os: ubuntu 2004 > >>> > >>> platform: arm64 > >>> > >>> kernel: 5.4.18 > >>> > <>
RE: [PATCH] drm/amdgpu: add mb for si
[AMD Official Use Only - General] Did you see that? It's a patch which I created by git-format-patch. Anyway I will paste the changes below. I was suspecting maybe we need some waits for smu running. diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c index 49c398ec0aaf..9f308a021b2d 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c @@ -6814,6 +6814,7 @@ static int si_dpm_enable(struct amdgpu_device *adev) struct si_power_info *si_pi = si_get_pi(adev); struct amdgpu_ps *boot_ps = adev->pm.dpm.boot_ps; int ret; + int i; if (amdgpu_si_is_smc_running(adev)) return -EINVAL; @@ -6909,6 +6910,17 @@ static int si_dpm_enable(struct amdgpu_device *adev) si_program_response_times(adev); si_program_ds_registers(adev); si_dpm_start_smc(adev); + /* Waiting for smc alive */ + for (i = 0; i < adev->usec_timeout; i++) { + if (amdgpu_si_is_smc_running(adev)) + break; + udelay(1); + } + if (i >= adev->usec_timeout) { + DRM_ERROR("Timedout on waiting for smu running\n"); + return -EINVAL; + } + ret = si_notify_smc_display_change(adev, false); if (ret) { DRM_ERROR("si_notify_smc_display_change failed\n"); BR Evan > -Original Message- > From: Christian König > Sent: Thursday, November 24, 2022 6:06 PM > To: Quan, Evan ; 李真能 ; > Michel Dänzer ; Koenig, Christian > ; Deucher, Alexander > > Cc: dri-de...@lists.freedesktop.org; Pan, Xinhui ; > linux-ker...@vger.kernel.org; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH] drm/amdgpu: add mb for si > > That's not a patch but some binary file? > > Christian. > > Am 24.11.22 um 11:04 schrieb Quan, Evan: > > [AMD Official Use Only - General] > > > > Could the attached patch help? > > > > Evan > >> -Original Message- > >> From: amd-gfx On Behalf > Of ??? > >> Sent: Friday, November 18, 2022 5:25 PM > >> To: Michel Dänzer ; Koenig, Christian > >> ; Deucher, Alexander > >> > >> Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; > >> linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org > >> Subject: Re: [PATCH] drm/amdgpu: add mb for si > >> > >> > >> 在 2022/11/18 17:18, Michel Dänzer 写道: > >>> On 11/18/22 09:01, Christian König wrote: > Am 18.11.22 um 08:48 schrieb Zhenneng Li: > > During reboot test on arm64 platform, it may failure on boot, so > > add this mb in smc. > > > > The error message are as follows: > > [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init > > [amdgpu]] *ERROR* > > late_init of IP block failed -22 [ > > 7.006919][ 7] [ T295] amdgpu :04:00.0: > > amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] > > amdgpu > > :04:00.0: Fatal error during GPU init > Memory barries are not supposed to be sprinkled around like this, > you > >> need to give a detailed explanation why this is necessary. > Regards, > Christian. > > > Signed-off-by: Zhenneng Li > > --- > > drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > > index 8f994ffa9cd1..c7656f22278d 100644 > > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > > @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct > > amdgpu_device *adev) > > u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); > > u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); > > + mb(); > > + > > if (!(rst & RST_REG) && !(clk & CK_DISABLE)) > > return true; > >>> In particular, it makes no sense in this specific place, since it > >>> cannot directly > >> affect the values of rst & clk. > >> > >> I thinks so too. > >> > >> But when I do reboot test using nine desktop machines, there maybe > >> report this error on one or two machines after Hundreds of times or > >> Thousands of times reboot test, at the beginning, I use msleep() > >> instead of mb(), these two methods are all works, but I don't know what > is the root case. > >> > >> I use this method on other verdor's oland card, this error message > >> are reported again. > >> > >> What could be the root reason? > >> > >> test environmen: > >> > >> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 > >> > >> driver: amdgpu > >> > >> os: ubuntu 2004 > >> > >> platform: arm64 > >> > >> kernel: 5.4.18 > >> <>
Re: [PATCH] drm/amdgpu: Fix minmax error
ThispatchisReviewed-by:JamesZhu On 2022-11-24 16:19, Luben Tuikov wrote: Fix minmax compilation error by using min_t()/max_t(), of the assignment type. Cc: James Zhu Cc: Felix Kuehling Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system memory") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c index 8a2e5716d8dba2..d22d14b0ef0c84 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c @@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct mmu_interval_notifier *notifier, hmm_range->dev_private_owner = owner; do { - hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end); + hmm_range->end = min_t(typeof(hmm_range->end), + hmm_range->start + MAX_WALK_BYTE, + end); pr_debug("hmm range: start = 0x%lx, end = 0x%lx", hmm_range->start, hmm_range->end); /* Assuming 512MB takes maxmium 1 second to fault page address */ - timeout = max((hmm_range->end - hmm_range->start) >> 29, 1ULL) * - HMM_RANGE_DEFAULT_TIMEOUT; + timeout = max_t(typeof(timeout), + (hmm_range->end - hmm_range->start) >> 29, + 1ULL); + timeout *= HMM_RANGE_DEFAULT_TIMEOUT; timeout = jiffies + msecs_to_jiffies(timeout); retry: base-commit: d5e8f4912061ad2e577b4909556e1364e2c2018e prerequisite-patch-id: 6024d0c36cae3e4a995a8fcf787b91f511a37486
Re: [PATCH 19/29] drm/amdkfd: add debug set exceptions enabled operation
Am 2022-10-31 um 12:23 schrieb Jonathan Kim: The debugger subscibes to nofication for requested exceptions on attach. Allow the debugger to change its subsciption later on. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 36 drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 2 ++ 3 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 27cd5af72521..61612b9bdf8c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2887,6 +2887,9 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v args->send_runtime_event.exception_mask); break; case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED: + kfd_dbg_set_enabled_debug_exception_mask(target, + args->set_exceptions_enabled.exception_mask); + break; case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE: case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE: case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 3d304e8c286e..594ccca25cae 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -441,3 +441,39 @@ int kfd_dbg_trap_enable(struct kfd_process *target, uint32_t fd, return r; } + +void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target, + uint64_t exception_set_mask) +{ + uint64_t found_mask = 0; + struct process_queue_manager *pqm; + struct process_queue_node *pqn; + static const char write_data = '.'; + loff_t pos = 0; + int i; + + mutex_lock(&target->event_mutex); + + found_mask |= target->exception_status; + + pqm = &target->pqm; + list_for_each_entry(pqn, &pqm->queues, process_queue_list) { + if (!pqn) + continue; + + found_mask |= pqn->q->properties.exception_status; + } + + for (i = 0; i < target->n_pdds; i++) { + struct kfd_process_device *pdd = target->pdds[i]; + + found_mask |= pdd->exception_status; + } + + if (exception_set_mask & found_mask) + kernel_write(target->dbg_ev_file, &write_data, 1, &pos); + + target->exception_enable_mask = exception_set_mask; + + mutex_unlock(&target->event_mutex); +} diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h index 5270d5749828..837e09491a76 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h @@ -58,6 +58,8 @@ static inline bool kfd_dbg_is_per_vmid_supported(struct kfd_dev *dev) void debug_event_write_work_handler(struct work_struct *work); +void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target, + uint64_t exception_set_mask); /* * If GFX off is enabled, chips that do not support RLC restore for the debug * registers will disable GFX off temporarily for the entire debug session.
[PATCH] drm/amdgpu: Fix minmax error
Fix minmax compilation error by using min_t()/max_t(), of the assignment type. Cc: James Zhu Cc: Felix Kuehling Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system memory") Signed-off-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c index 8a2e5716d8dba2..d22d14b0ef0c84 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c @@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct mmu_interval_notifier *notifier, hmm_range->dev_private_owner = owner; do { - hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end); + hmm_range->end = min_t(typeof(hmm_range->end), + hmm_range->start + MAX_WALK_BYTE, + end); pr_debug("hmm range: start = 0x%lx, end = 0x%lx", hmm_range->start, hmm_range->end); /* Assuming 512MB takes maxmium 1 second to fault page address */ - timeout = max((hmm_range->end - hmm_range->start) >> 29, 1ULL) * - HMM_RANGE_DEFAULT_TIMEOUT; + timeout = max_t(typeof(timeout), + (hmm_range->end - hmm_range->start) >> 29, + 1ULL); + timeout *= HMM_RANGE_DEFAULT_TIMEOUT; timeout = jiffies + msecs_to_jiffies(timeout); retry: base-commit: d5e8f4912061ad2e577b4909556e1364e2c2018e prerequisite-patch-id: 6024d0c36cae3e4a995a8fcf787b91f511a37486 -- 2.39.0.rc0
[PATCH 09/12] drm/amd/display: Don't overwrite subvp pipe info in fast updates
From: Alvin Lee [Description] - This is a workaround to avoid concurrency issues -- a fast update creates a shallow copy of the dc current_state, and removes all subvp/phantom related flags. - We want to prevent the fast update thread from removing those flags in case there's another thread running that requires the info for proper programming Reviewed-by: Jun Lei Acked-by: Jasdeep Dhillon Signed-off-by: Alvin Lee --- drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +- .../drm/amd/display/dc/dcn32/dcn32_resource.c | 64 +++ .../drm/amd/display/dc/dcn32/dcn32_resource.h | 2 +- .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 4 +- .../gpu/drm/amd/display/dc/inc/core_types.h | 2 +- 5 files changed, 44 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index a7bfe0b6a5f3..87994ae0a397 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -3061,7 +3061,7 @@ static bool update_planes_and_stream_state(struct dc *dc, * Ensures that we have enough pipes for newly added MPO planes */ if (dc->res_pool->funcs->remove_phantom_pipes) - dc->res_pool->funcs->remove_phantom_pipes(dc, context); + dc->res_pool->funcs->remove_phantom_pipes(dc, context, false); /*remove old surfaces from context */ if (!dc_rem_all_planes_for_stream(dc, stream, context)) { diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c index 06489df85ac1..e4dbc8353ea3 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c @@ -1743,7 +1743,7 @@ void dcn32_retain_phantom_pipes(struct dc *dc, struct dc_state *context) } // return true if removed piped from ctx, false otherwise -bool dcn32_remove_phantom_pipes(struct dc *dc, struct dc_state *context) +bool dcn32_remove_phantom_pipes(struct dc *dc, struct dc_state *context, bool fast_update) { int i; bool removed_pipe = false; @@ -1770,14 +1770,23 @@ bool dcn32_remove_phantom_pipes(struct dc *dc, struct dc_state *context) removed_pipe = true; } - // Clear all phantom stream info - if (pipe->stream) { - pipe->stream->mall_stream_config.type = SUBVP_NONE; - pipe->stream->mall_stream_config.paired_stream = NULL; - } + /* For non-full updates, a shallow copy of the current state +* is created. In this case we don't want to erase the current +* state (there can be 2 HIRQL threads, one in flip, and one in +* checkMPO) that can cause a race condition. +* +* This is just a workaround, needs a proper fix. +*/ + if (!fast_update) { + // Clear all phantom stream info + if (pipe->stream) { + pipe->stream->mall_stream_config.type = SUBVP_NONE; + pipe->stream->mall_stream_config.paired_stream = NULL; + } - if (pipe->plane_state) { - pipe->plane_state->is_phantom = false; + if (pipe->plane_state) { + pipe->plane_state->is_phantom = false; + } } } return removed_pipe; @@ -1950,23 +1959,28 @@ int dcn32_populate_dml_pipes_from_context( pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; pipes[pipe_cnt].pipe.scale_ratio_depth.lb_depth = dm_lb_19; - switch (pipe->stream->mall_stream_config.type) { - case SUBVP_MAIN: - pipes[pipe_cnt].pipe.src.use_mall_for_pstate_change = dm_use_mall_pstate_change_sub_viewport; - subvp_in_use = true; - break; - case SUBVP_PHANTOM: - pipes[pipe_cnt].pipe.src.use_mall_for_pstate_change = dm_use_mall_pstate_change_phantom_pipe; - pipes[pipe_cnt].pipe.src.use_mall_for_static_screen = dm_use_mall_static_screen_disable; - // Disallow unbounded req for SubVP according to DCHUB programming guide - pipes[pipe_cnt].pipe.src.unbounded_req_mode = false; - break; - case SUBVP_NONE: - pipes[pipe_cnt].pipe.src.use_mall_for_pstate_change = dm_use_mall_pstate_change_disable; - pipes[pipe_cnt].pipe.src.use_mall_for_static_screen = dm_use_mall_static_screen_disable; - break; - default: -
[PATCH 08/12] drm/amd/display: program output tf when required
From: Dillon Varone [Description] Output transfer function must be programmed per pipe as part of a front end update when the plane changes, or output transfer function changes for a given plane. Reviewed-by: Alvin Lee Acked-by: Jasdeep Dhillon Signed-off-by: Dillon Varone --- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c index db57b17061ae..bc4a303cd864 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -1741,7 +1741,10 @@ static void dcn20_program_pipe( * only do gamma programming for powering on, internal memcmp to avoid * updating on slave planes */ - if (pipe_ctx->update_flags.bits.enable || pipe_ctx->stream->update_flags.bits.out_tf) + if (pipe_ctx->update_flags.bits.enable || + pipe_ctx->update_flags.bits.plane_changed || + pipe_ctx->stream->update_flags.bits.out_tf || + pipe_ctx->plane_state->update_flags.bits.output_tf_change) hws->funcs.set_output_transfer_func(dc, pipe_ctx, pipe_ctx->stream); /* If the pipe has been enabled or has a different opp, we -- 2.34.1
[PATCH 12/12] drm/amd/display: 3.2.214
From: Aric Cyr This version brings along following fixes: -Program output transfer function when required -Fix arthmetic errror in MALL size caluclations for subvp -DCC Meta pitch used for MALL allocation -Debugfs entry to tell if connector is DPIA link -Use largest vready_offset in pipe group -Fixes race condition in DPIA Aux transfer Reviewed-by: Rodrigo Siqueira Acked-by: Jasdeep Dhillon Signed-off-by: Aric Cyr --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index a19a890f1d76..4a7c0356d9c7 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -47,7 +47,7 @@ struct aux_payload; struct set_config_cmd_payload; struct dmub_notification; -#define DC_VER "3.2.213" +#define DC_VER "3.2.214" #define MAX_SURFACES 3 #define MAX_PLANES 6 -- 2.34.1
[PATCH 06/12] drm/amd/display: Retain phantom pipes when min transition into subvp
From: Alvin Lee [Description] - When entering into a SubVP config that requires a minimal transition we need to retain phantom pipes and also restore the mall config - This is because the min transition will remove phantom pipes from the context (shallow copy) and not restore it's original state - This is just a workaround, and needs a proper fix Reviewed-by: Jun Lei Acked-by: Jasdeep Dhillon Signed-off-by: Alvin Lee --- drivers/gpu/drm/amd/display/dc/core/dc.c | 21 ++- drivers/gpu/drm/amd/display/dc/dc_stream.h| 11 ++ .../drm/amd/display/dc/dcn32/dcn32_resource.c | 2 ++ .../drm/amd/display/dc/dcn32/dcn32_resource.h | 11 -- .../amd/display/dc/dcn321/dcn321_resource.c | 2 ++ .../gpu/drm/amd/display/dc/inc/core_types.h | 2 ++ 6 files changed, 37 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index f9b8b6f6fd31..a7bfe0b6a5f3 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -3954,6 +3954,7 @@ bool dc_update_planes_and_stream(struct dc *dc, struct dc_state *context; enum surface_update_type update_type; int i; + struct mall_temp_config mall_temp_config; /* In cases where MPO and split or ODM are used transitions can * cause underflow. Apply stream configuration with minimal pipe @@ -3985,11 +3986,29 @@ bool dc_update_planes_and_stream(struct dc *dc, /* on plane removal, minimal state is the new one */ if (force_minimal_pipe_splitting && !is_plane_addition) { + /* Since all phantom pipes are removed in full validation, +* we have to save and restore the subvp/mall config when +* we do a minimal transition since the flags marking the +* pipe as subvp/phantom will be cleared (dc copy constructor +* creates a shallow copy). +*/ + if (dc->res_pool->funcs->save_mall_state) + dc->res_pool->funcs->save_mall_state(dc, context, &mall_temp_config); if (!commit_minimal_transition_state(dc, context)) { dc_release_state(context); return false; } - + if (dc->res_pool->funcs->restore_mall_state) + dc->res_pool->funcs->restore_mall_state(dc, context, &mall_temp_config); + + /* If we do a minimal transition with plane removal and the context +* has subvp we also have to retain back the phantom stream / planes +* since the refcount is decremented as part of the min transition +* (we commit a state with no subvp, so the phantom streams / planes +* had to be removed). +*/ + if (dc->res_pool->funcs->retain_phantom_pipes) + dc->res_pool->funcs->retain_phantom_pipes(dc, context); update_type = UPDATE_TYPE_FULL; } diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h b/drivers/gpu/drm/amd/display/dc/dc_stream.h index e0cee9666c48..dfd3df1d2f7e 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_stream.h +++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h @@ -160,6 +160,17 @@ struct mall_stream_config { struct dc_stream_state *paired_stream; // master / slave stream }; +/* Temp struct used to save and restore MALL config + * during validation. + * + * TODO: Move MALL config into dc_state instead of stream struct + * to avoid needing to save/restore. + */ +struct mall_temp_config { + struct mall_stream_config mall_stream_config[MAX_PIPES]; + bool is_phantom_plane[MAX_PIPES]; +}; + struct dc_stream_state { // sink is deprecated, new code should not reference // this pointer diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c index 99ddd2232322..06489df85ac1 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c @@ -2055,6 +2055,8 @@ static struct resource_funcs dcn32_res_pool_funcs = { .add_phantom_pipes = dcn32_add_phantom_pipes, .remove_phantom_pipes = dcn32_remove_phantom_pipes, .retain_phantom_pipes = dcn32_retain_phantom_pipes, + .save_mall_state = dcn32_save_mall_state, + .restore_mall_state = dcn32_restore_mall_state, }; diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h index c50bb34b515f..4e6b71832187 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h @@ -45,17 +45,6 @@ extern struct _vcs_dpi_ip_params_st dcn3_2_ip; extern struct _vcs_dpi_soc_bounding_box_st dcn3_2_soc; -/* Temp struct used to
[PATCH 10/12] drm/amd/display: Fix DTBCLK disable requests and SRC_SEL programming
From: Alvin Lee [Description] - When transitioning FRL / DP2 is not required, we will always request DTBCLK = 0Mhz, but PMFW returns the min freq - This causes us to make DTBCLK requests every time we call optimize after transitioning from FRL to non-FRL - If DTBCLK is not required, request the min instead (then we only need to make 1 extra request at boot time) - Also when programming PIPE_DTO_SRC_SEL, don't programming for DP first, just programming once for the required selection (programming DP on an HDMI connection then switching back causes corruption) Reviewed-by: Dillon Varone Acked-by: Jasdeep Dhillon Signed-off-by: Alvin Lee --- .../gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c| 2 +- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c | 6 +- 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c index 6f77d8e538ab..9eb9fe5b8d2c 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c @@ -438,7 +438,7 @@ static void dcn32_update_clocks(struct clk_mgr *clk_mgr_base, } if (!new_clocks->dtbclk_en) { - new_clocks->ref_dtbclk_khz = 0; + new_clocks->ref_dtbclk_khz = clk_mgr_base->bw_params->clk_table.entries[0].dtbclk_mhz * 1000; } /* clock limits are received with MHz precision, divide by 1000 to prevent setting clocks at every call */ diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c index df4f25119142..e4472c6be6c3 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c @@ -225,11 +225,7 @@ static void dccg32_set_dtbclk_dto( } else { REG_UPDATE_2(OTG_PIXEL_RATE_CNTL[params->otg_inst], DTBCLK_DTO_ENABLE[params->otg_inst], 0, - PIPE_DTO_SRC_SEL[params->otg_inst], 1); - if (params->is_hdmi) - REG_UPDATE(OTG_PIXEL_RATE_CNTL[params->otg_inst], - PIPE_DTO_SRC_SEL[params->otg_inst], 0); - + PIPE_DTO_SRC_SEL[params->otg_inst], params->is_hdmi ? 0 : 1); REG_WRITE(DTBCLK_DTO_MODULO[params->otg_inst], 0); REG_WRITE(DTBCLK_DTO_PHASE[params->otg_inst], 0); } -- 2.34.1
[PATCH 11/12] drm/amd/display: set per pipe dppclk to 0 when dpp is off
From: Dmytro Laktyushkin The 'commit 52e4fdf09ebc ("drm/amd/display: use low clocks for no plane configs")' introduced a change that set low clock values for DCN31 and DCN32. As a result of these changes, DC started to spam the log with the following warning: [ cut here ] WARNING: CPU: 8 PID: 1486 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_dccg.c:58 dccg2_update_dpp_dto+0x3f/0xf0 [amdgpu] [..] CPU: 8 PID: 1486 Comm: kms_atomic Tainted: G W 5.18.0+ #1 RIP: 0010:dccg2_update_dpp_dto+0x3f/0xf0 [amdgpu] RSP: 0018:bbd8025334d0 EFLAGS: 00010206 RAX: 01ee RBX: a02c87dd3de0 RCX: 000a7f80 RDX: 0007dec3 RSI: RDI: a02c87dd3de0 RBP: bbd8025334e8 R08: 0001 R09: 0005 R10: 000331a0 R11: c0b03d80 R12: a02ca576d000 R13: a02cd02c R14: 001453bc R15: a02cdc28 [..] dcn20_update_clocks_update_dpp_dto+0x4e/0xa0 [amdgpu] dcn32_update_clocks+0x5d9/0x650 [amdgpu] dcn20_prepare_bandwidth+0x49/0x100 [amdgpu] dcn30_prepare_bandwidth+0x63/0x80 [amdgpu] dc_commit_state_no_check+0x39d/0x13e0 [amdgpu] dc_commit_streams+0x1f9/0x3b0 [amdgpu] dc_commit_state+0x37/0x120 [amdgpu] amdgpu_dm_atomic_commit_tail+0x5e5/0x2520 [amdgpu] ? _raw_spin_unlock_irqrestore+0x1f/0x40 ? down_trylock+0x2c/0x40 ? vprintk_emit+0x186/0x2c0 ? vprintk_default+0x1d/0x20 ? vprintk+0x4e/0x60 We can easily trigger this issue by using a 4k@120 or a 2k@165 and running some of the kms_atomic tests. This warning is triggered because the per-pipe clock update is not happening; this commit fixes this issue by ensuring that DPPCLK is updated when calculating the watermark and dlg is invoked. Fixes: 52e4fdf09ebc ("drm/amd/display: use low clocks for no plane configs") Reported-by: Mark Broadworth Reviewed-by: Rodrigo Siqueira Signed-off-by: Dmytro Laktyushkin --- drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 3 +++ drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 5 - 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c index 12b23bd50e19..b37d14369a62 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c @@ -559,6 +559,9 @@ void dcn31_calculate_wm_and_dlg_fp( context->bw_ctx.bw.dcn.clk.dramclk_khz = 0; context->bw_ctx.bw.dcn.clk.fclk_khz = 0; context->bw_ctx.bw.dcn.clk.p_state_change_support = true; + for (i = 0; i < dc->res_pool->pipe_count; i++) + if (context->res_ctx.pipe_ctx[i].stream) + context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 0; } } diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c index 5a4cdb559d4e..f94abd124021 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c @@ -1320,7 +1320,10 @@ static void dcn32_calculate_dlg_params(struct dc *dc, struct dc_state *context, if (context->bw_ctx.bw.dcn.clk.dppclk_khz < pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000) context->bw_ctx.bw.dcn.clk.dppclk_khz = pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000; - context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000; + if (context->res_ctx.pipe_ctx[i].plane_state) + context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000; + else + context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 0; context->res_ctx.pipe_ctx[i].pipe_dlg_param = pipes[pipe_idx].pipe.dest; pipe_idx++; } -- 2.34.1
[PATCH 05/12] drm/amd/display: Use DCC meta pitch for MALL allocation requirements
From: Dillon Varone [Description] Calculations for determining DCC meta size should be pitch*height*bpp/256. Reviewed-by: Alvin Lee Acked-by: Jasdeep Dhillon Signed-off-by: Dillon Varone --- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c| 6 +++--- .../drm/amd/display/dc/dcn32/dcn32_resource_helpers.c | 11 --- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c index 76548b4b822c..c9b2343947be 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c @@ -262,11 +262,11 @@ static uint32_t dcn32_calculate_cab_allocation(struct dc *dc, struct dc_state *c num_mblks = ((mall_alloc_width_blk_aligned + mblk_width - 1) / mblk_width) * ((mall_alloc_height_blk_aligned + mblk_height - 1) / mblk_height); - /* For DCC: -* meta_num_mblk = CEILING(full_mblk_width_ub_l*full_mblk_height_ub_l*Bpe/256/mblk_bytes, 1) + /*For DCC: +* meta_num_mblk = CEILING(meta_pitch*full_vp_height*Bpe/256/mblk_bytes, 1) */ if (pipe->plane_state->dcc.enable) - num_mblks += (mall_alloc_width_blk_aligned * mall_alloc_width_blk_aligned * bytes_per_pixel + + num_mblks += (pipe->plane_state->dcc.meta_pitch * pipe->plane_res.scl_data.viewport.height * bytes_per_pixel + (256 * DCN3_2_MALL_MBLK_SIZE_BYTES) - 1) / (256 * DCN3_2_MALL_MBLK_SIZE_BYTES); bytes_in_mall = num_mblks * DCN3_2_MALL_MBLK_SIZE_BYTES; diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c index fa3778849db1..94fd125daa6b 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c @@ -121,14 +121,19 @@ uint32_t dcn32_helper_calculate_num_ways_for_subvp(struct dc *dc, struct dc_stat */ num_mblks = ((mall_alloc_width_blk_aligned + mblk_width - 1) / mblk_width) * ((mall_alloc_height_blk_aligned + mblk_height - 1) / mblk_height); + + /*For DCC: +* meta_num_mblk = CEILING(meta_pitch*full_vp_height*Bpe/256/mblk_bytes, 1) +*/ + if (pipe->plane_state->dcc.enable) + num_mblks += (pipe->plane_state->dcc.meta_pitch * pipe->plane_res.scl_data.viewport.height * bytes_per_pixel + + (256 * DCN3_2_MALL_MBLK_SIZE_BYTES) - 1) / (256 * DCN3_2_MALL_MBLK_SIZE_BYTES); + bytes_in_mall = num_mblks * DCN3_2_MALL_MBLK_SIZE_BYTES; // cache lines used is total bytes / cache_line size. Add +2 for worst case alignment // (MALL is 64-byte aligned) cache_lines_per_plane = bytes_in_mall / dc->caps.cache_line_size + 2; - /* For DCC divide by 256 */ - if (pipe->plane_state->dcc.enable) - cache_lines_per_plane = cache_lines_per_plane + (cache_lines_per_plane / 256) + 1; cache_lines_used += cache_lines_per_plane; } } -- 2.34.1
[PATCH 07/12] drm/amd/display: Fix arithmetic error in MALL size calculations for subvp
From: Dillon Varone [Description] Need to subtract unused section of the viewport when calculating required space in MALL for subvp instead of adding, to prevent over allocation. Reviewed-by: Alvin Lee Acked-by: Jasdeep Dhillon Signed-off-by: Dillon Varone --- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c index 94fd125daa6b..783935c4e664 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c @@ -97,14 +97,14 @@ uint32_t dcn32_helper_calculate_num_ways_for_subvp(struct dc *dc, struct dc_stat * FLOOR(vp_x_start, blk_width) */ full_vp_width_blk_aligned = ((pipe->plane_res.scl_data.viewport.x + - pipe->plane_res.scl_data.viewport.width + mblk_width - 1) / mblk_width * mblk_width) + + pipe->plane_res.scl_data.viewport.width + mblk_width - 1) / mblk_width * mblk_width) - (pipe->plane_res.scl_data.viewport.x / mblk_width * mblk_width); /* full_vp_height_blk_aligned = FLOOR(vp_y_start + full_vp_height + blk_height - 1, blk_height) - * FLOOR(vp_y_start, blk_height) */ full_vp_height_blk_aligned = ((pipe->plane_res.scl_data.viewport.y + - full_vp_height + mblk_height - 1) / mblk_height * mblk_height) + + full_vp_height + mblk_height - 1) / mblk_height * mblk_height) - (pipe->plane_res.scl_data.viewport.y / mblk_height * mblk_height); /* mall_alloc_width_blk_aligned_l/c = full_vp_width_blk_aligned_l/c */ -- 2.34.1
[PATCH 01/12] drm/amd/display: Fix race condition in DPIA AUX transfer
From: Stylon Wang [Why] This fix was intended for improving on coding style but in the process uncovers a race condition, which explains why we are getting incorrect length in DPIA AUX replies. Due to the call path of DPIA AUX going from DC back to DM layer then again into DC and the added complexities on top of current DC AUX implementation, a proper fix to rely on current dc_lock to address the race condition is difficult without a major overhual on how DPIA AUX is implemented. [How] - Add a mutex dpia_aux_lock to protect DPIA AUX transfers - Remove DMUB_ASYNC_TO_SYNC_ACCESS_* codes and rely solely on aux_return_code_type for error reporting and handling - Separate SET_CONFIG from DPIA AUX transfer because they have quiet different processing logic - Remove unnecessary type casting to and from void * type Reviewed-by: Nicholas Kazlauskas Acked-by: Jasdeep Dhillon Signed-off-by: Stylon Wang --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 151 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 17 +- .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 10 +- 3 files changed, 91 insertions(+), 87 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 4fe7971e3e58..da1be67831d6 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -147,14 +147,6 @@ MODULE_FIRMWARE(FIRMWARE_NAVI12_DMCU); /* Number of bytes in PSP footer for firmware. */ #define PSP_FOOTER_BYTES 0x100 -/* - * DMUB Async to Sync Mechanism Status - */ -#define DMUB_ASYNC_TO_SYNC_ACCESS_FAIL 1 -#define DMUB_ASYNC_TO_SYNC_ACCESS_TIMEOUT 2 -#define DMUB_ASYNC_TO_SYNC_ACCESS_SUCCESS 3 -#define DMUB_ASYNC_TO_SYNC_ACCESS_INVALID 4 - /** * DOC: overview * @@ -1442,6 +1434,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev) memset(&init_params, 0, sizeof(init_params)); #endif + mutex_init(&adev->dm.dpia_aux_lock); mutex_init(&adev->dm.dc_lock); mutex_init(&adev->dm.audio_lock); @@ -1806,6 +1799,7 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev) mutex_destroy(&adev->dm.audio_lock); mutex_destroy(&adev->dm.dc_lock); + mutex_destroy(&adev->dm.dpia_aux_lock); return; } @@ -10211,91 +10205,92 @@ uint32_t dm_read_reg_func(const struct dc_context *ctx, uint32_t address, return value; } -static int amdgpu_dm_set_dmub_async_sync_status(bool is_cmd_aux, - struct dc_context *ctx, - uint8_t status_type, - uint32_t *operation_result) +int amdgpu_dm_process_dmub_aux_transfer_sync( + struct dc_context *ctx, + unsigned int link_index, + struct aux_payload *payload, + enum aux_return_code_type *operation_result) { struct amdgpu_device *adev = ctx->driver_context; - int return_status = -1; struct dmub_notification *p_notify = adev->dm.dmub_notify; + int ret = -1; - if (is_cmd_aux) { - if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_SUCCESS) { - return_status = p_notify->aux_reply.length; - *operation_result = p_notify->result; - } else if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_TIMEOUT) { - *operation_result = AUX_RET_ERROR_TIMEOUT; - } else if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_FAIL) { - *operation_result = AUX_RET_ERROR_ENGINE_ACQUIRE; - } else if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_INVALID) { - *operation_result = AUX_RET_ERROR_INVALID_REPLY; - } else { - *operation_result = AUX_RET_ERROR_UNKNOWN; + mutex_lock(&adev->dm.dpia_aux_lock); + if (!dc_process_dmub_aux_transfer_async(ctx->dc, link_index, payload)) { + *operation_result = AUX_RET_ERROR_ENGINE_ACQUIRE; + goto out; + } + + if (!wait_for_completion_timeout(&adev->dm.dmub_aux_transfer_done, 10 * HZ)) { + DRM_ERROR("wait_for_completion_timeout timeout!"); + *operation_result = AUX_RET_ERROR_TIMEOUT; + goto out; + } + + if (p_notify->result != AUX_RET_SUCCESS) { + /* +* Transient states before tunneling is enabled could +* lead to this error. We can ignore this for now. +*/ + if (p_notify->result != AUX_RET_ERROR_PROTOCOL_ERROR) { + DRM_WARN("DPIA AUX failed on 0x%x(%d), error %d\n", + payload->address, payload->length, + p_notify->result); } - } else { - if (status_type == DMUB_ASYNC_TO
[PATCH 04/12] drm/amd/display: MALL SS calculations should iterate over all pipes for cursor
From: Dillon Varone [Description] MALL SS allocation calculations should iterate over all pipes to determine the the allocation size required for HW cursor. Reviewed-by: Alvin Lee Acked-by: Jasdeep Dhillon Signed-off-by: Dillon Varone --- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c index 2f19f711d8be..76548b4b822c 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c @@ -316,8 +316,8 @@ static uint32_t dcn32_calculate_cab_allocation(struct dc *dc, struct dc_state *c cache_lines_used += (((cursor_size + DCN3_2_MALL_MBLK_SIZE_BYTES - 1) / DCN3_2_MALL_MBLK_SIZE_BYTES) * DCN3_2_MALL_MBLK_SIZE_BYTES) / dc->caps.cache_line_size + 2; + break; } - break; } } -- 2.34.1
[PATCH 03/12] drm/amd/display: Create debugfs to tell if connector is DPIA link
From: Stylon Wang [Why] Tests need to tell if display is connected via USB4 DPIA link. Currently this is only possible via analyzing dmesg logs. [How] Create a per-connector debugfs entry to report if the link is tunneled via USB4 DPIA. Reviewed-by: Wayne Lin Acked-by: Jasdeep Dhillon Signed-off-by: Stylon Wang --- .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 2c43cdd2e707..461037a3dd75 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -2639,6 +2639,25 @@ static int dp_mst_progress_status_show(struct seq_file *m, void *unused) return 0; } +/* + * Reports whether the connected display is a USB4 DPIA tunneled display + * Example usage: cat /sys/kernel/debug/dri/0/DP-8/is_dpia_link + */ +static int is_dpia_link_show(struct seq_file *m, void *data) +{ + struct drm_connector *connector = m->private; + struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector); + struct dc_link *link = aconnector->dc_link; + + if (connector->status != connector_status_connected) + return -ENODEV; + + seq_printf(m, "%s\n", (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) ? "yes" : + (link->ep_type == DISPLAY_ENDPOINT_PHY) ? "no" : "unknown"); + + return 0; +} + DEFINE_SHOW_ATTRIBUTE(dp_dsc_fec_support); DEFINE_SHOW_ATTRIBUTE(dmub_fw_state); DEFINE_SHOW_ATTRIBUTE(dmub_tracebuffer); @@ -2650,6 +2669,7 @@ DEFINE_SHOW_ATTRIBUTE(internal_display); DEFINE_SHOW_ATTRIBUTE(psr_capability); DEFINE_SHOW_ATTRIBUTE(dp_is_mst_connector); DEFINE_SHOW_ATTRIBUTE(dp_mst_progress_status); +DEFINE_SHOW_ATTRIBUTE(is_dpia_link); static const struct file_operations dp_dsc_clock_en_debugfs_fops = { .owner = THIS_MODULE, @@ -2794,7 +2814,8 @@ static const struct { {"max_bpc", &dp_max_bpc_debugfs_fops}, {"dsc_disable_passthrough", &dp_dsc_disable_passthrough_debugfs_fops}, {"is_mst_connector", &dp_is_mst_connector_fops}, - {"mst_progress_status", &dp_mst_progress_status_fops} + {"mst_progress_status", &dp_mst_progress_status_fops}, + {"is_dpia_link", &is_dpia_link_fops} }; #ifdef CONFIG_DRM_AMD_DC_HDCP -- 2.34.1
[PATCH 00/12] DC Patches November 28 2022
This DC patchset brings improvements in multiple areas. In summary, we have: * Program output transfer function when required * Fix arthmetic errror in MALL size caluclations for subvp * DCC Meta pitch used for MALL allocation * Debugfs entry to tell if connector is DPIA link * Use largest vready_offset in pipe group * Fixes race condition in DPIA Aux transfer Cc: Daniel Wheeler Alvin Lee (3): drm/amd/display: Retain phantom pipes when min transition into subvp drm/amd/display: Don't overwrite subvp pipe info in fast updates drm/amd/display: Fix DTBCLK disable requests and SRC_SEL programming Aric Cyr (1): drm/amd/display: 3.2.214 Dillon Varone (4): drm/amd/display: MALL SS calculations should iterate over all pipes for cursor drm/amd/display: Use DCC meta pitch for MALL allocation requirements drm/amd/display: Fix arithmetic error in MALL size calculations for subvp drm/amd/display: program output tf when required Dmytro Laktyushkin (1): drm/amd/display: set per pipe dppclk to 0 when dpp is off Stylon Wang (2): drm/amd/display: Fix race condition in DPIA AUX transfer drm/amd/display: Create debugfs to tell if connector is DPIA link Wesley Chalmers (1): drm/amd/display: Use the largest vready_offset in pipe group .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 151 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 17 +- .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 23 ++- .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 10 +- .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c | 2 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 23 ++- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- drivers/gpu/drm/amd/display/dc/dc_stream.h| 11 ++ .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 30 +++- .../drm/amd/display/dc/dcn20/dcn20_hwseq.c| 34 +++- .../gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c | 6 +- .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 8 +- .../drm/amd/display/dc/dcn32/dcn32_resource.c | 66 +--- .../drm/amd/display/dc/dcn32/dcn32_resource.h | 13 +- .../display/dc/dcn32/dcn32_resource_helpers.c | 15 +- .../amd/display/dc/dcn321/dcn321_resource.c | 2 + .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 3 + .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 9 +- .../gpu/drm/amd/display/dc/inc/core_types.h | 4 +- 19 files changed, 277 insertions(+), 152 deletions(-) -- 2.34.1
[PATCH 02/12] drm/amd/display: Use the largest vready_offset in pipe group
From: Wesley Chalmers [WHY] Corruption can occur in LB if vready_offset is not large enough. DML calculates vready_offset for each pipe, but we currently select the top pipe's vready_offset, which is not necessarily enough for all pipes in the group. [HOW] Wherever program_global_sync is currently called, iterate through the entire pipe group and find the highest vready_offset. Reviewed-by: Dillon Varone Acked-by: Jasdeep Dhillon Signed-off-by: Wesley Chalmers --- .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 30 +-- .../drm/amd/display/dc/dcn20/dcn20_hwseq.c| 29 -- 2 files changed, 55 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c index 0db02e76dcc5..355ffed7380b 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -869,6 +869,32 @@ static void false_optc_underflow_wa( tg->funcs->clear_optc_underflow(tg); } +static int calculate_vready_offset_for_group(struct pipe_ctx *pipe) +{ + struct pipe_ctx *other_pipe; + int vready_offset = pipe->pipe_dlg_param.vready_offset; + + /* Always use the largest vready_offset of all connected pipes */ + for (other_pipe = pipe->bottom_pipe; other_pipe != NULL; other_pipe = other_pipe->bottom_pipe) { + if (other_pipe->pipe_dlg_param.vready_offset > vready_offset) + vready_offset = other_pipe->pipe_dlg_param.vready_offset; + } + for (other_pipe = pipe->top_pipe; other_pipe != NULL; other_pipe = other_pipe->top_pipe) { + if (other_pipe->pipe_dlg_param.vready_offset > vready_offset) + vready_offset = other_pipe->pipe_dlg_param.vready_offset; + } + for (other_pipe = pipe->next_odm_pipe; other_pipe != NULL; other_pipe = other_pipe->next_odm_pipe) { + if (other_pipe->pipe_dlg_param.vready_offset > vready_offset) + vready_offset = other_pipe->pipe_dlg_param.vready_offset; + } + for (other_pipe = pipe->prev_odm_pipe; other_pipe != NULL; other_pipe = other_pipe->prev_odm_pipe) { + if (other_pipe->pipe_dlg_param.vready_offset > vready_offset) + vready_offset = other_pipe->pipe_dlg_param.vready_offset; + } + + return vready_offset; +} + enum dc_status dcn10_enable_stream_timing( struct pipe_ctx *pipe_ctx, struct dc_state *context, @@ -912,7 +938,7 @@ enum dc_status dcn10_enable_stream_timing( pipe_ctx->stream_res.tg->funcs->program_timing( pipe_ctx->stream_res.tg, &stream->timing, - pipe_ctx->pipe_dlg_param.vready_offset, + calculate_vready_offset_for_group(pipe_ctx), pipe_ctx->pipe_dlg_param.vstartup_start, pipe_ctx->pipe_dlg_param.vupdate_offset, pipe_ctx->pipe_dlg_param.vupdate_width, @@ -2908,7 +2934,7 @@ void dcn10_program_pipe( pipe_ctx->stream_res.tg->funcs->program_global_sync( pipe_ctx->stream_res.tg, - pipe_ctx->pipe_dlg_param.vready_offset, + calculate_vready_offset_for_group(pipe_ctx), pipe_ctx->pipe_dlg_param.vstartup_start, pipe_ctx->pipe_dlg_param.vupdate_offset, pipe_ctx->pipe_dlg_param.vupdate_width); diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c index 3f3d4daa6294..db57b17061ae 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -1652,6 +1652,31 @@ static void dcn20_update_dchubp_dpp( hubp->funcs->phantom_hubp_post_enable(hubp); } +static int calculate_vready_offset_for_group(struct pipe_ctx *pipe) +{ + struct pipe_ctx *other_pipe; + int vready_offset = pipe->pipe_dlg_param.vready_offset; + + /* Always use the largest vready_offset of all connected pipes */ + for (other_pipe = pipe->bottom_pipe; other_pipe != NULL; other_pipe = other_pipe->bottom_pipe) { + if (other_pipe->pipe_dlg_param.vready_offset > vready_offset) + vready_offset = other_pipe->pipe_dlg_param.vready_offset; + } + for (other_pipe = pipe->top_pipe; other_pipe != NULL; other_pipe = other_pipe->top_pipe) { + if (other_pipe->pipe_dlg_param.vready_offset > vready_offset) + vready_offset = other_pipe->pipe_dlg_param.vready_offset; + } + for (other_pipe = pipe->next_odm_pipe; other_pipe != NULL; other_pipe = other_pipe->next_odm_pipe) { +
Re: [PATCH] drm/amdgpu: remove redundant NULL check
Applied. Thanks, Luben On 2022-11-23 02:44, zys.zlj...@gmail.com wrote: > From: Yushan Zhou > > release_firmware() checks whether firmware pointer is NULL. > Remove the redundant NULL check in psp_sw_fini(). > > Signed-off-by: Yushan Zhou > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 36 +++-- > 1 file changed, 16 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > index effa7df3ddbf..77b966ab5439 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c > @@ -484,26 +484,22 @@ static int psp_sw_fini(void *handle) > struct psp_gfx_cmd_resp *cmd = psp->cmd; > > psp_memory_training_fini(psp); > - if (psp->sos_fw) { > - release_firmware(psp->sos_fw); > - psp->sos_fw = NULL; > - } > - if (psp->asd_fw) { > - release_firmware(psp->asd_fw); > - psp->asd_fw = NULL; > - } > - if (psp->ta_fw) { > - release_firmware(psp->ta_fw); > - psp->ta_fw = NULL; > - } > - if (psp->cap_fw) { > - release_firmware(psp->cap_fw); > - psp->cap_fw = NULL; > - } > - if (psp->toc_fw) { > - release_firmware(psp->toc_fw); > - psp->toc_fw = NULL; > - } > + > + release_firmware(psp->sos_fw); > + psp->sos_fw = NULL; > + > + release_firmware(psp->asd_fw); > + psp->asd_fw = NULL; > + > + release_firmware(psp->ta_fw); > + psp->ta_fw = NULL; > + > + release_firmware(psp->cap_fw); > + psp->cap_fw = NULL; > + > + release_firmware(psp->toc_fw); > + psp->toc_fw = NULL; > + > if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) || > adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7)) > psp_sysfs_fini(adev);
RE: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA
[AMD Official Use Only - General] > -Original Message- > From: Kuehling, Felix > Sent: November 24, 2022 11:24 AM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to > TMA > > > Am 2022-11-24 um 09:51 schrieb Kim, Jonathan: > > [Public] > > > >> -Original Message- > >> From: Kuehling, Felix > >> Sent: November 22, 2022 7:45 PM > >> To: Kim, Jonathan ; amd- > >> g...@lists.freedesktop.org > >> Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to > >> TMA > >> > >> > >> On 2022-10-31 12:23, Jonathan Kim wrote: > >>> From: Jay Cornwall > >>> > >>> Trap handler behavior will differ when a debugger is attached. > >>> > >>> Make the debug trap flag available in the trap handler TMA. > >>> Update it when the debug trap ioctl is invoked. > >>> > >>> v3: Rebase for upstream > >>> > >>> v2: > >>> Add missing debug flag setup on APUs > >>> > >>> Signed-off-by: Jay Cornwall > >>> Reviewed-by: Felix Kuehling > >>> Signed-off-by: Jonathan Kim > >>> --- > >>>drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 4 > >>>drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++ > >>>drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 > > >>>3 files changed, 22 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > >> b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > >>> index ae6e701a2656..d4f87f2adada 100644 > >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > >>> @@ -193,6 +193,8 @@ void kfd_dbg_trap_deactivate(struct > kfd_process > >> *target, bool unwind, int unwind > >>> if (unwind && count == unwind_count) > >>> break; > >>> > >>> + kfd_process_set_trap_debug_flag(&pdd->qpd, false); > >>> + > >>> /* GFX off is already disabled by debug activate if not RLC > >> restore supported. */ > >>> if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) > >>> amdgpu_gfx_off_ctrl(pdd->dev->adev, false); > >>> @@ -278,6 +280,8 @@ int kfd_dbg_trap_activate(struct kfd_process > >> *target) > >>> if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) > >>> amdgpu_gfx_off_ctrl(pdd->dev->adev, true); > >>> > >>> + kfd_process_set_trap_debug_flag(&pdd->qpd, true); > >>> + > >>> r = debug_refresh_runlist(pdd->dev->dqm); > >>> if (r) { > >>> target->runtime_info.runtime_state = > >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >>> index 9690a2adb9ed..82b28588ab72 100644 > >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >>> @@ -1101,6 +1101,8 @@ int kfd_init_apertures(struct kfd_process > >> *process); > >>>void kfd_process_set_trap_handler(struct qcm_process_device *qpd, > >>>uint64_t tba_addr, > >>>uint64_t tma_addr); > >>> +void kfd_process_set_trap_debug_flag(struct qcm_process_device > >> *qpd, > >>> +bool enabled); > >>> > >>>/* CWSR initialization */ > >>>int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file > >> *filep); > >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > >> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > >>> index 59c4c38833b6..d62e0c62df76 100644 > >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > >>> @@ -1252,6 +1252,8 @@ int kfd_process_init_cwsr_apu(struct > >> kfd_process *p, struct file *filep) > >>> memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev- > >>> cwsr_isa_size); > >>> > >>> + kfd_process_set_trap_debug_flag(qpd, p- > >>> debug_trap_enabled); > >>> + > >>> qpd->tma_addr = qpd->tba_addr + > >> KFD_CWSR_TMA_OFFSET; > >>> pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for > >> pqm.\n", > >>> qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); > >>> @@ -1288,6 +1290,9 @@ static int > >> kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd) > >>> memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size); > >>> > >>> + kfd_process_set_trap_debug_flag(&pdd->qpd, > >>> + pdd->process- > >>> debug_trap_enabled); > >>> + > >>> qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET; > >>> pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n", > >>> qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); > >>> @@ -1374,6 +1379,17 @@ bool kfd_process_xnack_mode(struct > >> kfd_process *p, bool supported) > >>> return true; > >>>} > >>> > >>> +void kfd_process_set_trap_debug_flag(struct qcm_process_device > >> *qpd, > >>> +bool enabled) > >>> +{ > >>> + /* If TMA doesn't exist then f
[linux-next:master] BUILD REGRESSION c35bd4e428856ed8c1fae7f7dfa08a9141c153d1
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: c35bd4e428856ed8c1fae7f7dfa08a9141c153d1 Add linux-next specific files for 20221124 Error/Warning reports: https://lore.kernel.org/oe-kbuild-all/202211090634.ryfkk0ws-...@intel.com https://lore.kernel.org/oe-kbuild-all/202211241736.k6437e7j-...@intel.com https://lore.kernel.org/oe-kbuild-all/202211242021.fdzrfna8-...@intel.com https://lore.kernel.org/oe-kbuild-all/202211242120.mzzvguln-...@intel.com https://lore.kernel.org/oe-kbuild-all/202211250227.grpjpxyn-...@intel.com Error/Warning: (recently discovered and may have been fixed) arch/arm/mach-s3c/devs.c:32:10: fatal error: linux/platform_data/dma-s3c24xx.h: No such file or directory arch/powerpc/kernel/kvm_emul.o: warning: objtool: kvm_template_end(): can't find starting instruction arch/powerpc/kernel/optprobes_head.o: warning: objtool: optprobe_template_end(): can't find starting instruction drivers/clk/clk.c:1022:5: error: redefinition of 'clk_prepare' drivers/clk/clk.c:1268:6: error: redefinition of 'clk_is_enabled_when_prepared' drivers/clk/clk.c:941:6: error: redefinition of 'clk_unprepare' drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4968: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:5075:24: warning: implicit conversion from 'enum ' to 'enum dc_status' [-Wenum-conversion] drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:40:20: warning: no previous prototype for 'to_dal_irq_source_dcn201' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous prototype for 'gf100_fifo_nonstall_block' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous prototype for function 'gf100_fifo_nonstall_block' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous prototype for 'nvkm_engn_cgrp_get' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous prototype for function 'nvkm_engn_cgrp_get' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous prototype for 'tu102_gr_load' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous prototype for function 'tu102_gr_load' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype for 'wpr_generic_header_dump' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype for function 'wpr_generic_header_dump' [-Wmissing-prototypes] drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c:221:21: warning: variable 'loc' set but not used [-Wunused-but-set-variable] drivers/iio/addac/ad74115.c:320:27: warning: 'ad74115_dac_slew_rate_hz_tbl' defined but not used [-Wunused-const-variable=] mm/vmscan.c:4090:30: error: implicit declaration of function 'pmd_young'; did you mean 'pte_young'? [-Werror=implicit-function-declaration] net/netfilter/nf_conntrack_netlink.c:2674:6: warning: unused variable 'mark' [-Wunused-variable] vmlinux.o: warning: objtool: __btrfs_map_block+0x1d77: unreachable instruction Unverified Error/Warning (likely false positive, please contact us if interested): drivers/usb/fotg210/fotg210-udc.c:632:17: sparse: sparse: restricted __le16 degrades to integer Error/Warning ids grouped by kconfigs: gcc_recent_errors |-- alpha-allyesconfig | |-- drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst | |-- drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status | |-- drivers-gpu-drm-amd-amdgpu-..-display-dc-irq-dcn201-irq_service_dcn201.c:warning:no-previous-prototype-for-to_dal_irq_source_dcn201 | |-- drivers-gpu-drm-nouveau-nvkm-engine-fifo-gf100.c:warning:no-previous-prototype-for-gf100_fifo_nonstall_block | |-- drivers-gpu-drm-nouveau-nvkm-engine-fifo-runl.c:warning:no-previous-prototype-for-nvkm_engn_cgrp_get | |-- drivers-gpu-drm-nouveau-nvkm-engine-gr-tu102.c:warning:no-previous-prototype-for-tu102_gr_load | |-- drivers-gpu-drm-nouveau-nvkm-nvfw-acr.c:warning:no-previous-prototype-for-wpr_generic_header_dump | `-- drivers-gpu-drm-nouveau-nvkm-subdev-acr-lsfw.c:warning:variable-loc-set-but-not-used |-- alpha-randconfig-r005-20221124 | |-- drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-
Re: [PATCH AUTOSEL 6.0 38/44] drm/amdgpu: Unlock bo_list_mutex after error handling
On Mon, Nov 21, 2022 at 12:07:40PM +0100, Christian König wrote: Am 21.11.22 um 10:57 schrieb Michel Dänzer: On 11/19/22 03:11, Sasha Levin wrote: From: Philip Yang [ Upstream commit 64f65135c41a75f933d3bca236417ad8e9eb75de ] Get below kernel WARNING backtrace when pressing ctrl-C to kill kfdtest application. If amdgpu_cs_parser_bos returns error after taking bo_list_mutex, as caller amdgpu_cs_ioctl will not unlock bo_list_mutex, this generates the kernel WARNING. Add unlock bo_list_mutex after amdgpu_cs_parser_bos error handling to cleanup bo_list userptr bo. WARNING: kfdtest/2930 still has locks held! 1 lock held by kfdtest/2930: (&list->bo_list_mutex){+.+.}-{3:3}, at: amdgpu_cs_ioctl+0xce5/0x1f10 [amdgpu] stack backtrace: dump_stack_lvl+0x44/0x57 get_signal+0x79f/0xd00 arch_do_signal_or_restart+0x36/0x7b0 exit_to_user_mode_prepare+0xfd/0x1b0 syscall_exit_to_user_mode+0x19/0x40 do_syscall_64+0x40/0x80 Signed-off-by: Philip Yang Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index b7bae833c804..9d59f83c8faa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -655,6 +655,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } mutex_unlock(&p->bo_list->bo_list_mutex); } + mutex_unlock(&p->bo_list->bo_list_mutex); return r; } Looks doubtful that this is a correct backport — there's an identical mutex_unlock call just above. Oh, yes good point. This patch doesn't needs to be backported at all because it just fixes a problem introduced in the same cycle: Dropping it, thanks! -- Thanks, Sasha
Re: [PATCH v2] drm/amdgpu: Fix potential double free and null pointer dereference
Applied. Regards, Luben On 2022-11-22 19:10, Luben Tuikov wrote: > amdgpu_xgmi_hive_type does provide a release method which frees the allocated > "hive", > so we don't need a kfree() after a kobject_put(). > > Reviewed-by: Luben Tuikov > > Regards, > Luben > > On 2022-11-21 23:28, Liang He wrote: >> In amdgpu_get_xgmi_hive(), we should not call kfree() after >> kobject_put() as the PUT will call kfree(). >> >> In amdgpu_device_ip_init(), we need to check the returned *hive* >> which can be NULL before we dereference it. >> >> Signed-off-by: Liang He >> --- >> v1->v2: we need the extra GET to keep *hive* alive, it is >> my fault to remove the GET in v1. >> >> >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 + >> drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 2 -- >> 2 files changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index f1e9663b4051..00976e15b698 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -2462,6 +2462,11 @@ static int amdgpu_device_ip_init(struct amdgpu_device >> *adev) >> if (!amdgpu_sriov_vf(adev)) { >> struct amdgpu_hive_info *hive = >> amdgpu_get_xgmi_hive(adev); >> >> +if (WARN_ON(!hive)) { >> +r = -ENOENT; >> +goto init_failed; >> +} >> + >> if (!hive->reset_domain || >> >> !amdgpu_reset_get_reset_domain(hive->reset_domain)) { >> r = -ENOENT; >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c >> index 47159e9a0884..4b9e7b050ccd 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c >> @@ -386,7 +386,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct >> amdgpu_device *adev) >> if (ret) { >> dev_err(adev->dev, "XGMI: failed initializing kobject for xgmi >> hive\n"); >> kobject_put(&hive->kobj); >> -kfree(hive); >> hive = NULL; >> goto pro_end; >> } >> @@ -410,7 +409,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct >> amdgpu_device *adev) >> dev_err(adev->dev, "XGMI: failed initializing >> reset domain for xgmi hive\n"); >> ret = -ENOMEM; >> kobject_put(&hive->kobj); >> -kfree(hive); >> hive = NULL; >> goto pro_end; >> } >
Re: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls
Am 2022-11-24 um 09:58 schrieb Kim, Jonathan: [AMD Official Use Only - General] -Original Message- From: Kuehling, Felix Sent: November 22, 2022 6:59 PM To: Kim, Jonathan ; amd- g...@lists.freedesktop.org Subject: Re: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls On 2022-10-31 12:23, Jonathan Kim wrote: On GFX9.4.1, the implicit wait count instruction on s_barrier is disabled by default in the driver during normal operation for performance requirements. There is a hardware bug in GFX9.4.1 where if the implicit wait count instruction after an s_barrier instruction is disabled, any wave that hits an exception may step over the s_barrier when returning from the trap handler with the barrier logic having no ability to be aware of this, thereby causing other waves to wait at the barrier indefinitely resulting in a shader hang. This bug has been corrected for GFX9.4.2 and onward. Since the debugger subscribes to hardware exceptions, in order to avoid this bug, the debugger must enable implicit wait count on s_barrier for a debug session and disable it on detach. In order to change this setting in the in the device global SQ_CONFIG register, the GFX pipeline must be idle. GFX9.4.1 as a compute device will either dispatch work through the compute ring buffers used for image post processing or through the hardware scheduler by the KFD. Have the KGD suspend and drain the compute ring buffer, then suspend the hardware scheduler and block any future KFD process job requests before changing the implicit wait count setting. Once set, resume all work. Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 + .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 105 +- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +- 4 files changed, 110 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 0e6ddf05c23c..9f2499f52d2c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1034,6 +1034,9 @@ struct amdgpu_device { struct pci_saved_state *pci_state; pci_channel_state_t pci_channel_state; + /* Track auto wait count on s_barrier settings */ + boolbarrier_has_auto_waitcnt; + struct amdgpu_reset_control *reset_cntl; uint32_t ip_versions[MAX_HWIP][HWIP_MAX_INSTANCE]; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c index 4191af5a3f13..13f02a0aa828 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c @@ -26,6 +26,7 @@ #include "amdgpu.h" #include "amdgpu_amdkfd.h" #include "amdgpu_amdkfd_arcturus.h" +#include "amdgpu_reset.h" #include "sdma0/sdma0_4_2_2_offset.h" #include "sdma0/sdma0_4_2_2_sh_mask.h" #include "sdma1/sdma1_4_2_2_offset.h" @@ -48,6 +49,8 @@ #include "amdgpu_amdkfd_gfx_v9.h" #include "gfxhub_v1_0.h" #include "mmhub_v9_4.h" +#include "gc/gc_9_0_offset.h" +#include "gc/gc_9_0_sh_mask.h" #define HQD_N_REGS 56 #define DUMP_REG(addr) do { \ @@ -276,6 +279,104 @@ int kgd_arcturus_hqd_sdma_destroy(struct amdgpu_device *adev, void *mqd, return 0; } +/* + * Helper used to suspend/resume gfx pipe for image post process work to set + * barrier behaviour. + */ +static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool suspend) +{ + int i, r = 0; + + for (i = 0; i < adev->gfx.num_compute_rings; i++) { + struct amdgpu_ring *ring = &adev->gfx.compute_ring[i]; + + if (!(ring && ring->sched.thread)) + continue; + + /* stop secheduler and drain ring. */ + if (suspend) { + drm_sched_stop(&ring->sched, NULL); + r = amdgpu_fence_wait_empty(ring); + if (r) + goto out; + } else { + drm_sched_start(&ring->sched, false); + } + } + +out: + /* return on resume or failure to drain rings. */ + if (!suspend || r) + return r; + + return amdgpu_device_ip_wait_for_idle(adev, GC_HWIP); +} + +static void set_barrier_auto_waitcnt(struct amdgpu_device *adev, bool enable_waitcnt) +{ + uint32_t data; + + WRITE_ONCE(adev->barrier_has_auto_waitcnt, enable_waitcnt); + + if (!down_read_trylock(&adev->reset_domain->sem)) + return; + + amdgpu_amdkfd_suspend(adev, false); + + if (suspend_resume_compute_scheduler(adev, true)) + goto out; + + data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CONFIG)); + data = REG_SET_FIELD(data, SQ_CONFIG, DISABLE_BARRIER_WAITCNT, + enable_waitcnt ? 0 : 1); + WREG32(
Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA
Am 2022-11-24 um 09:51 schrieb Kim, Jonathan: [Public] -Original Message- From: Kuehling, Felix Sent: November 22, 2022 7:45 PM To: Kim, Jonathan ; amd- g...@lists.freedesktop.org Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA On 2022-10-31 12:23, Jonathan Kim wrote: From: Jay Cornwall Trap handler behavior will differ when a debugger is attached. Make the debug trap flag available in the trap handler TMA. Update it when the debug trap ioctl is invoked. v3: Rebase for upstream v2: Add missing debug flag setup on APUs Signed-off-by: Jay Cornwall Reviewed-by: Felix Kuehling Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 4 drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 3 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index ae6e701a2656..d4f87f2adada 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -193,6 +193,8 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int unwind if (unwind && count == unwind_count) break; + kfd_process_set_trap_debug_flag(&pdd->qpd, false); + /* GFX off is already disabled by debug activate if not RLC restore supported. */ if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) amdgpu_gfx_off_ctrl(pdd->dev->adev, false); @@ -278,6 +280,8 @@ int kfd_dbg_trap_activate(struct kfd_process *target) if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) amdgpu_gfx_off_ctrl(pdd->dev->adev, true); + kfd_process_set_trap_debug_flag(&pdd->qpd, true); + r = debug_refresh_runlist(pdd->dev->dqm); if (r) { target->runtime_info.runtime_state = diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 9690a2adb9ed..82b28588ab72 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -1101,6 +1101,8 @@ int kfd_init_apertures(struct kfd_process *process); void kfd_process_set_trap_handler(struct qcm_process_device *qpd, uint64_t tba_addr, uint64_t tma_addr); +void kfd_process_set_trap_debug_flag(struct qcm_process_device *qpd, +bool enabled); /* CWSR initialization */ int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file *filep); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 59c4c38833b6..d62e0c62df76 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1252,6 +1252,8 @@ int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file *filep) memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev- cwsr_isa_size); + kfd_process_set_trap_debug_flag(qpd, p- debug_trap_enabled); + qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET; pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n", qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); @@ -1288,6 +1290,9 @@ static int kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd) memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size); + kfd_process_set_trap_debug_flag(&pdd->qpd, + pdd->process- debug_trap_enabled); + qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET; pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n", qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); @@ -1374,6 +1379,17 @@ bool kfd_process_xnack_mode(struct kfd_process *p, bool supported) return true; } +void kfd_process_set_trap_debug_flag(struct qcm_process_device *qpd, +bool enabled) +{ + /* If TMA doesn't exist then flag will be set during allocation. */ I would expect a change to the TMA allocation function, but that isn't in this patch? The TMA is allocated under kfd_process_init_cwsr_* and CWSR enabled is a pre-condition for the 1st level trap handler loading. The lack of context in the patch for those functions may be hiding that fact. Is the placement of this comment misleading? Maybe it should go in kfd_dbg_trap_activate when kfd_process_set_trap_debug_flag is called? Or should it just be removed since the combined calls within initialization of CWSR + debug enable seem complete for enablement? I think the comment is fine. I was sort of expecting to see the corresponding change in the TMA allocation in the same patch. So my question is just lack of context. If that change in the TMA allocation got squashed into another patch in the series, maybe it would make se
RE: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls
[AMD Official Use Only - General] > -Original Message- > From: Kuehling, Felix > Sent: November 22, 2022 6:59 PM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode > enable and disable calls > > > On 2022-10-31 12:23, Jonathan Kim wrote: > > On GFX9.4.1, the implicit wait count instruction on s_barrier is > > disabled by default in the driver during normal operation for > > performance requirements. > > > > There is a hardware bug in GFX9.4.1 where if the implicit wait count > > instruction after an s_barrier instruction is disabled, any wave that > > hits an exception may step over the s_barrier when returning from the > > trap handler with the barrier logic having no ability to be > > aware of this, thereby causing other waves to wait at the barrier > > indefinitely resulting in a shader hang. This bug has been corrected > > for GFX9.4.2 and onward. > > > > Since the debugger subscribes to hardware exceptions, in order to avoid > > this bug, the debugger must enable implicit wait count on s_barrier > > for a debug session and disable it on detach. > > > > In order to change this setting in the in the device global SQ_CONFIG > > register, the GFX pipeline must be idle. GFX9.4.1 as a compute device > > will either dispatch work through the compute ring buffers used for > > image post processing or through the hardware scheduler by the KFD. > > > > Have the KGD suspend and drain the compute ring buffer, then suspend > the > > hardware scheduler and block any future KFD process job requests before > > changing the implicit wait count setting. Once set, resume all work. > > > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 + > > .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 105 > +- > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +- > > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +- > > 4 files changed, 110 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > index 0e6ddf05c23c..9f2499f52d2c 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > @@ -1034,6 +1034,9 @@ struct amdgpu_device { > > struct pci_saved_state *pci_state; > > pci_channel_state_t pci_channel_state; > > > > + /* Track auto wait count on s_barrier settings */ > > + boolbarrier_has_auto_waitcnt; > > + > > struct amdgpu_reset_control *reset_cntl; > > uint32_t > ip_versions[MAX_HWIP][HWIP_MAX_INSTANCE]; > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > > index 4191af5a3f13..13f02a0aa828 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > > @@ -26,6 +26,7 @@ > > #include "amdgpu.h" > > #include "amdgpu_amdkfd.h" > > #include "amdgpu_amdkfd_arcturus.h" > > +#include "amdgpu_reset.h" > > #include "sdma0/sdma0_4_2_2_offset.h" > > #include "sdma0/sdma0_4_2_2_sh_mask.h" > > #include "sdma1/sdma1_4_2_2_offset.h" > > @@ -48,6 +49,8 @@ > > #include "amdgpu_amdkfd_gfx_v9.h" > > #include "gfxhub_v1_0.h" > > #include "mmhub_v9_4.h" > > +#include "gc/gc_9_0_offset.h" > > +#include "gc/gc_9_0_sh_mask.h" > > > > #define HQD_N_REGS 56 > > #define DUMP_REG(addr) do { \ > > @@ -276,6 +279,104 @@ int kgd_arcturus_hqd_sdma_destroy(struct > amdgpu_device *adev, void *mqd, > > return 0; > > } > > > > +/* > > + * Helper used to suspend/resume gfx pipe for image post process work > to set > > + * barrier behaviour. > > + */ > > +static int suspend_resume_compute_scheduler(struct amdgpu_device > *adev, bool suspend) > > +{ > > + int i, r = 0; > > + > > + for (i = 0; i < adev->gfx.num_compute_rings; i++) { > > + struct amdgpu_ring *ring = &adev->gfx.compute_ring[i]; > > + > > + if (!(ring && ring->sched.thread)) > > + continue; > > + > > + /* stop secheduler and drain ring. */ > > + if (suspend) { > > + drm_sched_stop(&ring->sched, NULL); > > + r = amdgpu_fence_wait_empty(ring); > > + if (r) > > + goto out; > > + } else { > > + drm_sched_start(&ring->sched, false); > > + } > > + } > > + > > +out: > > + /* return on resume or failure to drain rings. */ > > + if (!suspend || r) > > + return r; > > + > > + return amdgpu_device_ip_wait_for_idle(adev, GC_HWIP); > > +} > > + > > +static void set_barrier_auto_waitcnt(struct amdgpu_device *adev, bool > enable_waitcnt) > > +{ > > + uint32_t data; > > + > > + WRITE_ONCE(adev->barrier_has_auto_waitcnt, enable_waitcnt); > > + > > + if (!down_rea
Re: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm
On 11/24/2022 1:17 PM, Christian König wrote: Am 24.11.22 um 08:45 schrieb Wang, Yang(Kevin): [AMD Official Use Only - General] -Original Message- From: Koenig, Christian Sent: Thursday, November 24, 2022 3:25 PM To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org; Paneer Selvam, Arunpravin Cc: Zhang, Hawking ; Deucher, Alexander Subject: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm Am 24.11.22 um 06:49 schrieb Yang Wang: add vram usage information into dri debugfs amdgpu_vram_mm node. Background: when amdgpu driver introduces drm buddy allocator, the kernel driver (and developer) is difficult to track the vram usage information. Field: 0x-0x: vram usaged range. type: kernel, device, sg usage: normal, vm, user domain: C-CPU, G-GTT, V-VRAM, P-PRIV @x: the address of "amdgpu_bo" object in kernel space. 4096: vram range range. Example: 0x0003fea68000-0x0003fea68fff: (type:kernel usage:vm domain:--V- --V-) @1d33dfee 4096 bytes 0x0003fea69000-0x0003fea69fff: (type:kernel usage:vm domain:--V- --V-) @a79155b5 4096 bytes 0x0003fea6b000-0x0003fea6bfff: (type:kernel usage:vm domain:--V- --V-) @38ad633b 4096 bytes 0x0003fea6c000-0x0003fea6cfff: (type:device usage:user domain:--V- --V-) @e302f90b 4096 bytes 0x0003fea6d000-0x0003fea6dfff: (type:device usage:user domain:--V- --V-) @e664c172 4096 bytes 0x0003fea6e000-0x0003fea6efff: (type:kernel usage:vm domain:--V- --V-) @4528cb2f 4096 bytes 0x0003fea6f000-0x0003fea6: (type:kernel usage:vm domain:--V- --V-) @a446bdbf 4096 bytes 0x0003fea7-0x0003fea7: (type:device usage:user domain:--V- --V-) @78fae42f 65536 bytes 0x0003fead8000-0x0003feadbfff: (type:kernel usage:normal domain:--V- --V-) @1327b7ff 16384 bytes 0x0003feadc000-0x0003feadcfff: (type:kernel usage:normal domain:--V- --V-) @1327b7ff 4096 bytes 0x0003feadd000-0x0003feaddfff: (type:kernel usage:normal domain:--V- --V-) @b9706fc1 4096 bytes 0x0003feade000-0x0003feadefff: (type:kernel usage:vm domain:--V- --V-) @71a25571 4096 bytes Note: although some vram ranges can be merged in the example above, but this can reflect the actual distribution of drm buddy allocator. Well completely NAK. This is way to much complexity for simple debugging. Question is what are your requirements here? E.g. what information do you want and why doesn't the buddy allocator already expose this? Regards, Christian. [Kevin]: For KMD debug. The DRM buddy interface doesn't provide an interface to query which ranges of ram(resource) are used. It is not easy to debug in KMD side if driver create BO fail at specific location. and from the view of KMD, the VRAM at some locations has special purposes. with this patch, we can know which range of vram are actually used. Well that's not a good reason to add this complexity. Debugging doesn't justify that. Please work with Arun to add the necessary information to the buddy allocator interface. Regards, Christian. Hi Kevin, I will check and list down some of the necessary information that we can add to the buddy allocator interface. Regards, Arun. Best Regards, Kevin Signed-off-by: Yang Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 130 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 1 + 4 files changed, 136 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 90eb07106609..117c754409b3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -53,7 +53,7 @@ * */ -static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo) +void amdgpu_bo_destroy(struct ttm_buffer_object *tbo) { struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); @@ -66,7 +66,7 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo) kvfree(bo); } -static void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo) +void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo) { struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); struct amdgpu_bo_user *ubo; @@ -76,7 +76,7 @@ static void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo) amdgpu_bo_destroy(tbo); } -static void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo) +void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo) { struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev); struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index 147b79c10c
RE: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA
[Public] > -Original Message- > From: Kuehling, Felix > Sent: November 22, 2022 7:45 PM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org > Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to > TMA > > > On 2022-10-31 12:23, Jonathan Kim wrote: > > From: Jay Cornwall > > > > Trap handler behavior will differ when a debugger is attached. > > > > Make the debug trap flag available in the trap handler TMA. > > Update it when the debug trap ioctl is invoked. > > > > v3: Rebase for upstream > > > > v2: > > Add missing debug flag setup on APUs > > > > Signed-off-by: Jay Cornwall > > Reviewed-by: Felix Kuehling > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 4 > > drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++ > > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 > > 3 files changed, 22 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > > index ae6e701a2656..d4f87f2adada 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > > @@ -193,6 +193,8 @@ void kfd_dbg_trap_deactivate(struct kfd_process > *target, bool unwind, int unwind > > if (unwind && count == unwind_count) > > break; > > > > + kfd_process_set_trap_debug_flag(&pdd->qpd, false); > > + > > /* GFX off is already disabled by debug activate if not RLC > restore supported. */ > > if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) > > amdgpu_gfx_off_ctrl(pdd->dev->adev, false); > > @@ -278,6 +280,8 @@ int kfd_dbg_trap_activate(struct kfd_process > *target) > > if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) > > amdgpu_gfx_off_ctrl(pdd->dev->adev, true); > > > > + kfd_process_set_trap_debug_flag(&pdd->qpd, true); > > + > > r = debug_refresh_runlist(pdd->dev->dqm); > > if (r) { > > target->runtime_info.runtime_state = > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > > index 9690a2adb9ed..82b28588ab72 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > > @@ -1101,6 +1101,8 @@ int kfd_init_apertures(struct kfd_process > *process); > > void kfd_process_set_trap_handler(struct qcm_process_device *qpd, > > uint64_t tba_addr, > > uint64_t tma_addr); > > +void kfd_process_set_trap_debug_flag(struct qcm_process_device > *qpd, > > +bool enabled); > > > > /* CWSR initialization */ > > int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file > *filep); > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > > index 59c4c38833b6..d62e0c62df76 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > > @@ -1252,6 +1252,8 @@ int kfd_process_init_cwsr_apu(struct > kfd_process *p, struct file *filep) > > > > memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev- > >cwsr_isa_size); > > > > + kfd_process_set_trap_debug_flag(qpd, p- > >debug_trap_enabled); > > + > > qpd->tma_addr = qpd->tba_addr + > KFD_CWSR_TMA_OFFSET; > > pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for > pqm.\n", > > qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); > > @@ -1288,6 +1290,9 @@ static int > kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd) > > > > memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size); > > > > + kfd_process_set_trap_debug_flag(&pdd->qpd, > > + pdd->process- > >debug_trap_enabled); > > + > > qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET; > > pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n", > > qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); > > @@ -1374,6 +1379,17 @@ bool kfd_process_xnack_mode(struct > kfd_process *p, bool supported) > > return true; > > } > > > > +void kfd_process_set_trap_debug_flag(struct qcm_process_device > *qpd, > > +bool enabled) > > +{ > > + /* If TMA doesn't exist then flag will be set during allocation. */ > > I would expect a change to the TMA allocation function, but that isn't > in this patch? The TMA is allocated under kfd_process_init_cwsr_* and CWSR enabled is a pre-condition for the 1st level trap handler loading. The lack of context in the patch for those functions may be hiding that fact. Is the placement of this comment misleading? Maybe it should go in kfd_dbg_trap_activate when kfd_process_set_trap_debug_flag is called? Or should it just be removed since the combined calls within initialization of CWSR + debug enable seem complete for en
Re: [PATCH] drm/amdgpu: add mb for si
On 11/24/2022 4:11 PM, Lazar, Lijo wrote: On 11/24/2022 3:34 PM, Quan, Evan wrote: [AMD Official Use Only - General] Could the attached patch help? Evan -Original Message- From: amd-gfx On Behalf Of ??? Sent: Friday, November 18, 2022 5:25 PM To: Michel Dänzer ; Koenig, Christian ; Deucher, Alexander Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add mb for si 在 2022/11/18 17:18, Michel Dänzer 写道: On 11/18/22 09:01, Christian König wrote: Am 18.11.22 um 08:48 schrieb Zhenneng Li: During reboot test on arm64 platform, it may failure on boot, so add this mb in smc. The error message are as follows: [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block failed -22 [ 7.006919][ 7] [ T295] amdgpu :04:00.0: The issue is happening in late_init() which eventually does ret = si_thermal_enable_alert(adev, false); Just before this, si_thermal_start_thermal_controller is called in hw_init and that enables thermal alert. Maybe the issue is with enable/disable of thermal alerts in quick succession. Adding a delay inside si_thermal_start_thermal_controller might help. On a second look, temperature range is already set as part of si_thermal_start_thermal_controller in hw_init https://elixir.bootlin.com/linux/v6.1-rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L6780 There is no need to set it again here - https://elixir.bootlin.com/linux/v6.1-rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L7635 I think it is safe to remove the call from late_init altogether. Alex/Evan? Thanks, Lijo Thanks, Lijo amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu :04:00.0: Fatal error during GPU init Memory barries are not supposed to be sprinkled around like this, you need to give a detailed explanation why this is necessary. Regards, Christian. Signed-off-by: Zhenneng Li --- drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c index 8f994ffa9cd1..c7656f22278d 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct amdgpu_device *adev) u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); + mb(); + if (!(rst & RST_REG) && !(clk & CK_DISABLE)) return true; In particular, it makes no sense in this specific place, since it cannot directly affect the values of rst & clk. I thinks so too. But when I do reboot test using nine desktop machines, there maybe report this error on one or two machines after Hundreds of times or Thousands of times reboot test, at the beginning, I use msleep() instead of mb(), these two methods are all works, but I don't know what is the root case. I use this method on other verdor's oland card, this error message are reported again. What could be the root reason? test environmen: graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 driver: amdgpu os: ubuntu 2004 platform: arm64 kernel: 5.4.18
Re: [PATCH] drm/amdgpu: add mb for si
On 11/24/2022 3:34 PM, Quan, Evan wrote: [AMD Official Use Only - General] Could the attached patch help? Evan -Original Message- From: amd-gfx On Behalf Of ??? Sent: Friday, November 18, 2022 5:25 PM To: Michel Dänzer ; Koenig, Christian ; Deucher, Alexander Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add mb for si 在 2022/11/18 17:18, Michel Dänzer 写道: On 11/18/22 09:01, Christian König wrote: Am 18.11.22 um 08:48 schrieb Zhenneng Li: During reboot test on arm64 platform, it may failure on boot, so add this mb in smc. The error message are as follows: [6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block failed -22 [ 7.006919][ 7] [ T295] amdgpu :04:00.0: The issue is happening in late_init() which eventually does ret = si_thermal_enable_alert(adev, false); Just before this, si_thermal_start_thermal_controller is called in hw_init and that enables thermal alert. Maybe the issue is with enable/disable of thermal alerts in quick succession. Adding a delay inside si_thermal_start_thermal_controller might help. Thanks, Lijo amdgpu_device_ip_late_init failed [7.014224][ 7] [ T295] amdgpu :04:00.0: Fatal error during GPU init Memory barries are not supposed to be sprinkled around like this, you need to give a detailed explanation why this is necessary. Regards, Christian. Signed-off-by: Zhenneng Li --- drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c index 8f994ffa9cd1..c7656f22278d 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct amdgpu_device *adev) u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); +mb(); + if (!(rst & RST_REG) && !(clk & CK_DISABLE)) return true; In particular, it makes no sense in this specific place, since it cannot directly affect the values of rst & clk. I thinks so too. But when I do reboot test using nine desktop machines, there maybe report this error on one or two machines after Hundreds of times or Thousands of times reboot test, at the beginning, I use msleep() instead of mb(), these two methods are all works, but I don't know what is the root case. I use this method on other verdor's oland card, this error message are reported again. What could be the root reason? test environmen: graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 driver: amdgpu os: ubuntu 2004 platform: arm64 kernel: 5.4.18
Re: [PATCH 1/3] drm/ttm: remove ttm_bo_(un)lock_delayed_workqueue
On Thu, 24 Nov 2022 at 10:03, Christian König wrote: > > Those functions never worked correctly since it is still perfectly > possible that a buffer object is released and the background worker > restarted even after calling them. > > Signed-off-by: Christian König I know you usually do, but just a friendly reminder to Cc: intel-gfx on the next revision or before merging, just so our CI can give the series a quick test. Thanks.
RE: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition
[AMD Official Use Only - General] Reviewed-by: Evan Quan > -Original Message- > From: amd-gfx On Behalf Of Yu > Songping > Sent: Thursday, November 24, 2022 9:53 AM > To: airl...@gmail.com; dan...@ffwll.ch > Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; linux- > ker...@vger.kernel.org > Subject: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition > > The logical operator '&&' will make > smu->ppt_funcs->set_gfx_power_up_by_imu segment fault when > ppt_funcs is > smu->NULL. > > Signed-off-by: Yu Songping > --- > drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > index b880f4d7d67e..1cb728b0b7ee 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > @@ -161,7 +161,7 @@ int smu_get_dpm_freq_range(struct smu_context > *smu, > > int smu_set_gfx_power_up_by_imu(struct smu_context *smu) { > - if (!smu->ppt_funcs && !smu->ppt_funcs- > >set_gfx_power_up_by_imu) > + if (!smu->ppt_funcs || !smu->ppt_funcs- > >set_gfx_power_up_by_imu) > return -EOPNOTSUPP; > > return smu->ppt_funcs->set_gfx_power_up_by_imu(smu); > -- > 2.17.1 <>
Re: [PATCH] drm/amdgpu: add mb for si
That's not a patch but some binary file? Christian. Am 24.11.22 um 11:04 schrieb Quan, Evan: [AMD Official Use Only - General] Could the attached patch help? Evan -Original Message- From: amd-gfx On Behalf Of ??? Sent: Friday, November 18, 2022 5:25 PM To: Michel Dänzer ; Koenig, Christian ; Deucher, Alexander Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: add mb for si 在 2022/11/18 17:18, Michel Dänzer 写道: On 11/18/22 09:01, Christian König wrote: Am 18.11.22 um 08:48 schrieb Zhenneng Li: During reboot test on arm64 platform, it may failure on boot, so add this mb in smc. The error message are as follows: [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block failed -22 [ 7.006919][ 7] [ T295] amdgpu :04:00.0: amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu :04:00.0: Fatal error during GPU init Memory barries are not supposed to be sprinkled around like this, you need to give a detailed explanation why this is necessary. Regards, Christian. Signed-off-by: Zhenneng Li --- drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c index 8f994ffa9cd1..c7656f22278d 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct amdgpu_device *adev) u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); + mb(); + if (!(rst & RST_REG) && !(clk & CK_DISABLE)) return true; In particular, it makes no sense in this specific place, since it cannot directly affect the values of rst & clk. I thinks so too. But when I do reboot test using nine desktop machines, there maybe report this error on one or two machines after Hundreds of times or Thousands of times reboot test, at the beginning, I use msleep() instead of mb(), these two methods are all works, but I don't know what is the root case. I use this method on other verdor's oland card, this error message are reported again. What could be the root reason? test environmen: graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 driver: amdgpu os: ubuntu 2004 platform: arm64 kernel: 5.4.18
RE: [PATCH] drm/amdgpu: add mb for si
[AMD Official Use Only - General] Could the attached patch help? Evan > -Original Message- > From: amd-gfx On Behalf Of ??? > Sent: Friday, November 18, 2022 5:25 PM > To: Michel Dänzer ; Koenig, Christian > ; Deucher, Alexander > > Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ; > linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org > Subject: Re: [PATCH] drm/amdgpu: add mb for si > > > 在 2022/11/18 17:18, Michel Dänzer 写道: > > On 11/18/22 09:01, Christian König wrote: > >> Am 18.11.22 um 08:48 schrieb Zhenneng Li: > >>> During reboot test on arm64 platform, it may failure on boot, so add > >>> this mb in smc. > >>> > >>> The error message are as follows: > >>> [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init > >>> [amdgpu]] *ERROR* > >>> late_init of IP block failed -22 [ > >>> 7.006919][ 7] [ T295] amdgpu :04:00.0: > >>> amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu > >>> :04:00.0: Fatal error during GPU init > >> Memory barries are not supposed to be sprinkled around like this, you > need to give a detailed explanation why this is necessary. > >> > >> Regards, > >> Christian. > >> > >>> Signed-off-by: Zhenneng Li > >>> --- > >>> drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> index 8f994ffa9cd1..c7656f22278d 100644 > >>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct > >>> amdgpu_device *adev) > >>> u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); > >>> u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); > >>> + mb(); > >>> + > >>> if (!(rst & RST_REG) && !(clk & CK_DISABLE)) > >>> return true; > > In particular, it makes no sense in this specific place, since it cannot > > directly > affect the values of rst & clk. > > I thinks so too. > > But when I do reboot test using nine desktop machines, there maybe report > this error on one or two machines after Hundreds of times or Thousands of > times reboot test, at the beginning, I use msleep() instead of mb(), these > two methods are all works, but I don't know what is the root case. > > I use this method on other verdor's oland card, this error message are > reported again. > > What could be the root reason? > > test environmen: > > graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 > > driver: amdgpu > > os: ubuntu 2004 > > platform: arm64 > > kernel: 5.4.18 > > > <>
[PATCH 2/3] drm/ttm: use per BO cleanup workers
Instead of a single worker going over the list of delete BOs in regular intervals use a per BO worker which blocks for the resv object and locking of the BO. This not only simplifies the handling massively, but also results in much better response time when cleaning up buffers. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/i915/i915_gem.c| 2 +- drivers/gpu/drm/i915/intel_region_ttm.c| 2 +- drivers/gpu/drm/ttm/ttm_bo.c | 112 - drivers/gpu/drm/ttm/ttm_bo_util.c | 1 - drivers/gpu/drm/ttm/ttm_device.c | 24 ++--- include/drm/ttm/ttm_bo_api.h | 18 +--- include/drm/ttm/ttm_device.h | 7 +- 8 files changed, 57 insertions(+), 111 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ff2ae0be2c28..05f458d67eb8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3966,7 +3966,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_fence_driver_hw_fini(adev); if (adev->mman.initialized) - flush_delayed_work(&adev->mman.bdev.wq); + drain_workqueue(adev->mman.bdev.wq); if (adev->pm_sysfs_en) amdgpu_pm_sysfs_fini(adev); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 299f94a9fb87..413c9dbe5be1 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1099,7 +1099,7 @@ void i915_gem_drain_freed_objects(struct drm_i915_private *i915) { while (atomic_read(&i915->mm.free_count)) { flush_work(&i915->mm.free_work); - flush_delayed_work(&i915->bdev.wq); + drain_workqueue(i915->bdev.wq); rcu_barrier(); } } diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c b/drivers/gpu/drm/i915/intel_region_ttm.c index cf89d0c2a2d9..657bbc16a48a 100644 --- a/drivers/gpu/drm/i915/intel_region_ttm.c +++ b/drivers/gpu/drm/i915/intel_region_ttm.c @@ -132,7 +132,7 @@ int intel_region_ttm_fini(struct intel_memory_region *mem) break; msleep(20); - flush_delayed_work(&mem->i915->bdev.wq); + drain_workqueue(mem->i915->bdev.wq); } /* If we leaked objects, Don't free the region causing use after free */ diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index b77262a623e0..4749b65bedc4 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -280,14 +280,13 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, ret = 0; } - if (ret || unlikely(list_empty(&bo->ddestroy))) { + if (ret) { if (unlock_resv) dma_resv_unlock(bo->base.resv); spin_unlock(&bo->bdev->lru_lock); return ret; } - list_del_init(&bo->ddestroy); spin_unlock(&bo->bdev->lru_lock); ttm_bo_cleanup_memtype_use(bo); @@ -300,47 +299,21 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, } /* - * Traverse the delayed list, and call ttm_bo_cleanup_refs on all - * encountered buffers. + * Block for the dma_resv object to become idle, lock the buffer and clean up + * the resource and tt object. */ -bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all) +static void ttm_bo_delayed_delete(struct work_struct *work) { - struct list_head removed; - bool empty; - - INIT_LIST_HEAD(&removed); - - spin_lock(&bdev->lru_lock); - while (!list_empty(&bdev->ddestroy)) { - struct ttm_buffer_object *bo; - - bo = list_first_entry(&bdev->ddestroy, struct ttm_buffer_object, - ddestroy); - list_move_tail(&bo->ddestroy, &removed); - if (!ttm_bo_get_unless_zero(bo)) - continue; - - if (remove_all || bo->base.resv != &bo->base._resv) { - spin_unlock(&bdev->lru_lock); - dma_resv_lock(bo->base.resv, NULL); - - spin_lock(&bdev->lru_lock); - ttm_bo_cleanup_refs(bo, false, !remove_all, true); - - } else if (dma_resv_trylock(bo->base.resv)) { - ttm_bo_cleanup_refs(bo, false, !remove_all, true); - } else { - spin_unlock(&bdev->lru_lock); - } + struct ttm_buffer_object *bo; - ttm_bo_put(bo); - spin_lock(&bdev->lru_lock); - } - list_splice_tail(&removed, &bdev->ddestroy); - empty = list_empty(&bdev->ddestroy); - spin_unlock(&bdev->lru_lock); + bo = container_of(work, typeof(*bo), delayed_delete); -
[PATCH 3/3] drm/amdgpu: generally allow over-commit during BO allocation
We already fallback to a dummy BO with no backing store when we allocate GDS,GWS and OA resources and to GTT when we allocate VRAM. Drop all those workarounds and generalize this for GTT as well. This fixes ENOMEM issues with runaway applications which try to allocate/free GTT in a loop and are otherwise only limited by the CPU speed. The CS will wait for the cleanup of freed up BOs to satisfy the various domain specific limits and so effectively throttle those buggy applications down to a sane allocation behavior again. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 16 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +- 2 files changed, 4 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 8ef31d687ef3..286509e5e17e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -112,7 +112,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size, bp.resv = resv; bp.preferred_domain = initial_domain; bp.flags = flags; - bp.domain = initial_domain; + bp.domain = initial_domain | AMDGPU_GEM_DOMAIN_CPU; bp.bo_ptr_size = sizeof(struct amdgpu_bo); r = amdgpu_bo_create_user(adev, &bp, &ubo); @@ -331,20 +331,10 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data, } initial_domain = (u32)(0x & args->in.domains); -retry: r = amdgpu_gem_object_create(adev, size, args->in.alignment, -initial_domain, -flags, ttm_bo_type_device, resv, &gobj); +initial_domain, flags, ttm_bo_type_device, +resv, &gobj); if (r && r != -ERESTARTSYS) { - if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) { - flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; - goto retry; - } - - if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) { - initial_domain |= AMDGPU_GEM_DOMAIN_GTT; - goto retry; - } DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", size, initial_domain, args->in.alignment, r); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 974e85d8b6cc..919bbea2e3ac 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -581,11 +581,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev, bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE; bo->tbo.bdev = &adev->mman.bdev; - if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA | - AMDGPU_GEM_DOMAIN_GDS)) - amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU); - else - amdgpu_bo_placement_from_domain(bo, bp->domain); + amdgpu_bo_placement_from_domain(bo, bp->domain); if (bp->type == ttm_bo_type_kernel) bo->tbo.priority = 1; -- 2.34.1
[PATCH 1/3] drm/ttm: remove ttm_bo_(un)lock_delayed_workqueue
Those functions never worked correctly since it is still perfectly possible that a buffer object is released and the background worker restarted even after calling them. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +--- drivers/gpu/drm/radeon/radeon_device.c | 5 - drivers/gpu/drm/radeon/radeon_pm.c | 4 +--- drivers/gpu/drm/ttm/ttm_bo.c| 14 -- include/drm/ttm/ttm_bo_api.h| 16 6 files changed, 3 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index de61a85c4b02..8c57a5b7ecba 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -1717,7 +1717,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring) static int amdgpu_debugfs_ib_preempt(void *data, u64 val) { - int r, resched, length; + int r, length; struct amdgpu_ring *ring; struct dma_fence **fences = NULL; struct amdgpu_device *adev = (struct amdgpu_device *)data; @@ -1747,8 +1747,6 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val) /* stop the scheduler */ kthread_park(ring->sched.thread); - resched = ttm_bo_lock_delayed_workqueue(&adev->mman.bdev); - /* preempt the IB */ r = amdgpu_ring_preempt_ib(ring); if (r) { @@ -1785,8 +1783,6 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val) up_read(&adev->reset_domain->sem); - ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, resched); - pro_end: kfree(fences); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index cc65df9f2419..ff2ae0be2c28 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3965,10 +3965,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) } amdgpu_fence_driver_hw_fini(adev); - if (adev->mman.initialized) { + if (adev->mman.initialized) flush_delayed_work(&adev->mman.bdev.wq); - ttm_bo_lock_delayed_workqueue(&adev->mman.bdev); - } if (adev->pm_sysfs_en) amdgpu_pm_sysfs_fini(adev); diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index a556b6be1137..ea10306809cf 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1768,7 +1768,6 @@ int radeon_gpu_reset(struct radeon_device *rdev) bool saved = false; int i, r; - int resched; down_write(&rdev->exclusive_lock); @@ -1780,8 +1779,6 @@ int radeon_gpu_reset(struct radeon_device *rdev) atomic_inc(&rdev->gpu_reset_counter); radeon_save_bios_scratch_regs(rdev); - /* block TTM */ - resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev); radeon_suspend(rdev); radeon_hpd_fini(rdev); @@ -1840,8 +1837,6 @@ int radeon_gpu_reset(struct radeon_device *rdev) /* reset hpd state */ radeon_hpd_init(rdev); - ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched); - rdev->in_reset = true; rdev->needs_reset = false; diff --git a/drivers/gpu/drm/radeon/radeon_pm.c b/drivers/gpu/drm/radeon/radeon_pm.c index 04c693ca419a..cbc554928bcc 100644 --- a/drivers/gpu/drm/radeon/radeon_pm.c +++ b/drivers/gpu/drm/radeon/radeon_pm.c @@ -1853,11 +1853,10 @@ static bool radeon_pm_debug_check_in_vbl(struct radeon_device *rdev, bool finish static void radeon_dynpm_idle_work_handler(struct work_struct *work) { struct radeon_device *rdev; - int resched; + rdev = container_of(work, struct radeon_device, pm.dynpm_idle_work.work); - resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev); mutex_lock(&rdev->pm.mutex); if (rdev->pm.dynpm_state == DYNPM_STATE_ACTIVE) { int not_processed = 0; @@ -1908,7 +1907,6 @@ static void radeon_dynpm_idle_work_handler(struct work_struct *work) msecs_to_jiffies(RADEON_IDLE_LOOP_MS)); } mutex_unlock(&rdev->pm.mutex); - ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched); } /* diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index c3f4b33136e5..b77262a623e0 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -418,20 +418,6 @@ void ttm_bo_put(struct ttm_buffer_object *bo) } EXPORT_SYMBOL(ttm_bo_put); -int ttm_bo_lock_delayed_workqueue(struct ttm_device *bdev) -{ - return cancel_delayed_work_sync(&bdev->wq); -} -EXPORT_SYMBOL(ttm_bo_lock_delayed_workqueue); - -void ttm_bo_unlock_delayed_workqueue(struct ttm_device *bdev, int resc
[PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition
The logical operator '&&' will make smu->ppt_funcs->set_gfx_power_up_by_imu segment fault when smu->ppt_funcs is NULL. Signed-off-by: Yu Songping --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index b880f4d7d67e..1cb728b0b7ee 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -161,7 +161,7 @@ int smu_get_dpm_freq_range(struct smu_context *smu, int smu_set_gfx_power_up_by_imu(struct smu_context *smu) { - if (!smu->ppt_funcs && !smu->ppt_funcs->set_gfx_power_up_by_imu) + if (!smu->ppt_funcs || !smu->ppt_funcs->set_gfx_power_up_by_imu) return -EOPNOTSUPP; return smu->ppt_funcs->set_gfx_power_up_by_imu(smu); -- 2.17.1