RE: [PATCH] drm/amd/pm: keep the BACO feature enabled for suspend
[Public] Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: Quan, Evan Sent: Thursday, December 30, 2021 6:01 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Chen, Guchun ; Quan, Evan Subject: [PATCH] drm/amd/pm: keep the BACO feature enabled for suspend To pair with the workaround which always reset the ASIC in suspend. Otherwise, the reset which relies on BACO will fail. Fixes: 50583690930d ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Evan Quan Change-Id: I39ed072af16e34ef1e1c16b50ace6d46fbc388b9 --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 4d867778a65c..7628be2f2301 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1308,10 +1308,16 @@ static int smu_disable_dpms(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; int ret = 0; + /* +* TODO: (adev->in_suspend && !adev->in_s0ix) is added to pair +* the workaround which always reset the asic in suspend. +* It's likely that workaround will be dropped in the future. +* Then the change here should be dropped together. +*/ bool use_baco = !smu->is_apu && ((amdgpu_in_reset(adev) && (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)) || -((adev->in_runpm || adev->in_s4) && amdgpu_asic_supports_baco(adev))); +((adev->in_runpm || adev->in_s4 || (adev->in_suspend && +!adev->in_s0ix)) && amdgpu_asic_supports_baco(adev))); /* * For custom pptable uploading, skip the DPM features -- 2.29.0
Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV
Sure, I guess i can drop this patch then. Andrey On 2021-12-24 4:57 a.m., JingWen Chen wrote: I do agree with shaoyun, if the host find the gpu engine hangs first, and do the flr, guest side thread may not know this and still try to access HW(e.g. kfd is using a lot of amdgpu_in_reset and reset_sem to identify the reset status). And this may lead to very bad result. On 2021/12/24 下午4:58, Deng, Emily wrote: These patches look good to me. JingWen will pull these patches and do some basic TDR test on sriov environment, and give feedback. Best wishes Emily Deng -Original Message- From: Liu, Monk Sent: Thursday, December 23, 2021 6:14 PM To: Koenig, Christian ; Grodzovsky, Andrey ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review on Andrey's patch Thanks --- Monk Liu | Cloud GPU & Virtualization Solution | AMD --- we are hiring software manager for CVS core team --- -Original Message- From: Koenig, Christian Sent: Thursday, December 23, 2021 4:42 PM To: Grodzovsky, Andrey ; dri- de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: Since now flr work is serialized against GPU resets there is no need for this. Signed-off-by: Andrey Grodzovsky Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 --- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 11 --- 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index 487cd654b69e..7d59a66e3988 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -248,15 +248,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, virt); int timeout = AI_MAILBOX_POLL_FLR_TIMEDOUT; - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, -* otherwise the mailbox msg will be ruined/reseted by -* the VF FLR. -*/ - if (!down_write_trylock(>reset_sem)) - return; - amdgpu_virt_fini_data_exchange(adev); - atomic_set(>in_gpu_reset, 1); xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); @@ -269,9 +261,6 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) } while (timeout > 1); flr_done: - atomic_set(>in_gpu_reset, 0); - up_write(>reset_sem); - /* Trigger recovery for world switch failure if no TDR */ if (amdgpu_device_should_recover_gpu(adev) && (!amdgpu_device_has_job_running(adev) || diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index e3869067a31d..f82c066c8e8d 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -277,15 +277,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, virt); int timeout = NV_MAILBOX_POLL_FLR_TIMEDOUT; - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, -* otherwise the mailbox msg will be ruined/reseted by -* the VF FLR. -*/ - if (!down_write_trylock(>reset_sem)) - return; - amdgpu_virt_fini_data_exchange(adev); - atomic_set(>in_gpu_reset, 1); xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); @@ -298,9 +290,6 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) } while (timeout > 1); flr_done: - atomic_set(>in_gpu_reset, 0); - up_write(>reset_sem); - /* Trigger recovery for world switch failure if no TDR */ if (amdgpu_device_should_recover_gpu(adev) && (!amdgpu_device_has_job_running(adev) ||
Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV
Thanks a lot, please let me know. Andrey On 2021-12-24 3:58 a.m., Deng, Emily wrote: These patches look good to me. JingWen will pull these patches and do some basic TDR test on sriov environment, and give feedback. Best wishes Emily Deng -Original Message- From: Liu, Monk Sent: Thursday, December 23, 2021 6:14 PM To: Koenig, Christian ; Grodzovsky, Andrey ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review on Andrey's patch Thanks --- Monk Liu | Cloud GPU & Virtualization Solution | AMD --- we are hiring software manager for CVS core team --- -Original Message- From: Koenig, Christian Sent: Thursday, December 23, 2021 4:42 PM To: Grodzovsky, Andrey ; dri- de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: Since now flr work is serialized against GPU resets there is no need for this. Signed-off-by: Andrey Grodzovsky Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 --- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 11 --- 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index 487cd654b69e..7d59a66e3988 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -248,15 +248,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, virt); int timeout = AI_MAILBOX_POLL_FLR_TIMEDOUT; - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, -* otherwise the mailbox msg will be ruined/reseted by -* the VF FLR. -*/ - if (!down_write_trylock(>reset_sem)) - return; - amdgpu_virt_fini_data_exchange(adev); - atomic_set(>in_gpu_reset, 1); xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); @@ -269,9 +261,6 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) } while (timeout > 1); flr_done: - atomic_set(>in_gpu_reset, 0); - up_write(>reset_sem); - /* Trigger recovery for world switch failure if no TDR */ if (amdgpu_device_should_recover_gpu(adev) && (!amdgpu_device_has_job_running(adev) || diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index e3869067a31d..f82c066c8e8d 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -277,15 +277,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, virt); int timeout = NV_MAILBOX_POLL_FLR_TIMEDOUT; - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, -* otherwise the mailbox msg will be ruined/reseted by -* the VF FLR. -*/ - if (!down_write_trylock(>reset_sem)) - return; - amdgpu_virt_fini_data_exchange(adev); - atomic_set(>in_gpu_reset, 1); xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); @@ -298,9 +290,6 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) } while (timeout > 1); flr_done: - atomic_set(>in_gpu_reset, 0); - up_write(>reset_sem); - /* Trigger recovery for world switch failure if no TDR */ if (amdgpu_device_should_recover_gpu(adev) && (!amdgpu_device_has_job_running(adev) ||
Re: [PATCH] gpu/drm/radeon:Fix null pointer risk
Am 28.12.21 um 08:31 schrieb Wen Zhiwei: If the null pointer is not judged in advance, there is a risk that the pointer will cross the boundary As far as I can see that case is impossible, why do you want to add a check for it? Regards, Christian. Signed-off-by: Wen Zhiwei --- drivers/gpu/drm/radeon/radeon_vm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c index bb53016f3138..d3d342041adf 100644 --- a/drivers/gpu/drm/radeon/radeon_vm.c +++ b/drivers/gpu/drm/radeon/radeon_vm.c @@ -951,7 +951,7 @@ int radeon_vm_bo_update(struct radeon_device *rdev, if (mem->mem_type == TTM_PL_TT) { bo_va->flags |= RADEON_VM_PAGE_SYSTEM; - if (!(bo_va->bo->flags & (RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC))) + if (bo_va->bo && !(bo_va->bo->flags & (RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC))) bo_va->flags |= RADEON_VM_PAGE_SNOOPED; } else {
Re: [PATCH] drm/radeon: use kernel is_power_of_2 rather than local version
Am 30.12.21 um 06:00 schrieb Jonathan Gray: Follow the amdgpu change made in 7611750784664db46d0db95631e322aeb263dde7 and replace local radeon function with is_power_of_2(). Signed-off-by: Jonathan Gray Reviewed-by: Christian König --- drivers/gpu/drm/radeon/radeon_device.c | 19 +++ 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 4f0fbf667431..15692cb241fc 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1085,19 +1085,6 @@ static unsigned int radeon_vga_set_decode(struct pci_dev *pdev, bool state) return VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM; } -/** - * radeon_check_pot_argument - check that argument is a power of two - * - * @arg: value to check - * - * Validates that a certain argument is a power of two (all asics). - * Returns true if argument is valid. - */ -static bool radeon_check_pot_argument(int arg) -{ - return (arg & (arg - 1)) == 0; -} - /** * radeon_gart_size_auto - Determine a sensible default GART size * according to ASIC family. @@ -1126,7 +1113,7 @@ static int radeon_gart_size_auto(enum radeon_family family) static void radeon_check_arguments(struct radeon_device *rdev) { /* vramlimit must be a power of two */ - if (!radeon_check_pot_argument(radeon_vram_limit)) { + if (!is_power_of_2(radeon_vram_limit)) { dev_warn(rdev->dev, "vram limit (%d) must be a power of 2\n", radeon_vram_limit); radeon_vram_limit = 0; @@ -1140,7 +1127,7 @@ static void radeon_check_arguments(struct radeon_device *rdev) dev_warn(rdev->dev, "gart size (%d) too small\n", radeon_gart_size); radeon_gart_size = radeon_gart_size_auto(rdev->family); - } else if (!radeon_check_pot_argument(radeon_gart_size)) { + } else if (!is_power_of_2(radeon_gart_size)) { dev_warn(rdev->dev, "gart size (%d) must be a power of 2\n", radeon_gart_size); radeon_gart_size = radeon_gart_size_auto(rdev->family); @@ -1163,7 +1150,7 @@ static void radeon_check_arguments(struct radeon_device *rdev) break; } - if (!radeon_check_pot_argument(radeon_vm_size)) { + if (!is_power_of_2(radeon_vm_size)) { dev_warn(rdev->dev, "VM size (%d) must be a power of 2\n", radeon_vm_size); radeon_vm_size = 4;
[pull] amdgpu, amdkfd drm-next-5.17
Hi Dave, Daniel, Fixes for 5.17. Now with more S-o-b. The following changes since commit a342655865b2f14d1fbf346356d3b3360e63e872: drm/radeon: Fix syntax errors in comments (2021-12-14 16:11:02 -0500) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-next-5.17-2021-12-30 for you to fetch changes up to 0637d41786a3a9551f33ad8e15bdb40416362028: drm/amdgpu: no DC support for headless chips (2021-12-30 08:54:45 -0500) amd-drm-next-5.17-2021-12-30: amdgpu: - Suspend/resume fixes - Fence fix - Misc code cleanups - IP discovery fixes - SRIOV fixes - RAS fixes - GMC 8 VRAM detection fix - FRU fixes for Aldebaran - Display fixes amdkfd: - SVM fixes - IP discovery fixes Alex Deucher (5): drm/amdgpu: clean up some leftovers from bring up drm/amdgpu: add support for IP discovery gc_info table v2 drm/amdgpu: fix runpm documentation drm/amdgpu: always reset the asic in suspend (v2) drm/amdgpu: no DC support for headless chips Alvin Lee (1): drm/amd/display: Fix check for null function ptr Angus Wang (1): drm/amd/display: Changed pipe split policy to allow for multi-display pipe split Anthony Koo (1): drm/amd/display: [FW Promotion] Release 0.0.98 Aric Cyr (1): drm/amd/display: 3.2.167 Bokun Zhang (1): drm/amdgpu: Filter security violation registers Changcheng Deng (1): drm/amdkfd: use max() and min() to make code cleaner Charlene Liu (1): drm/amd/display: fix B0 TMDS deepcolor no dislay issue Evan Quan (1): drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform George Shen (2): drm/amd/display: Limit max link cap with LTTPR caps drm/amd/display: Remove CR AUX RD Interval limit for LTTPR Guchun Chen (2): drm/amdkfd: correct sdma queue number in kfd device init (v3) drm/amdgpu: drop redundant semicolon Huang Rui (1): drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence Jiapeng Chong (1): drm/amd/display: Fix warning comparing pointer to 0 José Expósito (1): drm/amd/display: fix dereference before NULL check Kent Russell (4): drm/amdgpu: Increase potential product_name to 64 characters drm/amdgpu: Enable unique_id for Aldebaran drm/amdgpu: Only overwrite serial if field is empty drm/amdgpu: Access the FRU on Aldebaran Lai, Derek (1): drm/amd/display: Added power down for DCN10 Leslie Shi (1): drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure Lijo Lazar (1): drm/amd/pm: Fix xgmi link control on aldebaran Marina Nikolic (1): amdgpu/pm: Make sysfs pm attributes as read-only for VFs Mario Limonciello (2): drivers/amd/pm: smu13: use local variable adev drm/amd/pm: restore SMU version print statement for dGPUs Martin Leung (1): drm/amd/display: Undo ODM combine Nicholas Kazlauskas (4): drm/amd/display: Fix USB4 null pointer dereference in update_psp_stream_config drm/amd/display: Block z-states when stutter period exceeds criteria drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization drm/amd/display: Set optimize_pwr_state for DCN31 Philip Yang (1): drm/amdkfd: fix svm_bo release invalid wait context warning Prike Liang (1): drm/amd/pm: skip setting gfx cgpg in the s0ix suspend-resume Rajneesh Bhardwaj (1): drm/amdgpu: Don't inherit GEM object VMAs in child process Shen, George (1): drm/amd/display: Refactor vendor specific link training sequence Surbhi Kakarya (1): drm/amdgpu: Check the memory can be accesssed by ttm_device_clear_dma_mappings. Tao Zhou (5): drm/amdgpu: add gpu reset control for umc page retirement drm/amdkfd: add reset parameter for unmap queues drm/amdkfd: add reset queue function for RAS poison (v2) drm/amdkfd: reset queue which consumes RAS poison (v2) drm/amdgpu: save error count in RAS poison handler Victor Skvortsov (6): drm/amdgpu: Separate vf2pf work item init from virt data exchange drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov drm/amdgpu: get xgmi info before ip_init drm/amdgpu: Modify indirect register access for gfx9 sriov Wenjing Liu (5): drm/amd/display: define link res and make it accessible to all link interfaces drm/amd/display: populate link res in both detection and validation drm/amd/display: access hpo dp link encoder only through link resource drm/amd/display: support dynamic HPO DP link encoder allocation drm/amd/display: get and restore link res map Wesley
Re: [PATCH] drm/amd/pm: keep the BACO feature enabled for suspend
Reviewed-by: Alex Deucher On Thu, Dec 30, 2021 at 5:01 AM Evan Quan wrote: > > To pair with the workaround which always reset the ASIC in suspend. > Otherwise, the reset which relies on BACO will fail. > > Fixes: 50583690930d ("drm/amdgpu: always reset the asic in suspend (v2)") > > Signed-off-by: Evan Quan > Change-Id: I39ed072af16e34ef1e1c16b50ace6d46fbc388b9 > --- > drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > index 4d867778a65c..7628be2f2301 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > @@ -1308,10 +1308,16 @@ static int smu_disable_dpms(struct smu_context *smu) > { > struct amdgpu_device *adev = smu->adev; > int ret = 0; > + /* > +* TODO: (adev->in_suspend && !adev->in_s0ix) is added to pair > +* the workaround which always reset the asic in suspend. > +* It's likely that workaround will be dropped in the future. > +* Then the change here should be dropped together. > +*/ > bool use_baco = !smu->is_apu && > ((amdgpu_in_reset(adev) && > (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)) > || > -((adev->in_runpm || adev->in_s4) && > amdgpu_asic_supports_baco(adev))); > +((adev->in_runpm || adev->in_s4 || (adev->in_suspend && > !adev->in_s0ix)) && amdgpu_asic_supports_baco(adev))); > > /* > * For custom pptable uploading, skip the DPM features > -- > 2.29.0 >
Re: [pull] amdgpu drm-fixes-5.16
On Thu, Dec 30, 2021 at 12:29 AM Dave Airlie wrote: > > On Thu, 30 Dec 2021 at 01:51, Alex Deucher wrote: > > > > Hi Dave, Daniel, > > Just FYI on merging this into tip I got a conflict I'm not sure what > answer is right. > > fixes has: > ee2698cf79cc759a397c61086c758d4cc85938bf > Author: Angus Wang > Date: Thu Dec 9 17:27:01 2021 -0500 > > drm/amd/display: Changed pipe split policy to allow for > multi-display pipe split > > next has: > 1edf5ae1fdaffb67c1b93e98df670cbe535d13cf > Author: Zhan Liu > Date: Mon Nov 8 19:31:00 2021 -0500 > > drm/amd/display: enable seamless boot for DCN301 > > -.pipe_split_policy = MPC_SPLIT_AVOID_MULT_DISP, > fixes is +.pipe_split_policy = MPC_SPLIT_DYNAMIC, > next is +.pipe_split_policy = MPC_SPLIT_AVOID, > > I've chosen the -fixes answer for now, but it would be good to have > someone review it before Linus merges. It should ultimately be MPC_SPLIT_DYNAMIC. -next has an extra patch which changes it to an intermediate value before this patch changes it to MPC_SPLIT_DYNAMIC. Alex > > Dave.
[PATCH] drm/amd/pm: keep the BACO feature enabled for suspend
To pair with the workaround which always reset the ASIC in suspend. Otherwise, the reset which relies on BACO will fail. Fixes: 50583690930d ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Evan Quan Change-Id: I39ed072af16e34ef1e1c16b50ace6d46fbc388b9 --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 4d867778a65c..7628be2f2301 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1308,10 +1308,16 @@ static int smu_disable_dpms(struct smu_context *smu) { struct amdgpu_device *adev = smu->adev; int ret = 0; + /* +* TODO: (adev->in_suspend && !adev->in_s0ix) is added to pair +* the workaround which always reset the asic in suspend. +* It's likely that workaround will be dropped in the future. +* Then the change here should be dropped together. +*/ bool use_baco = !smu->is_apu && ((amdgpu_in_reset(adev) && (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)) || -((adev->in_runpm || adev->in_s4) && amdgpu_asic_supports_baco(adev))); +((adev->in_runpm || adev->in_s4 || (adev->in_suspend && !adev->in_s0ix)) && amdgpu_asic_supports_baco(adev))); /* * For custom pptable uploading, skip the DPM features -- 2.29.0
Re: [PATCH] drm/amdgpu: add dummy event6 for vega10
Reviewed-by: Jingwen Chen On 2021/12/29 下午6:38, James Yao wrote: > [why] > Malicious mailbox event1 fails driver loading on vega10. > An dummy event6 prevent driver from taking response from malicious event1 as > its own. > > [how] > On vega10, send a mailbox event6 before sending event1. > > Signed-off-by: James Yao > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 > drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c| 11 +++ > drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h| 2 ++ > 3 files changed, 17 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > index f8e574cc0e22..d9509c3482e2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > @@ -727,6 +727,10 @@ void amdgpu_detect_virtualization(struct amdgpu_device > *adev) > vi_set_virt_ops(adev); > break; > case CHIP_VEGA10: > + soc15_set_virt_ops(adev); > + /* send a dummy GPU_INIT_DATA request to host on vega10 > */ > + amdgpu_virt_request_init_data(adev); > + break; > case CHIP_VEGA20: > case CHIP_ARCTURUS: > case CHIP_ALDEBARAN: > diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > index 0077e738db31..56da5ab82987 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > @@ -180,6 +180,11 @@ static int xgpu_ai_send_access_requests(struct > amdgpu_device *adev, > RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, > mmBIF_BX_PF0_MAILBOX_MSGBUF_RCV_DW2)); > } > + } else if (req == IDH_REQ_GPU_INIT_DATA){ > + /* Dummy REQ_GPU_INIT_DATA handling */ > + r = xgpu_ai_poll_msg(adev, IDH_REQ_GPU_INIT_DATA_READY); > + /* version set to 0 since dummy */ > + adev->virt.req_init_data_ver = 0; > } > > return 0; > @@ -381,10 +386,16 @@ void xgpu_ai_mailbox_put_irq(struct amdgpu_device *adev) > amdgpu_irq_put(adev, >virt.rcv_irq, 0); > } > > +static int xgpu_ai_request_init_data(struct amdgpu_device *adev) > +{ > + return xgpu_ai_send_access_requests(adev, IDH_REQ_GPU_INIT_DATA); > +} > + > const struct amdgpu_virt_ops xgpu_ai_virt_ops = { > .req_full_gpu = xgpu_ai_request_full_gpu_access, > .rel_full_gpu = xgpu_ai_release_full_gpu_access, > .reset_gpu = xgpu_ai_request_reset, > .wait_reset = NULL, > .trans_msg = xgpu_ai_mailbox_trans_msg, > + .req_init_data = xgpu_ai_request_init_data, > }; > diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h > b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h > index f9aa4d0bb638..fa7e13e0459e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h > +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h > @@ -35,6 +35,7 @@ enum idh_request { > IDH_REQ_GPU_FINI_ACCESS, > IDH_REL_GPU_FINI_ACCESS, > IDH_REQ_GPU_RESET_ACCESS, > + IDH_REQ_GPU_INIT_DATA, > > IDH_LOG_VF_ERROR = 200, > IDH_READY_TO_RESET = 201, > @@ -48,6 +49,7 @@ enum idh_event { > IDH_SUCCESS, > IDH_FAIL, > IDH_QUERY_ALIVE, > + IDH_REQ_GPU_INIT_DATA_READY, > > IDH_TEXT_MESSAGE = 255, > };