Re: [PATCH] drm/amdgpu: re-create idle bo's PTE during VM state machine reset

2023-12-18 Thread Christian König
Am 19.12.23 um 08:00 schrieb ZhenGuo Yin: Idle bo's PTE needs to be re-created when resetting VM state machine. Set idle bo's vm_bo as moved to mark it as invalid. Fixes: 55bf196f60df ("drm/amdgpu: reset VM when an error is detected") Signed-off-by: ZhenGuo Yin Good catch, Reviewed-by: Christ

Re: [PATCH] drm/amdgpu: Let KFD sync with VM fences

2023-12-18 Thread Christian König
Am 19.12.23 um 08:51 schrieb Christian König: Am 18.12.23 um 22:21 schrieb Felix Kuehling: Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update

Re: [PATCH] drm/amdgpu: Let KFD sync with VM fences

2023-12-18 Thread Christian König
Am 18.12.23 um 22:21 schrieb Felix Kuehling: Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tables for VM mappings managed through re

[PATCH] drm/amd/display: Adjust kdoc for 'dcn35_hw_block_power_down' & 'dcn35_hw_block_power_up'

2023-12-18 Thread Srinivasan Shanmugam
Fixes the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn35/dcn35_hwseq.c:1124: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Cc: Charlene Liu Cc: Muhammad Ahmed Cc: Hamza Mahfooz Cc: Rodrigo Si

[PATCH] drm/amd/display: Address function parameter 'context' not described in 'dc_state_rem_all_planes_for_stream' & 'populate_subvp_cmd_drr_info'

2023-12-18 Thread Srinivasan Shanmugam
Fixes the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_state.c:524: warning: Function parameter or member 'state' not described in 'dc_state_rem_all_planes_for_stream' drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_state.c:524: warning: Excess function parameter 'c

[PATCH] drm/amdgpu: re-create idle bo's PTE during VM state machine reset

2023-12-18 Thread ZhenGuo Yin
Idle bo's PTE needs to be re-created when resetting VM state machine. Set idle bo's vm_bo as moved to mark it as invalid. Fixes: 55bf196f60df ("drm/amdgpu: reset VM when an error is detected") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 + 1 file changed, 1 insertio

[PATCH] drm/amdgpu: Check resize bar register when system uses large bar

2023-12-18 Thread Ma Jun
Print a warnning message if the system can't access the resize bar register when using large bar. Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drive

[PATCH] drm/amdgpu: Let KFD sync with VM fences

2023-12-18 Thread Felix Kuehling
Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tables for VM mappings managed through render nodes. Signed-off-by: Felix Kuehling ---

Re: [PATCH v3 1/2] drm/buddy: Implement tracking clear page feature

2023-12-18 Thread kernel test robot
ge feature config: arc-randconfig-001-20231215 (https://download.01.org/0day-ci/archive/20231218/202312180258.cty6xurg-...@intel.com/config) compiler: arc-elf-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231218/202312180258.cty6xurg-...@intel.com/repr

RE: [PATCH Review 1/1] drm/amdgpu: Fix ecc irq enable/disable unpaired

2023-12-18 Thread Yang, Stanley
[AMD Official Use Only - General] For mode2 reset, only call SDMA/GFX suspend to disable SDMA/GFX ecc_irq, driver just need enable SDMA/GFX ecc_irq during resume process. Think about below scenario on aqua vanjaram, user modprobe amdgpu with reset_method=3, driver will do GPU recovery if the SDM

RE: [PATCH 4/4] drm/amd/pm: smu v13_0_6 supports ecc info by default

2023-12-18 Thread Zhang, Hawking
[AMD Official Use Only - General] Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Chai, Thomas Sent: Monday, December 18, 2023 15:10 To: amd-gfx@lists.freedesktop.org Cc: Chai, Thomas ; Zhang, Hawking ; Zhou1, Tao ; Li, Candice ; Yang, Stanley ; Chai, T

Re: Regression in 6.6: trying to set DPMS mode kills radeon (r600)

2023-12-18 Thread Holger Hoffstätte
On 2023-12-16 18:36, Holger Hoffstätte wrote: The affected machine is an older SandyBridge dektop with a fanless r600 Redwood GPU, using the radeon driver. "Recently" - some time after the last few 6.6.x stable updates - it started to die with GPU lockups. I first blamed this on standby/resume

Re: regression/bisected/6.7rc1: Instead of desktop I see a horizontal flashing bar with a picture of the desktop background on white screen

2023-12-18 Thread Mikhail Gavrilov
On Fri, Dec 15, 2023 at 9:14 PM Hamza Mahfooz wrote: > > Can you try the following patch with old fw (version 0x07002100 should > be fine)?: https://patchwork.freedesktop.org/patch/572298/ > Tested-by: Mikhail Gavrilov on 7900XTX hardware. Can I ask? What does SubVP actually do? I read on Phoro

RE: [PATCH Review 1/1] drm/amdgpu: Fix ecc irq enable/disable unpaired

2023-12-18 Thread Yang, Stanley
[AMD Official Use Only - General] Yes, we can only call gfx/sdma ras late init in aldebaran_mode2_restore_ip, will update. Regards, Stanley > -Original Message- > From: Zhang, Hawking > Sent: Monday, December 18, 2023 8:37 PM > To: Yang, Stanley ; amd-gfx@lists.freedesktop.org > Subject

Re: [PATCH v3 3/3] drm/amd: Retry delayed work handler if sensor is busy

2023-12-18 Thread Lazar, Lijo
On 12/16/2023 1:25 AM, Mario Limonciello wrote: The SW CTF delayed work handler triggers a shutdown if a sensor read failed for any reason. The specific circumstance of a busy sensor should be retried however to ensure that a good value can be returned. Signed-off-by: Mario Limonciello --- d

[PATCH 1/4] drm/amdgpu: MCA supports recording umc address information

2023-12-18 Thread YiPeng Chai
MCA supports recording umc address information. V2: Move err_addr variable from struct ras_err_node to struct ras_err_info. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c | 13 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 22 +++--- drivers

[PATCH 3/4] drm/amdgpu: Add umc page retirement for umc v12_0

2023-12-18 Thread YiPeng Chai
Add umc page retirement for umc v12_0. V2: 1. Changed umc page retirement check condition to call umc_v12_0_is_uncorrectable_error. 2. Use memset to clear the contents of the umc error address structure. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 56 ++

[PATCH 4/4] drm/amd/pm: smu v13_0_6 supports ecc info by default

2023-12-18 Thread YiPeng Chai
smu v13_0_6 supports ecc info by default. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c i

RE: [PATCH Review 1/1] drm/amdgpu: Fix ecc irq enable/disable unpaired

2023-12-18 Thread Zhang, Hawking
[AMD Official Use Only - General] In such case, can we call amdgpu_gfx_ras_late_init and amdgpu_sdma_ras_late_init in aldebaran_mode2_restore_ip? Regards, Hawking -Original Message- From: Yang, Stanley Sent: Monday, December 18, 2023 17:30 To: Zhang, Hawking ; amd-gfx@lists.freedesktop

[PATCH 2/4] drm/amdgpu: Add poison mode check error condition for umc v12_0

2023-12-18 Thread YiPeng Chai
Add poison mode check error condition for umc v12_0. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c| 20 ++- drivers/gpu/drm/amd/amdgpu/umc_v12_0.h| 4 ++-- .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 4 ++-- 3 files changed, 19 insert

RE: [PATCH Review 1/1] drm/amdgpu: Fix ecc irq enable/disable unpaired

2023-12-18 Thread Zhang, Hawking
[AMD Official Use Only - General] Can we put the irq resume in amdgpu_ras_resume? Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Stanley.Yang Sent: Saturday, December 16, 2023 00:50 To: amd-gfx@lists.freedesktop.org Cc: Yang, Stanley Subject: [PATCH Review 1/1] drm/amdg

Re: Crashes under Xen with Radeon graphics card

2023-12-18 Thread Juergen Gross
On 15.12.23 17:19, Deucher, Alexander wrote: [AMD Official Use Only - General] -Original Message- From: Juergen Gross Sent: Friday, December 15, 2023 11:13 AM To: Deucher, Alexander ; lkml ; xen-de...@lists.xenproject.org; amd- g...@lists.freedesktop.org Cc: Koenig, Christian ; Pan, Xi

Re: [PATCH] gpu: drm: amd: fixed typos

2023-12-18 Thread Ghanshyam Agrawal
On Fri, Dec 15, 2023 at 9:28 PM Alex Deucher wrote: > > On Fri, Dec 15, 2023 at 3:40 AM Ghanshyam Agrawal > wrote: > > > > On Fri, Dec 15, 2023 at 10:59 AM Randy Dunlap wrote: > > > > > > Hi-- > > > > > > On 12/14/23 21:20, Ghanshyam Agrawal wrote: > > > > Fixed multiple typos in atomfirmware.h