[PATCH v3] drm/amdgpu: reset vm state machine after gpu reset(vram lost)

2024-07-23 Thread ZhenGuo Yin
token. v2: Check vm->generation instead of calling drm_sched_entity_error in amdgpu_vm_validate. v3: Use new generation token instead of vram_lost_counter for check. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-)

[PATCH v2] drm/amdgpu: reset vm state machine after gpu reset(vram lost)

2024-07-22 Thread ZhenGuo Yin
2: Check vm->generation instead of calling drm_sched_entity_error in amdgpu_vm_validate. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/

[PATCH] drm/amdgpu: reset vm state machine after gpu reset(vram lost)

2024-07-19 Thread ZhenGuo Yin
r of the device. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 3abfa66d72a2..fd7f912816dc 100644 --- a/drivers

[PATCH] drm/amdgpu: clear set_q_mode_offs when VM changed

2024-04-01 Thread ZhenGuo Yin
d to avoid SET_Q_MODE packets v2") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 7a906318e451..c11c6299711e 100644 --- a/drivers/gpu

[PATCH] drm/amd/amdgpu: add pipe1 hardware support

2024-03-14 Thread ZhenGuo Yin
Enable pipe1 support starting from SIENNA CICHLID asic. Need to use correct ref/mask for pipe1 hdp flush. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2117 Fixes: 085292c3d780 ("Revert "drm/amd/amdgpu: add pipe1 hardware support"") Signed-off-by: ZhenGuo Yin ---

[PATCH] drm/amdgpu: Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOV

2024-03-06 Thread ZhenGuo Yin
[Why] RLCG interface returns "out-of-range" error under SRIOV VF when accessing PF-only registers. [How] Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOV. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 8 ++-- drivers/gpu/drm/amd/amdgpu/gfx

[PATCH] drm/amdgpu: re-create idle bo's PTE during VM state machine reset

2023-12-18 Thread ZhenGuo Yin
Idle bo's PTE needs to be re-created when resetting VM state machine. Set idle bo's vm_bo as moved to mark it as invalid. Fixes: 55bf196f60df ("drm/amdgpu: reset VM when an error is detected") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1

[PATCH] drm/amdgpu: Skip access gfx11 golden registers under SRIOV

2023-11-23 Thread ZhenGuo Yin
[Why] Golden registers are PF-only registers on gfx11. RLCG interface will return "out-of-range" under SRIOV VF. [How] Skip access gfx11 golden registers under SRIOV. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 +++ 1 file changed, 3 insertions(+) di

[PATCH v3] drm/amdkfd: Free gang_ctx_bo and wptr_bo in pqm_uninit

2023-11-22 Thread ZhenGuo Yin
[Why] Memory leaks of gang_ctx_bo and wptr_bo. [How] Free gang_ctx_bo and wptr_bo in pqm_uninit. v2: add a common function pqm_clean_queue_resource to free queue's resources. v3: reset pdd->pqd.num_gws when destorying GWS queue. Signed-off-by: ZhenGuo Yin --- .../am

[PATCH v2] drm/amdkfd: Free gang_ctx_bo and wptr_bo in pqm_uninit

2023-11-19 Thread ZhenGuo Yin
[Why] Memory leaks of gang_ctx_bo and wptr_bo. [How] Free gang_ctx_bo and wptr_bo in pqm_uninit. v2: add a common function pqm_clean_queue_resource to free queue's resources. Signed-off-by: ZhenGuo Yin --- .../amd/amdkfd/kfd_process_queue_manager.c| 46 ++- 1 file ch

[PATCH] drm/amdkfd: Free gang_ctx_bo and wptr_bo in pqm_uninit

2023-11-06 Thread ZhenGuo Yin
[Why] There will be a warning trace when cleaning up the gtt drm_mm allocator during unloading driver since gang_ctx_bo and wptr_bo do not get freed. [How] Free gang_ctx_bo and wptr_bo in pqm_uninit. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 8

[PATCH v2] drm/amdgpu: access RLC_SPM_MC_CNTL through MMIO in SRIOV runtime

2023-08-28 Thread ZhenGuo Yin
Register RLC_SPM_MC_CNTL is not blocked by L1 policy, VF can directly access it through MMIO during SRIOV runtime. v2: use SOC15 interface to access registers Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 +++-- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 13

[PATCH] drm/amdgpu: access RLC_SPM_MC_CNTL through MMIO in SRIOV

2023-08-27 Thread ZhenGuo Yin
Register RLC_SPM_MC_CNTL is not blocked by L1 policy, VF can directly access it through MMIO. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 10 ++ drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 10 ++ 2 files changed, 4 insertions(+), 16 deletions(-) diff

[PATCH] drm/amdgpu: add entity error check in amdgpu_ctx_get_entity

2023-05-11 Thread ZhenGuo Yin
[Why] UMD is not aware of entity error, and will keep submitting jobs into the error entity. [How] Add entity error check when getting entity from ctx. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff

[PATCH 1/2] drm/amdgpu: set finished fence error if job timedout

2023-05-09 Thread ZhenGuo Yin
Set finished fence to ETIME error if job timedout. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 57f8f8b3cd8a..f2c02e4167fe

[PATCH 2/2] drm/scheduler: avoid infinite loop if entity's dependency is a scheduled error fence

2023-05-09 Thread ZhenGuo Yin
error fence, add drm_sched_entity_wakeup callback for the dependency with scheduled error fence. Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/scheduler/sched_entity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/sche

[PATCH] drm/amdgpu: update documentation of parameter amdgpu_gtt_size

2022-11-18 Thread ZhenGuo Yin
Fixes: f7ba887f606b ("drm/amdgpu: Adjust logic around GTT size (v3)") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/am

[PATCH] drm/amd/pm: Init pm_attr_list when dpm is disabled

2022-10-12 Thread ZhenGuo Yin
roups+0x20/0x90 [amdgpu] Call Trace: amdgpu_pm_sysfs_fini+0x2f/0x40 [amdgpu] amdgpu_device_fini_hw+0xdf/0x290 [amdgpu] [How] List pm_attr_list should be initialized when dpm is disabled. Fiexes:894483d76ada ("drm/amd/pm: Remove redundant check condition") Signed-off-by: ZhenGuo Yi

[PATCH v2] drm/ttm: update bulk move object of ghost BO

2022-09-06 Thread ZhenGuo Yin
function ttm_bo_move_to_lru_tail_unlocked. v2: set bulk move to NULL manually if no resource associated with ghost BO Fixed: 5b951e487fd6bf5f ("drm/ttm: fix bulk move handling v2") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +++ 1 file changed, 3 insertions(+)

[PATCH] drm/ttm: update bulk move object of ghost BO

2022-09-01 Thread ZhenGuo Yin
function ttm_bo_move_to_lru_tail_unlocked. Fixed:·5b951e487fd6bf5f·("drm/ttm:·fix·bulk·move·handling·v2") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/ttm/ttm_bo_util.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_ut

[PATCH v2] drm/amdgpu: fix scratch register access method in SRIOV

2022-06-05 Thread ZhenGuo Yin
The scratch register should be accessed through MMIO instead of RLCG in SRIOV, since it being used in RLCG register access function. Fixes: 0e1314781b9c("drm/amdgpu: nuke dynamic gfx scratch reg allocation") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7

[PATCH] drm/amdgpu: fix scratch register access method in SRIOV

2022-06-01 Thread ZhenGuo Yin
The scratch register should be accessed through MMIO instead of RLCG in SRIOV, since it being used in RLCG register access function. Fixes: 0e1314781b9c("drm/amdgpu: nuke dynamic gfx scratch reg allocation") --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 ++-- 1 file changed, 2 insertions(+), 2 d