commit 4220d2c7c41b ("drm/amdgpu: remove is_mes_queue flag") set
ring->adev->ring[ring-idx] as NULL at the end of function amdgpu_ring_fini()
which will cause function amdgpu_fence_driver_sw_fini() skip
drm_sched_fini() and free fence_drv.fence then cause memory leak.
Remove set rings[ring->idx] a
commit 4220d2c7c41b ("drm/amdgpu: remove is_mes_queue flag") set
ring->adev->ring[ring-idx] as NULL at the end of function amdgpu_ring_fini()
which will cause function amdgpu_fence_driver_sw_fini() skip
drm_sched_fini() and free fence_drv.fence then cause memory leak.
Release these resource at the
When Application A submits jobs (a1, a2, a3) and application B submits
job b1 with a dependency on a2's scheduler fence, killing application A
before run_job(a1) causes drm_sched_entity_kill_jobs_work() to force
signal all jobs sequentially. However, due to missing work_run_job or
work_free_job in
patch "daf823f1d0cd drm/amdgpu: Remove duplicated "context still
alive" check" removed ctx put, which will cause amdgpu_ctx_fini()
cannot be called and then cause some finished fence that added by
amdgpu_ctx_add_fence() cannot be released and cause memleak.
Signed-off-by: Lin.Cao
---
drivers/gpu
Cleaner shader will cause function level reset when run compute
benchmark and gfx benchmark at same time in multi vf environment.
Disable cleaner shader in multi vf environment.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion
use job && job->vm to check ib has vmid and use job && job->vmid to
check if switch buffer should be emitted
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 7 ++-
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
b/driv
'commit 6e66dc05b54f ("drm/amdgpu: set the VM pointer to NULL in
amdgpu_job_prepare")' set job->vm as NULL if there is no fence. It will
cause emit switch buffer be skippen if job->vm set as NULL.
Check job rather than vm could solve this problem.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/
SRIOV do not need to forece reprogram HW state on init which should be
set from host side.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_
If buddy manager have more than one roots and each root have sub-block
need to be free. When drm_buddy_fini called, the first loop of
force_merge will merge and free all of the sub block of first root,
which offset is 0x0 and size is biggest(more than have of the mm size).
In subsequent force_merge
If buddy manager have more than one roots and each root have sub-block
need to be free. When drm_buddy_fini called, the first loop of
force_merge will merge and free all of the sub block of first root,
which offset is 0x0 and size is biggest(more than have of the mm size).
In subsequent force_merge
Flag "mes.ring.shced.ready" will be set as true after mes hw init and set
as false when mes hw fini to avoid duplicate initialization. But hw fini
will not be called when function level reset, which will cause mes hw
init be skipped during FLR, which will leads to mapping legacy queue
fail. Set thi
In interrupt context, write dbg_ev_file will be run by work queue. It
will cause write dbg_ev_file execution after debug_trap_disable, which
will cause NULL pointer access.
v2: cancel work "debug_event_workarea" before set dbg_ev_file as NULL.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdkf
In interrupt context, write dbg_ev_file will be run by work queue. It
will cause write dbg_ev_file execution after debug_trap_disable, which
will cause NULL pointer access.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
pp_dpm_*clk should be set as read only for SRIOV one VF mode, remove
S_IWUGO flag and _store function of these debugfs in one VF mode.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/pm/amdgpu_pm.c | 10 +-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/p
GFX doorbell range should be set after flr otherwise the gfx doorbell
range will be overlap with MEC.
v2: remove "amdgpu_sriov_vf" and "amdgpu_in_reset" check, and add grbm
select for the case of 2 gfx rings.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 +++
1 file
GFX doorbell range should be set after flr otherwise the GFX doorbell
range will overlap with MEC.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_
In SR-IOV environment, the value of pcie_table->num_of_link_levels will
be 0, and num_of_levels - 1 will cause array index out of bounds
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/s
Remove restriction of sriov max_pfn so that TBA and TMA can move to high
47 bits address.
Regression test: change range alloc flag of libdrm as
AMDGPU_VA_RANGE_HIGH and there is no flr occur when testing amdgpu_test
of drm.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 ++
JPEG init header will overwirte vcn init header info which will
loss some debug information
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
i
Return -EINVAL when MMSCH init fail which can be handle by function
amdgpu_device_reset_sriov correctly.
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
b/drivers/gpu
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 10 ++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
index 8b6b2bd5c148..ed937f70895c 100644
--- a/drivers/gpu/d
Signed-off-by: Lin.Cao
Change-Id: I1322c010d1428b2c1df5080b72da94e90cf17fec
---
drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 12
1 file changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h
inde
v1: Vmbo->shadow is used to back vram bo up when vram lost. So that we
should set shadow as vmbo->shadow to recover vmbo->bo
v2: Modify if(vmbo->shadow) shadow = vmbo->shadow as if(!vmbo->shadow)
continue;
Fix: 'commit e18aaea733da ("drm/amdgpu: move shadow_list to amdgpu_bo_vm")'
Signed-off-by: L
Vmbo->shadow is used to back vram bo up when vram lost. So that we should set
shadow as vmbo->shadow to recover vmbo->bo.
Fix: 'commit e18aaea733da ("drm/amdgpu: move shadow_list to amdgpu_bo_vm")'
Signed-off-by: Lin.Cao
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +++-
1 file change
In the case of SRIOV, the register smnMp1_PMI_3_FIFO will get an invalid
value which will cause the "shift out of bound". In Ubuntu22.04, this
issue will be checked an related call trace will be reported in dmesg.
Signed-off-by: lin cao
---
drivers/gpu/drm/amd/pm/s
25 matches
Mail list logo