[PATCH v2] drm/amdkfd: dqm fence memory corruption

2021-01-28 Thread Qu Huang
memory, but microcode does write 8 bytes of memory, so there is a memory corruption. Changes since v1: * Change dqm->fence_addr as a u64 pointer to fix this issue, also fix up query_status and amdkfd_fence_wait_timeout function uses 64 bit fence value to make them consistent. Signed-off-by:

[PATCH] drm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug

2021-03-21 Thread Qu Huang
ull) [ 1272.884564] RSP [ 1272.884566] CR2: 0000 Signed-off-by: Qu Huang --- drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c index 51171

Re: [PATCH] drm/amdkfd: dqm fence memory corruption

2021-03-26 Thread Qu Huang
On 2021/1/28 5:50, Felix Kuehling wrote: Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: Amdgpu driver uses 4-byte data type as DQM fence memory, and transmits GPU address of fence memory to microcode through query status PM4 message. However, query status PM4 message definition and microcode

[PATCH] drm/amdgpu: Fix a potential sdma invalid access

2021-04-01 Thread Qu Huang
ll get an invalid address in amdgpu_fill_buffer(), resulting in a VMFAULT or memory corruption. To avoid it, we have to hold bo->base.resv lock first, and check whether the mem.mem_type is TTM_PL_VRAM. Signed-off-by: Qu Huang --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 ++-- 1 fil

Re: [PATCH] drm/amdgpu: Fix a potential sdma invalid access

2021-04-02 Thread Qu Huang
Hi Christian, On 2021/4/3 0:25, Christian König wrote: Hi Qu, Am 02.04.21 um 05:18 schrieb Qu Huang: Before dma_resv_lock(bo->base.resv, NULL) in amdgpu_bo_release_notify(), the bo->base.resv lock may be held by ttm_mem_evict_first(), That can't happen since when bo_release_notif

Re: [PATCH] drm/amdgpu: Fix a potential sdma invalid access

2021-04-05 Thread Qu Huang
Hi Christian, On 2021/4/3 16:49, Christian König wrote: Hi Qu, Am 03.04.21 um 07:08 schrieb Qu Huang: Hi Christian, On 2021/4/3 0:25, Christian König wrote: Hi Qu, Am 02.04.21 um 05:18 schrieb Qu Huang: Before dma_resv_lock(bo->base.resv, NULL) in amdgpu_bo_release_notify(), the

[PATCH] drm/amdkfd: dqm fence memory corruption

2021-01-27 Thread Qu Huang
memory, but microcode does write 8 bytes of memory, so there is a memory corruption. Signed-off-by: Qu Huang --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers