RE: [PATCH 1/1] drm/amdgpu: Optimize KFD page table reservation

2019-11-25 Thread Pan, Xinhui
Really cool patch!

Reviewed-by: xinhui pan 

-Original Message-
From: Kuehling, Felix  
Sent: 2019年11月26日 3:35
To: amd-gfx@lists.freedesktop.org; Pan, Xinhui 
Subject: [PATCH 1/1] drm/amdgpu: Optimize KFD page table reservation

Be less pessimistic about estimated page table use for KFD. Most allocations 
use 2MB pages and therefore need less VRAM for page tables. This allows more 
VRAM to be used for applications especially on large systems with many GPUs and 
hundreds of GB of system memory.

Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB Old page 
table reservation per GPU:  1GB New page table reservation per GPU: 32MB

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index a1ed8a8e3752..e43a95514b41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -105,11 +105,24 @@ void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
(kfd_mem_limit.max_ttm_mem_limit >> 20));  }
 
+/* Estimate page table size needed to represent a given memory size
+ *
+ * With 4KB pages, we need one 8 byte PTE for each 4KB of memory
+ * (factor 512, >> 9). With 2MB pages, we need one 8 byte PTE for 2MB
+ * of memory (factor 256K, >> 18). ROCm user mode tries to optimize
+ * for 2MB pages for TLB efficiency. However, small allocations and
+ * fragmented system memory still need some 4KB pages. We choose a
+ * compromise that should work in most cases without reserving too
+ * much memory for page tables unnecessarily (factor 16K, >> 14).
+ */
+#define ESTIMATE_PT_SIZE(mem_size) ((mem_size) >> 14)
+
 static int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,
uint64_t size, u32 domain, bool sg)
 {
+   uint64_t reserved_for_pt =
+   ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
size_t acc_size, system_mem_needed, ttm_mem_needed, vram_needed;
-   uint64_t reserved_for_pt = amdgpu_amdkfd_total_mem_size >> 9;
int ret = 0;
 
acc_size = ttm_bo_dma_acc_size(&adev->mman.bdev, size,
--
2.24.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] MAINTAINERS: Add Xinhui Pan as another AMDGPU contact

2021-05-10 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: xinhui pan 

From: Christian König 
Sent: Wednesday, May 5, 2021 7:01:46 PM
To: Pan, Xinhui ; Deucher, Alexander 
; amd-gfx@lists.freedesktop.org 

Subject: [PATCH] MAINTAINERS: Add Xinhui Pan as another AMDGPU contact

Since Chunming Zhou left AMD last year we are down to only
two maintainers once more. So add Xinhu Pan as another
contact as well.

Signed-off-by: Christian König 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 64ed8b77cfa9..e2cb5a8acdf1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14970,6 +14970,7 @@ F:  drivers/net/wireless/quantenna
 RADEON and AMDGPU DRM DRIVERS
 M:  Alex Deucher 
 M:  Christian König 
+M: Pan, Xinhui 
 L:  amd-gfx@lists.freedesktop.org
 S:  Supported
 T:  git 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fagd5f%2Flinux.git&data=04%7C01%7CXinhui.Pan%40amd.com%7C125c7d582eee46ba134408d90fb52e94%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637558093114409974%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CMIlY5Lj2Yqvusz2HwmGtwX71KgjYHU0MnRLtbD8Kso%3D&reserved=0
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


回复: [PATCH] drm/amdgpu: fix PM reference leak in amdgpu_debugfs_gfxoff_rea()

2021-05-17 Thread Pan, Xinhui
[AMD Official Use Only]

thanks Kuai.
But code below matches the other code block in this file.

r = pm_runtime_get_sync(dev->dev);
if (r < 0) {
pm_runtime_put_autosuspend(dev->dev);
return r;
}


发件人: Yu Kuai 
发送时间: 2021年5月17日 16:16
收件人: Deucher, Alexander; Koenig, Christian; Pan, Xinhui; airl...@linux.ie; 
dan...@ffwll.ch
抄送: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
linux-ker...@vger.kernel.org; yuku...@huawei.com; yi.zh...@huawei.com
主题: [PATCH] drm/amdgpu: fix PM reference leak in amdgpu_debugfs_gfxoff_rea()

pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.

Reported-by: Hulk Robot 
Signed-off-by: Yu Kuai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index bcaf271b39bf..eb7f9d20dad7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1058,7 +1058,7 @@ static ssize_t amdgpu_debugfs_gfxoff_read(struct file *f, 
char __user *buf,
if (size & 0x3 || *pos & 0x3)
return -EINVAL;

-   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
+   r = pm_runtime_resume_and_get(adev_to_drm(adev)->dev);
if (r < 0)
return r;

--
2.25.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-18 Thread Pan, Xinhui
[AMD Official Use Only]

To observe the issue. I made one kfdtest case for debug.
It just alloc a userptr memory and detect if memory is corrupted.
I can hit this failure in 2 minutes. :(

diff --git a/tests/kfdtest/src/KFDMemoryTest.cpp 
b/tests/kfdtest/src/KFDMemoryTest.cpp
index 70c8033..a72f53f 100644
--- a/tests/kfdtest/src/KFDMemoryTest.cpp
+++ b/tests/kfdtest/src/KFDMemoryTest.cpp
@@ -584,6 +584,32 @@ TEST_F(KFDMemoryTest, ZeroMemorySizeAlloc) {
 TEST_END
 }

+TEST_F(KFDMemoryTest, swap) {
+TEST_START(TESTPROFILE_RUNALL)
+
+unsigned int size = 128<<20;
+unsigned int*tmp = (unsigned int *)mmap(0,
+   size,
+   PROT_READ | PROT_WRITE,
+   MAP_ANONYMOUS | MAP_PRIVATE,
+   -1,
+   0);
+EXPECT_NE(tmp, MAP_FAILED);
+
+LOG() << "pls run this with KFDMemoryTest.LargestSysBufferTest" << 
std::endl;
+do {
+   memset(tmp, 0xcc, size);
+
+   HsaMemoryBuffer buf(tmp, size);
+   sleep(1);
+   EXPECT_EQ(tmp[0], 0x);
+} while (true);
+
+munmap(tmp, size);
+
+TEST_END
+}
+
 // Basic test for hsaKmtAllocMemory
 TEST_F(KFDMemoryTest, MemoryAlloc) {
 TEST_START(TESTPROFILE_RUNALL)
--
2.25.1

____________
发件人: Pan, Xinhui 
发送时间: 2021年5月19日 10:28
收件人: amd-gfx@lists.freedesktop.org
抄送: Kuehling, Felix; Deucher, Alexander; Koenig, Christian; 
dri-de...@lists.freedesktop.org; dan...@ffwll.ch; Pan, Xinhui
主题: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

cpu 1   cpu 2
kfd alloc BO A(userptr) alloc BO B(GTT)
->init -> validate  -> init -> validate -> populate
init_user_pages   -> swapout BO A //hit ttm 
pages limit
-> get_user_pages (fill up ttm->pages)
 -> validate -> populate
  -> swapin BO A // Now hit the BUG

We know that get_user_pages may race with swapout on same BO.
Threre are some issues I have met.
1) memory corruption.
This is because we do a swap before memory is setup. ttm_tt_swapout()
just create a swap_storage with its content being 0x0. So when we setup
memory after the swapout. The following swapin makes the memory
corrupted.

2) panic
When swapout happes with get_user_pages, they touch ttm->pages without
anylock. It causes memory corruption too. But I hit page fault mostly.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 928e8d57cd08..42460e4480f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -835,6 +835,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr)
struct amdkfd_process_info *process_info = mem->process_info;
struct amdgpu_bo *bo = mem->bo;
struct ttm_operation_ctx ctx = { true, false };
+   struct page **pages;
int ret = 0;

mutex_lock(&process_info->lock);
@@ -852,7 +853,13 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr)
goto out;
}

-   ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages);
+   pages = kvmalloc_array(bo->tbo.ttm->num_pages,
+   sizeof(struct page *),
+   GFP_KERNEL | __GFP_ZERO);
+   if (!pages)
+   goto unregister_out;
+
+   ret = amdgpu_ttm_tt_get_user_pages(bo, pages);
if (ret) {
pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
goto unregister_out;
@@ -863,6 +870,12 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr)
pr_err("%s: Failed to reserve BO\n", __func__);
goto release_out;
}
+
+   WARN_ON_ONCE(bo->tbo.ttm->page_flags & TTM_PAGE_FLAG_SWAPPED);
+
+   memcpy(bo->tbo.ttm->pages,
+   pages,
+   sizeof(struct page*) * bo->tbo.ttm->num_pages);
amdgpu_bo_placement_from_domain(bo, mem->domain);
ret = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (ret)
@@ -872,6 +885,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr)
 release_out:
amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
 unregister_out:
+   kvfree(pages);
if (ret)
amdgpu_mn_unregister(bo);
 out:
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-18 Thread Pan, Xinhui
[AMD Official Use Only]

yes, we really dont swapout SG BOs.
The problems is that before we validate a userptr BO, we create this BO in CPU 
domain by default. So this BO has chance to swapout.

we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late.
I have not try to revert Chris' patch as I think it desnt help. Or I can have a 
try later.


发件人: Kuehling, Felix 
发送时间: 2021年5月19日 11:29
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org; 
dan...@ffwll.ch
主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
swapin

Swapping SG BOs makes no sense, because TTM doesn't own the pages of
this type of BO.

Last I checked, userptr BOs (and other SG BOs) were protected from
swapout by the fact that they would not be added to the swap-LRU. But it
looks like Christian just removed the swap-LRU. I guess this broke that
protection:

commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
Author: Christian König 
Date:   Tue Oct 6 16:30:09 2020 +0200

 drm/ttm: remove swap LRU v3

 Instead evict round robin from each devices SYSTEM and TT domain.

 v2: reorder num_pages access reported by Dan's script
 v3: fix rebase fallout, num_pages should be 32bit

 Signed-off-by: Christian König 
 Tested-by: Nirmoy Das 
 Reviewed-by: Huang Rui 
 Reviewed-by: Matthew Auld 
 Link: https://patchwork.freedesktop.org/patch/424009/

Regards,
   Felix


On 2021-05-18 10:28 p.m., xinhui pan wrote:
> cpu 1   cpu 2
> kfd alloc BO A(userptr) alloc BO B(GTT)
>  ->init -> validate   -> init -> validate -> 
> populate
>  init_user_pages-> swapout BO A //hit ttm 
> pages limit
>   -> get_user_pages (fill up ttm->pages)
>-> validate -> populate
>-> swapin BO A // Now hit the BUG
>
> We know that get_user_pages may race with swapout on same BO.
> Threre are some issues I have met.
> 1) memory corruption.
> This is because we do a swap before memory is setup. ttm_tt_swapout()
> just create a swap_storage with its content being 0x0. So when we setup
> memory after the swapout. The following swapin makes the memory
> corrupted.
>
> 2) panic
> When swapout happes with get_user_pages, they touch ttm->pages without
> anylock. It causes memory corruption too. But I hit page fault mostly.
>
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 928e8d57cd08..42460e4480f8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -835,6 +835,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
> user_addr)
>   struct amdkfd_process_info *process_info = mem->process_info;
>   struct amdgpu_bo *bo = mem->bo;
>   struct ttm_operation_ctx ctx = { true, false };
> + struct page **pages;
>   int ret = 0;
>
>   mutex_lock(&process_info->lock);
> @@ -852,7 +853,13 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
> user_addr)
>   goto out;
>   }
>
> - ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages);
> + pages = kvmalloc_array(bo->tbo.ttm->num_pages,
> + sizeof(struct page *),
> + GFP_KERNEL | __GFP_ZERO);
> + if (!pages)
> + goto unregister_out;
> +
> + ret = amdgpu_ttm_tt_get_user_pages(bo, pages);
>   if (ret) {
>   pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
>   goto unregister_out;
> @@ -863,6 +870,12 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
> user_addr)
>   pr_err("%s: Failed to reserve BO\n", __func__);
>   goto release_out;
>   }
> +
> + WARN_ON_ONCE(bo->tbo.ttm->page_flags & TTM_PAGE_FLAG_SWAPPED);
> +
> + memcpy(bo->tbo.ttm->pages,
> + pages,
> + sizeof(struct page*) * bo->tbo.ttm->num_pages);
>   amdgpu_bo_placement_from_domain(bo, mem->domain);
>   ret = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>   if (ret)
> @@ -872,6 +885,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
> user_addr)
>   release_out:
>   amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
>   unregister_out:
> + kvfree(pages);
>   if (ret)
>   amdgpu_mn_unregister(bo);
>   out:
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-18 Thread Pan, Xinhui
[AMD Official Use Only]

I have reverted Chris'  patch, still hit this failure.
Just see two lines in Chris' patch. Any BO in cpu domian would be swapout 
first. That is why we hit this issue frequently now. But the bug is there long 
time ago.

-   for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-   list_for_each_entry(bo, &glob->swap_lru[i], swap) {
[snip]
+   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
+   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {


________
发件人: Pan, Xinhui 
发送时间: 2021年5月19日 12:09
收件人: Kuehling, Felix; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org; 
dan...@ffwll.ch
主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
swapin

yes, we really dont swapout SG BOs.
The problems is that before we validate a userptr BO, we create this BO in CPU 
domain by default. So this BO has chance to swapout.

we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late.
I have not try to revert Chris' patch as I think it desnt help. Or I can have a 
try later.


发件人: Kuehling, Felix 
发送时间: 2021年5月19日 11:29
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org; 
dan...@ffwll.ch
主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
swapin

Swapping SG BOs makes no sense, because TTM doesn't own the pages of
this type of BO.

Last I checked, userptr BOs (and other SG BOs) were protected from
swapout by the fact that they would not be added to the swap-LRU. But it
looks like Christian just removed the swap-LRU. I guess this broke that
protection:

commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
Author: Christian König 
Date:   Tue Oct 6 16:30:09 2020 +0200

 drm/ttm: remove swap LRU v3

 Instead evict round robin from each devices SYSTEM and TT domain.

 v2: reorder num_pages access reported by Dan's script
 v3: fix rebase fallout, num_pages should be 32bit

 Signed-off-by: Christian König 
 Tested-by: Nirmoy Das 
 Reviewed-by: Huang Rui 
 Reviewed-by: Matthew Auld 
 Link: https://patchwork.freedesktop.org/patch/424009/

Regards,
   Felix


On 2021-05-18 10:28 p.m., xinhui pan wrote:
> cpu 1   cpu 2
> kfd alloc BO A(userptr) alloc BO B(GTT)
>  ->init -> validate   -> init -> validate -> 
> populate
>  init_user_pages-> swapout BO A //hit ttm 
> pages limit
>   -> get_user_pages (fill up ttm->pages)
>-> validate -> populate
>-> swapin BO A // Now hit the BUG
>
> We know that get_user_pages may race with swapout on same BO.
> Threre are some issues I have met.
> 1) memory corruption.
> This is because we do a swap before memory is setup. ttm_tt_swapout()
> just create a swap_storage with its content being 0x0. So when we setup
> memory after the swapout. The following swapin makes the memory
> corrupted.
>
> 2) panic
> When swapout happes with get_user_pages, they touch ttm->pages without
> anylock. It causes memory corruption too. But I hit page fault mostly.
>
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 928e8d57cd08..42460e4480f8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -835,6 +835,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
> user_addr)
>   struct amdkfd_process_info *process_info = mem->process_info;
>   struct amdgpu_bo *bo = mem->bo;
>   struct ttm_operation_ctx ctx = { true, false };
> + struct page **pages;
>   int ret = 0;
>
>   mutex_lock(&process_info->lock);
> @@ -852,7 +853,13 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
> user_addr)
>   goto out;
>   }
>
> - ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages);
> + pages = kvmalloc_array(bo->tbo.ttm->num_pages,
> + sizeof(struct page *),
> + GFP_KERNEL | __GFP_ZERO);
> + if (!pages)
> + goto unregister_out;
> +
> + ret = amdgpu_ttm_tt_get_user_pages(bo, pages);
>   if (ret) {
>   pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
>   goto unregister_out;
> @@ -863,6 

回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Pan, Xinhui
[AMD Official Use Only]

swapout function create one swap storage which is filled with zero. And set 
ttm->page_flags as TTM_PAGE_FLAG_SWAPPED.  Just because ttm has no backend page 
this time, no real data is swapout to this swap storage.

swapin function is called during populate as TTM_PAGE_FLAG_SWAPPED is set.
Now here is the problem, we swapin data to ttm bakend memory from swap storage. 
That just causes the memory been overwritten.


发件人: Christian König 
发送时间: 2021年5月19日 18:01
收件人: Pan, Xinhui; Kuehling, Felix; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; 
dri-de...@lists.freedesktop.org
主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout 
and swapin

I'm scratching my head how that is even possible.

See when a BO is created in the system domain it is just an empty hull,
e.g. without backing store and allocated pages.

So the swapout function will just ignore it.

Christian.

Am 19.05.21 um 07:07 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> I have reverted Chris'  patch, still hit this failure.
> Just see two lines in Chris' patch. Any BO in cpu domian would be swapout 
> first. That is why we hit this issue frequently now. But the bug is there 
> long time ago.
>
> -   for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -   list_for_each_entry(bo, &glob->swap_lru[i], swap) {
> [snip]
> +   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
> +   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
>
>
> 
> 发件人: Pan, Xinhui 
> 发送时间: 2021年5月19日 12:09
> 收件人: Kuehling, Felix; amd-gfx@lists.freedesktop.org
> 抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org; 
> dan...@ffwll.ch
> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
> swapin
>
> yes, we really dont swapout SG BOs.
> The problems is that before we validate a userptr BO, we create this BO in 
> CPU domain by default. So this BO has chance to swapout.
>
> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late.
> I have not try to revert Chris' patch as I think it desnt help. Or I can have 
> a try later.
>
> 
> 发件人: Kuehling, Felix 
> 发送时间: 2021年5月19日 11:29
> 收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
> 抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org; 
> dan...@ffwll.ch
> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
> swapin
>
> Swapping SG BOs makes no sense, because TTM doesn't own the pages of
> this type of BO.
>
> Last I checked, userptr BOs (and other SG BOs) were protected from
> swapout by the fact that they would not be added to the swap-LRU. But it
> looks like Christian just removed the swap-LRU. I guess this broke that
> protection:
>
> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
> Author: Christian König 
> Date:   Tue Oct 6 16:30:09 2020 +0200
>
>   drm/ttm: remove swap LRU v3
>
>   Instead evict round robin from each devices SYSTEM and TT domain.
>
>   v2: reorder num_pages access reported by Dan's script
>   v3: fix rebase fallout, num_pages should be 32bit
>
>   Signed-off-by: Christian König 
>   Tested-by: Nirmoy Das 
>   Reviewed-by: Huang Rui 
>   Reviewed-by: Matthew Auld 
>   Link: 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F424009%2F&data=04%7C01%7CXinhui.Pan%40amd.com%7Cb4422d8b3e4947cd243c08d91aad14c3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637570152942496679%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=K3%2FnFpN56y8L49UuYRM6SqefVFLnqIwpDAtWpS1XvnQ%3D&reserved=0
>
> Regards,
> Felix
>
>
> On 2021-05-18 10:28 p.m., xinhui pan wrote:
>> cpu 1   cpu 2
>> kfd alloc BO A(userptr) alloc BO B(GTT)
>>   ->init -> validate   -> init -> validate 
>> -> populate
>>   init_user_pages-> swapout BO A //hit ttm 
>> pages limit
>>-> get_user_pages (fill up ttm->pages)
>> -> validate -> populate
>> -> swapin BO A // Now hit the BUG
>>
>> We know that get_user_pages may race with swapout on same BO.
>> Threre are some issues I have met.
>> 1) memory corruption.
>> This is because we do a swap before memory is setup. ttm_tt_swapout()
>> just create a swap_storage with its content

回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Pan, Xinhui
[AMD Official Use Only]

I am not sure if we can create a ttm_bo_type_sg bo for userptr. But I have 
another idea now. we can use flag AMDGPU_AMDKFD_CREATE_USERPTR_BO to create the 
userptr bo.

发件人: Kuehling, Felix 
发送时间: 2021年5月19日 23:11
收件人: Christian König; Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; 
dri-de...@lists.freedesktop.org
主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout 
and swapin

Looks like we're creating the userptr BO as ttm_bo_type_device. I guess
we should be using ttm_bo_type_sg? BTW, amdgpu_gem_userptr_ioctl also
uses ttm_bo_type_device.

Regards,
  Felix


Am 2021-05-19 um 6:01 a.m. schrieb Christian König:
> I'm scratching my head how that is even possible.
>
> See when a BO is created in the system domain it is just an empty
> hull, e.g. without backing store and allocated pages.
>
> So the swapout function will just ignore it.
>
> Christian.
>
> Am 19.05.21 um 07:07 schrieb Pan, Xinhui:
>> [AMD Official Use Only]
>>
>> I have reverted Chris'  patch, still hit this failure.
>> Just see two lines in Chris' patch. Any BO in cpu domian would be
>> swapout first. That is why we hit this issue frequently now. But the
>> bug is there long time ago.
>>
>> -   for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
>> -   list_for_each_entry(bo, &glob->swap_lru[i], swap) {
>> [snip]
>> +   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
>> +   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
>>
>>
>> 
>> 发件人: Pan, Xinhui 
>> 发送时间: 2021年5月19日 12:09
>> 收件人: Kuehling, Felix; amd-gfx@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian;
>> dri-de...@lists.freedesktop.org; dan...@ffwll.ch
>> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to
>> swapout and swapin
>>
>> yes, we really dont swapout SG BOs.
>> The problems is that before we validate a userptr BO, we create this
>> BO in CPU domain by default. So this BO has chance to swapout.
>>
>> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too
>> late.
>> I have not try to revert Chris' patch as I think it desnt help. Or I
>> can have a try later.
>>
>> 
>> 发件人: Kuehling, Felix 
>> 发送时间: 2021年5月19日 11:29
>> 收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian;
>> dri-de...@lists.freedesktop.org; dan...@ffwll.ch
>> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to
>> swapout and swapin
>>
>> Swapping SG BOs makes no sense, because TTM doesn't own the pages of
>> this type of BO.
>>
>> Last I checked, userptr BOs (and other SG BOs) were protected from
>> swapout by the fact that they would not be added to the swap-LRU. But it
>> looks like Christian just removed the swap-LRU. I guess this broke that
>> protection:
>>
>> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
>> Author: Christian König 
>> Date:   Tue Oct 6 16:30:09 2020 +0200
>>
>>   drm/ttm: remove swap LRU v3
>>
>>   Instead evict round robin from each devices SYSTEM and TT domain.
>>
>>   v2: reorder num_pages access reported by Dan's script
>>   v3: fix rebase fallout, num_pages should be 32bit
>>
>>   Signed-off-by: Christian König 
>>   Tested-by: Nirmoy Das 
>>   Reviewed-by: Huang Rui 
>>   Reviewed-by: Matthew Auld 
>>   Link: https://patchwork.freedesktop.org/patch/424009/
>>
>> Regards,
>> Felix
>>
>>
>> On 2021-05-18 10:28 p.m., xinhui pan wrote:
>>> cpu 1   cpu 2
>>> kfd alloc BO A(userptr) alloc BO B(GTT)
>>>   ->init -> validate   -> init ->
>>> validate -> populate
>>>   init_user_pages-> swapout BO A
>>> //hit ttm pages limit
>>>-> get_user_pages (fill up ttm->pages)
>>> -> validate -> populate
>>> -> swapin BO A // Now hit the BUG
>>>
>>> We know that get_user_pages may race with swapout on same BO.
>>> Threre are some issues I have met.
>>> 1) memory corruption.
>>> This is because we do a swap before memory is setup. ttm_tt_swapout()
>>> just create a swap_storage with 

Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-20 Thread Pan, Xinhui
I just sent out patch below yesterday.  swapping unpopulated bo is useless 
indeed.

[RFC PATCH 2/2] drm/ttm: skip swapout when ttm has no backend page.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


回复: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-20 Thread Pan, Xinhui
[AMD Official Use Only]

I just sent out patch below yesterday.  swapping unpopulated bo is useless 
indeed.

[RFC PATCH 2/2] drm/ttm: skip swapout when ttm has no backend page.


发件人: Christian König 
发送时间: 2021年5月20日 14:39
收件人: Pan, Xinhui; Kuehling, Felix; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; 
dri-de...@lists.freedesktop.org
主题: Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to 
swapout and swapin

> swapout function create one swap storage which is filled with zero. And set 
> ttm->page_flags as TTM_PAGE_FLAG_SWAPPED.  Just because ttm has no backend 
> page this time, no real data is swapout to this swap storage.

That's the fundamental problem. A TT object which isn't populated
shouldn't be considered for swapout nor eviction in the first place.

I'm going to take a look later today.

Christian.

Am 20.05.21 um 04:55 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> swapout function create one swap storage which is filled with zero. And set 
> ttm->page_flags as TTM_PAGE_FLAG_SWAPPED.  Just because ttm has no backend 
> page this time, no real data is swapout to this swap storage.
>
> swapin function is called during populate as TTM_PAGE_FLAG_SWAPPED is set.
> Now here is the problem, we swapin data to ttm bakend memory from swap 
> storage. That just causes the memory been overwritten.
>
> ________
> 发件人: Christian König 
> 发送时间: 2021年5月19日 18:01
> 收件人: Pan, Xinhui; Kuehling, Felix; amd-gfx@lists.freedesktop.org
> 抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; 
> dri-de...@lists.freedesktop.org
> 主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout 
> and swapin
>
> I'm scratching my head how that is even possible.
>
> See when a BO is created in the system domain it is just an empty hull,
> e.g. without backing store and allocated pages.
>
> So the swapout function will just ignore it.
>
> Christian.
>
> Am 19.05.21 um 07:07 schrieb Pan, Xinhui:
>> [AMD Official Use Only]
>>
>> I have reverted Chris'  patch, still hit this failure.
>> Just see two lines in Chris' patch. Any BO in cpu domian would be swapout 
>> first. That is why we hit this issue frequently now. But the bug is there 
>> long time ago.
>>
>> -   for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
>> -   list_for_each_entry(bo, &glob->swap_lru[i], swap) {
>> [snip]
>> +   for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
>> +   for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
>>
>>
>> 
>> 发件人: Pan, Xinhui 
>> 发送时间: 2021年5月19日 12:09
>> 收件人: Kuehling, Felix; amd-gfx@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org; 
>> dan...@ffwll.ch
>> 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
>> swapin
>>
>> yes, we really dont swapout SG BOs.
>> The problems is that before we validate a userptr BO, we create this BO in 
>> CPU domain by default. So this BO has chance to swapout.
>>
>> we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late.
>> I have not try to revert Chris' patch as I think it desnt help. Or I can 
>> have a try later.
>>
>> 
>> 发件人: Kuehling, Felix 
>> 发送时间: 2021年5月19日 11:29
>> 收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
>> 抄送: Deucher, Alexander; Koenig, Christian; dri-de...@lists.freedesktop.org; 
>> dan...@ffwll.ch
>> 主题: Re: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and 
>> swapin
>>
>> Swapping SG BOs makes no sense, because TTM doesn't own the pages of
>> this type of BO.
>>
>> Last I checked, userptr BOs (and other SG BOs) were protected from
>> swapout by the fact that they would not be added to the swap-LRU. But it
>> looks like Christian just removed the swap-LRU. I guess this broke that
>> protection:
>>
>> commit 2cb51d22d70b18eaf339abf9758bf0b7608da65c
>> Author: Christian König 
>> Date:   Tue Oct 6 16:30:09 2020 +0200
>>
>>drm/ttm: remove swap LRU v3
>>
>>Instead evict round robin from each devices SYSTEM and TT domain.
>>
>>v2: reorder num_pages access reported by Dan's script
>>v3: fix rebase fallout, num_pages should be 32bit
>>
>>Signed-off-by: Christian König 
>>Tested-by: Nirmoy Das 
>>Reviewed-by: Huang Rui 
&g

回复: [PATCH] drm/amdgpu: Use dma_resv_lock instead in BO release_notify

2021-05-21 Thread Pan, Xinhui
[AMD Official Use Only]

Oh, sorry for that. I notice the lockdep warning too.
I just think we use trylock elsewhere because we hold the lru_lock mostly.
So I think we can do something like below. Let me verify it later.

@@ -318,7 +318,9 @@ int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct 
amdgpu_bo *bo)
ef = container_of(dma_fence_get(&info->eviction_fence->base),
struct amdgpu_amdkfd_fence, base);

+   spin_lock(&bo->tbo.bdev->lru_lock);
BUG_ON(!dma_resv_trylock(bo->tbo.base.resv));
+   spin_unlock(&bo->tbo.bdev->lru_lock);
ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
dma_resv_unlock(bo->tbo.base.resv);



发件人: Kuehling, Felix 
发送时间: 2021年5月22日 2:24
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Koenig, Christian
主题: Re: [PATCH] drm/amdgpu: Use dma_resv_lock instead in BO release_notify


Am 2021-05-21 um 1:26 a.m. schrieb xinhui pan:
> The reservation object might be locked again by evict/swap after
> individualized. The race is like below.
> cpu 0 cpu 1
> BO releaseBO evict or swap
> ttm_bo_individualize_resv {resv = &_resv}
>   ttm_bo_evict_swapout_allowable
>   dma_resv_trylock(resv)
> ->release_notify() {BUG_ON(!trylock(resv))}
>   if (!ttm_bo_get_unless_zero))
>   dma_resv_unlock(resv)
> Actually this is not a bug if trylock fails. So use dma_resv_lock
> instead.

Please test this with LOCKDEP enabled. I believe the trylock here was
needed to avoid potential deadlocks. Maybe Christian can fill in more
details.

Regards,
  Felix


>
> Signed-off-by: xinhui pan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 928e8d57cd08..beacb46265f8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -318,7 +318,7 @@ int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct 
> amdgpu_bo *bo)
>   ef = container_of(dma_fence_get(&info->eviction_fence->base),
>   struct amdgpu_amdkfd_fence, base);
>
> - BUG_ON(!dma_resv_trylock(bo->tbo.base.resv));
> + dma_resv_lock(bo->tbo.base.resv, NULL);
>   ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
>   dma_resv_unlock(bo->tbo.base.resv);
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [RFC PATCH] drm/ttm: Do page counting after populate callback succeed

2021-06-15 Thread Pan, Xinhui


> 2021年6月15日 20:01,Christian König  写道:
> 
> Am 15.06.21 um 13:57 schrieb xinhui pan:
>> Amdgpu set SG flag in populate callback. So TTM still count pages in SG
>> BO.
> 
> It's probably better to fix this instead. E.g. why does amdgpu modify the SG 
> flag during populate and not during initial creation? That doesn't seem to 
> make sense.

fair enough. Let me have a try.
No idea why we set SG flag in populate years ago.

> 
> Christian.
> 
>> One easy way to fix this is lets count pages after populate callback.
>> 
>> We hit one issue that amdgpu alloc many SG BOs, but TTM try to do swap
>> again and again even if swapout does not swap SG BOs at all.
>> 
>> Signed-off-by: xinhui pan 
>> ---
>>  drivers/gpu/drm/ttm/ttm_tt.c | 32 +---
>>  1 file changed, 13 insertions(+), 19 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
>> index a1a25410ec74..4fa0a8cd71c0 100644
>> --- a/drivers/gpu/drm/ttm/ttm_tt.c
>> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>> @@ -317,13 +317,6 @@ int ttm_tt_populate(struct ttm_device *bdev,
>>  if (ttm_tt_is_populated(ttm))
>>  return 0;
>>  -   if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
>> -atomic_long_add(ttm->num_pages, &ttm_pages_allocated);
>> -if (bdev->pool.use_dma32)
>> -atomic_long_add(ttm->num_pages,
>> -&ttm_dma32_pages_allocated);
>> -}
>> -
>>  while (atomic_long_read(&ttm_pages_allocated) > ttm_pages_limit ||
>> atomic_long_read(&ttm_dma32_pages_allocated) >
>> ttm_dma32_pages_limit) {
>> @@ -342,6 +335,13 @@ int ttm_tt_populate(struct ttm_device *bdev,
>>  if (ret)
>>  goto error;
>>  +   if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
>> +atomic_long_add(ttm->num_pages, &ttm_pages_allocated);
>> +if (bdev->pool.use_dma32)
>> +atomic_long_add(ttm->num_pages,
>> +&ttm_dma32_pages_allocated);
>> +}
>> +
>>  ttm_tt_add_mapping(bdev, ttm);
>>  ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED;
>>  if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
>> @@ -355,12 +355,6 @@ int ttm_tt_populate(struct ttm_device *bdev,
>>  return 0;
>>error:
>> -if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
>> -atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
>> -if (bdev->pool.use_dma32)
>> -atomic_long_sub(ttm->num_pages,
>> -&ttm_dma32_pages_allocated);
>> -}
>>  return ret;
>>  }
>>  EXPORT_SYMBOL(ttm_tt_populate);
>> @@ -384,12 +378,6 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct 
>> ttm_tt *ttm)
>>  if (!ttm_tt_is_populated(ttm))
>>  return;
>>  -   ttm_tt_clear_mapping(ttm);
>> -if (bdev->funcs->ttm_tt_unpopulate)
>> -bdev->funcs->ttm_tt_unpopulate(bdev, ttm);
>> -else
>> -ttm_pool_free(&bdev->pool, ttm);
>> -
>>  if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) {
>>  atomic_long_sub(ttm->num_pages, &ttm_pages_allocated);
>>  if (bdev->pool.use_dma32)
>> @@ -397,6 +385,12 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct 
>> ttm_tt *ttm)
>>  &ttm_dma32_pages_allocated);
>>  }
>>  +   ttm_tt_clear_mapping(ttm);
>> +if (bdev->funcs->ttm_tt_unpopulate)
>> +bdev->funcs->ttm_tt_unpopulate(bdev, ttm);
>> +else
>> +ttm_pool_free(&bdev->pool, ttm);
>> +
>>  ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED;
>>  }
>>  
> 

<>___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: Fix circular lock in nocpsch path

2021-06-16 Thread Pan, Xinhui


> 2021年6月16日 02:22,Kuehling, Felix  写道:
> 
> [+Xinhui]
> 
> 
> Am 2021-06-15 um 1:50 p.m. schrieb Amber Lin:
>> Calling free_mqd inside of destroy_queue_nocpsch_locked can cause a
>> circular lock. destroy_queue_nocpsch_locked is called under a DQM lock,
>> which is taken in MMU notifiers, potentially in FS reclaim context.
>> Taking another lock, which is BO reservation lock from free_mqd, while
>> causing an FS reclaim inside the DQM lock creates a problematic circular
>> lock dependency. Therefore move free_mqd out of
>> destroy_queue_nocpsch_locked and call it after unlocking DQM.
>> 
>> Signed-off-by: Amber Lin 
>> Reviewed-by: Felix Kuehling 
> 
> Let's submit this patch as is. I'm making some comments inline for
> things that Xinhui can address in his race condition patch.
> 
> 
>> ---
>> .../drm/amd/amdkfd/kfd_device_queue_manager.c  | 18 +-
>> 1 file changed, 13 insertions(+), 5 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index 72bea5278add..c069fa259b30 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -486,9 +486,6 @@ static int destroy_queue_nocpsch_locked(struct 
>> device_queue_manager *dqm,
>>  if (retval == -ETIME)
>>  qpd->reset_wavefronts = true;
>> 
>> -
>> -mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
>> -
>>  list_del(&q->list);
>>  if (list_empty(&qpd->queues_list)) {
>>  if (qpd->reset_wavefronts) {
>> @@ -523,6 +520,8 @@ static int destroy_queue_nocpsch(struct 
>> device_queue_manager *dqm,
>>  int retval;
>>  uint64_t sdma_val = 0;
>>  struct kfd_process_device *pdd = qpd_to_pdd(qpd);
>> +struct mqd_manager *mqd_mgr =
>> +dqm->mqd_mgrs[get_mqd_type_from_queue_type(q->properties.type)];
>> 
>>  /* Get the SDMA queue stats */
>>  if ((q->properties.type == KFD_QUEUE_TYPE_SDMA) ||
>> @@ -540,6 +539,8 @@ static int destroy_queue_nocpsch(struct 
>> device_queue_manager *dqm,
>>  pdd->sdma_past_activity_counter += sdma_val;
>>  dqm_unlock(dqm);
>> 
>> +mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
>> +
>>  return retval;
>> }
>> 
>> @@ -1629,7 +1630,7 @@ static bool set_cache_memory_policy(struct 
>> device_queue_manager *dqm,
>> static int process_termination_nocpsch(struct device_queue_manager *dqm,
>>  struct qcm_process_device *qpd)
>> {
>> -struct queue *q, *next;
>> +struct queue *q;
>>  struct device_process_node *cur, *next_dpn;
>>  int retval = 0;
>>  bool found = false;
>> @@ -1637,12 +1638,19 @@ static int process_termination_nocpsch(struct 
>> device_queue_manager *dqm,
>>  dqm_lock(dqm);
>> 
>>  /* Clear all user mode queues */
>> -list_for_each_entry_safe(q, next, &qpd->queues_list, list) {
>> +while (!list_empty(&qpd->queues_list)) {
>> +struct mqd_manager *mqd_mgr;
>>  int ret;
>> 
>> +q = list_first_entry(&qpd->queues_list, struct queue, list);
>> +mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
>> +q->properties.type)];
>>  ret = destroy_queue_nocpsch_locked(dqm, qpd, q);
>>  if (ret)
>>  retval = ret;
>> +dqm_unlock(dqm);
>> +mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
>> +dqm_lock(dqm);
> 
> This is the correct way to clean up the list when dropping the dqm-lock
> in the middle. Xinhui, you can use the same method in
> process_termination_cpsch.
> 

yes, that is the right way to walk through the list. thanks.


> I believe the swapping of the q->mqd with a temporary variable is not
> needed. When free_mqd is called, the queue is no longer on the
> qpd->queues_list, so destroy_queue cannot race with it. If we ensure
> that queues are always removed from the list before calling free_mqd,
> and that list-removal happens under the dqm_lock, then there should be
> no risk of a race condition that causes a double-free.
> 

no, the double free exists because pqm_destroy_queue fetch the queue from qid 
by get_queue_by_qid()
the race is like below.
pqm_destroy_queue
get_queue_by_qid
process_termination_cpsch
destroy_queue_cpsch
lock

list_for_each_entry_safe

list_del(q)
unlock
free_mqd
lock
list_del(q)
unlock
free_mqd



 
> Regards,
>   Felix
> 
> 
>>  }
>> 
>>  /* Unregister process */

__

Re: [PATCH] drm/amdkfd: Fix circular lock in nocpsch path

2021-06-16 Thread Pan, Xinhui


> 2021年6月16日 12:36,Kuehling, Felix  写道:
> 
> Am 2021-06-16 um 12:01 a.m. schrieb Pan, Xinhui:
>>> 2021年6月16日 02:22,Kuehling, Felix  写道:
>>> 
>>> [+Xinhui]
>>> 
>>> 
>>> Am 2021-06-15 um 1:50 p.m. schrieb Amber Lin:
>>>> Calling free_mqd inside of destroy_queue_nocpsch_locked can cause a
>>>> circular lock. destroy_queue_nocpsch_locked is called under a DQM lock,
>>>> which is taken in MMU notifiers, potentially in FS reclaim context.
>>>> Taking another lock, which is BO reservation lock from free_mqd, while
>>>> causing an FS reclaim inside the DQM lock creates a problematic circular
>>>> lock dependency. Therefore move free_mqd out of
>>>> destroy_queue_nocpsch_locked and call it after unlocking DQM.
>>>> 
>>>> Signed-off-by: Amber Lin 
>>>> Reviewed-by: Felix Kuehling 
>>> Let's submit this patch as is. I'm making some comments inline for
>>> things that Xinhui can address in his race condition patch.
>>> 
>>> 
>>>> ---
>>>> .../drm/amd/amdkfd/kfd_device_queue_manager.c  | 18 +-
>>>> 1 file changed, 13 insertions(+), 5 deletions(-)
>>>> 
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>>>> index 72bea5278add..c069fa259b30 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>>>> @@ -486,9 +486,6 @@ static int destroy_queue_nocpsch_locked(struct 
>>>> device_queue_manager *dqm,
>>>>if (retval == -ETIME)
>>>>qpd->reset_wavefronts = true;
>>>> 
>>>> -
>>>> -  mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
>>>> -
>>>>list_del(&q->list);
>>>>if (list_empty(&qpd->queues_list)) {
>>>>if (qpd->reset_wavefronts) {
>>>> @@ -523,6 +520,8 @@ static int destroy_queue_nocpsch(struct 
>>>> device_queue_manager *dqm,
>>>>int retval;
>>>>uint64_t sdma_val = 0;
>>>>struct kfd_process_device *pdd = qpd_to_pdd(qpd);
>>>> +  struct mqd_manager *mqd_mgr =
>>>> +  dqm->mqd_mgrs[get_mqd_type_from_queue_type(q->properties.type)];
>>>> 
>>>>/* Get the SDMA queue stats */
>>>>if ((q->properties.type == KFD_QUEUE_TYPE_SDMA) ||
>>>> @@ -540,6 +539,8 @@ static int destroy_queue_nocpsch(struct 
>>>> device_queue_manager *dqm,
>>>>pdd->sdma_past_activity_counter += sdma_val;
>>>>dqm_unlock(dqm);
>>>> 
>>>> +  mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
>>>> +
>>>>return retval;
>>>> }
>>>> 
>>>> @@ -1629,7 +1630,7 @@ static bool set_cache_memory_policy(struct 
>>>> device_queue_manager *dqm,
>>>> static int process_termination_nocpsch(struct device_queue_manager *dqm,
>>>>struct qcm_process_device *qpd)
>>>> {
>>>> -  struct queue *q, *next;
>>>> +  struct queue *q;
>>>>struct device_process_node *cur, *next_dpn;
>>>>int retval = 0;
>>>>bool found = false;
>>>> @@ -1637,12 +1638,19 @@ static int process_termination_nocpsch(struct 
>>>> device_queue_manager *dqm,
>>>>dqm_lock(dqm);
>>>> 
>>>>/* Clear all user mode queues */
>>>> -  list_for_each_entry_safe(q, next, &qpd->queues_list, list) {
>>>> +  while (!list_empty(&qpd->queues_list)) {
>>>> +  struct mqd_manager *mqd_mgr;
>>>>int ret;
>>>> 
>>>> +  q = list_first_entry(&qpd->queues_list, struct queue, list);
>>>> +  mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type(
>>>> +  q->properties.type)];
>>>>ret = destroy_queue_nocpsch_locked(dqm, qpd, q);
>>>>if (ret)
>>>>retval = ret;
>>>> +  dqm_unlock(dqm);
>>>> +  mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj);
>>>> +  dqm_lock(dqm);
>>> This is the correct way to clean up the list when dropping the dqm-lock
>>> in the middle. Xi

Re: [PATCH 1/2] drm/amdkfd: Fix some double free when destroy queue fails

2021-06-16 Thread Pan, Xinhui


> 2021年6月17日 06:55,Kuehling, Felix  写道:
> 
> On 2021-06-16 4:35 a.m., xinhui pan wrote:
>> Some resource are freed even destroy queue fails.
> 
> Looks like you're keeping this behaviour for -ETIME. That is consistent with 
> what pqn_destroy_queue does. What you're fixing here is the behaviour for 
> non-timeout errors. Please make that clear in the patch description.
will do that in v2.

> 
> Out of curiosity, what kind of error were you getting? The only other ones 
> that are not a fatal memory shortage, are some EINVAL cases in 
> pm_unmap_queues_v*. But that would indicate some internal error, that a queue 
> was created with an invalid type, or maybe the queue data structure was 
> somehow corrupted.
> 
This is just because amdkfd_fence_wait_timeout got timeout. 
execute_queues_cpsch return EIO as dqm->is_hws_hang is true.
hit this issue with kfdtest --gtest_filter=*QM*
> 
>>  That will cause double
>> free when user-space issue another destroy_queue ioctl.
>> 
>> Paste some log below.
>> 
>> amdgpu: Can't create new usermode queue because -1 queues were already
>> created
>> 
>> refcount_t: underflow; use-after-free.
>> Call Trace:
>>  kobject_put+0xe6/0x1b0
>>  kfd_procfs_del_queue+0x37/0x50 [amdgpu]
>>  pqm_destroy_queue+0x17a/0x390 [amdgpu]
>>  kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
>>  kfd_ioctl+0x463/0x690 [amdgpu]
>> 
>> BUG kmalloc-32 (Tainted: GW): Object already free
>> INFO: Allocated in allocate_sdma_mqd+0x30/0xb0 [amdgpu] age=4796 cpu=2
>> pid=2511
>>  __slab_alloc+0x72/0x80
>>  kmem_cache_alloc_trace+0x81f/0x8c0
>>  allocate_sdma_mqd+0x30/0xb0 [amdgpu]
>>  create_queue_cpsch+0xbf/0x470 [amdgpu]
>>  pqm_create_queue+0x28d/0x6d0 [amdgpu]
>>  kfd_ioctl_create_queue+0x492/0xae0 [amdgpu]
>> INFO: Freed in free_mqd_hiq_sdma+0x20/0x60 [amdgpu] age=2537 cpu=7
>> pid=2511
>>  kfree+0x322/0x340
>>  free_mqd_hiq_sdma+0x20/0x60 [amdgpu]
>>  destroy_queue_cpsch+0x20c/0x330 [amdgpu]
>>  pqm_destroy_queue+0x1a3/0x390 [amdgpu]
>>  kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
>> 
>> Signed-off-by: xinhui pan 
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 2 ++
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 4 +++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 1 +
>>  3 files changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index e6366b408420..c24ab8f17eb6 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -1529,6 +1529,8 @@ static int destroy_queue_cpsch(struct 
>> device_queue_manager *dqm,
>>  KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
>>  if (retval == -ETIME)
>>  qpd->reset_wavefronts = true;
>> +else if (retval)
>> +goto failed_try_destroy_debugged_queue;
>>  if (q->properties.is_gws) {
>>  dqm->gws_queue_count--;
>>  qpd->mapped_gws_queue = false;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index 09b98a83f670..984197e5929f 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -607,11 +607,13 @@ static int kfd_procfs_add_sysfs_files(struct 
>> kfd_process *p)
>>void kfd_procfs_del_queue(struct queue *q)
>>  {
>> -if (!q)
>> +if (!q || !kobject_get_unless_zero(&q->kobj))
>>  return;
>>  kobject_del(&q->kobj);
>>  kobject_put(&q->kobj);
>> +/* paired with the get above */
>> +kobject_put(&q->kobj);
>>  }
>>int kfd_process_create_wq(void)
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> index 95a6c36cea4c..4fcb64bc43dd 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> @@ -373,6 +373,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, 
>> unsigned int qid)
>>  dqm = pqn->kq->dev->dqm;
>>  dqm->ops.destroy_kernel_queue(dqm, pqn->kq, &pdd->qpd);
>>  kernel_queue_uninit(pqn->kq, false);
>> +pqn->kq = NULL;
> 
> This seems unrelated to this patch. But if you're fixing this, I'd expect a 
> similar fix after uninit_queue(pqn->q).
> 
> Regards,
>   Felix
> 
> 
>>  }
>>  if (pqn->q) {

<>___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2 1/2] drm/amdkfd: Fix some double free when destroy queue fails

2021-06-17 Thread Pan, Xinhui
Felix
what I am thinking of like below looks like more simple. :)

@@ -1501,6 +1501,11 @@ static int destroy_queue_cpsch(struct 
device_queue_manager *dqm,
/* remove queue from list to prevent rescheduling after preemption */
dqm_lock(dqm);
 
+   if (dqm->is_hws_hang) {
+   retval = -EIO;
+   goto failed_try_destroy_debugged_queue;
+   }
+
if (qpd->is_debug) {
/*
 * error, currently we do not allow to destroy a queue

> 2021年6月17日 20:02,Pan, Xinhui  写道:
> 
> Handle queue destroy failure while CP hang.
> Once CP got hang, kfd trigger GPU reset and set related flags to stop
> driver touching the queue. As we leave the queue as it is, we need keep
> the resource as it is too.
> 
> Regardless user-space tries to destroy the queue again or not. We need
> put queue back to the list so process termination would do the cleanup
> work. What's more, if userspace tries to destroy the queue again, we
> would not free its resource twice.
> 
> Kfd return -EIO in this case, so lets handle it now.
> 
> Paste some error log below without this patch.
> 
> amdgpu: Can't create new usermode queue because -1 queues were already
> created
> 
> refcount_t: underflow; use-after-free.
> Call Trace:
> kobject_put+0xe6/0x1b0
> kfd_procfs_del_queue+0x37/0x50 [amdgpu]
> pqm_destroy_queue+0x17a/0x390 [amdgpu]
> kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
> kfd_ioctl+0x463/0x690 [amdgpu]
> 
> BUG kmalloc-32 (Tainted: GW): Object already free
> INFO: Allocated in allocate_sdma_mqd+0x30/0xb0 [amdgpu] age=4796 cpu=2
> pid=2511
> __slab_alloc+0x72/0x80
> kmem_cache_alloc_trace+0x81f/0x8c0
> allocate_sdma_mqd+0x30/0xb0 [amdgpu]
> create_queue_cpsch+0xbf/0x470 [amdgpu]
> pqm_create_queue+0x28d/0x6d0 [amdgpu]
> kfd_ioctl_create_queue+0x492/0xae0 [amdgpu]
> INFO: Freed in free_mqd_hiq_sdma+0x20/0x60 [amdgpu] age=2537 cpu=7
> pid=2511
> kfree+0x322/0x340
> free_mqd_hiq_sdma+0x20/0x60 [amdgpu]
> destroy_queue_cpsch+0x20c/0x330 [amdgpu]
> pqm_destroy_queue+0x1a3/0x390 [amdgpu]
> kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
> 
> Signed-off-by: xinhui pan 
> ---
> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c   | 13 +
> drivers/gpu/drm/amd/amdkfd/kfd_process.c|  4 +++-
> .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c  |  2 ++
> 3 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index c069fa259b30..63a9a19a3987 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1530,6 +1530,11 @@ static int destroy_queue_cpsch(struct 
> device_queue_manager *dqm,
>   KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
>   if (retval == -ETIME)
>   qpd->reset_wavefronts = true;
> + /* In gpu reset? We leave the queue as it is, so do NOT
> +  * cleanup the resource.
> +  */
> + else if (retval == -EIO)
> + goto failed_execute_queue;
>   if (q->properties.is_gws) {
>   dqm->gws_queue_count--;
>   qpd->mapped_gws_queue = false;
> @@ -1551,6 +1556,14 @@ static int destroy_queue_cpsch(struct 
> device_queue_manager *dqm,
> 
>   return retval;
> 
> +failed_execute_queue:
> + /* Put queue back to the list, then we have chance to destroy it.
> +  * FIXME: we do NOT want the queue in the runlist again.
> +  */
> + list_add(&q->list, &qpd->queues_list);
> + qpd->queue_count++;
> + if (q->properties.is_active)
> + increment_queue_count(dqm, q->properties.type);
> failed_try_destroy_debugged_queue:
> 
>   dqm_unlock(dqm);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 09b98a83f670..984197e5929f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -607,11 +607,13 @@ static int kfd_procfs_add_sysfs_files(struct 
> kfd_process *p)
> 
> void kfd_procfs_del_queue(struct queue *q)
> {
> - if (!q)
> + if (!q || !kobject_get_unless_zero(&q->kobj))
>   return;
> 
>   kobject_del(&q->kobj);
>   kobject_put(&q->kobj);
> + /* paired with the get above */
> + kobject_put(&q->kobj);
> }
> 
> int kfd_process_create_wq(void)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queu

Re: [PATCH v2 1/2] drm/amdkfd: Fix some double free when destroy queue fails

2021-06-17 Thread Pan, Xinhui
Felix
What I am wondreing is that if CP got hang,  could we assume all usermode 
queues have stopped?
If so, we can do cleanupwork regardless of the retval of execute_queues_cpsch().

> 2021年6月17日 20:11,Pan, Xinhui  写道:
> 
> Felix
> what I am thinking of like below looks like more simple. :)
> 
> @@ -1501,6 +1501,11 @@ static int destroy_queue_cpsch(struct 
> device_queue_manager *dqm,
>/* remove queue from list to prevent rescheduling after preemption */
>dqm_lock(dqm);
> 
> +   if (dqm->is_hws_hang) {
> +   retval = -EIO;
> +   goto failed_try_destroy_debugged_queue;
> +   }
> +
>if (qpd->is_debug) {
>/*
> * error, currently we do not allow to destroy a queue
> 
>> 2021年6月17日 20:02,Pan, Xinhui  写道:
>> 
>> Handle queue destroy failure while CP hang.
>> Once CP got hang, kfd trigger GPU reset and set related flags to stop
>> driver touching the queue. As we leave the queue as it is, we need keep
>> the resource as it is too.
>> 
>> Regardless user-space tries to destroy the queue again or not. We need
>> put queue back to the list so process termination would do the cleanup
>> work. What's more, if userspace tries to destroy the queue again, we
>> would not free its resource twice.
>> 
>> Kfd return -EIO in this case, so lets handle it now.
>> 
>> Paste some error log below without this patch.
>> 
>> amdgpu: Can't create new usermode queue because -1 queues were already
>> created
>> 
>> refcount_t: underflow; use-after-free.
>> Call Trace:
>> kobject_put+0xe6/0x1b0
>> kfd_procfs_del_queue+0x37/0x50 [amdgpu]
>> pqm_destroy_queue+0x17a/0x390 [amdgpu]
>> kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
>> kfd_ioctl+0x463/0x690 [amdgpu]
>> 
>> BUG kmalloc-32 (Tainted: GW): Object already free
>> INFO: Allocated in allocate_sdma_mqd+0x30/0xb0 [amdgpu] age=4796 cpu=2
>> pid=2511
>> __slab_alloc+0x72/0x80
>> kmem_cache_alloc_trace+0x81f/0x8c0
>> allocate_sdma_mqd+0x30/0xb0 [amdgpu]
>> create_queue_cpsch+0xbf/0x470 [amdgpu]
>> pqm_create_queue+0x28d/0x6d0 [amdgpu]
>> kfd_ioctl_create_queue+0x492/0xae0 [amdgpu]
>> INFO: Freed in free_mqd_hiq_sdma+0x20/0x60 [amdgpu] age=2537 cpu=7
>> pid=2511
>> kfree+0x322/0x340
>> free_mqd_hiq_sdma+0x20/0x60 [amdgpu]
>> destroy_queue_cpsch+0x20c/0x330 [amdgpu]
>> pqm_destroy_queue+0x1a3/0x390 [amdgpu]
>> kfd_ioctl_destroy_queue+0x57/0xc0 [amdgpu]
>> 
>> Signed-off-by: xinhui pan 
>> ---
>> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c   | 13 +
>> drivers/gpu/drm/amd/amdkfd/kfd_process.c|  4 +++-
>> .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c  |  2 ++
>> 3 files changed, 18 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index c069fa259b30..63a9a19a3987 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -1530,6 +1530,11 @@ static int destroy_queue_cpsch(struct 
>> device_queue_manager *dqm,
>>  KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
>>  if (retval == -ETIME)
>>  qpd->reset_wavefronts = true;
>> +/* In gpu reset? We leave the queue as it is, so do NOT
>> + * cleanup the resource.
>> + */
>> +else if (retval == -EIO)
>> +goto failed_execute_queue;
>>  if (q->properties.is_gws) {
>>  dqm->gws_queue_count--;
>>  qpd->mapped_gws_queue = false;
>> @@ -1551,6 +1556,14 @@ static int destroy_queue_cpsch(struct 
>> device_queue_manager *dqm,
>> 
>>  return retval;
>> 
>> +failed_execute_queue:
>> +/* Put queue back to the list, then we have chance to destroy it.
>> + * FIXME: we do NOT want the queue in the runlist again.
>> + */
>> +list_add(&q->list, &qpd->queues_list);
>> +qpd->queue_count++;
>> +if (q->properties.is_active)
>> +increment_queue_count(dqm, q->properties.type);
>> failed_try_destroy_debugged_queue:
>> 
>>  dqm_unlock(dqm);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index 09b98a83f670..984197e5929f 100644
>> --- a/drivers/g

RE: [PATCH 5/8] drm/amdgpu: use the new cursor in amdgpu_ttm_access_memory

2021-03-21 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

No, the patch from Nirmoy did not fully fix this issue. I will send another fix 
patch later.


-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: 2021年3月20日 17:08
To: Kuehling, Felix ; Paneer Selvam, Arunpravin 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 5/8] drm/amdgpu: use the new cursor in 
amdgpu_ttm_access_memory

Yeah, Nirmoy already stumbled over the correct fix.

The size check is off by one. Patch to fix this should be pushed on Monday.

Regards,
Christian.

Am 19.03.21 um 22:23 schrieb Felix Kuehling:
> This is causing a deadlock in amdgpu_ttm_access_memory during the
> PtraceAccess test in kfdtest. Unfortunately it doesn't get flagged by
> LOCKDEP. See the kernel log snippet below. I don't have a good
> explanation what's going on other than maybe some data structure
> corruption.
>
> With this patch reverted the PtraceAccess test still fails, but it
> doesn't hang any more. If I also revert "use new cursor in
> amdgpu_ttm_io_mem_pfn" (which is used via amdgpu_find_mm_node in
> amdgpu_ttm_access_memory), Ptrace access starts working correctly.
> That tells me that there is some fundamental bug in the resource
> cursor implementation that's breaking several users.
>
> Regards,
>   Felix
>
>
> [  129.446085] watchdog: BUG: soft lockup - CPU#8 stuck for 22s!
> [kfdtest:3588]
> [  129.455379] Modules linked in: ip6table_filter ip6_tables
> iptable_filter amdgpu x86_pkg_temp_thermal drm_ttm_helper ttm iommu_v2
> gpu_sched ip_tables x_tables [  129.455428] irq event stamp: 75294000
> [  129.455432] hardirqs last  enabled at (75293999):
> [] _raw_spin_unlock_irqrestore+0x2d/0x40
> [  129.455447] hardirqs last disabled at (75294000):
> [] sysvec_apic_timer_interrupt+0xa/0xa0
> [  129.455457] softirqs last  enabled at (75184000):
> [] __do_softirq+0x306/0x429 [  129.455467] softirqs
> last disabled at (75183995):
> [] asm_call_irq_on_stack+0xf/0x20 [  129.455477]
> CPU: 8 PID: 3588 Comm: kfdtest Not tainted 5.11.0-kfd-fkuehlin #194 [
> 129.455485] Hardware name: ASUS All Series/X99-E WS/USB 3.1, BIOS
> 3201 06/17/2016
> [  129.455490] RIP: 0010:_raw_spin_lock_irqsave+0xb/0x50
> [  129.455498] Code: d2 31 f6 e8 e7 e9 31 ff 48 89 df 58 5b e9 7d 32
> 32 ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89
> fd 53 9c <5b> fa f6 c7 02 74 05 e8 59 06 3e ff 65 ff 05 92 73 24 56 48
> 8d 7d [  129.455505] RSP: 0018:a3eb407f3c58 EFLAGS: 0246 [
> 129.455513] RAX: 96466e2010a0 RBX: 96466e20 RCX:
> 
> [  129.455519] RDX: a3eb407f3e70 RSI: 0190 RDI:
> 96466e2010a0
> [  129.455524] RBP: 96466e2010a0 R08:  R09:
> 0001
> [  129.455528] R10: a3eb407f3c60 R11: 96466e2010b8 R12:
> 0190
> [  129.455533] R13: 0190 R14: a3eb407f3e70 R15:
> 0190
> [  129.455538] FS:  7f5aad0f0740() GS:96467fc0()
> knlGS:
> [  129.455544] CS:  0010 DS:  ES:  CR0: 80050033 [
> 129.455549] CR2: 563ea76ad0f0 CR3: 0007c6e92005 CR4:
> 001706e0
> [  129.44] Call Trace:
> [  129.455563]  amdgpu_device_vram_access+0xc1/0x200 [amdgpu] [
> 129.455820]  ? _raw_spin_unlock_irqrestore+0x2d/0x40
> [  129.455834]  amdgpu_ttm_access_memory+0x29e/0x320 [amdgpu] [
> 129.456063]  ttm_bo_vm_access+0x1c8/0x3a0 [ttm] [  129.456089]
> __access_remote_vm+0x289/0x390 [  129.456112]
> ptrace_access_vm+0x98/0xc0 [  129.456127]
> generic_ptrace_peekdata+0x31/0x80 [  129.456138]
> ptrace_request+0x13b/0x5d0 [  129.456155]  arch_ptrace+0x24f/0x2f0 [
> 129.456165]  __x64_sys_ptrace+0xc9/0x140 [  129.456177]
> do_syscall_64+0x2d/0x40 [  129.456185]
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  129.456194] RIP: 0033:0x7f5aab861a3f [  129.456199] Code: 48 89 44
> 24 18 48 8d 44 24 30 c7 44 24 10 18 00
> 00 00 8b 70 08 48 8b 50 10 48 89 44 24 20 4c 0f 43 50 18 b8 65 00 00
> 00 0f 05 <48> 3d 00 f0 ff ff 77 41 48 85 c0 78 06 41 83 f8 02 76 1e 48
> 8b 4c [  129.456205] RSP: 002b:7ffd27b68750 EFLAGS: 0293
> ORIG_RAX:
> 0065
> [  129.456214] RAX: ffda RBX: 0001 RCX:
> 7f5aab861a3f
> [  129.456219] RDX: 7f5aab30 RSI: 0dfa RDI:
> 0002
> [  129.456224] RBP: 7ffd27b68870 R08: 0001 R09:
> 
> [  129.456228] R10: 7ffd27b68758 R11: 0293 R12:
> 563ea764e2aa
> [  129.456233] R13:  R14: 0021 R15:
> 
>
> On 2021-03-08 8:40 a.m., Christian König wrote:
>> Separate the drm_mm_node walking from the actual handling.
>>
>> Signed-off-by: Christian König 
>> Acked-by: Oak Zeng 
>> Tested-by: Nirmoy Das 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 67
>> +++--
>>   1 file changed, 18 insertions(+), 49 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/a

RE: [PATCH 5/8] drm/amdgpu: use the new cursor in amdgpu_ttm_access_memory

2021-03-21 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

Because this is not a deadlock of lock itself.
Just because something like
while(true) {

LOCKIRQ
...
UNLOCKIRQ
...
}
I think scheduler policy is voluntary.  So it never schedule out if there is no 
sleep function and then soft lockup showed up when interrupt enabled.

-Original Message-
From: amd-gfx  On Behalf Of Felix 
Kuehling
Sent: 2021年3月20日 5:23
To: Christian König ; Paneer Selvam, 
Arunpravin ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 5/8] drm/amdgpu: use the new cursor in 
amdgpu_ttm_access_memory

This is causing a deadlock in amdgpu_ttm_access_memory during the PtraceAccess 
test in kfdtest. Unfortunately it doesn't get flagged by LOCKDEP. See the 
kernel log snippet below. I don't have a good explanation what's going on other 
than maybe some data structure corruption.

With this patch reverted the PtraceAccess test still fails, but it doesn't hang 
any more. If I also revert "use new cursor in amdgpu_ttm_io_mem_pfn" (which is 
used via amdgpu_find_mm_node in amdgpu_ttm_access_memory), Ptrace access starts 
working correctly. That tells me that there is some fundamental bug in the 
resource cursor implementation that's breaking several users.

Regards,
   Felix


[  129.446085] watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [kfdtest:3588] 
[  129.455379] Modules linked in: ip6table_filter ip6_tables iptable_filter 
amdgpu x86_pkg_temp_thermal drm_ttm_helper ttm iommu_v2 gpu_sched ip_tables 
x_tables [  129.455428] irq event stamp: 75294000 [  129.455432] hardirqs last  
enabled at (75293999): [] 
_raw_spin_unlock_irqrestore+0x2d/0x40
[  129.455447] hardirqs last disabled at (75294000): [] 
sysvec_apic_timer_interrupt+0xa/0xa0
[  129.455457] softirqs last  enabled at (75184000): [] 
__do_softirq+0x306/0x429 [  129.455467] softirqs last disabled at (75183995): 
[] asm_call_irq_on_stack+0xf/0x20 [  129.455477] CPU: 8 PID: 
3588 Comm: kfdtest Not tainted 5.11.0-kfd-fkuehlin #194 [  129.455485] Hardware 
name: ASUS All Series/X99-E WS/USB 3.1, BIOS 3201 06/17/2016 [  129.455490] 
RIP: 0010:_raw_spin_lock_irqsave+0xb/0x50
[  129.455498] Code: d2 31 f6 e8 e7 e9 31 ff 48 89 df 58 5b e9 7d 32 32 ff 0f 
1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 fd 53 9c <5b> fa f6 
c7 02 74 05 e8 59 06 3e ff 65 ff 05 92 73 24 56 48 8d 7d [  129.455505] RSP: 
0018:a3eb407f3c58 EFLAGS: 0246 [  129.455513] RAX: 96466e2010a0 
RBX: 96466e20 RCX:  [  129.455519] RDX: 
a3eb407f3e70 RSI: 0190 RDI: 96466e2010a0 [  129.455524] 
RBP: 96466e2010a0 R08:  R09: 0001 [  
129.455528] R10: a3eb407f3c60 R11: 96466e2010b8 R12: 0190 [ 
 129.455533] R13: 0190 R14: a3eb407f3e70 R15: 0190 
[  129.455538] FS:  7f5aad0f0740() GS:96467fc0() 
knlGS: [  129.455544] CS:  0010 DS:  ES:  CR0: 
80050033 [  129.455549] CR2: 563ea76ad0f0 CR3: 0007c6e92005 
CR4: 001706e0 [  129.44] Call Trace:
[  129.455563]  amdgpu_device_vram_access+0xc1/0x200 [amdgpu] [  129.455820]  ? 
_raw_spin_unlock_irqrestore+0x2d/0x40
[  129.455834]  amdgpu_ttm_access_memory+0x29e/0x320 [amdgpu] [  129.456063]  
ttm_bo_vm_access+0x1c8/0x3a0 [ttm] [  129.456089]  
__access_remote_vm+0x289/0x390 [  129.456112]  ptrace_access_vm+0x98/0xc0 [  
129.456127]  generic_ptrace_peekdata+0x31/0x80 [  129.456138]  
ptrace_request+0x13b/0x5d0 [  129.456155]  arch_ptrace+0x24f/0x2f0 [  
129.456165]  __x64_sys_ptrace+0xc9/0x140 [  129.456177]  
do_syscall_64+0x2d/0x40 [  129.456185]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  129.456194] RIP: 0033:0x7f5aab861a3f
[  129.456199] Code: 48 89 44 24 18 48 8d 44 24 30 c7 44 24 10 18 00 00 00 8b 
70 08 48 8b 50 10 48 89 44 24 20 4c 0f 43 50 18 b8 65 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 41 48 85 c0 78 06 41 83 f8 02 76 1e 48 8b 4c [  129.456205] RSP: 
002b:7ffd27b68750 EFLAGS: 0293 ORIG_RAX: 0065 [  
129.456214] RAX: ffda RBX: 0001 RCX: 7f5aab861a3f [ 
 129.456219] RDX: 7f5aab30 RSI: 0dfa RDI: 0002 
[  129.456224] RBP: 7ffd27b68870 R08: 0001 R09: 
 [  129.456228] R10: 7ffd27b68758 R11: 0293 
R12: 563ea764e2aa [  129.456233] R13:  R14: 
0021 R15: 

On 2021-03-08 8:40 a.m., Christian König wrote:
> Separate the drm_mm_node walking from the actual handling.
>
> Signed-off-by: Christian König 
> Acked-by: Oak Zeng 
> Tested-by: Nirmoy Das 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 67 +++--
>   1 file changed, 18 insertions(+), 49 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 517611b709fa..2cbe4ace591f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/a

RE: [PATCH 1/2] drm/amdgpu: use zero as start for dummy resource walks

2021-03-23 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

I don’t think so. Start is offset here. We get the valid physical address from 
pages_addr[offset] when we update mapping.
Btw, what issue we are seeing?

-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: 2021年3月23日 22:55
To: amd-gfx@lists.freedesktop.org
Cc: Das, Nirmoy ; Chen, Guchun 
Subject: [PATCH 1/2] drm/amdgpu: use zero as start for dummy resource walks

When we don't have a physically backing store we should use zero instead of the 
virtual start address since that isn't necessary a valid physical one.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index 40f2adf305bc..e94362ccf9d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -54,7 +54,7 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
 struct drm_mm_node *node;

 if (!res || !res->mm_node) {
-cur->start = start;
+cur->start = 0;
 cur->size = size;
 cur->remaining = size;
 cur->node = NULL;
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cxinhui.pan%40amd.com%7C031c743bd7c448e8d91508d8ee0ba402%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637521081053105295%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lrJ6k3QBXqM9G6GRK25frFlqANkbfR4kAv6A3%2F8myBc%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm//amdgpu: Always sync fence before unlock eviction_lock

2020-03-13 Thread Pan, Xinhui


> 2020年3月13日 16:52,Koenig, Christian  写道:
> 
> Am 13.03.20 um 08:43 schrieb xinhui pan:
>> The fence generated in ->commit is a shared one, so add it to resv.
>> And we need do that with eviction lock hold.
>> 
>> Currently we only sync last_direct/last_delayed before ->prepare. But we
>> fail to sync the last fence generated by ->commit. That cuases problems
>> if eviction happenes later, but it does not sync the last fence.
> 
> NAK, that won't work.
> 
> We can only add fences when the dma_resv object is locked and that is only 
> the case when validating.
> 
well, tha tis true.
but considering this is a PT BO, and only eviction has race on it AFAIK.
as for the individualized resv in bo release, we unref PT BO just after that.
I am still thinking of other races in the real world.

thanks
xinhui

> I'm considering to just partially revert the patch originally stopping to add 
> fences and instead only not add them when invalidating in a direct submit.
> 
> Christian.
> 
>> 
>> Cc: Christian König 
>> Cc: Alex Deucher 
>> Cc: Felix Kuehling 
>> Signed-off-by: xinhui pan 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 73398831196f..f424b5969930 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1582,6 +1582,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  struct amdgpu_vm_update_params params;
>>  enum amdgpu_sync_mode sync_mode;
>>  int r;
>> +struct amdgpu_bo *root = vm->root.base.bo;
>>  memset(¶ms, 0, sizeof(params));
>>  params.adev = adev;
>> @@ -1604,8 +1605,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  }
>>  if (flags & AMDGPU_PTE_VALID) {
>> -struct amdgpu_bo *root = vm->root.base.bo;
>> -
>>  if (!dma_fence_is_signaled(vm->last_direct))
>>  amdgpu_bo_fence(root, vm->last_direct, true);
>>  @@ -1623,6 +1622,12 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  r = vm->update_funcs->commit(¶ms, fence);
>>  +   if (!dma_fence_is_signaled(vm->last_direct))
>> +amdgpu_bo_fence(root, vm->last_direct, true);
>> +
>> +if (!dma_fence_is_signaled(vm->last_delayed))
>> +amdgpu_bo_fence(root, vm->last_delayed, true);
>> +
>>  error_unlock:
>>  amdgpu_vm_eviction_unlock(vm);
>>  return r;
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm//amdgpu: Always sync fence before unlock eviction_lock

2020-03-13 Thread Pan, Xinhui


> 2020年3月13日 17:55,Koenig, Christian  写道:
> 
> Am 13.03.20 um 10:29 schrieb Pan, Xinhui:
>> 
>>> 2020年3月13日 16:52,Koenig, Christian  写道:
>>> 
>>> Am 13.03.20 um 08:43 schrieb xinhui pan:
>>>> The fence generated in ->commit is a shared one, so add it to resv.
>>>> And we need do that with eviction lock hold.
>>>> 
>>>> Currently we only sync last_direct/last_delayed before ->prepare. But we
>>>> fail to sync the last fence generated by ->commit. That cuases problems
>>>> if eviction happenes later, but it does not sync the last fence.
>>> NAK, that won't work.
>>> 
>>> We can only add fences when the dma_resv object is locked and that is only 
>>> the case when validating.
>>> 
>> well, tha tis true.
>> but considering this is a PT BO, and only eviction has race on it AFAIK.
>> as for the individualized resv in bo release, we unref PT BO just after that.
>> I am still thinking of other races in the real world.
> 
> We should probably just add all pipelined/delayed submissions directly to the 
> reservation object in amdgpu_vm_sdma_commit().
> 
> Only the direct and invalidating submissions can't be added because we can't 
> grab the reservation object in the MMU notifier.
> 
> Can you prepare a patch for this?
> 
yep, I can.
Adding fence to bo resv in every commit introduce a little overload?  As we 
only need add the last fence to resv given the fact the job scheduer ring is 
FIFO.
yes,  code should be simple anyway as long as it works.

thanks
xinhui

> Regards,
> Christian.
> 
>> 
>> thanks
>> xinhui
>> 
>>> I'm considering to just partially revert the patch originally stopping to 
>>> add fences and instead only not add them when invalidating in a direct 
>>> submit.
>>> 
>>> Christian.
>>> 
>>>> Cc: Christian König 
>>>> Cc: Alex Deucher 
>>>> Cc: Felix Kuehling 
>>>> Signed-off-by: xinhui pan 
>>>> ---
>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
>>>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 73398831196f..f424b5969930 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -1582,6 +1582,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>struct amdgpu_vm_update_params params;
>>>>enum amdgpu_sync_mode sync_mode;
>>>>int r;
>>>> +  struct amdgpu_bo *root = vm->root.base.bo;
>>>>memset(¶ms, 0, sizeof(params));
>>>>params.adev = adev;
>>>> @@ -1604,8 +1605,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>}
>>>>if (flags & AMDGPU_PTE_VALID) {
>>>> -  struct amdgpu_bo *root = vm->root.base.bo;
>>>> -
>>>>if (!dma_fence_is_signaled(vm->last_direct))
>>>>amdgpu_bo_fence(root, vm->last_direct, true);
>>>>  @@ -1623,6 +1622,12 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>r = vm->update_funcs->commit(¶ms, fence);
>>>>  + if (!dma_fence_is_signaled(vm->last_direct))
>>>> +  amdgpu_bo_fence(root, vm->last_direct, true);
>>>> +
>>>> +  if (!dma_fence_is_signaled(vm->last_delayed))
>>>> +  amdgpu_bo_fence(root, vm->last_delayed, true);
>>>> +
>>>>  error_unlock:
>>>>amdgpu_vm_eviction_unlock(vm);
>>>>return r;
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm//amdgpu: Always sync fence before unlock eviction_lock

2020-03-13 Thread Pan, Xinhui


> 2020年3月13日 18:23,Koenig, Christian  写道:
> 
> Am 13.03.20 um 11:21 schrieb Pan, Xinhui:
>> 
>>> 2020年3月13日 17:55,Koenig, Christian  写道:
>>> 
>>> Am 13.03.20 um 10:29 schrieb Pan, Xinhui:
>>>>> 2020年3月13日 16:52,Koenig, Christian  写道:
>>>>> 
>>>>> Am 13.03.20 um 08:43 schrieb xinhui pan:
>>>>>> The fence generated in ->commit is a shared one, so add it to resv.
>>>>>> And we need do that with eviction lock hold.
>>>>>> 
>>>>>> Currently we only sync last_direct/last_delayed before ->prepare. But we
>>>>>> fail to sync the last fence generated by ->commit. That cuases problems
>>>>>> if eviction happenes later, but it does not sync the last fence.
>>>>> NAK, that won't work.
>>>>> 
>>>>> We can only add fences when the dma_resv object is locked and that is 
>>>>> only the case when validating.
>>>>> 
>>>> well, tha tis true.
>>>> but considering this is a PT BO, and only eviction has race on it AFAIK.
>>>> as for the individualized resv in bo release, we unref PT BO just after 
>>>> that.
>>>> I am still thinking of other races in the real world.
>>> We should probably just add all pipelined/delayed submissions directly to 
>>> the reservation object in amdgpu_vm_sdma_commit().
>>> 
>>> Only the direct and invalidating submissions can't be added because we 
>>> can't grab the reservation object in the MMU notifier.

wait, I see amdgpu_vm_handle_fault will grab the resv lock of root BO.
so no race then?

>>> 
>>> Can you prepare a patch for this?
>>> 
>> yep, I can.
>> Adding fence to bo resv in every commit introduce a little overload?
> 
> Yes it does, but we used to have this before and it wasn't really measurable.
> 
> With the unusual exception of mapping really large chunks of fragmented 
> system memory we only use one commit for anything <1GB anyway.
> 
> Christian.
> 
>> As we only need add the last fence to resv given the fact the job scheduer 
>> ring is FIFO.
>> yes,  code should be simple anyway as long as it works.
>> 
>> thanks
>> xinhui
>> 
>>> Regards,
>>> Christian.
>>> 
>>>> thanks
>>>> xinhui
>>>> 
>>>>> I'm considering to just partially revert the patch originally stopping to 
>>>>> add fences and instead only not add them when invalidating in a direct 
>>>>> submit.
>>>>> 
>>>>> Christian.
>>>>> 
>>>>>> Cc: Christian König 
>>>>>> Cc: Alex Deucher 
>>>>>> Cc: Felix Kuehling 
>>>>>> Signed-off-by: xinhui pan 
>>>>>> ---
>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++--
>>>>>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>> 
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>> index 73398831196f..f424b5969930 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>> @@ -1582,6 +1582,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>>>> amdgpu_device *adev,
>>>>>>  struct amdgpu_vm_update_params params;
>>>>>>  enum amdgpu_sync_mode sync_mode;
>>>>>>  int r;
>>>>>> +struct amdgpu_bo *root = vm->root.base.bo;
>>>>>>  memset(¶ms, 0, sizeof(params));
>>>>>>  params.adev = adev;
>>>>>> @@ -1604,8 +1605,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>>>> amdgpu_device *adev,
>>>>>>  }
>>>>>>  if (flags & AMDGPU_PTE_VALID) {
>>>>>> -struct amdgpu_bo *root = vm->root.base.bo;
>>>>>> -
>>>>>>  if (!dma_fence_is_signaled(vm->last_direct))
>>>>>>  amdgpu_bo_fence(root, vm->last_direct, true);
>>>>>>  @@ -1623,6 +1622,12 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>>>> amdgpu_device *adev,
>>>>>>  r = vm->update_funcs->commit(¶ms, fence);
>>>>>>  +   if (!dma_fence_is_signaled(vm->last_direct))
>>>>>> +amdgpu_bo_fence(root, vm->last_direct, true);
>>>>>> +
>>>>>> +if (!dma_fence_is_signaled(vm->last_delayed))
>>>>>> +amdgpu_bo_fence(root, vm->last_delayed, true);
>>>>>> +
>>>>>>  error_unlock:
>>>>>>  amdgpu_vm_eviction_unlock(vm);
>>>>>>  return r;
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

2020-03-13 Thread Pan, Xinhui
wait, this causes test case in Sl+ state. need take a look.
 
> 2020年3月13日 19:53,Pan, Xinhui  写道:
> 
> If a job need sync the bo resv, it is likely that bo need the job fence
> to sync with others.
> 
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: Felix Kuehling 
> Suggested-by: Christian König 
> Signed-off-by: xinhui pan 
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 5 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 9 +
> 2 files changed, 14 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index b5705fcfc935..ca6021b4200b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -226,6 +226,11 @@ struct amdgpu_vm_update_params {
>* @num_dw_left: number of dw left for the IB
>*/
>   unsigned int num_dw_left;
> +
> + /**
> +  * @resv: sync the resv and add job fence to it conditionally.
> +  */
> + struct dma_resv *resv;
> };
> 
> struct amdgpu_vm_update_funcs {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index 4cc7881f438c..0cfac59bff36 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -70,6 +70,8 @@ static int amdgpu_vm_sdma_prepare(struct 
> amdgpu_vm_update_params *p,
> 
>   p->num_dw_left = ndw;
> 
> + p->resv = resv;
> +
>   if (!resv)
>   return 0;
> 
> @@ -111,6 +113,13 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>   swap(p->vm->last_delayed, tmp);
>   dma_fence_put(tmp);
> 
> + /* add job fence to resv.
> +  * MM notifier path is an exception as we can not grab the
> +  * resv lock.
> +  */
> + if (!p->direct && p->resv)
> + dma_resv_add_shared_fence(p->resv, f);
> +
>   if (fence && !p->direct)
>   swap(*fence, f);
>   dma_fence_put(f);
> -- 
> 2.17.1
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

2020-03-13 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

page table BOs share same resv.It should be ok using any of them, root bo resv 
or bo resv.
I forgot to unref bos which cause problems. not good at rebasing...



From: Koenig, Christian 
Sent: Friday, March 13, 2020 9:34:42 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 

Cc: Deucher, Alexander ; Kuehling, Felix 

Subject: Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

Am 13.03.20 um 12:53 schrieb xinhui pan:
> If a job need sync the bo resv, it is likely that bo need the job fence
> to sync with others.

That won't work because this is the wrong resv object :)

You added the fence to the mapped BO and not the page table.

No wonder that this doesn't work,
Christian.

>
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: Felix Kuehling 
> Suggested-by: Christian König 
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 5 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 9 +
>   2 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index b5705fcfc935..ca6021b4200b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -226,6 +226,11 @@ struct amdgpu_vm_update_params {
> * @num_dw_left: number of dw left for the IB
> */
>unsigned int num_dw_left;
> +
> + /**
> +  * @resv: sync the resv and add job fence to it conditionally.
> +  */
> + struct dma_resv *resv;
>   };
>
>   struct amdgpu_vm_update_funcs {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index 4cc7881f438c..0cfac59bff36 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -70,6 +70,8 @@ static int amdgpu_vm_sdma_prepare(struct 
> amdgpu_vm_update_params *p,
>
>p->num_dw_left = ndw;
>
> + p->resv = resv;
> +
>if (!resv)
>return 0;
>
> @@ -111,6 +113,13 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>swap(p->vm->last_delayed, tmp);
>dma_fence_put(tmp);
>
> + /* add job fence to resv.
> +  * MM notifier path is an exception as we can not grab the
> +  * resv lock.
> +  */
> + if (!p->direct && p->resv)
> + dma_resv_add_shared_fence(p->resv, f);
> +
>if (fence && !p->direct)
>swap(*fence, f);
>dma_fence_put(f);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

2020-03-13 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

page table BOs share same resv.It should be ok using any of them, root bo resv 
or bo resv.
I forgot to unref bos which cause problems. not good at rebasing...

thanks
xinhui

From: Koenig, Christian 
Sent: Friday, March 13, 2020 9:34:42 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 

Cc: Deucher, Alexander ; Kuehling, Felix 

Subject: Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

Am 13.03.20 um 12:53 schrieb xinhui pan:
> If a job need sync the bo resv, it is likely that bo need the job fence
> to sync with others.

That won't work because this is the wrong resv object :)

You added the fence to the mapped BO and not the page table.

No wonder that this doesn't work,
Christian.

>
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: Felix Kuehling 
> Suggested-by: Christian König 
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 5 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 9 +
>   2 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index b5705fcfc935..ca6021b4200b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -226,6 +226,11 @@ struct amdgpu_vm_update_params {
> * @num_dw_left: number of dw left for the IB
> */
>unsigned int num_dw_left;
> +
> + /**
> +  * @resv: sync the resv and add job fence to it conditionally.
> +  */
> + struct dma_resv *resv;
>   };
>
>   struct amdgpu_vm_update_funcs {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index 4cc7881f438c..0cfac59bff36 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -70,6 +70,8 @@ static int amdgpu_vm_sdma_prepare(struct 
> amdgpu_vm_update_params *p,
>
>p->num_dw_left = ndw;
>
> + p->resv = resv;
> +
>if (!resv)
>return 0;
>
> @@ -111,6 +113,13 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>swap(p->vm->last_delayed, tmp);
>dma_fence_put(tmp);
>
> + /* add job fence to resv.
> +  * MM notifier path is an exception as we can not grab the
> +  * resv lock.
> +  */
> + if (!p->direct && p->resv)
> + dma_resv_add_shared_fence(p->resv, f);
> +
>if (fence && !p->direct)
>swap(*fence, f);
>dma_fence_put(f);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

2020-03-13 Thread Pan, Xinhui
Yep, will send v3.

Thanks
xinhui

发件人: "Koenig, Christian" 
日期: 2020年3月13日 星期五 21:46
收件人: "Pan, Xinhui" , "amd-gfx@lists.freedesktop.org" 

抄送: "Deucher, Alexander" , "Kuehling, Felix" 

主题: Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

Yeah, but this is still the wrong resv object :)

See the object passed to amdgpu_vm_sdma_prepare() is the one of the BO which is 
mapped into the page tables and NOT the one of the page tables.

You need to use p->vm->root.base.bo->tbo.base.resv here.

Regards,
Christian.

Am 13.03.20 um 14:43 schrieb Pan, Xinhui:

[AMD Official Use Only - Internal Distribution Only]

page table BOs share same resv.It should be ok using any of them, root bo resv 
or bo resv.
I forgot to unref bos which cause problems. not good at rebasing...



From: Koenig, Christian 
<mailto:christian.koe...@amd.com>
Sent: Friday, March 13, 2020 9:34:42 PM
To: Pan, Xinhui <mailto:xinhui@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
<mailto:amd-gfx@lists.freedesktop.org>
Cc: Deucher, Alexander 
<mailto:alexander.deuc...@amd.com>; Kuehling, Felix 
<mailto:felix.kuehl...@amd.com>
Subject: Re: [PATCH 1/2] drm//amdgpu: Add job fence to resv conditionally

Am 13.03.20 um 12:53 schrieb xinhui pan:
> If a job need sync the bo resv, it is likely that bo need the job fence
> to sync with others.

That won't work because this is the wrong resv object :)

You added the fence to the mapped BO and not the page table.

No wonder that this doesn't work,
Christian.

>
> Cc: Christian König 
> <mailto:christian.koe...@amd.com>
> Cc: Alex Deucher <mailto:alexander.deuc...@amd.com>
> Cc: Felix Kuehling <mailto:felix.kuehl...@amd.com>
> Suggested-by: Christian König 
> <mailto:christian.koe...@amd.com>
> Signed-off-by: xinhui pan <mailto:xinhui@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 5 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 9 +
>   2 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index b5705fcfc935..ca6021b4200b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -226,6 +226,11 @@ struct amdgpu_vm_update_params {
> * @num_dw_left: number of dw left for the IB
> */
>unsigned int num_dw_left;
> +
> + /**
> +  * @resv: sync the resv and add job fence to it conditionally.
> +  */
> + struct dma_resv *resv;
>   };
>
>   struct amdgpu_vm_update_funcs {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index 4cc7881f438c..0cfac59bff36 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -70,6 +70,8 @@ static int amdgpu_vm_sdma_prepare(struct 
> amdgpu_vm_update_params *p,
>
>p->num_dw_left = ndw;
>
> + p->resv = resv;
> +
>if (!resv)
>return 0;
>
> @@ -111,6 +113,13 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>swap(p->vm->last_delayed, tmp);
>dma_fence_put(tmp);
>
> + /* add job fence to resv.
> +  * MM notifier path is an exception as we can not grab the
> +  * resv lock.
> +  */
> + if (!p->direct && p->resv)
> + dma_resv_add_shared_fence(p->resv, f);
> +
>if (fence && !p->direct)
>swap(*fence, f);
>dma_fence_put(f);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] drm/amdgpu: Add job fence to resv conditionally

2020-03-13 Thread Pan, Xinhui


> 2020年3月13日 22:13,Koenig, Christian  写道:
> 
> Am 13.03.20 um 15:07 schrieb xinhui pan:
>> Provide resv as a parameter for vm sdma commit.
>> Job fence on page table should be a shared one, so add it to the resv.
>> 
>> Cc: Christian König 
>> Cc: Alex Deucher 
>> Cc: Felix Kuehling 
>> Suggested-by: Christian König 
>> Signed-off-by: xinhui pan 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 4 ++--
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 5 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 7 +++
>>  3 files changed, 14 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 73398831196f..809ca6e8f40f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1579,6 +1579,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>> dma_addr_t *pages_addr,
>> struct dma_fence **fence)
>>  {
>> +struct amdgpu_bo *root = vm->root.base.bo;
>>  struct amdgpu_vm_update_params params;
>>  enum amdgpu_sync_mode sync_mode;
>>  int r;
>> @@ -1604,8 +1605,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  }
>>  if (flags & AMDGPU_PTE_VALID) {
>> -struct amdgpu_bo *root = vm->root.base.bo;
>> -
>>  if (!dma_fence_is_signaled(vm->last_direct))
>>  amdgpu_bo_fence(root, vm->last_direct, true);
>>  @@ -1613,6 +1612,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  amdgpu_bo_fence(root, vm->last_delayed, true);
>>  }
>>  +   params.resv = root->tbo.base.resv;
>>  r = vm->update_funcs->prepare(¶ms, resv, sync_mode);
>>  if (r)
>>  goto error_unlock;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> index b5705fcfc935..ca6021b4200b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> @@ -226,6 +226,11 @@ struct amdgpu_vm_update_params {
>>   * @num_dw_left: number of dw left for the IB
>>   */
>>  unsigned int num_dw_left;
>> +
>> +/**
>> + * @resv: sync the resv and add job fence to it conditionally.
>> + */
>> +struct dma_resv *resv;
>>  };
>>struct amdgpu_vm_update_funcs {
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> index 4cc7881f438c..a1b270a4da8e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> @@ -111,6 +111,13 @@ static int amdgpu_vm_sdma_commit(struct 
>> amdgpu_vm_update_params *p,
>>  swap(p->vm->last_delayed, tmp);
>>  dma_fence_put(tmp);
>>  +   /* add job fence to resv.
>> + * MM notifier path is an exception as we can not grab the
>> + * resv lock.
>> + */
>> +if (!p->direct && p->resv)
> 
> You can just use p->vm->root.base.bo->tbo.base.resv here, no need for a new 
> field in the paramater structure.
> 

yep, make sense.

> And it would probably be best to also remove the vm->last_delayed field 
> entirely.
> 
> In other words use something like this here
> 
> if (p->direct) {
> tmp = dma_fence_get(f);
> swap(p->vm->last_direct, tmp);
> dma_fence_put(tmp);
> } else {
> dma_resv_add_shared_fence(p->vm->root.base.bo->tbo.base.resv, f);
> }
> 

I think we still need udpate the last_delayed. looks like eviction and some 
other ioctls test this field to determine  the vm state.
this should be done by a new patch if possible.

> Regards,
> Christian.
> 
>> +dma_resv_add_shared_fence(p->resv, f);
>> +
>>  if (fence && !p->direct)
>>  swap(*fence, f);
>>  dma_fence_put(f);
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 2/2] drm/amdgpu: unref the bo after job submit

2020-03-13 Thread Pan, Xinhui


> 2020年3月13日 22:20,Koenig, Christian  写道:
> 
> Am 13.03.20 um 15:07 schrieb xinhui pan:
>> Otherwise we might free BOs before job has completed.
>> We add the fence to BO after commit, so free BOs after that.
> 
> I'm not sure if this is necessary, but probably better save than sorry.   

without this patch, we hit gmc page fault. 

we have individualized bo resv during bo releasing.
so any fences added to root PT bo resv after that is actually untested.

> Some comments below.
> 
>> 
>> Cc: Christian König 
>> Cc: Alex Deucher 
>> Cc: Felix Kuehling 
>> Signed-off-by: xinhui pan 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 38 +++---
>>  1 file changed, 28 insertions(+), 10 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 809ca6e8f40f..605a1bb40280 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -942,13 +942,17 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device 
>> *adev,
>>   *
>>   * @entry: PDE to free
>>   */
>> -static void amdgpu_vm_free_table(struct amdgpu_vm_pt *entry)
>> +static void amdgpu_vm_free_table(struct amdgpu_vm_pt *entry,
>> +struct list_head *head)
>>  {
>>  if (entry->base.bo) {
>>  entry->base.bo->vm_bo = NULL;
>>  list_del(&entry->base.vm_status);
>> -amdgpu_bo_unref(&entry->base.bo->shadow);
>> -amdgpu_bo_unref(&entry->base.bo);
>> +if (!head) {
>> +amdgpu_bo_unref(&entry->base.bo->shadow);
>> +amdgpu_bo_unref(&entry->base.bo);
>> +} else
>> +list_add(&entry->base.vm_status, head);
> 
> Instead of adding a parameter make this a new state in the VM. Something like 
> vm->zombies or something similar.
> 
>>  }
>>  kvfree(entry->entries);
>>  entry->entries = NULL;
>> @@ -965,7 +969,8 @@ static void amdgpu_vm_free_table(struct amdgpu_vm_pt 
>> *entry)
>>   */
>>  static void amdgpu_vm_free_pts(struct amdgpu_device *adev,
>> struct amdgpu_vm *vm,
>> -   struct amdgpu_vm_pt_cursor *start)
>> +   struct amdgpu_vm_pt_cursor *start,
>> +   struct list_head *head)
>>  {
>>  struct amdgpu_vm_pt_cursor cursor;
>>  struct amdgpu_vm_pt *entry;
>> @@ -973,10 +978,10 @@ static void amdgpu_vm_free_pts(struct amdgpu_device 
>> *adev,
>>  vm->bulk_moveable = false;
>>  for_each_amdgpu_vm_pt_dfs_safe(adev, vm, start, cursor, entry)
>> -amdgpu_vm_free_table(entry);
>> +amdgpu_vm_free_table(entry, head);
>>  if (start)
>> -amdgpu_vm_free_table(start->entry);
>> +amdgpu_vm_free_table(start->entry, head);
>>  }
>>/**
>> @@ -1428,7 +1433,8 @@ static void amdgpu_vm_fragment(struct 
>> amdgpu_vm_update_params *params,
>>   */
>>  static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>>   uint64_t start, uint64_t end,
>> - uint64_t dst, uint64_t flags)
>> + uint64_t dst, uint64_t flags,
>> + struct list_head *head)
>>  {
>>  struct amdgpu_device *adev = params->adev;
>>  struct amdgpu_vm_pt_cursor cursor;
>> @@ -1539,7 +1545,7 @@ static int amdgpu_vm_update_ptes(struct 
>> amdgpu_vm_update_params *params,
>>   * completely covered by the range and so potentially 
>> still in use.
>>   */
>>  while (cursor.pfn < frag_start) {
>> -amdgpu_vm_free_pts(adev, params->vm, &cursor);
>> +amdgpu_vm_free_pts(adev, params->vm, &cursor, 
>> head);
>>  amdgpu_vm_pt_next(adev, &cursor);
>>  }
>>  @@ -1583,6 +1589,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  struct amdgpu_vm_update_params params;
>>  enum amdgpu_sync_mode sync_mode;
>>  int r;
>> +struct list_head head;
>>  memset(¶ms, 0, sizeof(params));
>>  params.adev = adev;
>> @@ -1590,6 +1597,8 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  params.direct = direct;
>>  params.pages_addr = pages_addr;
>>  +   INIT_LIST_HEAD(&head);
>> +
>>  /* Implicitly sync to command submissions in the same VM before
>>   * unmapping. Sync to moving fences before mapping.
>>   */
>> @@ -1617,13 +1626,22 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  if (r)
>>  goto error_unlock;
>>  -   r = amdgpu_vm_update_ptes(¶ms, start, last + 1, addr, flags);
>> +r = amdgpu_vm_update_ptes(¶ms, start, last + 1, addr, flags, &head);
>>  if (r)
>>  goto error_unlock;
>>  r = vm->update_funcs->commit(¶ms

Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-14 Thread Pan, Xinhui
hi, All
I think I found the root cause. here is what happened.

user: alloc/mapping memory
kernel: validate memory and update the bo mapping, and update 
the page table
-> amdgpu_vm_bo_update_mapping
-> amdgpu_vm_update_ptes
-> amdgpu_vm_alloc_pts
-> amdgpu_vm_clear_bo // it 
will submit a job and we have a fence. BUT it is NOT added in resv.
user: free/unmapping memory
kernel: unmapping mmeory and udpate the page table
-> amdgpu_vm_bo_update_mapping
sync last_delay fence if flag & AMDGPU_PTE_VALID // of 
source we did not sync it here, as this is unmapping.
-> amdgpu_vm_update_ptes
-> amdgpu_vm_free_pts // unref page 
table bo.

So from the sequence above, we know there is a race betwen bo releasing and bo 
clearing.
bo might have been released before job running.

we can fix it in several ways,
1) sync last_delay in both mapping and unmapping case.
 Chris, you just sync last_delay in mapping case, should it be ok to sync it 
also in unmapping case?

2) always add fence to resv after commit. 
 this is done by patchset v4. And only need patch 1. no need to move unref bo 
after commit.

3) move unref bo after commit, and add the last delay fence to resv. 
This is done by patchset V1. 


any ideas?

thanks
xinhui

> 2020年3月14日 02:05,Koenig, Christian  写道:
> 
> The page table is not updated and then freed. A higher level PDE is updated 
> and because of this the lower level page tables is freed.
> 
> Without this it could be that the memory backing the freed page table is 
> reused while the PDE is still pointing to it.
> 
> Rather unlikely that this causes problems, but better save than sorry.
> 
> Regards,
> Christian.
> 
> Am 13.03.20 um 18:36 schrieb Felix Kuehling:
>> This seems weird. This means that we update a page table, and then free it 
>> in the same amdgpu_vm_update_ptes call? That means the update is redundant. 
>> Can we eliminate the redundant PTE update if the page table is about to be 
>> freed anyway?
>> 
>> Regards,
>>   Felix
>> 
>> On 2020-03-13 12:09, xinhui pan wrote:
>>> Free page table bo before job submit is insane.
>>> We might touch invalid memory while job is runnig.
>>> 
>>> we now have individualized bo resv during bo releasing.
>>> So any fences added to root PT bo is actually untested when
>>> a normal PT bo is releasing.
>>> 
>>> We might hit gmc page fault or memory just got overwrited.
>>> 
>>> Cc: Christian König 
>>> Cc: Alex Deucher 
>>> Cc: Felix Kuehling 
>>> Signed-off-by: xinhui pan 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 24 +---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  3 +++
>>>   2 files changed, 24 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 73398831196f..346e2f753474 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -937,6 +937,21 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device 
>>> *adev,
>>>   return r;
>>>   }
>>>   +static void amdgpu_vm_free_zombie_bo(struct amdgpu_device *adev,
>>> +struct amdgpu_vm *vm)
>>> +{
>>> +struct amdgpu_vm_pt *entry;
>>> +
>>> +while (!list_empty(&vm->zombies)) {
>>> +entry = list_first_entry(&vm->zombies, struct amdgpu_vm_pt,
>>> +base.vm_status);
>>> +list_del(&entry->base.vm_status);
>>> +
>>> +amdgpu_bo_unref(&entry->base.bo->shadow);
>>> +amdgpu_bo_unref(&entry->base.bo);
>>> +}
>>> +}
>>> +
>>>   /**
>>>* amdgpu_vm_free_table - fre one PD/PT
>>>*
>>> @@ -945,10 +960,9 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device 
>>> *adev,
>>>   static void amdgpu_vm_free_table(struct amdgpu_vm_pt *entry)
>>>   {
>>>   if (entry->base.bo) {
>>> +list_move(&entry->base.vm_status,
>>> + &entry->base.bo->vm_bo->vm->zombies);
>>>   entry->base.bo->vm_bo = NULL;
>>> -list_del(&entry->base.vm_status);
>>> -amdgpu_bo_unref(&entry->base.bo->shadow);
>>> -amdgpu_bo_unref(&entry->base.bo);
>>>   }
>>>   kvfree(entry->entries);
>>>   entry->entries = NULL;
>>> @@ -1624,6 +1638,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>   r = vm->update_funcs->commit(¶ms, fence);
>>> error_unlock:
>>> +amdgpu_vm_free_zombie_bo(adev, vm);
>>>   amdgpu_vm_eviction_unlock(vm);
>>>   return r;
>>>   }
>>> @@ -2807,6 +2822,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
>>> amdgpu_vm *vm,
>>>   INIT_LIST_HEAD(&vm->invalidated);
>>>   spin_lock_init(&vm->invalidated_lock);
>>>   INIT_LIST_HEAD(&vm->freed);
>>> +INIT_LIST_HEAD(&vm->zombies);
>>> 

Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-16 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

I still hit page fault with option 1 while running oclperf test.
Looks like we need sync fence after commit.

From: Tao, Yintian 
Sent: Monday, March 16, 2020 4:15:01 PM
To: Pan, Xinhui ; Koenig, Christian 

Cc: Deucher, Alexander ; Kuehling, Felix 
; Pan, Xinhui ; 
amd-gfx@lists.freedesktop.org 
Subject: RE: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

Hi Xinhui


I encounter the same problem(page fault) when test vk_example benchmark.
I use your first option which can fix the problem. Can you help submit one 
patch?


-   if (flags & AMDGPU_PTE_VALID) {
-   struct amdgpu_bo *root = vm->root.base.bo;
-   if (!dma_fence_is_signaled(vm->last_direct))
-   amdgpu_bo_fence(root, vm->last_direct, true);
+   if (!dma_fence_is_signaled(vm->last_direct))
+   amdgpu_bo_fence(root, vm->last_direct, true);

-   if (!dma_fence_is_signaled(vm->last_delayed))
-   amdgpu_bo_fence(root, vm->last_delayed, true);
-   }
+   if (!dma_fence_is_signaled(vm->last_delayed))
+   amdgpu_bo_fence(root, vm->last_delayed, true);


Best Regards
Yintian Tao

-Original Message-----
From: amd-gfx  On Behalf Of Pan, Xinhui
Sent: 2020年3月14日 21:07
To: Koenig, Christian 
Cc: Deucher, Alexander ; Kuehling, Felix 
; Pan, Xinhui ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

hi, All
I think I found the root cause. here is what happened.

user: alloc/mapping memory
  kernel: validate memory and update the bo mapping, and update the 
page table
-> amdgpu_vm_bo_update_mapping
-> amdgpu_vm_update_ptes
-> amdgpu_vm_alloc_pts
-> amdgpu_vm_clear_bo // it 
will submit a job and we have a fence. BUT it is NOT added in resv.
user: free/unmapping memory
kernel: unmapping mmeory and udpate the page table
-> amdgpu_vm_bo_update_mapping
sync last_delay fence if flag & AMDGPU_PTE_VALID // of 
source we did not sync it here, as this is unmapping.
-> amdgpu_vm_update_ptes
-> amdgpu_vm_free_pts // unref page 
table bo.

So from the sequence above, we know there is a race betwen bo releasing and bo 
clearing.
bo might have been released before job running.

we can fix it in several ways,
1) sync last_delay in both mapping and unmapping case.
 Chris, you just sync last_delay in mapping case, should it be ok to sync it 
also in unmapping case?

2) always add fence to resv after commit.
 this is done by patchset v4. And only need patch 1. no need to move unref bo 
after commit.

3) move unref bo after commit, and add the last delay fence to resv.
This is done by patchset V1.


any ideas?

thanks
xinhui

> 2020年3月14日 02:05,Koenig, Christian  写道:
>
> The page table is not updated and then freed. A higher level PDE is updated 
> and because of this the lower level page tables is freed.
>
> Without this it could be that the memory backing the freed page table is 
> reused while the PDE is still pointing to it.
>
> Rather unlikely that this causes problems, but better save than sorry.
>
> Regards,
> Christian.
>
> Am 13.03.20 um 18:36 schrieb Felix Kuehling:
>> This seems weird. This means that we update a page table, and then free it 
>> in the same amdgpu_vm_update_ptes call? That means the update is redundant. 
>> Can we eliminate the redundant PTE update if the page table is about to be 
>> freed anyway?
>>
>> Regards,
>>   Felix
>>
>> On 2020-03-13 12:09, xinhui pan wrote:
>>> Free page table bo before job submit is insane.
>>> We might touch invalid memory while job is runnig.
>>>
>>> we now have individualized bo resv during bo releasing.
>>> So any fences added to root PT bo is actually untested when a normal
>>> PT bo is releasing.
>>>
>>> We might hit gmc page fault or memory just got overwrited.
>>>
>>> Cc: Christian König 
>>> Cc: Alex Deucher 
>>> Cc: Felix Kuehling 
>>> Signed-off-by: xinhui pan 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 24 +---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  3 +++
>>>   2 files changed, 24 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 73398831196f..346e2f753474 100644
>>> --- a/driv

Re: [PATCH] drm/amdgpu: fix and cleanup amdgpu_gem_object_close v2

2020-03-18 Thread Pan, Xinhui
I wonder if it really fix anything with such small delay. but it should be no 
harm anyway.

Reviewed-by: xinhui pan 

> 2020年3月18日 15:51,Christian König  写道:
> 
> Ping? Xinhui can I get an rb for this?
> 
> Thanks,
> Christian.
> 
> Am 16.03.20 um 14:22 schrieb Christian König:
>> The problem is that we can't add the clear fence to the BO
>> when there is an exclusive fence on it since we can't
>> guarantee the the clear fence will complete after the
>> exclusive one.
>> 
>> To fix this refactor the function and wait for any potential
>> exclusive fence with a small timeout before adding the
>> shared clear fence.
>> 
>> v2: fix warning
>> 
>> Signed-off-by: Christian König 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 43 +++--
>>  1 file changed, 26 insertions(+), 17 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> index 5bec66e6b1f8..49c91dac35a0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> @@ -161,10 +161,11 @@ void amdgpu_gem_object_close(struct drm_gem_object 
>> *obj,
>>  struct amdgpu_bo_list_entry vm_pd;
>>  struct list_head list, duplicates;
>> +struct dma_fence *fence = NULL;
>>  struct ttm_validate_buffer tv;
>>  struct ww_acquire_ctx ticket;
>>  struct amdgpu_bo_va *bo_va;
>> -int r;
>> +long r;
>>  INIT_LIST_HEAD(&list);
>>  INIT_LIST_HEAD(&duplicates);
>> @@ -178,28 +179,36 @@ void amdgpu_gem_object_close(struct drm_gem_object 
>> *obj,
>>  r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates);
>>  if (r) {
>>  dev_err(adev->dev, "leaking bo va because "
>> -"we fail to reserve bo (%d)\n", r);
>> +"we fail to reserve bo (%ld)\n", r);
>>  return;
>>  }
>>  bo_va = amdgpu_vm_bo_find(vm, bo);
>> -if (bo_va && --bo_va->ref_count == 0) {
>> -amdgpu_vm_bo_rmv(adev, bo_va);
>> +if (!bo_va || --bo_va->ref_count)
>> +goto out_unlock;
>>  -   if (amdgpu_vm_ready(vm)) {
>> -struct dma_fence *fence = NULL;
>> +amdgpu_vm_bo_rmv(adev, bo_va);
>> +if (!amdgpu_vm_ready(vm))
>> +goto out_unlock;
>>  -   r = amdgpu_vm_clear_freed(adev, vm, &fence);
>> -if (unlikely(r)) {
>> -dev_err(adev->dev, "failed to clear page "
>> -"tables on GEM object close (%d)\n", r);
>> -}
>>  -   if (fence) {
>> -amdgpu_bo_fence(bo, fence, true);
>> -dma_fence_put(fence);
>> -}
>> -}
>> -}
>> +r = amdgpu_vm_clear_freed(adev, vm, &fence);
>> +if (r || !fence)
>> +goto out_unlock;
>> +
>> +r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, false, false,
>> +  msecs_to_jiffies(10));
>> +if (r == 0)
>> +r = -ETIMEDOUT;
>> +if (r)
>> +goto out_unlock;
>> +
>> +amdgpu_bo_fence(bo, fence, true);
>> +dma_fence_put(fence);
>> +
>> +out_unlock:
>> +if (unlikely(r < 0))
>> +dev_err(adev->dev, "failed to clear page "
>> +"tables on GEM object close (%ld)\n", r);
>>  ttm_eu_backoff_reservation(&ticket, &list);
>>  }
>>  
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix and cleanup amdgpu_gem_object_close v2

2020-03-18 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

yes, adding excl fence again as shared one is more reliable

From: Christian König 
Sent: Wednesday, March 18, 2020 4:03:14 PM
To: Pan, Xinhui 
Cc: amd-gfx@lists.freedesktop.org ; Liu, Monk 

Subject: Re: [PATCH] drm/amdgpu: fix and cleanup amdgpu_gem_object_close v2

The key point is that 10ms should be sufficient that either the move or
the update is finished.

One alternative which came to my mind would be to add the exclusive
fence as shared as well in this case.

This way we won't need to block at all.

Christian.

Am 18.03.20 um 09:00 schrieb Pan, Xinhui:
> I wonder if it really fix anything with such small delay. but it should be no 
> harm anyway.
>
> Reviewed-by: xinhui pan 
>
>> 2020年3月18日 15:51,Christian König  写道:
>>
>> Ping? Xinhui can I get an rb for this?
>>
>> Thanks,
>> Christian.
>>
>> Am 16.03.20 um 14:22 schrieb Christian König:
>>> The problem is that we can't add the clear fence to the BO
>>> when there is an exclusive fence on it since we can't
>>> guarantee the the clear fence will complete after the
>>> exclusive one.
>>>
>>> To fix this refactor the function and wait for any potential
>>> exclusive fence with a small timeout before adding the
>>> shared clear fence.
>>>
>>> v2: fix warning
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 43 +++--
>>>   1 file changed, 26 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> index 5bec66e6b1f8..49c91dac35a0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> @@ -161,10 +161,11 @@ void amdgpu_gem_object_close(struct drm_gem_object 
>>> *obj,
>>>  struct amdgpu_bo_list_entry vm_pd;
>>>  struct list_head list, duplicates;
>>> +   struct dma_fence *fence = NULL;
>>>  struct ttm_validate_buffer tv;
>>>  struct ww_acquire_ctx ticket;
>>>  struct amdgpu_bo_va *bo_va;
>>> -   int r;
>>> +   long r;
>>>  INIT_LIST_HEAD(&list);
>>>  INIT_LIST_HEAD(&duplicates);
>>> @@ -178,28 +179,36 @@ void amdgpu_gem_object_close(struct drm_gem_object 
>>> *obj,
>>>  r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates);
>>>  if (r) {
>>>  dev_err(adev->dev, "leaking bo va because "
>>> -   "we fail to reserve bo (%d)\n", r);
>>> +   "we fail to reserve bo (%ld)\n", r);
>>>  return;
>>>  }
>>>  bo_va = amdgpu_vm_bo_find(vm, bo);
>>> -   if (bo_va && --bo_va->ref_count == 0) {
>>> -   amdgpu_vm_bo_rmv(adev, bo_va);
>>> +   if (!bo_va || --bo_va->ref_count)
>>> +   goto out_unlock;
>>>   - if (amdgpu_vm_ready(vm)) {
>>> -   struct dma_fence *fence = NULL;
>>> +   amdgpu_vm_bo_rmv(adev, bo_va);
>>> +   if (!amdgpu_vm_ready(vm))
>>> +   goto out_unlock;
>>>   - r = amdgpu_vm_clear_freed(adev, vm, &fence);
>>> -   if (unlikely(r)) {
>>> -   dev_err(adev->dev, "failed to clear page "
>>> -   "tables on GEM object close (%d)\n", r);
>>> -   }
>>>   - if (fence) {
>>> -   amdgpu_bo_fence(bo, fence, true);
>>> -   dma_fence_put(fence);
>>> -   }
>>> -   }
>>> -   }
>>> +   r = amdgpu_vm_clear_freed(adev, vm, &fence);
>>> +   if (r || !fence)
>>> +   goto out_unlock;
>>> +
>>> +   r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, false, false,
>>> + msecs_to_jiffies(10));
>>> +   if (r == 0)
>>> +   r = -ETIMEDOUT;
>>> +   if (r)
>>> +   goto out_unlock;
>>> +
>>> +   amdgpu_bo_fence(bo, fence, true);
>>> +   dma_fence_put(fence);
>>> +
>>> +out_unlock:
>>> +   if (unlikely(r < 0))
>>> +   dev_err(adev->dev, "failed to clear page "
>>> +   "tables on GEM object close (%ld)\n", r);
>>>  ttm_eu_backoff_reservation(&ticket, &list);
>>>   }
>>>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: fix the wrong no space return while acquires packet buffer

2020-03-19 Thread Pan, Xinhui


> 2020年3月19日 18:51,Huang Rui  写道:
> 
> The queue buffer index starts from position 0, so the available buffer size
> which starts from position 0 to rptr should be "rptr" index value. While the
> packet_size_in_dwords == rptr, the available buffer is just good enough.
> 
> Signed-off-by: Huang Rui 
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index bae7064..4667c8f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -263,7 +263,7 @@ int kq_acquire_packet_buffer(struct kernel_queue *kq,
>   /* make sure after rolling back to position 0, there is
>* still enough space.
>*/
> - if (packet_size_in_dwords >= rptr)
> + if (packet_size_in_dwords > rptr)

rptr should always be > wptr unless ring is empty.

say, rptr is 4, packet_size_in_dwords is 4. Then wptr changes from 0 to 4, that 
is illegal.


>   goto err_no_space;
> 
>   /* fill nops, roll back and start at position 0 */
> -- 
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Cdb98cfd84b764e77d20e08d7cbf38dff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637202119950119907&sdata=mqrl2uooQ95Dns4l6CtmmOBCm%2FqZyw1pX48VA20mlZY%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: hold the reference of finished fence

2020-03-24 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

Does this issue occur when gpu recovery?
I just check the code,  fence timedout will free job and put its fence. but gpu 
recovery might resubmit job.
Correct me if I am wrong.

From: amd-gfx  on behalf of Andrey 
Grodzovsky 
Sent: Tuesday, March 24, 2020 11:40:06 AM
To: Tao, Yintian ; Koenig, Christian 
; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org 
Subject: Re: [PATCH] drm/amdgpu: hold the reference of finished fence


On 3/23/20 10:22 AM, Yintian Tao wrote:
> There is one one corner case at dma_fence_signal_locked
> which will raise the NULL pointer problem just like below.
> ->dma_fence_signal
>  ->dma_fence_signal_locked
>->test_and_set_bit
> here trigger dma_fence_release happen due to the zero of fence refcount.


Did you find out why the zero refcount on the finished fence happens
before the fence was signaled ? The finished fence is created with
refcount set to 1 in drm_sched_fence_create->dma_fence_init and then the
refcount is decremented in
drm_sched_main->amdgpu_job_free_cb->drm_sched_job_cleanup. This should
only happen after fence is already signaled (see
drm_sched_get_cleanup_job). On top of that the finished fence is
referenced from other places (e.g. entity->last_scheduled e.t.c)...


>
> ->dma_fence_put
>  ->dma_fence_release
>->drm_sched_fence_release_scheduled
>->call_rcu
> here make the union fled “cb_list” at finished fence
> to NULL because struct rcu_head contains two pointer
> which is same as struct list_head cb_list
>
> Therefore, to hold the reference of finished fence at drm_sched_process_job
> to prevent the null pointer during finished fence dma_fence_signal
>
> [  732.912867] BUG: kernel NULL pointer dereference, address: 0008
> [  732.914815] #PF: supervisor write access in kernel mode
> [  732.915731] #PF: error_code(0x0002) - not-present page
> [  732.916621] PGD 0 P4D 0
> [  732.917072] Oops: 0002 [#1] SMP PTI
> [  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G   OE 
> 5.4.0-rc7 #1
> [  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> [  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
> [  732.938569] Call Trace:
> [  732.939003]  
> [  732.939364]  dma_fence_signal+0x29/0x50
> [  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
> [  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
> [  732.941910]  dma_fence_signal_locked+0x85/0x100
> [  732.942692]  dma_fence_signal+0x29/0x50
> [  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
> [  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
>
> v2: hold the finished fence at drm_sched_process_job instead of
>  amdgpu_fence_process
> v3: resume the blank line
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index a18eabf692e4..8e731ed0d9d9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -651,7 +651,9 @@ static void drm_sched_process_job(struct dma_fence *f, 
> struct dma_fence_cb *cb)
>
>trace_drm_sched_process_job(s_fence);
>
> + dma_fence_get(&s_fence->finished);
>drm_sched_fence_finished(s_fence);


If the fence was already released during call to
drm_sched_fence_finished->dma_fence_signal->... why is it safe to
reference the s_fence just before that call ? Can't it already be
released by this time ?

Andrey



> + dma_fence_put(&s_fence->finished);
>wake_up_interruptible(&sched->wake_up_worker);
>   }
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7C65933fca0b414d12aab408d7cfa51165%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637206180230440562&sdata=z6ec%2BcWkwjaDgZvkpL3jOMYkBtDjbNOxlXiAk4Ri5Ck%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui


> 2020年3月25日 15:48,Koenig, Christian  写道:
> 
> Am 25.03.20 um 06:47 schrieb xinhui pan:
>> Hit panic during GPU recovery test. drm_sched_entity_select_rq might
>> set NULL to rq. So add a check like drm_sched_job_init does.
> 
> NAK, the rq should never be set to NULL in the first place.
> 
> How did that happened?

well, I have not check the details.
but just got the call trace below.
looks like sched is not ready, and drm_sched_entity_select_rq set entity->rq to 
NULL.
in the next amdgpu_vm_sdma_commit, hit panic when we deference entity->rq.

297567 [   44.667677] amdgpu :03:00.0: GPU reset begin!
297568 [   44.929047] [drm] scheduler sdma0 is not ready, skipping
297569 [   44.929048] [drm] scheduler sdma1 is not ready, skipping
297570 [   44.934608] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
297571 [   44.947941] BUG: kernel NULL pointer dereference, address: 
0038
297572 [   44.955132] #PF: supervisor read access in kernel mode
297573 [   44.960451] #PF: error_code(0x) - not-present page
297574 [   44.965714] PGD 0 P4D 0
297575 [   44.968331] Oops:  [#1] SMP PTI
297576 [   44.971911] CPU: 7 PID: 2496 Comm: gnome-shell Tainted: GW
 5.4.0-rc7+ #1
297577 [   44.980221] Hardware name: System manufacturer System Product 
Name/Z170-A, BIOS 1702 01/28/2016
297578 [   44.989177] RIP: 0010:amdgpu_vm_sdma_commit+0x55/0x190 [amdgpu]
297579 [   44.995242] Code: 47 20 80 7f 10 00 4c 8b a0 88 01 00 00 48 8b 47 08 
4c 8d a8 70 01 00 00 75 07 4c 8d a8 88 02 00 00 49 8b 45 10 41 8b 54 24 08 <48> 
8b 40 38 85 d2 48 8d b8 30 ff ff f   f 0f 84 06 01 00 00 48 8b 80
297580 [   45.014931] RSP: 0018:b66e008839d0 EFLAGS: 00010246
297581 [   45.020504] RAX:  RBX: b66e00883a30 RCX: 
00100400
297582 [   45.028062] RDX: 003c RSI: 8df123662138 RDI: 
b66e00883a30
297583 [   45.035662] RBP: b66e00883a00 R08: b66e0088395c R09: 
b66e00883960
297584 [   45.043298] R10: 00100240 R11: 0035 R12: 
8df1425385e8
297585 [   45.050916] R13: 8df13cfd1288 R14: 8df123662138 R15: 
8df13cfd1000
297586 [   45.058524] FS:  7fcc8f6b2100() GS:8df15e38() 
knlGS:
297587 [   45.067114] CS:  0010 DS:  ES:  CR0: 80050033
297588 [   45.073206] CR2: 0038 CR3: 000641fb6006 CR4: 
003606e0
297589 [   45.080791] DR0:  DR1:  DR2: 

297590 [   45.088277] DR3:  DR6: fffe0ff0 DR7: 
0400
297591 [   45.095773] Call Trace:
297592 [   45.098354]  amdgpu_vm_bo_update_mapping+0x1c1/0x1f0 [amdgpu]
297593 [   45.104427]  ? mark_held_locks+0x4d/0x80
297594 [   45.108682]  amdgpu_vm_bo_update+0x3b7/0x960 [amdgpu]
297595 [   45.114049]  ? rcu_read_lock_sched_held+0x4f/0x80
297596 [   45.119111]  amdgpu_gem_va_ioctl+0x4f3/0x510 [amdgpu]
297597 [   45.124495]  ? amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu]
297598 [   45.130250]  drm_ioctl_kernel+0xb0/0x100 [drm]
297599 [   45.134988]  ? amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu]
297600 [   45.140742]  ? drm_ioctl_kernel+0xb0/0x100 [drm]
297601 [   45.145622]  drm_ioctl+0x389/0x450 [drm]
297602 [   45.149804]  ? amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu]
297603 [   45.11]  ? trace_hardirqs_on+0x3b/0xf0
297604 [   45.159892]  amdgpu_drm_ioctl+0x4f/0x80 [amdgpu]
297605 [   45.172104]  do_vfs_ioctl+0xa9/0x6f0
297606 [   45.175909]  ? tomoyo_file_ioctl+0x19/0x20
297607 [   45.180241]  ksys_ioctl+0x75/0x80
297608 [   45.183760]  ? do_syscall_64+0x17/0x230
297609 [   45.187833]  __x64_sys_ioctl+0x1a/0x20
297610 [   45.191846]  do_syscall_64+0x5f/0x230
297611 [   45.195764]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
297612 [   45.201126] RIP: 0033:0x7fcc8c7725d7

> 
> Regards,
> Christian.
> 
>> 
>> Cc: Christian König 
>> Cc: Alex Deucher 
>> Cc: Felix Kuehling 
>> Signed-off-by: xinhui pan 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> index cf96c335b258..d30d103e48a2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> @@ -95,6 +95,8 @@ static int amdgpu_vm_sdma_commit(struct 
>> amdgpu_vm_update_params *p,
>>  int r;
>>  entity = p->direct ? &p->vm->direct : &p->vm->delayed;
>> +if (!entity->rq)
>> +return -ENOENT;
>>  ring = container_of(entity->rq->sched, struct amdgpu_ring, sched);
>>  WARN_ON(ib->length_dw == 0);
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui


> 2020年3月25日 17:23,Pan, Xinhui  写道:
> 
> 
> 
>> 2020年3月25日 15:48,Koenig, Christian  写道:
>> 
>> Am 25.03.20 um 06:47 schrieb xinhui pan:
>>> Hit panic during GPU recovery test. drm_sched_entity_select_rq might
>>> set NULL to rq. So add a check like drm_sched_job_init does.
>> 
>> NAK, the rq should never be set to NULL in the first place.
>> 
>> How did that happened?
> 
> well, I have not check the details.

so recovery will disable sdma ring. the sched->ready will be false then. 
any job submitted during suspend and resume will meet this issue.

[   99.011614] amdgpu :03:00.0: GPU reset begin!
[   99.265504] CPU: 5 PID: 163 Comm: kworker/5:1 Tainted: GW 
5.4.0-rc7+ #1
[   99.273659] Hardware name: System manufacturer System Product Name/Z170-A, 
BIOS 1702 01/28/2016
[   99.282522] Workqueue: events drm_sched_job_timedout [gpu_sched]
[   99.288682] Call Trace:
[   99.291193]  dump_stack+0x98/0xd5
[   99.294629]  sdma_v5_0_enable+0x1ab/0x1d0 [amdgpu]
[   99.299563]  sdma_v5_0_suspend+0x2a/0x30 [amdgpu]
[   99.304360]  amdgpu_device_ip_suspend_phase2+0xa3/0x110 [amdgpu]
[   99.310504]  ? amdgpu_device_ip_suspend_phase1+0x5b/0xe0 [amdgpu]
[   99.316727]  amdgpu_device_ip_suspend+0x37/0x60 [amdgpu]
[   99.322159]  amdgpu_device_pre_asic_reset+0x81/0x1f0 [amdgpu]
[   99.328054]  amdgpu_device_gpu_recover+0x27f/0xc60 [amdgpu]
[   99.333767]  amdgpu_job_timedout+0x123/0x140 [amdgpu]
[   99.338898]  drm_sched_job_timedout+0x85/0xe0 [gpu_sched]
[   99.35]  ? amdgpu_cgs_destroy_device+0x10/0x10 [amdgpu]
[   99.350145]  ? drm_sched_job_timedout+0x85/0xe0 [gpu_sched]
[   99.355834]  process_one_work+0x231/0x5c0
[   99.359927]  worker_thread+0x3f/0x3b0
[   99.363641]  ? __kthread_parkme+0x61/0x90
[   99.367701]  kthread+0x12c/0x150
[   99.371010]  ? process_one_work+0x5c0/0x5c0
[   99.375318]  ? kthread_park+0x90/0x90
[   99.379042]  ret_from_fork+0x3a/0x50


> but just got the call trace below.
> looks like sched is not ready, and drm_sched_entity_select_rq set entity->rq 
> to NULL.
> in the next amdgpu_vm_sdma_commit, hit panic when we deference entity->rq.
> 
> 297567 [   44.667677] amdgpu :03:00.0: GPU reset begin!
> 297568 [   44.929047] [drm] scheduler sdma0 is not ready, skipping
> 297569 [   44.929048] [drm] scheduler sdma1 is not ready, skipping
> 297570 [   44.934608] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> 297571 [   44.947941] BUG: kernel NULL pointer dereference, address: 
> 0038
> 297572 [   44.955132] #PF: supervisor read access in kernel mode
> 297573 [   44.960451] #PF: error_code(0x) - not-present page
> 297574 [   44.965714] PGD 0 P4D 0
> 297575 [   44.968331] Oops:  [#1] SMP PTI
> 297576 [   44.971911] CPU: 7 PID: 2496 Comm: gnome-shell Tainted: GW  
>5.4.0-rc7+ #1
> 297577 [   44.980221] Hardware name: System manufacturer System Product 
> Name/Z170-A, BIOS 1702 01/28/2016
> 297578 [   44.989177] RIP: 0010:amdgpu_vm_sdma_commit+0x55/0x190 [amdgpu]
> 297579 [   44.995242] Code: 47 20 80 7f 10 00 4c 8b a0 88 01 00 00 48 8b 47 
> 08 4c 8d a8 70 01 00 00 75 07 4c 8d a8 88 02 00 00 49 8b 45 10 41 8b 54 24 08 
> <48> 8b 40 38 85 d2 48 8d b8 30 ff ff f   f 0f 84 06 01 00 00 48 8b 80
> 297580 [   45.014931] RSP: 0018:b66e008839d0 EFLAGS: 00010246
> 297581 [   45.020504] RAX:  RBX: b66e00883a30 RCX: 
> 00100400
> 297582 [   45.028062] RDX: 003c RSI: 8df123662138 RDI: 
> b66e00883a30
> 297583 [   45.035662] RBP: b66e00883a00 R08: b66e0088395c R09: 
> b66e00883960
> 297584 [   45.043298] R10: 00100240 R11: 0035 R12: 
> 8df1425385e8
> 297585 [   45.050916] R13: 8df13cfd1288 R14: 8df123662138 R15: 
> 8df13cfd1000
> 297586 [   45.058524] FS:  7fcc8f6b2100() GS:8df15e38() 
> knlGS:
> 297587 [   45.067114] CS:  0010 DS:  ES:  CR0: 80050033
> 297588 [   45.073206] CR2: 0038 CR3: 000641fb6006 CR4: 
> 003606e0
> 297589 [   45.080791] DR0:  DR1:  DR2: 
> 
> 297590 [   45.088277] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> 297591 [   45.095773] Call Trace:
> 297592 [   45.098354]  amdgpu_vm_bo_update_mapping+0x1c1/0x1f0 [amdgpu]
> 297593 [   45.104427]  ? mark_held_locks+0x4d/0x80
> 297594 [   45.108682]  amdgpu_vm_bo_update+0x3b7/0x960 [amdgpu]
> 297595 [   45.114049]  ? rcu_read_lock_sched_held+0x4f/0x80
> 297596 [   45.119111]  amdgpu_gem_va_ioctl+0x4f3/0x510 [amdgpu]
> 297597 [   45.124495]  ? amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu]
> 297598 [   45.130250]  drm_ioctl_kernel+0xb0/0x100 [drm]
> 297

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

well, submit job with HW disabled shluld be no harm.

The only concern is that we might use up IBs if we park scheduler during 
recovery. I have saw recovery stuck in sa new functuon.

ring test alloc IBs to test if recovery succeed or not. But if there is no 
enough IBs it will wait fences to signal. However we have parked the scheduler 
thread,  the job will never run and no fences will be signaled.

see, deadlock indeed. Now we are allowing job submission here. it is more 
likely that IBs might be used up.


From: Koenig, Christian 
Sent: Wednesday, March 25, 2020 7:13:13 PM
To: Das, Nirmoy 
Cc: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 
; Deucher, Alexander 
; Kuehling, Felix 
Subject: Re: [PATCH] drm/amdgpu: Check entity rq

Hi guys,

thanks for pointing this out Nirmoy.

Yeah, could be that I forgot to commit the patch. Currently I don't know at 
which end of the chaos I should start to clean up.

Christian.

Am 25.03.2020 12:09 schrieb "Das, Nirmoy" :
Hi Xinhui,


Can you please check if you can reproduce the crash with
https://lists.freedesktop.org/archives/amd-gfx/2020-February/046414.html

Christian fix it earlier, I think he forgot to push it.


Regards,

Nirmoy

On 3/25/20 12:07 PM, xinhui pan wrote:
> gpu recover will call sdma suspend/resume. In this period, ring will be
> disabled. So the vm_pte_scheds(sdma.instance[X].ring.sched)->ready will
> be false.
>
> If we submit any jobs in this ring-disabled period. We fail to pick up
> a rq for vm entity and entity->rq will set to NULL.
> amdgpu_vm_sdma_commit did not check the entity->rq, so fix it. Otherwise
> hit panic.
>
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: Felix Kuehling 
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index cf96c335b258..d30d103e48a2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -95,6 +95,8 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>int r;
>
>entity = p->direct ? &p->vm->direct : &p->vm->delayed;
> + if (!entity->rq)
> + return -ENOENT;
>ring = container_of(entity->rq->sched, struct amdgpu_ring, sched);
>
>WARN_ON(ib->length_dw == 0);


Am 25.03.2020 12:09 schrieb "Das, Nirmoy" :
Hi Xinhui,


Can you please check if you can reproduce the crash with
https://lists.freedesktop.org/archives/amd-gfx/2020-February/046414.html

Christian fix it earlier, I think he forgot to push it.


Regards,

Nirmoy

On 3/25/20 12:07 PM, xinhui pan wrote:
> gpu recover will call sdma suspend/resume. In this period, ring will be
> disabled. So the vm_pte_scheds(sdma.instance[X].ring.sched)->ready will
> be false.
>
> If we submit any jobs in this ring-disabled period. We fail to pick up
> a rq for vm entity and entity->rq will set to NULL.
> amdgpu_vm_sdma_commit did not check the entity->rq, so fix it. Otherwise
> hit panic.
>
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: Felix Kuehling 
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index cf96c335b258..d30d103e48a2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -95,6 +95,8 @@ static int amdgpu_vm_sdma_commit(struct 
> amdgpu_vm_update_params *p,
>int r;
>
>entity = p->direct ? &p->vm->direct : &p->vm->delayed;
> + if (!entity->rq)
> + return -ENOENT;
>ring = container_of(entity->rq->sched, struct amdgpu_ring, sched);
>
>WARN_ON(ib->length_dw == 0);


Am 25.03.2020 12:09 schrieb "Das, Nirmoy" :
Hi Xinhui,


Can you please check if you can reproduce the crash with
https://lists.freedesktop.org/archives/amd-gfx/2020-February/046414.html

Christian fix it earlier, I think he forgot to push it.


Regards,

Nirmoy

On 3/25/20 12:07 PM, xinhui pan wrote:
> gpu recover will call sdma suspend/resume. In this period, ring will be
> disabled. So the vm_pte_scheds(sdma.instance[X].ring.sched)->ready will
> be false.
>
> If we submit any jobs in this ring-disabled period. We fail to pick up
> a rq for vm entity and entity->rq will set to NULL.
> amdgpu_vm_sdma_commit did not check the entity->rq, so fix it. Otherwise
> hit pan

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui
well, submit job with HW disabled shluld be no harm.

The only concern is that we might use up IBs if we park scheduler thread during 
recovery. 
I have saw recovery stuck in sa new functuon. 
ring test alloc IBs to test if recovery succeed or not. But if there is no 
enough IBs it will wait fences to signal. 
However we have parked the scheduler thread,  the job will never run and no 
fences will be signaled.

see, deadlock indeed. Now we are allowing job submission here. it is more 
likely that IBs might be used up.

deadlock calltrace. 
271384 [27069.375047] INFO: task gnome-shell:2507 blocked for more than 120 
seconds.
271385 [27069.382510]   Tainted: GW 5.4.0-rc7+ #1
271386 [27069.388207] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
271387 [27069.396221] gnome-shell D0  2507   2487 0x
271388 [27069.401869] Call Trace:
271389 [27069.404404]  __schedule+0x2ab/0x860
271390 [27069.408009]  ? dma_fence_wait_any_timeout+0x1a4/0x2b0
271391 [27069.413198]  schedule+0x3a/0xc0
271392 [27069.416432]  schedule_timeout+0x21d/0x3c0
271393 [27069.420583]  ? trace_hardirqs_on+0x3b/0xf0
271394 [27069.424815]  ? dma_fence_add_callback+0x6e/0xe0
271395 [27069.429449]  ? dma_fence_wait_any_timeout+0x1a4/0x2b0
271396 [27069.434640]  dma_fence_wait_any_timeout+0x205/0x2b0
271397 [27069.439633]  ? dma_fence_wait_any_timeout+0x238/0x2b0
271398 [27069.444944]  amdgpu_sa_bo_new+0x4d7/0x5c0 [amdgpu]
271399 [27069.449949]  amdgpu_ib_get+0x36/0xa0 [amdgpu]
271400 [27069.454534]  amdgpu_job_alloc_with_ib+0x4d/0x70 [amdgpu]
271401 [27069.460057]  amdgpu_vm_sdma_prepare+0x28/0x60 [amdgpu]
271402 [27069.465370]  amdgpu_vm_bo_update_mapping+0xd7/0x1f0 [amdgpu]
271403 [27069.471171]  ? mark_held_locks+0x4d/0x80
271404 [27069.475281]  amdgpu_vm_bo_update+0x3b7/0x960 [amdgpu]
271405 [27069.480538]  amdgpu_gem_va_ioctl+0x4f3/0x510 [amdgpu]
271406 [27069.485838]  ? amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu]
271407 [27069.491380]  drm_ioctl_kernel+0xb0/0x100 [drm]
271408 [27069.496045]  ? amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu]
271409 [27069.501569]  ? drm_ioctl_kernel+0xb0/0x100 [drm]
271410 [27069.506353]  drm_ioctl+0x389/0x450 [drm]
271411 [27069.510458]  ? amdgpu_gem_va_map_flags+0x70/0x70 [amdgpu]
271412 [27069.516000]  ? trace_hardirqs_on+0x3b/0xf0
271413 [27069.520305]  amdgpu_drm_ioctl+0x4f/0x80 [amdgpu]
271414 [27069.525048]  do_vfs_ioctl+0xa9/0x6f0
271415 [27069.528753]  ? tomoyo_file_ioctl+0x19/0x20
271416 [27069.532972]  ksys_ioctl+0x75/0x80
271417 [27069.536396]  ? do_syscall_64+0x17/0x230
271418 [27069.540357]  __x64_sys_ioctl+0x1a/0x20
271419 [27069.544239]  do_syscall_64+0x5f/0x230


> 2020年3月25日 19:13,Koenig, Christian  写道:
> 
> Hi guys,
> 
> thanks for pointing this out Nirmoy.
> 
> Yeah, could be that I forgot to commit the patch. Currently I don't know at 
> which end of the chaos I should start to clean up.
> 
> Christian.
> 
> Am 25.03.2020 12:09 schrieb "Das, Nirmoy" :
> Hi Xinhui,
> 
> 
> Can you please check if you can reproduce the crash with 
> https://lists.freedesktop.org/archives/amd-gfx/2020-February/046414.html
> 
> Christian fix it earlier, I think he forgot to push it.
> 
> 
> Regards,
> 
> Nirmoy
> 
> On 3/25/20 12:07 PM, xinhui pan wrote:
> > gpu recover will call sdma suspend/resume. In this period, ring will be
> > disabled. So the vm_pte_scheds(sdma.instance[X].ring.sched)->ready will
> > be false.
> >
> > If we submit any jobs in this ring-disabled period. We fail to pick up
> > a rq for vm entity and entity->rq will set to NULL.
> > amdgpu_vm_sdma_commit did not check the entity->rq, so fix it. Otherwise
> > hit panic.
> >
> > Cc: Christian König 
> > Cc: Alex Deucher 
> > Cc: Felix Kuehling 
> > Signed-off-by: xinhui pan 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> > index cf96c335b258..d30d103e48a2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> > @@ -95,6 +95,8 @@ static int amdgpu_vm_sdma_commit(struct 
> > amdgpu_vm_update_params *p,
> >int r;
> >   
> >entity = p->direct ? &p->vm->direct : &p->vm->delayed;
> > + if (!entity->rq)
> > + return -ENOENT;
> >ring = container_of(entity->rq->sched, struct amdgpu_ring, sched);
> >   
> >WARN_ON(ib->length_dw == 0);
> 
> 
> Am 25.03.2020 12:09 schrieb "Das, Nirmoy" :
> Hi Xinhui,
> 
> 
> Can you please check if you can reproduce the crash with 
> https://lists.freedesktop.org/archives/amd-gfx/2020-February/046414.html
> 
> Christian fix it earlier, I think he forgot to push it.
> 
> 
> Regards,
> 
> Nirmoy
> 
> On 3/25/20 12:07 PM, xinhui pan wrote:
> > gpu recover will call sdma suspend/resume. In this period, ring will be
> > disabled. So the vm_pte_scheds(sdma.instance[X].ring.sched)->ready wi

Re: [RFC PATCH 0/2] add direct IB pool

2020-03-25 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

yes, IB test and  vram restore will alloc IBs.

I hit this issue for quite a long time ago. We test benchmarks on ARM server 
which is running android.
Hunders of processes hit too many issues. Panic and memory corruption 
everywhere.

Now i have a littke time to fix this deadlock.

if you want to repro it, set gpu timeout to 50ms,then run vulkan,ocl, 
amdgputest, etc together.
I believe you will see more weird issues.


From: Liu, Monk 
Sent: Thursday, March 26, 2020 1:31:04 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 

Cc: Deucher, Alexander ; Kuehling, Felix 
; Pan, Xinhui ; Koenig, Christian 

Subject: RE: [RFC PATCH 0/2] add direct IB pool

That sounds a roughly doable plan to me , although we didn't hit this issue in 
our virtualization stress test but like a possible issue.

>>> So the ring test above got stuck if no ib to alloc.
Why there is IB alloc happened in ring test ? I remember there is no IB 
allocated for ring test, are you referring to IB test ?



_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of xinhui pan
Sent: Thursday, March 26, 2020 10:02 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Kuehling, Felix 
; Pan, Xinhui ; Koenig, Christian 

Subject: [RFC PATCH 0/2] add direct IB pool

druing gpu recovery, we alloc ibs for ring tests to test if recovery succeed or 
not.

As gpu recovery parked the gpu scheduler thread, any pending jobs hold the ib 
resource has no chance to free. So the ring test above got stuck if no ib to 
alloc.

If we schedule IBs directly in job_submit_direct, we can alloc ibs in the new 
ib pool. It should have less contention.

If the IB could be freed in time, IOW, not depending on any scheduler, nor any 
other blocking code. It is better to alloc ibs in direct pool.

xinhui pan (2):
  drm/amdgpu: add direct ib pool
  drm/amdgpu: use new job alloc variation if possible

 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 12 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c |  6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   |  4 ++--
 13 files changed, 35 insertions(+), 18 deletions(-)

--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cmonk.liu%40amd.com%7C1f5b1a3ba10a452c9de608d7d129b396%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637207850237679644&sdata=cS7S7a8gDmIgyJNbr4qXSPMZTLwKz0W429Z%2F2Zo6gek%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [RFC PATCH 1/2] drm/amdgpu: add direct ib pool

2020-03-25 Thread Pan, Xinhui


> 2020年3月26日 13:38,Koenig, Christian  写道:
> 
> Yeah that's on my TODO list for quite a while as well.
> 
> But we even need three IB pools. One very small for the IB tests, one for 
> direct VM updates and one for the rest.
> 
> So please make the pool a parameter to ib_get() and not the hack you have 
> below.

yep, I will make IB pool  a parameter.

IB tests for gfx need many IBs, PAGE_SIZE for ib pool is still not enough.
but the default size for ib pool is 2MB now, just one hugepage, today we have 
memory in TB.
so no need make a different size for IB tests pool.

> 
> Thanks,
> Christian.
> 
> Am 26.03.2020 03:02 schrieb "Pan, Xinhui" :
> Another ib poll for direct submit.
> Any jobs schedule IBs without dependence on gpu scheduler should use
> this pool firstly.
> 
> Signed-off-by: xinhui pan 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 12 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 +++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  3 ++-
>  5 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 7dd74253e7b6..c01423ffb8ed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -849,6 +849,7 @@ struct amdgpu_device {
>  struct amdgpu_ring  *rings[AMDGPU_MAX_RINGS];
>  boolib_pool_ready;
>  struct amdgpu_sa_managerring_tmp_bo;
> +   struct amdgpu_sa_managerring_tmp_bo_direct;
>  
>  /* interrupts */
>  struct amdgpu_irq   irq;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 8304d0c87899..28be4efb3d5b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -920,7 +920,7 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device *adev,
>  parser->entity = entity;
>  
>  ring = to_amdgpu_ring(entity->rq->sched);
> -   r =  amdgpu_ib_get(adev, vm, ring->funcs->parse_cs ?
> +   r =  amdgpu_ib_get(adev, (unsigned long )vm|0x1, 
> ring->funcs->parse_cs ?
> chunk_ib->ib_bytes : 0, ib);
>  if (r) {
>  DRM_ERROR("Failed to get ib !\n");
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index bece01f1cf09..f2e08c372d57 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -66,7 +66,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm,
>  int r;
>  
>  if (size) {
> -   r = amdgpu_sa_bo_new(&adev->ring_tmp_bo,
> +   r = amdgpu_sa_bo_new(vm ? &adev->ring_tmp_bo : 
> &adev->ring_tmp_bo_direct,
>&ib->sa_bo, size, 256);
>  if (r) {
>  dev_err(adev->dev, "failed to get a new IB (%d)\n", 
> r);
> @@ -75,7 +75,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm,
>  
>  ib->ptr = amdgpu_sa_bo_cpu_addr(ib->sa_bo);
>  
> -   if (!vm)
> +   if (!((unsigned long)vm & ~0x1))
>  ib->gpu_addr = amdgpu_sa_bo_gpu_addr(ib->sa_bo);
>  }
>  
> @@ -310,6 +310,13 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev)
>  return r;
>  }
>  
> +   r = amdgpu_sa_bo_manager_init(adev, &adev->ring_tmp_bo_direct,
> + AMDGPU_IB_POOL_SIZE*64*1024,
> + AMDGPU_GPU_PAGE_SIZE,
> + AMDGPU_GEM_DOMAIN_GTT);
> +   if (r) {
> +   return r;
> +   }
>  adev->ib_pool_ready = true;
>  
>  return 0;
> @@ -327,6 +334,7 @@ void amdgpu_ib_pool_fini(struct amdgpu_device *adev)
>  {
>  if (adev->ib_pool_ready) {
>  amdgpu_sa_bo_manager_fini(adev, &adev->ring_tmp_bo);
> +   amdgpu_sa_bo_manager_fini(adev, &adev->ring_tmp_bo_direct);
>  adev->ib_pool_ready = false;
>  }
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 4981e443a884..6a63826c6760 100644
> --- a/drivers/gpu/drm/amd/am

Re: [RFC PATCH 1/2] drm/amdgpu: add direct ib pool

2020-03-25 Thread Pan, Xinhui


> 2020年3月26日 14:36,Koenig, Christian  写道:
> 
> 
> 
> Am 26.03.2020 07:15 schrieb "Pan, Xinhui" :
> 
> 
> > 2020年3月26日 13:38,Koenig, Christian  写道:
> > 
> > Yeah that's on my TODO list for quite a while as well.
> > 
> > But we even need three IB pools. One very small for the IB tests, one for 
> > direct VM updates and one for the rest.
> > 
> > So please make the pool a parameter to ib_get() and not the hack you have 
> > below.
> 
> yep, I will make IB pool  a parameter.
> 
> IB tests for gfx need many IBs, PAGE_SIZE for ib pool is still not enough.
> but the default size for ib pool is 2MB now, just one hugepage, today we have 
> memory in TB.
> so no need make a different size for IB tests pool.
> 
> 2MB is probably a bit much and we don't have huge page optimisation for 
> kernel allocations at the moment anyway. Keep in mind that we have only 
> limited space in the GART.
gart table is just 512MB.
do you mean every entry in gart table just points to one 4KB page? and need 5 
gart table entries for one 2M hugepage? 

> 
> Maybe make this 4*PAGE_SIZE for the new IB pool for now and test if that 
> works or not.
> 
> Christian.
> 
> 
> 
> 
> > 
> > Thanks,
> > Christian.
> > 
> > Am 26.03.2020 03:02 schrieb "Pan, Xinhui" :
> > Another ib poll for direct submit.
> > Any jobs schedule IBs without dependence on gpu scheduler should use
> > this pool firstly.
> > 
> > Signed-off-by: xinhui pan 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 12 ++--
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 +++-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  3 ++-
> >  5 files changed, 21 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 7dd74253e7b6..c01423ffb8ed 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -849,6 +849,7 @@ struct amdgpu_device {
> >  struct amdgpu_ring  *rings[AMDGPU_MAX_RINGS];
> >  boolib_pool_ready;
> >  struct amdgpu_sa_managerring_tmp_bo;
> > +   struct amdgpu_sa_managerring_tmp_bo_direct;
> >  
> >  /* interrupts */
> >  struct amdgpu_irq   irq;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 8304d0c87899..28be4efb3d5b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -920,7 +920,7 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device *adev,
> >  parser->entity = entity;
> >  
> >  ring = to_amdgpu_ring(entity->rq->sched);
> > -   r =  amdgpu_ib_get(adev, vm, ring->funcs->parse_cs ?
> > +   r =  amdgpu_ib_get(adev, (unsigned long )vm|0x1, 
> > ring->funcs->parse_cs ?
> > chunk_ib->ib_bytes : 0, ib);
> >  if (r) {
> >  DRM_ERROR("Failed to get ib !\n");
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> > index bece01f1cf09..f2e08c372d57 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> > @@ -66,7 +66,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct 
> > amdgpu_vm *vm,
> >  int r;
> >  
> >  if (size) {
> > -   r = amdgpu_sa_bo_new(&adev->ring_tmp_bo,
> > +   r = amdgpu_sa_bo_new(vm ? &adev->ring_tmp_bo : 
> > &adev->ring_tmp_bo_direct,
> >&ib->sa_bo, size, 256);
> >  if (r) {
> >  dev_err(adev->dev, "failed to get a new IB 
> > (%d)\n", r);
> > @@ -75,7 +75,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct 
> > amdgpu_vm *vm,
> >  
> >  ib->ptr = amdgpu_sa_bo_cpu_addr(ib->sa_bo);
> >  
> > -   if (!vm)
> > +   if (!((unsigned long)vm & ~0x1))
> >  ib->gpu_addr = amdgpu_sa_bo_gpu_addr(ib->sa_bo);
> >  }
> >  
> > @@ -310,6 +310,13 @@ int amdgpu

Re: [RFC PATCH 1/2] drm/amdgpu: add direct ib pool

2020-03-26 Thread Pan, Xinhui


> 2020年3月26日 14:51,Koenig, Christian  写道:
> 
> 
> 
> Am 26.03.2020 07:45 schrieb "Pan, Xinhui" :
> 
> 
> > 2020年3月26日 14:36,Koenig, Christian  写道:
> > 
> > 
> > 
> > Am 26.03.2020 07:15 schrieb "Pan, Xinhui" :
> > 
> > 
> > > 2020年3月26日 13:38,Koenig, Christian  写道:
> > > 
> > > Yeah that's on my TODO list for quite a while as well.
> > > 
> > > But we even need three IB pools. One very small for the IB tests, one for 
> > > direct VM updates and one for the rest.
> > > 
> > > So please make the pool a parameter to ib_get() and not the hack you have 
> > > below.
> > 
> > yep, I will make IB pool  a parameter.
> > 
> > IB tests for gfx need many IBs, PAGE_SIZE for ib pool is still not enough.
> > but the default size for ib pool is 2MB now, just one hugepage, today we 
> > have memory in TB.
> > so no need make a different size for IB tests pool.
> > 
> > 2MB is probably a bit much and we don't have huge page optimisation for 
> > kernel allocations at the moment anyway. Keep in mind that we have only 
> > limited space in the GART.
> gart table is just 512MB.
> do you mean every entry in gart table just points to one 4KB page? and need 5 
> gart table entries for one 2M hugepage? 
> 
> Yes, we need 512 * 4KB entries for a 2MB page in GART. The table for the 
> system VM is flat because of hardware restrictions.
> 
oh yes, 512 entries.

> IIRC we tried 256MB for the GART initially and in general we try to keep that 
> as small as possible because it eats up visible VRAM space.
that is true. but our roadmap tells that we are a linux ML team. 
I can not imagine that customers use small pci bar servers. well, they care the 
cost. but I prefer they using new products. :)

anyway, I will chosse a smallest workable default vaule for now.

> 
> Christian.
> 
> 
> > 
> > Maybe make this 4*PAGE_SIZE for the new IB pool for now and test if that 
> > works or not.
> > 
> > Christian.
> > 
> > 
> > 
> > 
> > > 
> > > Thanks,
> > > Christian.
> > > 
> > > Am 26.03.2020 03:02 schrieb "Pan, Xinhui" :
> > > Another ib poll for direct submit.
> > > Any jobs schedule IBs without dependence on gpu scheduler should use
> > > this pool firstly.
> > > 
> > > Signed-off-by: xinhui pan 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 12 ++--
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  8 +++-
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  3 ++-
> > >  5 files changed, 21 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > index 7dd74253e7b6..c01423ffb8ed 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > > @@ -849,6 +849,7 @@ struct amdgpu_device {
> > >  struct amdgpu_ring  *rings[AMDGPU_MAX_RINGS];
> > >  boolib_pool_ready;
> > >  struct amdgpu_sa_managerring_tmp_bo;
> > > +   struct amdgpu_sa_managerring_tmp_bo_direct;
> > >  
> > >  /* interrupts */
> > >  struct amdgpu_irq   irq;
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > index 8304d0c87899..28be4efb3d5b 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > @@ -920,7 +920,7 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device 
> > > *adev,
> > >  parser->entity = entity;
> > >  
> > >  ring = to_amdgpu_ring(entity->rq->sched);
> > > -   r =  amdgpu_ib_get(adev, vm, ring->funcs->parse_cs ?
> > > +   r =  amdgpu_ib_get(adev, (unsigned long )vm|0x1, 
> > > ring->funcs->parse_cs ?
> > > chunk_ib->ib_bytes : 0, ib);
> > >  if (r) {
> > >  DRM_ERROR("Failed to get ib !\n");
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> > > ind

Re: [PATCH v2] drm/amdgpu: implement more ib pools

2020-03-27 Thread Pan, Xinhui


> 2020年3月27日 16:24,Koenig, Christian  写道:
> 
> Am 27.03.20 um 04:08 schrieb xinhui pan:
>> We have three ib pools, they are normal, VM, direct pools.
>> 
>> Any jobs which schedule IBs without dependence on gpu scheduler should
>> use DIRECT pool.
>> 
>> Any jobs schedule direct VM update IBs should use VM pool.
>> 
>> Any other jobs use NORMAL pool.
>> 
>> Signed-off-by: xinhui pan 
> 
> Two more coding style suggestions below, with those fixed feel free to add a 
> Reviewed-by: Christian König .
> 
> But in general your function parameter indentation is sometimes off. Not much 
> of an issue, but what editor and settings are you using?

I use vim with
set tabstop=4
set shiftwidth=4

But this now I use sed to replace some code.

> 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 12 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 41 +++--
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  5 ++-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h |  4 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c|  3 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  8 ++--
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c |  6 ++-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c |  9 +++--
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |  6 ++-
>>  drivers/gpu/drm/amd/amdgpu/cik_sdma.c   |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c   |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  6 ++-
>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  6 ++-
>>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  2 +-
>>  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c  |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c  |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c  |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/si_dma.c |  3 +-
>>  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   |  6 ++-
>>  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   |  6 ++-
>>  25 files changed, 103 insertions(+), 49 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 7dd74253e7b6..649bf5b8ea4e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -390,6 +390,13 @@ struct amdgpu_sa_bo {
>>  int amdgpu_fence_slab_init(void);
>>  void amdgpu_fence_slab_fini(void);
>>  +enum amdgpu_ib_pool_type {
>> +AMDGPU_IB_POOL_NORMAL = 0,
>> +AMDGPU_IB_POOL_VM,
>> +AMDGPU_IB_POOL_DIRECT,
>> +
>> +AMDGPU_IB_POOL_MAX
>> +};
>>  /*
>>   * IRQS.
>>   */
>> @@ -441,7 +448,8 @@ struct amdgpu_fpriv {
>>  int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv);
>>int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>> -  unsigned size, struct amdgpu_ib *ib);
>> +  unsigned size, struct amdgpu_ib *ib,
>> +  enum amdgpu_ib_pool_type pool);
>>  void amdgpu_ib_free(struct amdgpu_device *adev, struct amdgpu_ib *ib,
>>  struct dma_fence *f);
>>  int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>> @@ -848,7 +856,7 @@ struct amdgpu_device {
>>  unsignednum_rings;
>>  struct amdgpu_ring  *rings[AMDGPU_MAX_RINGS];
>>  boolib_pool_ready;
>> -struct amdgpu_sa_managerring_tmp_bo;
>> +struct amdgpu_sa_managerring_tmp_bo[AMDGPU_IB_POOL_MAX];
>>  /* interrupts */
>>  struct amdgpu_irq   irq;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index 59ec5e2be211..0f26668ae6f7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -921,7 +921,7 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device *adev,
>>  ring = to_amdgpu_ring(entity->rq->sched);
>>  r =  amdgpu_ib_get(adev, vm, ring->funcs->parse_cs ?
>> -   chunk_ib->ib_bytes : 0, ib);
>> +   chunk_ib->ib_bytes : 0, ib, 
>> AMDGPU_IB_POOL_NORMAL);
>>  if (r) {
>>  DRM_ERROR("Failed to get ib !\n");
>>  return r;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> index bece01f1cf09..0bfcd30df051 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
>> @@ -61,12 +61,13 @@
>>   * Returns 0 on success, error on failure.
>>   */
>>  int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>> -  unsigned size, struct amdgpu_ib *ib)
>> +  unsigned size, struct amdgpu_ib *ib,
>> +  enum amdgpu_ib

Re: [PATCH 1/2] drm/amdgpu: fix and cleanup amdgpu_gem_object_close v2

2020-03-30 Thread Pan, Xinhui
Reviewed-by: xinhui pan 


> 2020年3月30日 18:50,Christian König  写道:
> 
> The problem is that we can't add the clear fence to the BO
> when there is an exclusive fence on it since we can't
> guarantee the the clear fence will complete after the
> exclusive one.
> 
> To fix this refactor the function and also add the exclusive
> fence as shared to the resv object.
> 
> v2: fix warning
> v3: add excl fence as shared instead
> 
> Signed-off-by: Christian König 
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 41 ++---
> 1 file changed, 23 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 5bec66e6b1f8..a0be80513e96 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -161,16 +161,17 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
> 
>   struct amdgpu_bo_list_entry vm_pd;
>   struct list_head list, duplicates;
> + struct dma_fence *fence = NULL;
>   struct ttm_validate_buffer tv;
>   struct ww_acquire_ctx ticket;
>   struct amdgpu_bo_va *bo_va;
> - int r;
> + long r;
> 
>   INIT_LIST_HEAD(&list);
>   INIT_LIST_HEAD(&duplicates);
> 
>   tv.bo = &bo->tbo;
> - tv.num_shared = 1;
> + tv.num_shared = 2;
>   list_add(&tv.head, &list);
> 
>   amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
> @@ -178,28 +179,32 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
>   r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates);
>   if (r) {
>   dev_err(adev->dev, "leaking bo va because "
> - "we fail to reserve bo (%d)\n", r);
> + "we fail to reserve bo (%ld)\n", r);
>   return;
>   }
>   bo_va = amdgpu_vm_bo_find(vm, bo);
> - if (bo_va && --bo_va->ref_count == 0) {
> - amdgpu_vm_bo_rmv(adev, bo_va);
> + if (!bo_va || --bo_va->ref_count)
> + goto out_unlock;
> 
> - if (amdgpu_vm_ready(vm)) {
> - struct dma_fence *fence = NULL;
> + amdgpu_vm_bo_rmv(adev, bo_va);
> + if (!amdgpu_vm_ready(vm))
> + goto out_unlock;
> 
> - r = amdgpu_vm_clear_freed(adev, vm, &fence);
> - if (unlikely(r)) {
> - dev_err(adev->dev, "failed to clear page "
> - "tables on GEM object close (%d)\n", r);
> - }
> + fence = dma_resv_get_excl(bo->tbo.base.resv);
> + amdgpu_bo_fence(bo, fence, true);
> + fence = NULL;
> 
> - if (fence) {
> - amdgpu_bo_fence(bo, fence, true);
> - dma_fence_put(fence);
> - }
> - }
> - }
> + r = amdgpu_vm_clear_freed(adev, vm, &fence);
> + if (r || !fence)
> + goto out_unlock;
> +
> + amdgpu_bo_fence(bo, fence, true);
> + dma_fence_put(fence);
> +
> +out_unlock:
> + if (unlikely(r < 0))
> + dev_err(adev->dev, "failed to clear page "
> + "tables on GEM object close (%ld)\n", r);
>   ttm_eu_backoff_reservation(&ticket, &list);
> }
> 
> -- 
> 2.17.1
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix fence handling in amdgpu_gem_object_close

2020-03-31 Thread Pan, Xinhui
Reviewed-by: xinhui pan 

> 2020年3月31日 22:25,Christian König  写道:
> 
> The exclusive fence is only optional.
> 
> Signed-off-by: Christian König 
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 --
> 1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index a0be80513e96..77d988a0033f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -191,8 +191,10 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
>   goto out_unlock;
> 
>   fence = dma_resv_get_excl(bo->tbo.base.resv);
> - amdgpu_bo_fence(bo, fence, true);
> - fence = NULL;
> + if (fence) {
> + amdgpu_bo_fence(bo, fence, true);
> + fence = NULL;
> + }
> 
>   r = amdgpu_vm_clear_freed(adev, vm, &fence);
>   if (r || !fence)
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Cd59da038993e43160f6608d7d57f6535%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637212615519082332&sdata=kLCEV01aoJ6%2BrWD7yRBN1NB3b%2FJRyYA0TpV%2FGoATB0c%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker

2020-04-09 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

Why we break out the loops when there are pending bos to be released?

And I just checked the process_one_work. Right after the work item callback is 
called,  the workqueue itself will call cond_resched. So I think

From: Koenig, Christian 
Sent: Thursday, April 9, 2020 9:38:24 PM
To: Lucas Stach ; Pan, Xinhui ; 
amd-gfx@lists.freedesktop.org 
Cc: dri-de...@lists.freedesktop.org 
Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
worker

Am 09.04.20 um 15:25 schrieb Lucas Stach:
> Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König:
>> Am 09.04.20 um 03:31 schrieb xinhui pan:
>>> The delayed delete list is per device which might be very huge. And in
>>> a heavy workload test, the list might always not be empty. That will
>>> trigger any RCU stall warnings or softlockups in non-preemptible kernels
>>> Lets do schedule out if possible in that case.
>> Mhm, I'm not sure if that is actually allowed. This is called from a
>> work item and those are not really supposed to be scheduled away.
> Huh? Workitems can schedule out just fine, otherwise they would be
> horribly broken when it comes to sleeping locks.

Let me refine the sentence: Work items are not really supposed to be
scheduled purposely. E.g. you shouldn't call schedule() or
cond_resched() like in the case here.

Getting scheduled away because we wait for a lock is of course perfectly
fine.

>   The workqueue code
> even has measures to keep the workqueues at the expected concurrency
> level by starting other workitems when one of them goes to sleep.

Yeah, and exactly that's what I would say we should avoid here :)

In other words work items can be scheduled away, but they should not if
not really necessary (e.g. waiting for a lock).

Otherwise as you said new threads for work item processing are started
up and I don't think we want that.

Just returning from the work item and waiting for the next cycle is most
likely the better option.

Regards,
Christian.

>
> Regards,
> Lucas
>
>> Maybe rather change the while into while (!list_empty(&bdev->ddestroy)
>> && !should_reschedule(0)).
>>
>> Christian.
>>
>>> Signed-off-by: xinhui pan 
>>> ---
>>>drivers/gpu/drm/ttm/ttm_bo.c | 1 +
>>>1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index 9e07c3f75156..b8d853cab33b 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device 
>>> *bdev, bool remove_all)
>>>  }
>>>
>>>  ttm_bo_put(bo);
>>> +   cond_resched();
>>>  spin_lock(&glob->lru_lock);
>>>  }
>>>  list_splice_tail(&removed, &bdev->ddestroy);
>> ___
>> dri-devel mailing list
>> dri-de...@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868&sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker

2020-04-09 Thread Pan, Xinhui
I think it doesn't matter if workitem schedule out. Even we did not schedule 
out, the workqueue itself will schedule out later.
So it did not break anything with this patch I think.

From: Pan, Xinhui 
Sent: Thursday, April 9, 2020 10:07:09 PM
To: Lucas Stach ; amd-gfx@lists.freedesktop.org 
; Koenig, Christian 
Cc: dri-de...@lists.freedesktop.org 
Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
worker

Why we break out the loops when there are pending bos to be released?

And I just checked the process_one_work. Right after the work item callback is 
called,  the workqueue itself will call cond_resched. So I think

From: Koenig, Christian 
Sent: Thursday, April 9, 2020 9:38:24 PM
To: Lucas Stach ; Pan, Xinhui ; 
amd-gfx@lists.freedesktop.org 
Cc: dri-de...@lists.freedesktop.org 
Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
worker

Am 09.04.20 um 15:25 schrieb Lucas Stach:
> Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König:
>> Am 09.04.20 um 03:31 schrieb xinhui pan:
>>> The delayed delete list is per device which might be very huge. And in
>>> a heavy workload test, the list might always not be empty. That will
>>> trigger any RCU stall warnings or softlockups in non-preemptible kernels
>>> Lets do schedule out if possible in that case.
>> Mhm, I'm not sure if that is actually allowed. This is called from a
>> work item and those are not really supposed to be scheduled away.
> Huh? Workitems can schedule out just fine, otherwise they would be
> horribly broken when it comes to sleeping locks.

Let me refine the sentence: Work items are not really supposed to be
scheduled purposely. E.g. you shouldn't call schedule() or
cond_resched() like in the case here.

Getting scheduled away because we wait for a lock is of course perfectly
fine.

>   The workqueue code
> even has measures to keep the workqueues at the expected concurrency
> level by starting other workitems when one of them goes to sleep.

Yeah, and exactly that's what I would say we should avoid here :)

In other words work items can be scheduled away, but they should not if
not really necessary (e.g. waiting for a lock).

Otherwise as you said new threads for work item processing are started
up and I don't think we want that.

Just returning from the work item and waiting for the next cycle is most
likely the better option.

Regards,
Christian.

>
> Regards,
> Lucas
>
>> Maybe rather change the while into while (!list_empty(&bdev->ddestroy)
>> && !should_reschedule(0)).
>>
>> Christian.
>>
>>> Signed-off-by: xinhui pan 
>>> ---
>>>drivers/gpu/drm/ttm/ttm_bo.c | 1 +
>>>1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index 9e07c3f75156..b8d853cab33b 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device 
>>> *bdev, bool remove_all)
>>>  }
>>>
>>>  ttm_bo_put(bo);
>>> +   cond_resched();
>>>  spin_lock(&glob->lru_lock);
>>>  }
>>>  list_splice_tail(&removed, &bdev->ddestroy);
>> ___
>> dri-devel mailing list
>> dri-de...@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868&sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker

2020-04-09 Thread Pan, Xinhui
https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026

This is another example of the usage of  cond_sched.

From: Pan, Xinhui 
Sent: Thursday, April 9, 2020 10:11:08 PM
To: Lucas Stach ; amd-gfx@lists.freedesktop.org 
; Koenig, Christian 
Cc: dri-de...@lists.freedesktop.org 
Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
worker

I think it doesn't matter if workitem schedule out. Even we did not schedule 
out, the workqueue itself will schedule out later.
So it did not break anything with this patch I think.

From: Pan, Xinhui 
Sent: Thursday, April 9, 2020 10:07:09 PM
To: Lucas Stach ; amd-gfx@lists.freedesktop.org 
; Koenig, Christian 
Cc: dri-de...@lists.freedesktop.org 
Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
worker

Why we break out the loops when there are pending bos to be released?

And I just checked the process_one_work. Right after the work item callback is 
called,  the workqueue itself will call cond_resched. So I think

From: Koenig, Christian 
Sent: Thursday, April 9, 2020 9:38:24 PM
To: Lucas Stach ; Pan, Xinhui ; 
amd-gfx@lists.freedesktop.org 
Cc: dri-de...@lists.freedesktop.org 
Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
worker

Am 09.04.20 um 15:25 schrieb Lucas Stach:
> Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König:
>> Am 09.04.20 um 03:31 schrieb xinhui pan:
>>> The delayed delete list is per device which might be very huge. And in
>>> a heavy workload test, the list might always not be empty. That will
>>> trigger any RCU stall warnings or softlockups in non-preemptible kernels
>>> Lets do schedule out if possible in that case.
>> Mhm, I'm not sure if that is actually allowed. This is called from a
>> work item and those are not really supposed to be scheduled away.
> Huh? Workitems can schedule out just fine, otherwise they would be
> horribly broken when it comes to sleeping locks.

Let me refine the sentence: Work items are not really supposed to be
scheduled purposely. E.g. you shouldn't call schedule() or
cond_resched() like in the case here.

Getting scheduled away because we wait for a lock is of course perfectly
fine.

>   The workqueue code
> even has measures to keep the workqueues at the expected concurrency
> level by starting other workitems when one of them goes to sleep.

Yeah, and exactly that's what I would say we should avoid here :)

In other words work items can be scheduled away, but they should not if
not really necessary (e.g. waiting for a lock).

Otherwise as you said new threads for work item processing are started
up and I don't think we want that.

Just returning from the work item and waiting for the next cycle is most
likely the better option.

Regards,
Christian.

>
> Regards,
> Lucas
>
>> Maybe rather change the while into while (!list_empty(&bdev->ddestroy)
>> && !should_reschedule(0)).
>>
>> Christian.
>>
>>> Signed-off-by: xinhui pan 
>>> ---
>>>drivers/gpu/drm/ttm/ttm_bo.c | 1 +
>>>1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index 9e07c3f75156..b8d853cab33b 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device 
>>> *bdev, bool remove_all)
>>>  }
>>>
>>>  ttm_bo_put(bo);
>>> +   cond_resched();
>>>  spin_lock(&glob->lru_lock);
>>>  }
>>>  list_splice_tail(&removed, &bdev->ddestroy);
>> ___
>> dri-devel mailing list
>> dri-de...@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868&sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker

2020-04-09 Thread Pan, Xinhui


> 2020年4月9日 22:59,Koenig, Christian  写道:
> 
>> Why we break out the loops when there are pending bos to be released?
> 
> We do this anyway if we can't acquire the necessary locks. Freeing already 
> deleted BOs is just a very lazy background work.

That is true. eviction will reclaim the BO resource too.

> 
>> So it did not break anything with this patch I think.
> 
> Oh, the patch will certainly work. I'm just not sure if it's the ideal 
> behavior.
> 
>> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026
>> 
>> This is another example of the usage of  cond_sched.
> 
> Yes, and that is also a good example of what I mean here:
> 
>>  if (!mutex_trylock(&slab_mutex))
>> 
>>  
>> /* Give up. Setup the next iteration. */
>> 
>>  
>> goto out;
> 
> If the function can't acquire the lock immediately it gives up and waits for 
> the next iteration.
> 
> I think it would be better if we do this in TTM as well if we spend to much 
> time cleaning up old BOs.

fair enough.

> 
> On the other hand you are right that cond_resched() has the advantage that we 
> could spend more time on cleaning up old BOs if there is nothing else for the 
> CPU TODO.
> 
> Regards,
> Christian.
> 
> Am 09.04.20 um 16:24 schrieb Pan, Xinhui:
>> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026
>> 
>> This is another example of the usage of  cond_sched.
>> From: Pan, Xinhui 
>> Sent: Thursday, April 9, 2020 10:11:08 PM
>> To: Lucas Stach ; amd-gfx@lists.freedesktop.org 
>> ; Koenig, Christian 
>> Cc: dri-de...@lists.freedesktop.org 
>> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
>> worker
>>  
>> I think it doesn't matter if workitem schedule out. Even we did not schedule 
>> out, the workqueue itself will schedule out later.
>> So it did not break anything with this patch I think.
>> From: Pan, Xinhui 
>> Sent: Thursday, April 9, 2020 10:07:09 PM
>> To: Lucas Stach ; amd-gfx@lists.freedesktop.org 
>> ; Koenig, Christian 
>> Cc: dri-de...@lists.freedesktop.org 
>> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
>> worker
>>  
>> Why we break out the loops when there are pending bos to be released?
>> 
>> And I just checked the process_one_work. Right after the work item callback 
>> is called,  the workqueue itself will call cond_resched. So I think
>> From: Koenig, Christian 
>> Sent: Thursday, April 9, 2020 9:38:24 PM
>> To: Lucas Stach ; Pan, Xinhui ; 
>> amd-gfx@lists.freedesktop.org 
>> Cc: dri-de...@lists.freedesktop.org 
>> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete 
>> worker
>>  
>> Am 09.04.20 um 15:25 schrieb Lucas Stach:
>> > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König:
>> >> Am 09.04.20 um 03:31 schrieb xinhui pan:
>> >>> The delayed delete list is per device which might be very huge. And in
>> >>> a heavy workload test, the list might always not be empty. That will
>> >>> trigger any RCU stall warnings or softlockups in non-preemptible kernels
>> >>> Lets do schedule out if possible in that case.
>> >> Mhm, I'm not sure if that is actually allowed. This is called from a
>> >> work item and those are not really supposed to be scheduled away.
>> > Huh? Workitems can schedule out just fine, otherwise they would be
>> > horribly broken when it comes to sleeping locks.
>> 
>> Let me refine the sentence: Work items are not really supposed to be 
>> scheduled purposely. E.g. you shouldn't call schedule() or 
>> cond_resched() like in the case here.
>> 
>> Getting scheduled away because we wait for a lock is of course perfectly 
>> fine.
>> 
>> >   The workqueue code
>> > even has measures to keep the workqueues at the expected concurrency
>> > level by starting other workitems when one of them goes to sleep.
>> 
>> Yeah, and exactly that's what I would say we should avoid here :)
>> 
>> In other words work items can be scheduled away, but they should not if 
>> not really necessary (e.g. waiting for a lock).
>> 
>> Otherwise as you said new threads for work item processing are started 
>> up and I don't think we want that.
>> 
>> Just returning from the work item and waiting for the next cycle is most 
>> likely the better option.
>> 
>> Regards,
>> Christian.
>> 

Re: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video playback (v2)

2020-04-11 Thread Pan, Xinhui
Prike
I hit this issue too. reboot hung with my vega10.  it is ok with navi10.

From: amd-gfx  on behalf of Liang, Prike 

Sent: Sunday, April 12, 2020 11:49:39 AM
To: Johannes Hirte 
Cc: Deucher, Alexander ; Huang, Ray 
; Quan, Evan ; 
amd-gfx@lists.freedesktop.org 
Subject: RE: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video 
playback (v2)

Thanks update and verify. Could you give more detail information and error log 
message
about you observed issue?

Thanks,
Prike
> -Original Message-
> From: Johannes Hirte 
> Sent: Sunday, April 12, 2020 7:56 AM
> To: Liang, Prike 
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Huang, Ray ;
> Quan, Evan 
> Subject: Re: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video
> playback (v2)
>
> On 2020 Apr 07, Prike Liang wrote:
> > The system will be hang up during S3 suspend because of SMU is pending
> > for GC not respose the register CP_HQD_ACTIVE access request.This
> > issue root cause of accessing the GC register under enter GFX CGGPG
> > and can be fixed by disable GFX CGPG before perform suspend.
> >
> > v2: Use disable the GFX CGPG instead of RLC safe mode guard.
> >
> > Signed-off-by: Prike Liang 
> > Tested-by: Mengbing Wang 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 2e1f955..bf8735b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2440,8 +2440,6 @@ static int
> > amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)  {
> >  int i, r;
> >
> > -   amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> > -   amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
> >
> >  for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
> >  if (!adev->ip_blocks[i].status.valid)
> > @@ -3470,6 +3468,9 @@ int amdgpu_device_suspend(struct drm_device
> *dev, bool fbcon)
> >  }
> >  }
> >
> > +   amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> > +   amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
> > +
> >  amdgpu_amdkfd_suspend(adev, !fbcon);
> >
> >  amdgpu_ras_suspend(adev);
>
>
> This breaks shutdown/reboot on my system (Dell latitude 5495).
>
> --
> Regards,
>   Johannes Hirte

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Cde6e0578174940b5f29808d7de948b88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637222601969843248&sdata=quWGElw%2Fo70VJibuZ7%2BzS%2FcHH2OHSDB%2B5uaFPQUf2Os%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video playback (v2)

2020-04-12 Thread Pan, Xinhui
06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 
10 [Radeon PRO WX 8100] (prog-if 00 [VGA controller])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Vega
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel modules: amdgpu

Adapter  1SEG=, BN=06, DN=00, PCIID=68681002, SSID=0A0C1002)
Asic Family:  Vega10 
Flash Type :  M25P80  (1024 KB)
Product Name   :  Vega10 D05111 32Mx128 8GB 852e/1000m 1.00V 
Bios Config File   :  D0511100.109   
Bios P/N   :  113-D0511100-109
Bios Version   :  016.001.001.000.011125
Bios Date  :  09/22/18 10:48 
ROM Image Type :  Hybrid Images
ROM Image Details  :  
Image[0]: Size(61952 Bytes), Type(Legacy Image)
Image[1]: Size(43520 Bytes), Type(EFI Image)

发件人: "Liang, Prike" 
日期: 2020年4月13日 星期一 12:23
收件人: "Pan, Xinhui" , Johannes Hirte 

抄送: "Deucher, Alexander" , "Huang, Ray" 
, "Quan, Evan" , 
"amd-gfx@lists.freedesktop.org" 
主题: RE: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video playback 
(v2)

Could you share the PCI sub revision and I try check the issue on the 
Vega10(1002:687f) but can’t find the 
reboot hang up.
 
Thanks,
Prike
From: Pan, Xinhui  
Sent: Sunday, April 12, 2020 2:58 PM
To: Johannes Hirte ; Liang, Prike 

Cc: Deucher, Alexander ; Huang, Ray 
; Quan, Evan ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video 
playback (v2)
 
Prike
I hit this issue too. reboot hung with my vega10.  it is ok with navi10.

From: amd-gfx  on behalf of Liang, Prike 

Sent: Sunday, April 12, 2020 11:49:39 AM
To: Johannes Hirte 
Cc: Deucher, Alexander ; Huang, Ray 
; Quan, Evan ; 
amd-gfx@lists.freedesktop.org 
Subject: RE: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video 
playback (v2) 
 
Thanks update and verify. Could you give more detail information and error log 
message   
about you observed issue? 

Thanks,
Prike
> -Original Message-
> From: Johannes Hirte 
> Sent: Sunday, April 12, 2020 7:56 AM
> To: Liang, Prike 
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Huang, Ray ;
> Quan, Evan 
> Subject: Re: [PATCH v2] drm/amdgpu: fix gfx hang during suspend with video
> playback (v2)
> 
> On 2020 Apr 07, Prike Liang wrote:
> > The system will be hang up during S3 suspend because of SMU is pending
> > for GC not respose the register CP_HQD_ACTIVE access request.This
> > issue root cause of accessing the GC register under enter GFX CGGPG
> > and can be fixed by disable GFX CGPG before perform suspend.
> >
> > v2: Use disable the GFX CGPG instead of RLC safe mode guard.
> >
> > Signed-off-by: Prike Liang 
> > Tested-by: Mengbing Wang 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 2e1f955..bf8735b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2440,8 +2440,6 @@ static int
> > amdgpu_device_ip_suspend_phase1(struct amdgpu_device *adev)  {
> >  int i, r;
> >
> > -   amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> > -   amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
> >
> >  for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
> >  if (!adev->ip_blocks[i].status.valid)
> > @@ -3470,6 +3468,9 @@ int amdgpu_device_suspend(struct drm_device
> *dev, bool fbcon)
> >  }
> >  }
> >
> > +   amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> > +   amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
> > +
> >  amdgpu_amdkfd_suspend(adev, !fbcon);
> >
> >  amdgpu_ras_suspend(adev);
> 
> 
> This breaks shutdown/reboot on my system (Dell latitude 5495).
> 
> --
> Regards,
>   Johannes Hirte

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Cde6e0578174940b5f29808d7de948b88%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637222601969843248&sdata=quWGElw%2Fo70VJibuZ7%2BzS%2FcHH2OHSDB%2B5uaFPQUf2Os%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

This patch shluld fix the panic.

but I would like you do NOT add adev xgmi head to the local device list. if ras 
ue occurs while the gpu is already in gpu recovery.

From: amd-gfx  on behalf of Christian 
K?nig 
Sent: Friday, April 17, 2020 5:17:22 PM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org 
; Zhang, Hawking ; Li, 
Dennis ; Clements, John 
Subject: Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on 
sGPU

Am 16.04.20 um 17:47 schrieb Guchun Chen:
> When running ras uncorrectable error injection and trigger GPU
> reset on sGPU, below issue is observed. It's caused by the list
> uninitialized when accessing.
>
> [   80.047227] BUG: unable to handle page fault for address: c0f4f750
> [   80.047300] #PF: supervisor write access in kernel mode
> [   80.047351] #PF: error_code(0x0003) - permissions violation
> [   80.047404] PGD 12c20e067 P4D 12c20e067 PUD 12c210067 PMD 41c4ee067 PTE 
> 404316061
> [   80.047477] Oops: 0003 [#1] SMP PTI
> [   80.047516] CPU: 7 PID: 377 Comm: kworker/7:2 Tainted: G   OE 
> 5.4.0-rc7-guchchen #1
> [   80.047594] Hardware name: System manufacturer System Product Name/TUF 
> Z370-PLUS GAMING II, BIOS 0411 09/21/2018
> [   80.047888] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
>
> Signed-off-by: Guchun Chen 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index b27d9d62c9df..260b4a42e0ae 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1448,9 +1448,10 @@ static void amdgpu_ras_do_recovery(struct work_struct 
> *work)
>struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, false);
>
>/* Build list of devices to query RAS related errors */
> - if  (hive && adev->gmc.xgmi.num_physical_nodes > 1) {
> + if  (hive && adev->gmc.xgmi.num_physical_nodes > 1)
>device_list_handle = &hive->device_list;
> - } else {
> + else {

The coding style here is incorrect. If one branch of an if/else uses {}
the other(s) should use it as well.

> + INIT_LIST_HEAD(&device_list);

That was suggested before but then reverted, but I'm not sure why.

Regards,
Christian.

>list_add_tail(&adev->gmc.xgmi.head, &device_list);
>device_list_handle = &device_list;
>}

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Ca2bb160328c942b51b4608d7e2b02579%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637227118528983724&sdata=6bJVduxNT%2FVXKPDWDBAnJfVSe3CJI9mdGfwi5V89Kgw%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Pan, Xinhui
that breaks the device list in gpu recovery.

From: Pan, Xinhui 
Sent: Friday, April 17, 2020 7:11:40 PM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org 
; Zhang, Hawking ; Li, 
Dennis ; Clements, John ; Koenig, 
Christian 
Subject: Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on 
sGPU

This patch shluld fix the panic.

but I would like you do NOT add adev xgmi head to the local device list. if ras 
ue occurs while the gpu is already in gpu recovery.

From: amd-gfx  on behalf of Christian 
König 
Sent: Friday, April 17, 2020 5:17:22 PM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org 
; Zhang, Hawking ; Li, 
Dennis ; Clements, John 
Subject: Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on 
sGPU

Am 16.04.20 um 17:47 schrieb Guchun Chen:
> When running ras uncorrectable error injection and trigger GPU
> reset on sGPU, below issue is observed. It's caused by the list
> uninitialized when accessing.
>
> [   80.047227] BUG: unable to handle page fault for address: c0f4f750
> [   80.047300] #PF: supervisor write access in kernel mode
> [   80.047351] #PF: error_code(0x0003) - permissions violation
> [   80.047404] PGD 12c20e067 P4D 12c20e067 PUD 12c210067 PMD 41c4ee067 PTE 
> 404316061
> [   80.047477] Oops: 0003 [#1] SMP PTI
> [   80.047516] CPU: 7 PID: 377 Comm: kworker/7:2 Tainted: G   OE 
> 5.4.0-rc7-guchchen #1
> [   80.047594] Hardware name: System manufacturer System Product Name/TUF 
> Z370-PLUS GAMING II, BIOS 0411 09/21/2018
> [   80.047888] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
>
> Signed-off-by: Guchun Chen 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index b27d9d62c9df..260b4a42e0ae 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1448,9 +1448,10 @@ static void amdgpu_ras_do_recovery(struct work_struct 
> *work)
>struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, false);
>
>/* Build list of devices to query RAS related errors */
> - if  (hive && adev->gmc.xgmi.num_physical_nodes > 1) {
> + if  (hive && adev->gmc.xgmi.num_physical_nodes > 1)
>device_list_handle = &hive->device_list;
> - } else {
> + else {

The coding style here is incorrect. If one branch of an if/else uses {}
the other(s) should use it as well.

> + INIT_LIST_HEAD(&device_list);

That was suggested before but then reverted, but I'm not sure why.

Regards,
Christian.

>list_add_tail(&adev->gmc.xgmi.head, &device_list);
>device_list_handle = &device_list;
>}

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Ca2bb160328c942b51b4608d7e2b02579%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637227118528983724&sdata=6bJVduxNT%2FVXKPDWDBAnJfVSe3CJI9mdGfwi5V89Kgw%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [bug report] drm/amdgpu: add amdgpu_ras.c to support ras (v2)

2020-05-06 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

no.  below function checks if block is valid or not.
I think you need check your code_checker. or you were checking on a very old 
codebase?

/* check if ras is supported on block, say, sdma, gfx */
static inline int amdgpu_ras_is_supported(struct amdgpu_device *adev,
unsigned int block)

From: Dan Carpenter 
Sent: Wednesday, May 6, 2020 5:17:34 PM
To: Zhou1, Tao 
Cc: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 

Subject: Re: [bug report] drm/amdgpu: add amdgpu_ras.c to support ras (v2)

On Wed, May 06, 2020 at 07:26:16AM +, Zhou1, Tao wrote:
> [AMD Public Use]
>
> Hi Dan:
>
> Please check the following piece of code in 
> amdgpu_ras_debugfs_ctrl_parse_data:
>
>if (op != -1) {
>if (amdgpu_ras_find_block_id_by_name(block_name, &block_id))
>return -EINVAL;
>
>data->head.block = block_id;
>
> amdgpu_ras_find_block_id_by_name will return error directly if someone try to 
> provide an invalid block_name intentionally via debugfs.
>

No.  It's the line after that which are the problem.

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
   147  static int amdgpu_ras_debugfs_ctrl_parse_data(struct file *f,
   148  const char __user *buf, size_t size,
   149  loff_t *pos, struct ras_debug_if *data)
   150  {
   151  ssize_t s = min_t(u64, 64, size);
   152  char str[65];
   153  char block_name[33];
   154  char err[9] = "ue";
   155  int op = -1;
   156  int block_id;
   157  uint32_t sub_block;
   158  u64 address, value;
   159
   160  if (*pos)
   161  return -EINVAL;
   162  *pos = size;
   163
   164  memset(str, 0, sizeof(str));
   165  memset(data, 0, sizeof(*data));
   166
   167  if (copy_from_user(str, buf, s))
   168  return -EINVAL;
   169
   170  if (sscanf(str, "disable %32s", block_name) == 1)
   171  op = 0;
   172  else if (sscanf(str, "enable %32s %8s", block_name, err) == 2)
   173  op = 1;
   174  else if (sscanf(str, "inject %32s %8s", block_name, err) == 2)
   175  op = 2;
   176  else if (str[0] && str[1] && str[2] && str[3])
   177  /* ascii string, but commands are not matched. */

Say we don't write an ascii string.

   178  return -EINVAL;
   179
   180  if (op != -1) {
   181  if (amdgpu_ras_find_block_id_by_name(block_name, 
&block_id))
   182  return -EINVAL;
   183
   184  data->head.block = block_id;
   185  /* only ue and ce errors are supported */
   186  if (!memcmp("ue", err, 2))
   187  data->head.type = 
AMDGPU_RAS_ERROR__MULTI_UNCORRECTABLE;
   188  else if (!memcmp("ce", err, 2))
   189  data->head.type = 
AMDGPU_RAS_ERROR__SINGLE_CORRECTABLE;
   190  else
   191  return -EINVAL;
   192
   193  data->op = op;
   194
   195  if (op == 2) {
   196  if (sscanf(str, "%*s %*s %*s %u %llu %llu",
   197  &sub_block, &address, 
&value) != 3)
   198  if (sscanf(str, "%*s %*s %*s 0x%x 
0x%llx 0x%llx",
   199  &sub_block, 
&address, &value) != 3)
   200  return -EINVAL;
   201  data->head.sub_block_index = sub_block;
   202  data->inject.address = address;
   203  data->inject.value = value;
   204  }
   205  } else {
   206  if (size < sizeof(*data))
   207  return -EINVAL;
   208
   209  if (copy_from_user(data, buf, sizeof(*data)))
^^^
This lets us set the data->head.block to whatever we want.  Premusably
there is a trusted app which knows how to write the correct values.
But if it has a bug that will cause a crash and we'll have to find a
way to disable it in the kernel for kernel lock down mode etc so either
way we'll need to do a bit of work.

   210  return -EINVAL;
   211  }
   212
   213  return 0;
   214  }

regards,
dan carpenter

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Fix a redundant kfree

2020-09-01 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free
itself.
Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into
amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.
So drm_dev_release try to free a wrong pointer then.

Also driver's release trys to free adev, but drm_dev_release will
access dev after call drvier's release.

To fix it, remove driver's release and set managed.final_kfree to adev.

[   36.269348] BUG: unable to handle page fault for address: a0c279940028
[   36.276841] #PF: supervisor read access in kernel mode
[   36.282434] #PF: error_code(0x) - not-present page
[   36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 
8008066bf060
[   36.296868] Oops:  [#1] SMP DEBUG_PAGEALLOC NOPTI
[   36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G   O  
5.9.0-rc2+ #46
[   36.310670] Hardware name: System manufacturer System Product Name/PRIME 
Z390-A, BIOS 1401 11/26/2019
[   36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm]
[   36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 
00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 
18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5
[   36.347217] RSP: 0018:b9424141fce0 EFLAGS: 00010282
[   36.352931] RAX: 0006 RBX: a0c279940010 RCX: 0006
[   36.360718] RDX: c0419f5a RSI: 0200 RDI: a0c279940010
[   36.368503] RBP: b9424141fd10 R08: 0001 R09: 0001
[   36.376304] R10:  R11:  R12: a0c279940010
[   36.384070] R13: c0e2a000 R14: a0c26924e220 R15: fff2
[   36.391845] FS:  7fc4a277b740() GS:a0c288e0() 
knlGS:
[   36.400669] CS:  0010 DS:  ES:  CR0: 80050033
[   36.406937] CR2: a0c279940028 CR3: 000792304006 CR4: 003706e0
[   36.414732] DR0:  DR1:  DR2: 
[   36.422550] DR3:  DR6: fffe0ff0 DR7: 0400
[   36.430354] Call Trace:
[   36.433044]  drm_dev_put.part.0+0x40/0x60 [drm]
[   36.438017]  drm_dev_put+0x13/0x20 [drm]
[   36.442398]  amdgpu_pci_remove+0x56/0x60 [amdgpu]
[   36.447528]  pci_device_remove+0x3e/0xb0
[   36.451807]  device_release_driver_internal+0xff/0x1d0
[   36.457416]  device_release_driver+0x12/0x20
[   36.462094]  pci_stop_bus_device+0x70/0xa0
[   36.466588]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
[   36.472786]  remove_store+0x7b/0x90
[   36.476614]  dev_attr_store+0x17/0x30
[   36.480646]  sysfs_kf_write+0x4b/0x60
[   36.484655]  kernfs_fop_write+0xe8/0x1d0
[   36.488952]  vfs_write+0xf5/0x230
[   36.492562]  ksys_write+0x70/0xf0
[   36.496206]  __x64_sys_write+0x1a/0x20
[   36.500292]  do_syscall_64+0x38/0x90
[   36.504219]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index c12e9acd421d..52fc0c6c8538 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1165,7 +1165,7 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
 if (ret)
 goto err_free;

-drmm_add_final_kfree(ddev, ddev);
+drmm_add_final_kfree(ddev, adev);

 if (!supports_atomic)
 ddev->driver_features &= ~DRIVER_ATOMIC;
@@ -1221,13 +1221,6 @@ amdgpu_pci_remove(struct pci_dev *pdev)
 drm_dev_put(dev);
 }

-static void amdgpu_driver_release(struct drm_device *ddev)
-{
-struct amdgpu_device *adev = drm_to_adev(ddev);
-
-kfree(adev);
-}
-
 static void
 amdgpu_pci_shutdown(struct pci_dev *pdev)
 {
@@ -1522,7 +1515,6 @@ static struct drm_driver kms_driver = {
 .open = amdgpu_driver_open_kms,
 .postclose = amdgpu_driver_postclose_kms,
 .lastclose = amdgpu_driver_lastclose_kms,
-.release   = amdgpu_driver_release,
 .irq_handler = amdgpu_irq_handler,
 .ioctls = amdgpu_ioctls_kms,
 .gem_free_object_unlocked = amdgpu_gem_object_free,
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Fix a list corruption

2020-09-01 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

Remove the private obj from the internal list before we free aconnector.

[   56.925828] BUG: unable to handle page fault for address: 8f84a870a560
[   56.933272] #PF: supervisor read access in kernel mode
[   56.938801] #PF: error_code(0x) - not-present page
[   56.944376] PGD 18e605067 P4D 18e605067 PUD 86a614067 PMD 86a4d0067 PTE 
8008578f5060
[   56.953260] Oops:  [#1] SMP DEBUG_PAGEALLOC NOPTI
[   56.958815] CPU: 6 PID: 1407 Comm: bash Tainted: G   O  
5.9.0-rc2+ #46
[   56.967092] Hardware name: System manufacturer System Product Name/PRIME 
Z390-A, BIOS 1401 11/26/2019
[   56.977162] RIP: 0010:__list_del_entry_valid+0x31/0xa0
[   56.982768] Code: 00 ad de 55 48 8b 17 4c 8b 47 08 48 89 e5 48 39 c2 74 27 
48 b8 22 01 00 00 00 00 ad de 49 39 c0 74 2d 49 8b 30 48 39 fe 75 3d <48> 8b 52 
08 48 39 f2 75 4c b8 01 00 00 00 5d c3 48 89 7
[   57.003327] RSP: 0018:b40c81687c90 EFLAGS: 00010246
[   57.009048] RAX: dead0122 RBX: 8f84ea41f4f0 RCX: 0006
[   57.016871] RDX: 8f84a870a558 RSI: 8f84ea41f4f0 RDI: 8f84ea41f4f0
[   57.024672] RBP: b40c81687c90 R08: 8f84ea400998 R09: 0001
[   57.032490] R10:  R11:  R12: 0006
[   57.040287] R13: 8f84ea422a90 R14: 8f84b4129a20 R15: fff2
[   57.048105] FS:  7f550d885740() GS:8f850960() 
knlGS:
[   57.056979] CS:  0010 DS:  ES:  CR0: 80050033
[   57.063260] CR2: 8f84a870a560 CR3: 0007e5144001 CR4: 003706e0
[   57.071053] DR0:  DR1:  DR2: 
[   57.078849] DR3:  DR6: fffe0ff0 DR7: 0400
[   57.086684] Call Trace:
[   57.089381]  drm_atomic_private_obj_fini+0x29/0x82 [drm]
[   57.095247]  amdgpu_dm_fini+0x83/0x170 [amdgpu]
[   57.100264]  dm_hw_fini+0x23/0x30 [amdgpu]
[   57.104814]  amdgpu_device_fini+0x1df/0x4fe [amdgpu]
[   57.110271]  amdgpu_driver_unload_kms+0x43/0x70 [amdgpu]
[   57.116136]  amdgpu_pci_remove+0x3b/0x60 [amdgpu]
[   57.121291]  pci_device_remove+0x3e/0xb0
[   57.125583]  device_release_driver_internal+0xff/0x1d0
[   57.131223]  device_release_driver+0x12/0x20
[   57.135903]  pci_stop_bus_device+0x70/0xa0
[   57.140401]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
[   57.146571]  remove_store+0x7b/0x90
[   57.150429]  dev_attr_store+0x17/0x30
[   57.154441]  sysfs_kf_write+0x4b/0x60
[   57.158479]  kernfs_fop_write+0xe8/0x1d0
[   57.162788]  vfs_write+0xf5/0x230
[   57.166426]  ksys_write+0x70/0xf0
[   57.170087]  __x64_sys_write+0x1a/0x20
[   57.174219]  do_syscall_64+0x38/0x90
[   57.178145]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f52533ee7372..cb624ee70545 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5076,6 +5076,7 @@ static void amdgpu_dm_connector_destroy(struct 
drm_connector *connector)
 struct amdgpu_device *adev = drm_to_adev(connector->dev);
 struct amdgpu_display_manager *dm = &adev->dm;

+drm_atomic_private_obj_fini(&aconnector->mst_mgr.base);
 #if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) ||\
 defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)

--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix a redundant kfree

2020-09-01 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]


The correct thing to do this is to
_leave the amdgpu_driver_release()_ alone,
remove "drmm_add_final_kfree()" and qualify
the WARN_ON() in drm_dev_register() by
the existence of drm_driver.release() (i.e. non-NULL).

Re:  this drm driver release callback is more like a release notify. It is 
called in the beginning of the total release sequence. As you have made drm 
device a member of adev. So you can not free adev in the driver release 
callback.

Maybe change the sequence of release,  say, put drm driver release in the end 
of total release sequence.
Or still use the final_kfree to free adev and our release callback just do some 
other cleanup work.

From: Tuikov, Luben 
Sent: Wednesday, September 2, 2020 4:35:32 AM
To: Alex Deucher ; Pan, Xinhui ; 
Daniel Vetter 
Cc: amd-gfx@lists.freedesktop.org ; Deucher, 
Alexander 
Subject: Re: [PATCH] drm/amdgpu: Fix a redundant kfree

On 2020-09-01 10:12 a.m., Alex Deucher wrote:
> On Tue, Sep 1, 2020 at 3:46 AM Pan, Xinhui  wrote:
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free
>> itself.
>> Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into
>> amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.
>> So drm_dev_release try to free a wrong pointer then.
>>
>> Also driver's release trys to free adev, but drm_dev_release will
>> access dev after call drvier's release.
>>
>> To fix it, remove driver's release and set managed.final_kfree to adev.
>
> I've got to admit, the documentation around drm_dev_init is hard to
> follow.  We aren't really using the drmm stuff, but you have to use
> drmm_add_final_kfree to avoid a warning.  The logic seems to make
> sense though.
> Acked-by: Alex Deucher 

The logic in patch 3/3 uses the kref infrastructure
as described in drm_drv.c's comment on what the DRM
usage is, i.e. taking advantage of the kref infrastructure.

In amdgpu_pci_probe() we call drm_dev_init() which takes
a ref of 1 on the kref in the DRM device structure,
and from then on, only when the kref transitions
from non-zero to 0, do we free the container of
DRM device, and this is beautifully shown in the
kernel stack below (please take a look at the kernel
stack below).

Using a kref is very powerful as it is implicit:
when the kref transitions from non-zero to 0,
then call the release method.

Furthermore, we own the release method, and we
like that, as it is pure, as well as,
there may be more things we'd like to do in the future
before we free the amdgpu device: maybe free memory we're
using specifically for that PCI device, maybe write
some registers, maybe notify someone or something, etc.

On another note, adding "drmm_add_final_kfree()" in the middle
of amdgpu_pci_probe() seems hackish--it's neither part
of drm_dev_init() nor of drm_dev_register(). We really
don't need it, since we rely on the kref infrastructure
to tell us when to free the device, and if you'd look
at the beautiful stack below, it knows exactly when that is,
i.e. when to free it.

The correct thing to do this is to
_leave the amdgpu_driver_release()_ alone,
remove "drmm_add_final_kfree()" and qualify
the WARN_ON() in drm_dev_register() by
the existence of drm_driver.release() (i.e. non-NULL).

I'll submit a sequence of patches to fix this right.

Regards,
Luben

>
>>
>> [   36.269348] BUG: unable to handle page fault for address: a0c279940028
>> [   36.276841] #PF: supervisor read access in kernel mode
>> [   36.282434] #PF: error_code(0x) - not-present page
>> [   36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 
>> 8008066bf060
>> [   36.296868] Oops:  [#1] SMP DEBUG_PAGEALLOC NOPTI
>> [   36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G   O  
>> 5.9.0-rc2+ #46
>> [   36.310670] Hardware name: System manufacturer System Product Name/PRIME 
>> Z390-A, BIOS 1401 11/26/2019
>> [   36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm]
>> [   36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 
>> be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 
>> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5
>> [   36.347217] RSP: 0018:b9424141fce0 EFLAGS: 00010282
>> [   36.352931] RAX: 0006 RBX: a0c279940010 RCX: 
>> 0006
>> [   36.360718] RDX: c0419f5a RSI: 0200 RDI: 
>> a0c279940010
>> [   36.368503] RBP: b9424141fd10 R08: 0001 R09: 
>> 0001
>> [   36.376304] 

Re: [PATCH 0/3] Use implicit kref infra

2020-09-01 Thread Pan, Xinhui
If you take a look at the below function, you should not use driver's release 
to free adev. As dev is embedded in adev.

 809 static void drm_dev_release(struct kref *ref)
 810 {
 811 struct drm_device *dev = container_of(ref, struct drm_device, ref);
 812
 813 if (dev->driver->release)
 814 dev->driver->release(dev);
 815 
 816 drm_managed_release(dev);
 817 
 818 kfree(dev->managed.final_kfree);
 819 }

You have to make another change something like
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 13068fdf4331..2aabd2b4c63b 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref)
 
drm_managed_release(dev);
 
-   kfree(dev->managed.final_kfree);
+   if (dev->driver->final_release)
+   dev->driver->final_release(dev);
 }

And in the final_release callback we free the dev. But that is a little complex 
now. so I prefer still using final_kfree.
Of course we can do some cleanup work in the driver's release callback. BUT no 
kfree.

-原始邮件-
发件人: "Tuikov, Luben" 
日期: 2020年9月2日 星期三 09:07
收件人: "amd-gfx@lists.freedesktop.org" , 
"dri-de...@lists.freedesktop.org" 
抄送: "Deucher, Alexander" , Daniel Vetter 
, "Pan, Xinhui" , "Tuikov, Luben" 

主题: [PATCH 0/3] Use implicit kref infra

Use the implicit kref infrastructure to free the container
struct amdgpu_device, container of struct drm_device.

First, in drm_dev_register(), do not indiscriminately warn
when a DRM driver hasn't opted for managed.final_kfree,
but instead check if the driver has provided its own
"release" function callback in the DRM driver structure.
If that is the case, no warning.

Remove drmm_add_final_kfree(). We take care of that, in the
kref "release" callback when all refs are down to 0, via
drm_dev_put(), i.e. the free is implicit.

Remove superfluous NULL check, since the DRM device to be
suspended always exists, so long as the underlying PCI and
DRM devices exist.

Luben Tuikov (3):
  drm: No warn for drivers who provide release
  drm/amdgpu: Remove drmm final free
  drm/amdgpu: Remove superfluous NULL check

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 2 --
 drivers/gpu/drm/drm_drv.c  | 3 ++-
 3 files changed, 2 insertions(+), 6 deletions(-)

-- 
2.28.0.394.ge197136389



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] Use implicit kref infra

2020-09-01 Thread Pan, Xinhui


> 2020年9月2日 11:46,Tuikov, Luben  写道:
> 
> On 2020-09-01 21:42, Pan, Xinhui wrote:
>> If you take a look at the below function, you should not use driver's 
>> release to free adev. As dev is embedded in adev.
> 
> Do you mean "look at the function below", using "below" as an adverb?
> "below" is not an adjective.
> 
> I know dev is embedded in adev--I did that patchset.
> 
>> 
>> 809 static void drm_dev_release(struct kref *ref)
>> 810 {
>> 811 struct drm_device *dev = container_of(ref, struct drm_device, 
>> ref);
>> 812
>> 813 if (dev->driver->release)
>> 814 dev->driver->release(dev);
>> 815 
>> 816 drm_managed_release(dev);
>> 817 
>> 818 kfree(dev->managed.final_kfree);
>> 819 }
> 
> That's simple--this comes from change c6603c740e0e3
> and it should be reverted. Simple as that.
> 
> The version before this change was absolutely correct:
> 
> static void drm_dev_release(struct kref *ref)
> {
>   if (dev->driver->release)
>   dev->driver->release(dev);
>   else
>   drm_dev_fini(dev);
> }
> 
> Meaning, "the kref is now 0"--> if the driver
> has a release, call it, else use our own.
> But note that nothing can be assumed after this point,
> about the existence of "dev".
> 
> It is exactly because struct drm_device is statically
> embedded into a container, struct amdgpu_device,
> that this change above should be reverted.
> 
> This is very similar to how fops has open/release
> but no close. That is, the "release" is called
> only when the last kref is released, i.e. when
> kref goes from non-zero to zero.
> 
> This uses the kref infrastructure which has been
> around for about 20 years in the Linux kernel.
> 
> I suggest reading the comments
> in drm_dev.c mostly, "DOC: driver instance overview"
> starting at line 240 onwards. This is right above
> drm_put_dev(). There is actually an example of a driver
> in the comment. Also the comment to drm_dev_init().
> 
> Now, take a look at this:
> 
> /**
> * drm_dev_put - Drop reference of a DRM device
> * @dev: device to drop reference of or NULL
> *
> * This decreases the ref-count of @dev by one. The device is destroyed if the
> * ref-count drops to zero.
> */
> void drm_dev_put(struct drm_device *dev)
> {
>if (dev)
>kref_put(&dev->ref, drm_dev_release);
> }
> EXPORT_SYMBOL(drm_dev_put);
> 
> Two things:
> 
> 1. It is us, who kzalloc the amdgpu device, which contains
> the drm_device (you'll see this discussed in the reading
> material I pointed to above). We do this because we're
> probing the PCI device whether we'll work it it or not.
> 

that is true.
My understanding of the drm core code is like something below.
struct B { 
strcut A 
}
we initialize A firstly and initialize B in the end. But destroy B firstly and 
destory A in the end.
But yes, practice is more complex. 
if B has nothing to be destroyed. we can destory A directly, otherwise destroy 
B firstly.

in this case, we can do something below in our release()
//some cleanup work of B
drm_dev_fini(dev);//destroy A
kfree(adev)

> 2. Using the kref infrastructure, when the ref goes to 0,
> drm_dev_release is called. And here's the KEY:
> Because WE allocated the container, we should free it--after the release
> method is called, DRM cannot assume anything about the drm
> device or the container. The "release" method is final.
> 
> We allocate, we free. And we free only when the ref goes to 0.
> 
> DRM can, in due time, "free" itself of the DRM device and stop
> having knowledge of it--that's fine, but as long as the ref
> is not 0, the amdgpu device and thus the contained DRM device,
> cannot be freed.
> 
>> 
>> You have to make another change something like
>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> index 13068fdf4331..2aabd2b4c63b 100644
>> --- a/drivers/gpu/drm/drm_drv.c
>> +++ b/drivers/gpu/drm/drm_drv.c
>> @@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref)
>> 
>>drm_managed_release(dev);
>> 
>> -   kfree(dev->managed.final_kfree);
>> +   if (dev->driver->final_release)
>> +   dev->driver->final_release(dev);
>> }
> 
> No. What's this?
> There is no such thing as "final" release, nor is there a "partial" release.
> When the kref goes to 0, the device disappears. Simple.
>

Re: [PATCH V2] drm/amdgpu: Do not move root PT bo to relocated list

2020-09-01 Thread Pan, Xinhui


> 2020年9月1日 21:54,Christian König  写道:
> 
> Agreed, that change doesn't seem to make sense and your backtrace is mangled 
> so barely readable.

it is reply that messed up the logs.

And this patch was sent on 10th Feb. 
> 
> Christian.
> 
> Am 01.09.20 um 14:59 schrieb Liu, Monk:
>> [AMD Official Use Only - Internal Distribution Only]
>> 
>> See that we already have such logic:
>> 
>> 282 static void amdgpu_vm_bo_relocated(struct amdgpu_vm_bo_base *vm_bo)
>>  283 {
>>  284 if (vm_bo->bo->parent)
>>  285 list_move(&vm_bo->vm_status, &vm_bo->vm->relocated);
>>  286 else
>>  287 amdgpu_vm_bo_idle(vm_bo);
>>  288 }
>> 
>> Why you need to do the bo->parent check out side ?

because it is me that moves such logic into amdgpu_vm_bo_relocated.

>> 
>> -邮件原件-
>> 发件人: amd-gfx  代表 Pan, Xinhui
>> 发送时间: 2020年2月10日 9:04
>> 收件人: amd-gfx@lists.freedesktop.org
>> 抄送: Deucher, Alexander ; Koenig, Christian 
>> 
>> 主题: [PATCH V2] drm/amdgpu: Do not move root PT bo to relocated list
>> 
>> hit panic when we update the page tables.
>> 
>> <1>[  122.103290] BUG: kernel NULL pointer dereference, address: 
>> 0008 <1>[  122.103348] #PF: supervisor read access in kernel 
>> mode <1>[  122.103376] #PF: error_code(0x) - not-present page <6>[  
>> 122.103403] PGD 0 P4D 0 <4>[  122.103421] Oops:  [#1] SMP PTI
>> <4>[  122.103442] CPU: 13 PID: 2133 Comm: kfdtest Tainted: G   OE
>>  5.4.0-rc7+ #7
>> <4>[  122.103480] Hardware name: Supermicro SYS-7048GR-TR/X10DRG-Q, BIOS 
>> 3.0b 03/09/2018 <4>[  122.103657] RIP: 
>> 0010:amdgpu_vm_update_pdes+0x140/0x330 [amdgpu] <4>[  122.103689] Code: 03 
>> 4c 89 73 08 49 89 9d c8 00 00 00 48 8b 7b f0 c6 43 10 00 45 31 c0 48 8b 87 
>> 28 04 00 00 48 85 c0 74 07 4c 8b 80 20 04 00 00 <4d> 8b 70 08 31 f6 49 8b 86 
>> 28 04 00 00 48 85 c0 74 0f 48 8b 80 28 <4>[  122.103769] RSP: 
>> 0018:b49a0a6a3a98 EFLAGS: 00010246 <4>[  122.103797] RAX: 
>>  RBX: 9020f823c148 RCX: dead0122 <4>[  
>> 122.103831] RDX: 9020ece70018 RSI: 9020f823c0c8 RDI: 
>> 9010ca31c800 <4>[  122.103865] RBP: b49a0a6a3b38 R08: 
>>  R09: 0001 <4>[  122.103899] R10: 
>> 6044f994 R11: df57fb58 R12: 9020f823c000 <4>[  
>> 122.103933] R13: 9020f823c000 R14: 9020f823c0c8 R15: 
>> 9010d5d2 <4>[  122.103968] FS:  7f32c83dc780() 
>> GS:9020ff38() knlGS: <4>[  122.104006] CS:  0010 
>> DS:  ES:  CR0: 80050033 <4>[  122.104035] CR2: 
>> 0008 CR3: 002036bba005 CR4: 003606e0 <4>[  
>> 122.104069] DR0:  DR1:  DR2: 
>>  <4>[  122.104103] DR3:  DR6: 
>> fffe0ff0 DR7: 0400 <4>[  122.104137] Call Trace:
>> <4>[  122.104241]  vm_update_pds+0x31/0x50 [amdgpu] <4>[  122.104347]  
>> amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x2ef/0x690 [amdgpu] <4>[  122.104466] 
>>  kfd_process_alloc_gpuvm+0x98/0x190 [amdgpu] <4>[  122.104576]  
>> kfd_process_device_init_vm.part.8+0xf3/0x1f0 [amdgpu] <4>[  122.104688]  
>> kfd_process_device_init_vm+0x24/0x30 [amdgpu] <4>[  122.104794]  
>> kfd_ioctl_acquire_vm+0xa4/0xc0 [amdgpu] <4>[  122.104900]  
>> kfd_ioctl+0x277/0x500 [amdgpu] <4>[  122.105001]  ? 
>> kfd_ioctl_free_memory_of_gpu+0xc0/0xc0 [amdgpu] <4>[  122.105039]  ? 
>> rcu_read_lock_sched_held+0x4f/0x80
>> <4>[  122.105068]  ? kmem_cache_free+0x2ba/0x300 <4>[  122.105093]  ? 
>> vm_area_free+0x18/0x20 <4>[  122.105117]  ? find_held_lock+0x35/0xa0 <4>[  
>> 122.105143]  do_vfs_ioctl+0xa9/0x6f0 <4>[  122.106001]  ksys_ioctl+0x75/0x80 
>> <4>[  122.106802]  ? do_syscall_64+0x17/0x230 <4>[  122.107605]  
>> __x64_sys_ioctl+0x1a/0x20 <4>[  122.108378]  do_syscall_64+0x5f/0x230 <4>[  
>> 122.109118]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> <4>[  122.109842] RIP: 0033:0x7f32c6b495d7
>> 
>> Signed-off-by: xinhui pan 
>> ---
>> change from v1:
>>move root pt bo to idle state instead.
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++---
>>  1 file changed, 6 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>

Re: [PATCH] drm/amdgpu: fix max_entries calculation

2020-09-02 Thread Pan, Xinhui


> 2020年9月2日 20:05,Christian König  写道:
> 
> Calculate the correct value for max_entries or we might run after the
> page_address array.
> 
> Signed-off-by: Christian König 
> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 8bc2253939be..8aa9584c184f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1697,7 +1697,8 @@ static int amdgpu_vm_bo_split_mapping(struct 
> amdgpu_device *adev,
>   AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>   } else {
>   addr = 0;
> - max_entries = S64_MAX;
> + max_entries = ((mapping->last - mapping->start) >>
> +AMDGPU_GPU_PAGE_SHIFT) + 1;

should it be like below?
max_entries = (mapping->last - mapping->start + 1 - pfn) * 
AMDGPU_GPU_PAGES_IN_CPU_PAGE;

last and start are already pfns. why still >> AMDGPU_GPU_PAGE_SHIFT? Am I 
missing something?

>   }
> 
>   if (pages_addr) {
> -- 
> 2.17.1
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix max_entries calculation v2

2020-09-02 Thread Pan, Xinhui


> 2020年9月2日 22:05,Christian König  写道:
> 
> Calculate the correct value for max_entries or we might run after the
> page_address array.
> 
> v2: Xinhui pointed out we don't need the shift
> 
> Signed-off-by: Christian König 
> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 8bc2253939be..be886bdca5c6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1697,7 +1697,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
> amdgpu_device *adev,
>   AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>   } else {
>   addr = 0;
> - max_entries = S64_MAX;
> + max_entries = mapping->last - mapping->start + 1;

You need minus pfn here.

The range we are going to touch is [start + offset, last].
so the max_entries is last - (start + offset) + 1. and offset is pfn in this 
case.

I still hit panic with this patch in practice.

>   }
> 
>   if (pages_addr) {
> -- 
> 2.17.1
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix max_entries calculation v2

2020-09-02 Thread Pan, Xinhui


> 2020年9月2日 22:31,Christian König  写道:
> 
> Am 02.09.20 um 16:27 schrieb Pan, Xinhui:
>> 
>>> 2020年9月2日 22:05,Christian König  写道:
>>> 
>>> Calculate the correct value for max_entries or we might run after the
>>> page_address array.
>>> 
>>> v2: Xinhui pointed out we don't need the shift
>>> 
>>> Signed-off-by: Christian König 
>>> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 8bc2253939be..be886bdca5c6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -1697,7 +1697,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
>>> amdgpu_device *adev,
>>> AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>>> } else {
>>> addr = 0;
>>> -   max_entries = S64_MAX;
>>> +   max_entries = mapping->last - mapping->start + 1;
>> You need minus pfn here.
> 
> That doesn't sound correct either. The pfn is the destination of the mapping, 
> e.g. the offset inside the BO and not related to the virtual address range we 
> map.

I mean we need minus pfn too. pfn is mapping->offset >> PAGE_SHIFT.

In amdgpu_vm_bo_map(), there is a check  below
if (bo && offset + size > amdgpu_bo_size(bo))
return -EINVAL;
so mapping->offset is just an offset_in_bytes inside the BO as you said. 

mapping->start and mapping->last are virtual addresses in pfns, the range we 
are going to touch then is [start+ offset_in_pfns, last].

> 
>> The range we are going to touch is [start + offset, last].
>> so the max_entries is last - (start + offset) + 1. and offset is pfn in this 
>> case.
>> 
>> I still hit panic with this patch in practice.
> 
> Thanks for testing, I think I know what the problem is.
> 
> We need to start instead of mapping->start or otherwise the values is to 
> large after the first iteration.
> 
> Give me a second for a v3.
> 
> Christian.
> 
>> 
>>> }
>>> 
>>> if (pages_addr) {
>>> -- 
>>> 2.17.1
>>> 
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix max_entries calculation v3

2020-09-02 Thread Pan, Xinhui


> 2020年9月2日 23:21,Christian König  写道:
> 
> Calculate the correct value for max_entries or we might run after the
> page_address array.
> 
> v2: Xinhui pointed out we don't need the shift
> v3: use local copy of start and simplify some calculation
> 
> Signed-off-by: Christian König 
> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 8bc2253939be..7cbee1a7ec84 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1697,7 +1697,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
> amdgpu_device *adev,
>   AMDGPU_GPU_PAGES_IN_CPU_PAGE;
>   } else {
>   addr = 0;
> - max_entries = S64_MAX;
> + max_entries = mapping->last - start + 1;
>   }
> 
>   if (pages_addr) {
> @@ -1727,7 +1727,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
> amdgpu_device *adev,
>   addr += pfn << PAGE_SHIFT;
>   }
> 
> - last = min((uint64_t)mapping->last, start + max_entries - 1);
> + last = start + max_entries - 1;

I am not sure if we calculate the max_entries by nodes->size. does it work in 
that case?

>   r = amdgpu_vm_bo_update_mapping(adev, vm, false, false, resv,
>   start, last, flags, addr,
>   dma_addr, fence);
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Cbb2c2456534842d24e9c08d84f53cfc3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637346568673703868&sdata=rLB3ME25AkyRaP6kd3JxOkvqz3iSKhHu9bkZnMMqS74%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 0/3] Use implicit kref infra

2020-09-02 Thread Pan, Xinhui


> 2020年9月2日 22:50,Tuikov, Luben  写道:
> 
> On 2020-09-02 00:43, Pan, Xinhui wrote:
>> 
>> 
>>> 2020年9月2日 11:46,Tuikov, Luben  写道:
>>> 
>>> On 2020-09-01 21:42, Pan, Xinhui wrote:
>>>> If you take a look at the below function, you should not use driver's 
>>>> release to free adev. As dev is embedded in adev.
>>> 
>>> Do you mean "look at the function below", using "below" as an adverb?
>>> "below" is not an adjective.
>>> 
>>> I know dev is embedded in adev--I did that patchset.
>>> 
>>>> 
>>>> 809 static void drm_dev_release(struct kref *ref)
>>>> 810 {
>>>> 811 struct drm_device *dev = container_of(ref, struct drm_device, 
>>>> ref);
>>>> 812
>>>> 813 if (dev->driver->release)
>>>> 814 dev->driver->release(dev);
>>>> 815 
>>>> 816 drm_managed_release(dev);
>>>> 817 
>>>> 818 kfree(dev->managed.final_kfree);
>>>> 819 }
>>> 
>>> That's simple--this comes from change c6603c740e0e3
>>> and it should be reverted. Simple as that.
>>> 
>>> The version before this change was absolutely correct:
>>> 
>>> static void drm_dev_release(struct kref *ref)
>>> {
>>> if (dev->driver->release)
>>> dev->driver->release(dev);
>>> else
>>> drm_dev_fini(dev);
>>> }
>>> 
>>> Meaning, "the kref is now 0"--> if the driver
>>> has a release, call it, else use our own.
>>> But note that nothing can be assumed after this point,
>>> about the existence of "dev".
>>> 
>>> It is exactly because struct drm_device is statically
>>> embedded into a container, struct amdgpu_device,
>>> that this change above should be reverted.
>>> 
>>> This is very similar to how fops has open/release
>>> but no close. That is, the "release" is called
>>> only when the last kref is released, i.e. when
>>> kref goes from non-zero to zero.
>>> 
>>> This uses the kref infrastructure which has been
>>> around for about 20 years in the Linux kernel.
>>> 
>>> I suggest reading the comments
>>> in drm_dev.c mostly, "DOC: driver instance overview"
>>> starting at line 240 onwards. This is right above
>>> drm_put_dev(). There is actually an example of a driver
>>> in the comment. Also the comment to drm_dev_init().
>>> 
>>> Now, take a look at this:
>>> 
>>> /**
>>> * drm_dev_put - Drop reference of a DRM device
>>> * @dev: device to drop reference of or NULL
>>> *
>>> * This decreases the ref-count of @dev by one. The device is destroyed if 
>>> the
>>> * ref-count drops to zero.
>>> */
>>> void drm_dev_put(struct drm_device *dev)
>>> {
>>>   if (dev)
>>>   kref_put(&dev->ref, drm_dev_release);
>>> }
>>> EXPORT_SYMBOL(drm_dev_put);
>>> 
>>> Two things:
>>> 
>>> 1. It is us, who kzalloc the amdgpu device, which contains
>>> the drm_device (you'll see this discussed in the reading
>>> material I pointed to above). We do this because we're
>>> probing the PCI device whether we'll work it it or not.
>>> 
>> 
>> that is true.
> 
> Of course it's true--good morning!
> 
>> My understanding of the drm core code is like something below.
> 
> Let me stop you right there--just read the documentation I pointed
> to you at.
> 
>> struct B { 
>>  strcut A 
>> }
>> we initialize A firstly and initialize B in the end. But destroy B firstly 
>> and destory A in the end.
> 
> No!
> B, which is the amdgpu_device struct "exists" before A, which is the DRM 
> struct.
> This is why DRM recommends to _embed_ it into the driver's own device struct,
> as the documentation I pointed you to at.
> 
I think you are misleading me here.  A pci dev as you said below can act as 
many roles, a drm dev, a tty dev, etc.
say, struct B{
struct A;
struct TTY;
struct printer;
...
}
but TTY or other members has nothing to do with our discussion.

B of course exists before A. but the code logic is not that. code below is 
really rare in drm world.
create_B()
{
init B members
return create_A()
}
So usually B have more work to d

Re: [PATCH] drm/amdgpu: fix max_entries calculation v4

2020-09-03 Thread Pan, Xinhui
Reviewed-by: xinhui pan 


> 2020年9月3日 17:03,Christian König  写道:
> 
> Calculate the correct value for max_entries or we might run after the
> page_address array.
> 
> v2: Xinhui pointed out we don't need the shift
> v3: use local copy of start and simplify some calculation
> v4: fix the case that we map less VA range than BO size
> 
> Signed-off-by: Christian König 
> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 
> 1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 8bc2253939be..d6dcd58a8f1a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1691,13 +1691,13 @@ static int amdgpu_vm_bo_split_mapping(struct 
> amdgpu_device *adev,
>   uint64_t max_entries;
>   uint64_t addr, last;
> 
> + max_entries = mapping->last - start + 1;
>   if (nodes) {
>   addr = nodes->start << PAGE_SHIFT;
> - max_entries = (nodes->size - pfn) *
> - AMDGPU_GPU_PAGES_IN_CPU_PAGE;
> + max_entries = min((nodes->size - pfn) *
> + AMDGPU_GPU_PAGES_IN_CPU_PAGE, max_entries);
>   } else {
>   addr = 0;
> - max_entries = S64_MAX;
>   }
> 
>   if (pages_addr) {
> @@ -1727,7 +1727,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
> amdgpu_device *adev,
>   addr += pfn << PAGE_SHIFT;
>   }
> 
> - last = min((uint64_t)mapping->last, start + max_entries - 1);
> + last = start + max_entries - 1;
>   r = amdgpu_vm_bo_update_mapping(adev, vm, false, false, resv,
>   start, last, flags, addr,
>   dma_addr, fence);
> -- 
> 2.17.1
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] amd/amdgpu: Fix resv shared fence overflow

2020-09-28 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

Pls ignore this patch.

-Original Message-
From: Pan, Xinhui 
Sent: 2020年9月29日 13:17
To: amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Deucher, Alexander 
; Pan, Xinhui 
Subject: [PATCH] amd/amdgpu: Fix resv shared fence overflow

[  179.556745] kernel BUG at drivers/dma-buf/dma-resv.c:282!
[snip]
[  179.702910] Call Trace:
[  179.705696]  amdgpu_bo_fence+0x21/0x50 [amdgpu] [  179.710707]  
amdgpu_vm_sdma_commit+0x299/0x430 [amdgpu] [  179.716497]  
amdgpu_vm_bo_update_mapping.constprop.0+0x29f/0x390 [amdgpu] [  179.723927]  ? 
find_held_lock+0x38/0x90 [  179.728183]  amdgpu_vm_handle_fault+0x1af/0x420 
[amdgpu] [  179.734063]  gmc_v9_0_process_interrupt+0x245/0x2e0 [amdgpu] [  
179.740347]  ? kgd2kfd_interrupt+0xb8/0x1e0 [amdgpu] [  179.745808]  
amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu] [  179.751380]  ? 
amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu] [  179.757159]  
amdgpu_ih_process+0xbb/0x1a0 [amdgpu] [  179.762466]  
amdgpu_irq_handle_ih1+0x27/0x40 [amdgpu] [  179.767997]  
process_one_work+0x23c/0x580 [  179.772371]  worker_thread+0x50/0x3b0 [  
179.776356]  ? process_one_work+0x580/0x580 [  179.780939]  kthread+0x128/0x160 
[  179.784462]  ? kthread_park+0x90/0x90 [  179.788466]  ret_from_fork+0x1f/0x30

For unlocked case, we add last_unlocked fence to root bo resv if it has not 
been signaled.
And we will add another job fence to root bo resv in ->commit(). That causes 
the shared fence count bigger than it reserves.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37221b99ca96..77689cecd189 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1615,6 +1615,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
 struct dma_fence *tmp = dma_fence_get_stub();

 amdgpu_bo_fence(vm->root.base.bo, vm->last_unlocked, true);
+dma_resv_reserve_shared(vm->root.base.bo->tbo.base.resv, 1);
 swap(vm->last_unlocked, tmp);
 dma_fence_put(tmp);
 }
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Do gpu reset if we lost some gpu reset requests

2019-08-04 Thread Pan, Xinhui

As the race of gpu reset with ras interrupts. we might lose a chance to
do gpu recovery. To guarantee the gpu has recovered successfully, we use
atomic to save the counts of gpu reset requests, and issue another gpu
reset if there are any pending requests.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  2 +-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index a96b0f17c619..c1f444b74b19 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1220,7 +1220,15 @@ static void amdgpu_ras_do_recovery(struct work_struct 
*work)
container_of(work, struct amdgpu_ras, recovery_work);
 
amdgpu_device_gpu_recover(ras->adev, 0);
-   atomic_set(&ras->in_recovery, 0);
+   /* if there is no competiton, in_recovery changes from 1 to 0.
+* if ras_reset_gpu is called while we are doing gpu recvoery,
+* bacause of the atomic protection, we may lose some recovery
+* requests.
+* So we use atomic_xchg to check the count of requests, and
+* issue another gpu reset request to perform the gpu recovery.
+*/
+   if (atomic_xchg(&ras->in_recovery, 0) > 1)
+   amdgpu_ras_reset_gpu(ras->adev, 0);
 }
 
 static int amdgpu_ras_release_vram(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 2765f2dbb1e6..ba423a4a3013 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -498,7 +498,7 @@ static inline int amdgpu_ras_reset_gpu(struct amdgpu_device 
*adev,
 {
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
 
-   if (atomic_cmpxchg(&ras->in_recovery, 0, 1) == 0)
+   if (atomic_inc_return(&ras->in_recovery) == 1)
schedule_work(&ras->recovery_work);
return 0;
 }
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: Fix panic during gpu reset

2019-08-04 Thread Pan, Xinhui
Clear the flag after hw suspend, otherwise it skips the corresponding hw
resume.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 31abd8885fde..f62d4f30e810 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2256,6 +2256,7 @@ static int amdgpu_device_ip_suspend_phase2(struct 
amdgpu_device *adev)
DRM_ERROR("suspend of IP block <%s> failed %d\n",
  adev->ip_blocks[i].version->funcs->name, r);
}
+   adev->ip_blocks[i].status.hw = false;
/* handle putting the SMC in the appropriate state */
if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC) {
if (is_support_sw_smu(adev)) {
@@ -2270,7 +2271,6 @@ static int amdgpu_device_ip_suspend_phase2(struct 
amdgpu_device *adev)
  adev->mp1_state, r);
return r;
}
-   adev->ip_blocks[i].status.hw = false;
}
}
}
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: flush the fence on the bo after we individualize

2020-01-14 Thread Pan, Xinhui
As we move the ttm_bo_individualize_resv() upwards, we need flush the
copied fence too. Otherwise the driver keeps waiting for fence.

run&Kill kfdtest, then perf top.

  25.53%  [ttm] [k] ttm_bo_delayed_delete
  24.29%  [kernel]  [k] dma_resv_test_signaled_rcu
  19.72%  [kernel]  [k] ww_mutex_lock

Fix: 378e2d5b("drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more")
Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 8d91b0428af1..1494aebb8128 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -499,8 +499,10 @@ static void ttm_bo_cleanup_refs_or_queue(struct 
ttm_buffer_object *bo)
 
dma_resv_unlock(bo->base.resv);
}
-   if (bo->base.resv != &bo->base._resv)
+   if (bo->base.resv != &bo->base._resv) {
+   ttm_bo_flush_all_fences(bo);
dma_resv_unlock(&bo->base._resv);
+   }
 
 error:
kref_get(&bo->list_kref);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: add the lost mutex_init back

2020-01-15 Thread Pan, Xinhui


Initialize notifier_lock.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2c64d2a83d61..c2453532fd95 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2851,6 +2851,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
hash_init(adev->mn_hash);
mutex_init(&adev->lock_reset);
mutex_init(&adev->psp.mutex);
+   mutex_init(&adev->notifier_lock);
 
r = amdgpu_device_check_arguments(adev);
if (r)
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: initialize bo_va_list when add gws to process

2020-01-21 Thread Pan, Xinhui


bo_va_list is list_head, so initialize it.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9e7889c28f3e..ef721cb65868 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2126,6 +2126,7 @@ int amdgpu_amdkfd_add_gws_to_process(void *info, void 
*gws, struct kgd_mem **mem
return -ENOMEM;
 
mutex_init(&(*mem)->lock);
+   INIT_LIST_HEAD(&(*mem)->bo_va_list);
(*mem)->bo = amdgpu_bo_ref(gws_bo);
(*mem)->domain = AMDGPU_GEM_DOMAIN_GWS;
(*mem)->process_info = process_info;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[RFC PATCH] drm/amdgpu: Remove eviction fence before release bo

2020-02-05 Thread Pan, Xinhui


No need to trigger eviction as the memory mapping will not be used anymore.

All pt/pd bos share same resv, hence the same shared eviction fence. Everytime 
page table is freed, the fence will be signled and that cuases kfd unexcepted 
evictions.

kfd bo uses its own resv, so it is not affetced.

Signed-off-by: xinhui pan 
---

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 47b0f29..265b1ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -96,6 +96,7 @@
   struct mm_struct *mm);
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
 struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);
 
 struct amdkfd_process_info {
/* List head of all VMs that belong to a KFD process */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index ef721cb..a3c55ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -276,6 +276,26 @@
return 0;
 }
 
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
+{
+   struct amdgpu_vm *vm;
+   int ret = 0;
+
+   if (bo->vm_bo && bo->vm_bo->vm) {
+   vm = bo->vm_bo->vm;
+   if (vm->process_info && vm->process_info->eviction_fence) {
+   BUG_ON(!dma_resv_trylock(&bo->tbo.base._resv));
+   if (bo->tbo.base.resv != &bo->tbo.base._resv) {
+   dma_resv_copy_fences(&bo->tbo.base._resv, 
bo->tbo.base.resv);
+   bo->tbo.base.resv = &bo->tbo.base._resv;
+   }
+   ret = amdgpu_amdkfd_remove_eviction_fence(bo, 
vm->process_info->eviction_fence);
+   dma_resv_unlock(bo->tbo.base.resv);
+   }
+   }
+   return ret;
+}
+
 static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 bool wait)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6f60a58..4b5bee0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1307,6 +1307,9 @@
if (abo->kfd_bo)
amdgpu_amdkfd_unreserve_memory_limit(abo);
 
+   amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
+   abo->vm_bo = NULL;
+
if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
return;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index cc56eab..187cdb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -945,7 +945,6 @@
 static void amdgpu_vm_free_table(struct amdgpu_vm_pt *entry)
 {
if (entry->base.bo) {
-   entry->base.bo->vm_bo = NULL;
list_del(&entry->base.vm_status);
amdgpu_bo_unref(&entry->base.bo->shadow);
amdgpu_bo_unref(&entry->base.bo);
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[RFC PATCH v2] drm/amdgpu: Remove kfd eviction fence before release bo

2020-02-07 Thread Pan, Xinhui
No need to trigger eviction as the memory mapping will not be used anymore.

All pt/pd bos share same resv, hence the same shared eviction fence. Everytime 
page table is freed, the fence will be signled and that cuases kfd unexcepted 
evictions.

Signed-off-by: xinhui pab 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 78 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 10 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  5 +-
 drivers/gpu/drm/ttm/ttm_bo.c  | 38 +
 5 files changed, 111 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 47b0f2957d1f..265b1ed7264c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -96,6 +96,7 @@ struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 
context,
   struct mm_struct *mm);
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
 struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);
 
 struct amdkfd_process_info {
/* List head of all VMs that belong to a KFD process */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index ef721cb65868..11315095c29b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -223,7 +223,7 @@ void amdgpu_amdkfd_unreserve_memory_limit(struct amdgpu_bo 
*bo)
 static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
struct amdgpu_amdkfd_fence *ef)
 {
-   struct dma_resv *resv = bo->tbo.base.resv;
+   struct dma_resv *resv = &bo->tbo.base._resv;
struct dma_resv_list *old, *new;
unsigned int i, j, k;
 
@@ -276,6 +276,78 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
return 0;
 }
 
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
+{
+   struct amdgpu_vm_bo_base*vm_bo = bo->vm_bo;
+   struct amdgpu_vm *vm;
+   struct amdkfd_process_info *info;
+   struct amdgpu_amdkfd_fence *ef;
+   struct amdgpu_bo *parent;
+   int locked;
+   int ret;
+   struct ttm_bo_global *glob = &ttm_bo_glob;
+
+   if (vm_bo == NULL)
+   return 0;
+
+   /* Page table bo has a reference of the parent bo.
+* BO itself can't guarntee the vm it points to is alive.
+* for example, VM is going to free page tables, and the pt/pd bo might 
be
+* freed by a workqueue. In that case, the vm might be freed already,
+* leaving the bo->vm_bo points to vm.root.
+*
+* so to avoid that, when kfd free its vms,
+* 1) set vm->process_info to NULL if this is the last vm.
+* 2) set root_bo->vm_bo to NULL.
+*
+* but there are still races, just like
+* cpu 1cpu 2
+*  !vm_bo
+* ->info = NULL
+* free(info)
+* ->vm_bo = NULL
+* free (vm)
+*  info = vm->info //invalid vm
+*
+* So to avoid the race, use ttm_bo_glob lru_lock.
+* generally speaking, adding a new lock is accceptable.
+* But reusing this lock is simple.
+*/
+   parent = bo;
+   while (parent->parent)
+   parent = parent->parent;
+
+   spin_lock(&glob->lru_lock);
+   vm_bo = parent->vm_bo;
+   if (!vm_bo) {
+   spin_unlock(&glob->lru_lock);
+   return 0;
+   }
+
+   vm = vm_bo->vm;
+   if (!vm) {
+   spin_unlock(&glob->lru_lock);
+   return 0;
+   }
+
+   info = vm->process_info;
+   if (!info || !info->eviction_fence) {
+   spin_unlock(&glob->lru_lock);
+   return 0;
+   }
+
+   ef = container_of(dma_fence_get(&info->eviction_fence->base),
+   struct amdgpu_amdkfd_fence, base);
+   spin_unlock(&glob->lru_lock);
+
+   locked = dma_resv_trylock(&bo->tbo.base._resv);
+   ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
+   dma_fence_put(&ef->base);
+   if (locked)
+   dma_resv_unlock(&bo->tbo.base._resv);
+   return ret;
+}
+
 static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 bool wait)
 {
@@ -1030,6 +1102,7 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device 
*adev,
 {
struct amdkfd_process_info *process_info = vm->process_info;
struct amdgpu_bo *pd = vm->root.base.bo;
+   struct ttm_bo_global *glob = &ttm_bo_glob;
 
if (!process_info)
return;
@@ -1051,6 +1124,9 @@ void amdgpu_amdk

回复: [RFC PATCH v2] drm/amdgpu: Remove kfd eviction fence before release bo

2020-02-07 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]


发件人: Koenig, Christian 
发送时间: 2020年2月7日 21:46
收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
抄送: Kuehling, Felix; Deucher, Alexander
主题: Re: [RFC PATCH v2] drm/amdgpu: Remove kfd eviction fence before release bo

Am 07.02.20 um 14:42 schrieb Pan, Xinhui:
> No need to trigger eviction as the memory mapping will not be used anymore.
>
> All pt/pd bos share same resv, hence the same shared eviction fence. 
> Everytime page table is freed, the fence will be signled and that cuases kfd 
> unexcepted evictions.
>
> Signed-off-by: xinhui pab 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 78 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 10 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  5 +-
>   drivers/gpu/drm/ttm/ttm_bo.c  | 38 +
>   5 files changed, 111 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 47b0f2957d1f..265b1ed7264c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -96,6 +96,7 @@ struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 
> context,
>  struct mm_struct *mm);
>   bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
>   struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
> +int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);
>
>   struct amdkfd_process_info {
>   /* List head of all VMs that belong to a KFD process */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index ef721cb65868..11315095c29b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -223,7 +223,7 @@ void amdgpu_amdkfd_unreserve_memory_limit(struct 
> amdgpu_bo *bo)
>   static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
>   struct amdgpu_amdkfd_fence *ef)
>   {
> - struct dma_resv *resv = bo->tbo.base.resv;
> + struct dma_resv *resv = &bo->tbo.base._resv;

That won't work either and probably break a bunch of other cases.

[xh] kfd bos which are allocated explicitly use _resv AFAIK.
only pt/pd bos share the root._resv

Christian.

>   struct dma_resv_list *old, *new;
>   unsigned int i, j, k;
>
> @@ -276,6 +276,78 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
> amdgpu_bo *bo,
>   return 0;
>   }
>
> +int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
> +{
> + struct amdgpu_vm_bo_base*vm_bo = bo->vm_bo;
> + struct amdgpu_vm *vm;
> + struct amdkfd_process_info *info;
> + struct amdgpu_amdkfd_fence *ef;
> + struct amdgpu_bo *parent;
> + int locked;
> + int ret;
> + struct ttm_bo_global *glob = &ttm_bo_glob;
> +
> + if (vm_bo == NULL)
> + return 0;
> +
> + /* Page table bo has a reference of the parent bo.
> +  * BO itself can't guarntee the vm it points to is alive.
> +  * for example, VM is going to free page tables, and the pt/pd bo might 
> be
> +  * freed by a workqueue. In that case, the vm might be freed already,
> +  * leaving the bo->vm_bo points to vm.root.
> +  *
> +  * so to avoid that, when kfd free its vms,
> +  * 1) set vm->process_info to NULL if this is the last vm.
> +  * 2) set root_bo->vm_bo to NULL.
> +  *
> +  * but there are still races, just like
> +  * cpu 1cpu 2
> +  *  !vm_bo
> +  * ->info = NULL
> +  * free(info)
> +  * ->vm_bo = NULL
> +  * free (vm)
> +  *  info = vm->info //invalid vm
> +  *
> +  * So to avoid the race, use ttm_bo_glob lru_lock.
> +  * generally speaking, adding a new lock is accceptable.
> +  * But reusing this lock is simple.
> +  */
> + parent = bo;
> + while (parent->parent)
> + parent = parent->parent;
> +
> + spin_lock(&glob->lru_lock);
> + vm_bo = parent->vm_bo;
> + if (!vm_bo) {
> + spin_unlock(&glob->lru_lock);
> + return 0;
> + }
> +
> + vm = vm_bo->vm;
> + if (!vm) {
> + spin_unlock(&glob->lru_lock);
> + return 0;
> + }
> +
> + in

[RFC PATCH v3] drm/amdgpu: Remove kfd eviction fence before release bo

2020-02-08 Thread Pan, Xinhui
No need to trigger eviction as the memory mapping will not be used anymore.

All pt/pd bos share same resv, hence the same shared eviction fence. Everytime 
page table is freed, the fence will be signled and that cuases kfd unexcepted 
evictions.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 35 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  1 +
 drivers/gpu/drm/ttm/ttm_bo.c  | 16 +
 5 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 47b0f2957d1f..265b1ed7264c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -96,6 +96,7 @@ struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 
context,
   struct mm_struct *mm);
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
 struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);
 
 struct amdkfd_process_info {
/* List head of all VMs that belong to a KFD process */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index ef721cb65868..8a06ba3c9d41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -276,6 +276,40 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
return 0;
 }
 
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
+{
+   struct amdgpu_vm_bo_base *vm_bo;
+   struct amdgpu_vm *vm;
+   struct amdkfd_process_info *info;
+   struct amdgpu_amdkfd_fence *ef;
+   int ret;
+
+   while (bo->parent)
+   bo = bo->parent;
+
+   vm_bo = bo->vm_bo;
+   if (!vm_bo)
+   return 0;
+
+   vm = vm_bo->vm;
+   if (!vm)
+   return 0;
+
+   info = vm->process_info;
+   if (!info || !info->eviction_fence)
+   return 0;
+
+   ef = container_of(dma_fence_get(&info->eviction_fence->base),
+   struct amdgpu_amdkfd_fence, base);
+
+   BUG_ON(!dma_resv_trylock(&bo->tbo.base._resv));
+   ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
+   dma_resv_unlock(&bo->tbo.base._resv);
+
+   dma_fence_put(&ef->base);
+   return ret;
+}
+
 static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 bool wait)
 {
@@ -1051,6 +1085,7 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device 
*adev,
WARN_ON(!list_empty(&process_info->userptr_valid_list));
WARN_ON(!list_empty(&process_info->userptr_inval_list));
 
+   vm->process_info = NULL;
dma_fence_put(&process_info->eviction_fence->base);
cancel_delayed_work_sync(&process_info->restore_userptr_work);
put_pid(process_info->pid);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6f60a581e3ba..3784d178c965 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1307,6 +1307,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object 
*bo)
if (abo->kfd_bo)
amdgpu_amdkfd_unreserve_memory_limit(abo);
 
+   amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
+
if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
return;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 247f328b7223..eca4ec66c1ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3109,6 +3109,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
}
 
amdgpu_vm_free_pts(adev, vm, NULL);
+   root->vm_bo = NULL;
amdgpu_bo_unreserve(root);
amdgpu_bo_unref(&root);
WARN_ON(vm->root.base.bo);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 6c3cea509e25..855d3566381e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -399,8 +399,7 @@ static int ttm_bo_individualize_resv(struct 
ttm_buffer_object *bo)
BUG_ON(!dma_resv_trylock(&bo->base._resv));
 
r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
-   if (r)
-   dma_resv_unlock(&bo->base._resv);
+   dma_resv_unlock(&bo->base._resv);
 
return r;
 }
@@ -565,9 +564,6 @@ static void ttm_bo_release(struct kref *kref)
int ret;
 
if (!bo->deleted) {
-   

回复: [RFC PATCH v3] drm/amdgpu: Remove kfd eviction fence before release bo

2020-02-08 Thread Pan, Xinhui
sorry, there is coding error, will send out V4.


发件人: amd-gfx  代表 Pan, Xinhui 

发送时间: 2020年2月8日 22:48
收件人: amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander; Kuehling, Felix; Koenig, Christian
主题: [RFC PATCH v3] drm/amdgpu: Remove kfd eviction fence before release bo

No need to trigger eviction as the memory mapping will not be used anymore.

All pt/pd bos share same resv, hence the same shared eviction fence. Everytime 
page table is freed, the fence will be signled and that cuases kfd unexcepted 
evictions.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 35 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  1 +
 drivers/gpu/drm/ttm/ttm_bo.c  | 16 +
 5 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 47b0f2957d1f..265b1ed7264c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -96,6 +96,7 @@ struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 
context,
   struct mm_struct *mm);
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
 struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);

 struct amdkfd_process_info {
/* List head of all VMs that belong to a KFD process */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index ef721cb65868..8a06ba3c9d41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -276,6 +276,40 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
return 0;
 }

+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
+{
+   struct amdgpu_vm_bo_base *vm_bo;
+   struct amdgpu_vm *vm;
+   struct amdkfd_process_info *info;
+   struct amdgpu_amdkfd_fence *ef;
+   int ret;
+
+   while (bo->parent)
+   bo = bo->parent;
+
+   vm_bo = bo->vm_bo;
+   if (!vm_bo)
+   return 0;
+
+   vm = vm_bo->vm;
+   if (!vm)
+   return 0;
+
+   info = vm->process_info;
+   if (!info || !info->eviction_fence)
+   return 0;
+
+   ef = container_of(dma_fence_get(&info->eviction_fence->base),
+   struct amdgpu_amdkfd_fence, base);
+
+   BUG_ON(!dma_resv_trylock(&bo->tbo.base._resv));
+   ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
+   dma_resv_unlock(&bo->tbo.base._resv);
+
+   dma_fence_put(&ef->base);
+   return ret;
+}
+
 static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 bool wait)
 {
@@ -1051,6 +1085,7 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device 
*adev,
WARN_ON(!list_empty(&process_info->userptr_valid_list));
WARN_ON(!list_empty(&process_info->userptr_inval_list));

+   vm->process_info = NULL;
dma_fence_put(&process_info->eviction_fence->base);
cancel_delayed_work_sync(&process_info->restore_userptr_work);
put_pid(process_info->pid);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6f60a581e3ba..3784d178c965 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1307,6 +1307,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object 
*bo)
if (abo->kfd_bo)
amdgpu_amdkfd_unreserve_memory_limit(abo);

+   amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
+
if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
return;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 247f328b7223..eca4ec66c1ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3109,6 +3109,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
}

amdgpu_vm_free_pts(adev, vm, NULL);
+   root->vm_bo = NULL;
amdgpu_bo_unreserve(root);
amdgpu_bo_unref(&root);
WARN_ON(vm->root.base.bo);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 6c3cea509e25..855d3566381e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -399,8 +399,7 @@ static int ttm_bo_

[RFC PATCH v4] drm/amdgpu: Remove kfd eviction fence before release bo

2020-02-08 Thread Pan, Xinhui
No need to trigger eviction as the memory mapping will not be used anymore.

All pt/pd bos share same resv, hence the same shared eviction fence. Everytime 
page table is freed, the fence will be signled and that cuases kfd unexcepted 
evictions.

Signed-off-by: xinhui pan 
---
change from v3:
fix a coding error

change from v2:
based on Chris' drm/ttm: rework BO delayed delete patchset.

---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 36 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  1 +
 drivers/gpu/drm/ttm/ttm_bo.c  | 16 +
 5 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 47b0f2957d1f..265b1ed7264c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -96,6 +96,7 @@ struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 
context,
   struct mm_struct *mm);
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
 struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);
 
 struct amdkfd_process_info {
/* List head of all VMs that belong to a KFD process */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index ef721cb65868..d4b117065c1e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -276,6 +276,41 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
return 0;
 }
 
+int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
+{
+   struct amdgpu_bo *root = bo;
+   struct amdgpu_vm_bo_base *vm_bo;
+   struct amdgpu_vm *vm;
+   struct amdkfd_process_info *info;
+   struct amdgpu_amdkfd_fence *ef;
+   int ret;
+
+   while (root->parent)
+   root = root->parent;
+
+   vm_bo = root->vm_bo;
+   if (!vm_bo)
+   return 0;
+
+   vm = vm_bo->vm;
+   if (!vm)
+   return 0;
+
+   info = vm->process_info;
+   if (!info || !info->eviction_fence)
+   return 0;
+
+   ef = container_of(dma_fence_get(&info->eviction_fence->base),
+   struct amdgpu_amdkfd_fence, base);
+
+   BUG_ON(!dma_resv_trylock(&bo->tbo.base._resv));
+   ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
+   dma_resv_unlock(&bo->tbo.base._resv);
+
+   dma_fence_put(&ef->base);
+   return ret;
+}
+
 static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 bool wait)
 {
@@ -1051,6 +1086,7 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device 
*adev,
WARN_ON(!list_empty(&process_info->userptr_valid_list));
WARN_ON(!list_empty(&process_info->userptr_inval_list));
 
+   vm->process_info = NULL;
dma_fence_put(&process_info->eviction_fence->base);
cancel_delayed_work_sync(&process_info->restore_userptr_work);
put_pid(process_info->pid);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6f60a581e3ba..3784d178c965 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1307,6 +1307,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object 
*bo)
if (abo->kfd_bo)
amdgpu_amdkfd_unreserve_memory_limit(abo);
 
+   amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
+
if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
return;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 247f328b7223..eca4ec66c1ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3109,6 +3109,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
}
 
amdgpu_vm_free_pts(adev, vm, NULL);
+   root->vm_bo = NULL;
amdgpu_bo_unreserve(root);
amdgpu_bo_unref(&root);
WARN_ON(vm->root.base.bo);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 6c3cea509e25..855d3566381e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -399,8 +399,7 @@ static int ttm_bo_individualize_resv(struct 
ttm_buffer_object *bo)
BUG_ON(!dma_resv_trylock(&bo->base._resv));
 
r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
-   if (r)
-   dma_resv_unlock(&bo->base._resv);
+   dma_resv_unlock(&bo->bas

[PATCH] drm/amdgpu: skip update root page table

2020-02-08 Thread Pan, Xinhui
The pde is on root page table. No need to update parent's page table.

Change-Id: I2ec1015736039cf0278bdfa9bec35185ece506b5
Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index cc56eaba1911..247f328b7223 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1235,10 +1235,14 @@ static int amdgpu_vm_update_pde(struct 
amdgpu_vm_update_params *params,
struct amdgpu_vm_pt *entry)
 {
struct amdgpu_vm_pt *parent = amdgpu_vm_pt_parent(entry);
-   struct amdgpu_bo *bo = parent->base.bo, *pbo;
+   struct amdgpu_bo *bo, *pbo;
uint64_t pde, pt, flags;
unsigned level;
 
+   if (!parent)
+   return 0;
+
+   bo = parent->base.bo;
for (level = 0, pbo = bo->parent; pbo; ++level)
pbo = pbo->parent;
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Do not move root PT bo to relocated list

2020-02-08 Thread Pan, Xinhui
hit panic when we update the page tables.

<1>[  122.103290] BUG: kernel NULL pointer dereference, address: 
0008
<1>[  122.103348] #PF: supervisor read access in kernel mode
<1>[  122.103376] #PF: error_code(0x) - not-present page
<6>[  122.103403] PGD 0 P4D 0 
<4>[  122.103421] Oops:  [#1] SMP PTI
<4>[  122.103442] CPU: 13 PID: 2133 Comm: kfdtest Tainted: G   OE 
5.4.0-rc7+ #7
<4>[  122.103480] Hardware name: Supermicro SYS-7048GR-TR/X10DRG-Q, BIOS 3.0b 
03/09/2018
<4>[  122.103657] RIP: 0010:amdgpu_vm_update_pdes+0x140/0x330 [amdgpu]
<4>[  122.103689] Code: 03 4c 89 73 08 49 89 9d c8 00 00 00 48 8b 7b f0 c6 43 
10 00 45 31 c0 48 8b 87 28 04 00 00 48 85 c0 74 07 4c 8b 80 20 04 00 00 <4d> 8b 
70 08 31 f6 49 8b 86 28 04 00 00 48 85 c0 74 0f 48 8b 80 28
<4>[  122.103769] RSP: 0018:b49a0a6a3a98 EFLAGS: 00010246
<4>[  122.103797] RAX:  RBX: 9020f823c148 RCX: 
dead0122
<4>[  122.103831] RDX: 9020ece70018 RSI: 9020f823c0c8 RDI: 
9010ca31c800
<4>[  122.103865] RBP: b49a0a6a3b38 R08:  R09: 
0001
<4>[  122.103899] R10: 6044f994 R11: df57fb58 R12: 
9020f823c000
<4>[  122.103933] R13: 9020f823c000 R14: 9020f823c0c8 R15: 
9010d5d2
<4>[  122.103968] FS:  7f32c83dc780() GS:9020ff38() 
knlGS:
<4>[  122.104006] CS:  0010 DS:  ES:  CR0: 80050033
<4>[  122.104035] CR2: 0008 CR3: 002036bba005 CR4: 
003606e0
<4>[  122.104069] DR0:  DR1:  DR2: 

<4>[  122.104103] DR3:  DR6: fffe0ff0 DR7: 
0400
<4>[  122.104137] Call Trace:
<4>[  122.104241]  vm_update_pds+0x31/0x50 [amdgpu]
<4>[  122.104347]  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x2ef/0x690 [amdgpu]
<4>[  122.104466]  kfd_process_alloc_gpuvm+0x98/0x190 [amdgpu]
<4>[  122.104576]  kfd_process_device_init_vm.part.8+0xf3/0x1f0 [amdgpu]
<4>[  122.104688]  kfd_process_device_init_vm+0x24/0x30 [amdgpu]
<4>[  122.104794]  kfd_ioctl_acquire_vm+0xa4/0xc0 [amdgpu]
<4>[  122.104900]  kfd_ioctl+0x277/0x500 [amdgpu]
<4>[  122.105001]  ? kfd_ioctl_free_memory_of_gpu+0xc0/0xc0 [amdgpu]
<4>[  122.105039]  ? rcu_read_lock_sched_held+0x4f/0x80
<4>[  122.105068]  ? kmem_cache_free+0x2ba/0x300
<4>[  122.105093]  ? vm_area_free+0x18/0x20
<4>[  122.105117]  ? find_held_lock+0x35/0xa0
<4>[  122.105143]  do_vfs_ioctl+0xa9/0x6f0
<4>[  122.106001]  ksys_ioctl+0x75/0x80
<4>[  122.106802]  ? do_syscall_64+0x17/0x230
<4>[  122.107605]  __x64_sys_ioctl+0x1a/0x20
<4>[  122.108378]  do_syscall_64+0x5f/0x230
<4>[  122.109118]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  122.109842] RIP: 0033:0x7f32c6b495d7

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 3195bc90985a..3c388fdf335c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2619,7 +2619,7 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
continue;
bo_base->moved = true;
 
-   if (bo->tbo.type == ttm_bo_type_kernel)
+   if (bo->tbo.type == ttm_bo_type_kernel && bo->parent)
amdgpu_vm_bo_relocated(bo_base);
else if (bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv)
amdgpu_vm_bo_moved(bo_base);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Do not move root PT bo to relocated list

2020-02-09 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

If so the function name does not match its functionality.


From: Christian König 
Sent: Sunday, February 9, 2020 4:21:13 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 

Cc: Deucher, Alexander ; Koenig, Christian 

Subject: Re: [PATCH] drm/amdgpu: Do not move root PT bo to relocated list

Am 09.02.20 um 03:52 schrieb Pan, Xinhui:
> hit panic when we update the page tables.
>
> <1>[  122.103290] BUG: kernel NULL pointer dereference, address: 
> 0008
> <1>[  122.103348] #PF: supervisor read access in kernel mode
> <1>[  122.103376] #PF: error_code(0x) - not-present page
> <6>[  122.103403] PGD 0 P4D 0
> <4>[  122.103421] Oops:  [#1] SMP PTI
> <4>[  122.103442] CPU: 13 PID: 2133 Comm: kfdtest Tainted: G   OE 
> 5.4.0-rc7+ #7
> <4>[  122.103480] Hardware name: Supermicro SYS-7048GR-TR/X10DRG-Q, BIOS 3.0b 
> 03/09/2018
> <4>[  122.103657] RIP: 0010:amdgpu_vm_update_pdes+0x140/0x330 [amdgpu]
> <4>[  122.103689] Code: 03 4c 89 73 08 49 89 9d c8 00 00 00 48 8b 7b f0 c6 43 
> 10 00 45 31 c0 48 8b 87 28 04 00 00 48 85 c0 74 07 4c 8b 80 20 04 00 00 <4d> 
> 8b 70 08 31 f6 49 8b 86 28 04 00 00 48 85 c0 74 0f 48 8b 80 28
> <4>[  122.103769] RSP: 0018:b49a0a6a3a98 EFLAGS: 00010246
> <4>[  122.103797] RAX:  RBX: 9020f823c148 RCX: 
> dead0122
> <4>[  122.103831] RDX: 9020ece70018 RSI: 9020f823c0c8 RDI: 
> 9010ca31c800
> <4>[  122.103865] RBP: b49a0a6a3b38 R08:  R09: 
> 0001
> <4>[  122.103899] R10: 6044f994 R11: df57fb58 R12: 
> 9020f823c000
> <4>[  122.103933] R13: 9020f823c000 R14: 9020f823c0c8 R15: 
> 9010d5d2
> <4>[  122.103968] FS:  7f32c83dc780() GS:9020ff38() 
> knlGS:
> <4>[  122.104006] CS:  0010 DS:  ES:  CR0: 80050033
> <4>[  122.104035] CR2: 0008 CR3: 002036bba005 CR4: 
> 003606e0
> <4>[  122.104069] DR0:  DR1:  DR2: 
> 
> <4>[  122.104103] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> <4>[  122.104137] Call Trace:
> <4>[  122.104241]  vm_update_pds+0x31/0x50 [amdgpu]
> <4>[  122.104347]  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x2ef/0x690 [amdgpu]
> <4>[  122.104466]  kfd_process_alloc_gpuvm+0x98/0x190 [amdgpu]
> <4>[  122.104576]  kfd_process_device_init_vm.part.8+0xf3/0x1f0 [amdgpu]
> <4>[  122.104688]  kfd_process_device_init_vm+0x24/0x30 [amdgpu]
> <4>[  122.104794]  kfd_ioctl_acquire_vm+0xa4/0xc0 [amdgpu]
> <4>[  122.104900]  kfd_ioctl+0x277/0x500 [amdgpu]
> <4>[  122.105001]  ? kfd_ioctl_free_memory_of_gpu+0xc0/0xc0 [amdgpu]
> <4>[  122.105039]  ? rcu_read_lock_sched_held+0x4f/0x80
> <4>[  122.105068]  ? kmem_cache_free+0x2ba/0x300
> <4>[  122.105093]  ? vm_area_free+0x18/0x20
> <4>[  122.105117]  ? find_held_lock+0x35/0xa0
> <4>[  122.105143]  do_vfs_ioctl+0xa9/0x6f0
> <4>[  122.106001]  ksys_ioctl+0x75/0x80
> <4>[  122.106802]  ? do_syscall_64+0x17/0x230
> <4>[  122.107605]  __x64_sys_ioctl+0x1a/0x20
> <4>[  122.108378]  do_syscall_64+0x5f/0x230
> <4>[  122.109118]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> <4>[  122.109842] RIP: 0033:0x7f32c6b495d7
>
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 3195bc90985a..3c388fdf335c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2619,7 +2619,7 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
>continue;
>bo_base->moved = true;
>
> - if (bo->tbo.type == ttm_bo_type_kernel)
> + if (bo->tbo.type == ttm_bo_type_kernel && bo->parent)

Good catch, but that would mean that we move the root PD to the moved
state which in turn is illegal as well.

Maybe better adjust amdgpu_vm_bo_relocated() to move the root PD to the
idle state instead.

Christian.


>        amdgpu_vm_bo_relocated(bo_base);
>else if (bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv)
>amdgpu_vm_bo_moved(bo_base);


From: Christian König 
Sent: Sunday, February 9, 2020 4:21:13 PM
To: Pan, Xinhui ; amd-gfx@lists.freedes

[PATCH V2] drm/amdgpu: Do not move root PT bo to relocated list

2020-02-09 Thread Pan, Xinhui
hit panic when we update the page tables.

<1>[  122.103290] BUG: kernel NULL pointer dereference, address: 
0008
<1>[  122.103348] #PF: supervisor read access in kernel mode
<1>[  122.103376] #PF: error_code(0x) - not-present page
<6>[  122.103403] PGD 0 P4D 0
<4>[  122.103421] Oops:  [#1] SMP PTI
<4>[  122.103442] CPU: 13 PID: 2133 Comm: kfdtest Tainted: G   OE 
5.4.0-rc7+ #7
<4>[  122.103480] Hardware name: Supermicro SYS-7048GR-TR/X10DRG-Q, BIOS 3.0b 
03/09/2018
<4>[  122.103657] RIP: 0010:amdgpu_vm_update_pdes+0x140/0x330 [amdgpu]
<4>[  122.103689] Code: 03 4c 89 73 08 49 89 9d c8 00 00 00 48 8b 7b f0 c6 43 
10 00 45 31 c0 48 8b 87 28 04 00 00 48 85 c0 74 07 4c 8b 80 20 04 00 00 <4d> 8b 
70 08 31 f6 49 8b 86 28 04 00 00 48 85 c0 74 0f 48 8b 80 28
<4>[  122.103769] RSP: 0018:b49a0a6a3a98 EFLAGS: 00010246
<4>[  122.103797] RAX:  RBX: 9020f823c148 RCX: 
dead0122
<4>[  122.103831] RDX: 9020ece70018 RSI: 9020f823c0c8 RDI: 
9010ca31c800
<4>[  122.103865] RBP: b49a0a6a3b38 R08:  R09: 
0001
<4>[  122.103899] R10: 6044f994 R11: df57fb58 R12: 
9020f823c000
<4>[  122.103933] R13: 9020f823c000 R14: 9020f823c0c8 R15: 
9010d5d2
<4>[  122.103968] FS:  7f32c83dc780() GS:9020ff38() 
knlGS:
<4>[  122.104006] CS:  0010 DS:  ES:  CR0: 80050033
<4>[  122.104035] CR2: 0008 CR3: 002036bba005 CR4: 
003606e0
<4>[  122.104069] DR0:  DR1:  DR2: 

<4>[  122.104103] DR3:  DR6: fffe0ff0 DR7: 
0400
<4>[  122.104137] Call Trace:
<4>[  122.104241]  vm_update_pds+0x31/0x50 [amdgpu]
<4>[  122.104347]  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x2ef/0x690 [amdgpu]
<4>[  122.104466]  kfd_process_alloc_gpuvm+0x98/0x190 [amdgpu]
<4>[  122.104576]  kfd_process_device_init_vm.part.8+0xf3/0x1f0 [amdgpu]
<4>[  122.104688]  kfd_process_device_init_vm+0x24/0x30 [amdgpu]
<4>[  122.104794]  kfd_ioctl_acquire_vm+0xa4/0xc0 [amdgpu]
<4>[  122.104900]  kfd_ioctl+0x277/0x500 [amdgpu]
<4>[  122.105001]  ? kfd_ioctl_free_memory_of_gpu+0xc0/0xc0 [amdgpu]
<4>[  122.105039]  ? rcu_read_lock_sched_held+0x4f/0x80
<4>[  122.105068]  ? kmem_cache_free+0x2ba/0x300
<4>[  122.105093]  ? vm_area_free+0x18/0x20
<4>[  122.105117]  ? find_held_lock+0x35/0xa0
<4>[  122.105143]  do_vfs_ioctl+0xa9/0x6f0
<4>[  122.106001]  ksys_ioctl+0x75/0x80
<4>[  122.106802]  ? do_syscall_64+0x17/0x230
<4>[  122.107605]  __x64_sys_ioctl+0x1a/0x20
<4>[  122.108378]  do_syscall_64+0x5f/0x230
<4>[  122.109118]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  122.109842] RIP: 0033:0x7f32c6b495d7

Signed-off-by: xinhui pan 
---
change from v1:
   move root pt bo to idle state instead.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 3195bc9..c3d1af5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2619,9 +2619,12 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
continue;
bo_base->moved = true;
 
-   if (bo->tbo.type == ttm_bo_type_kernel)
-   amdgpu_vm_bo_relocated(bo_base);
-   else if (bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv)
+   if (bo->tbo.type == ttm_bo_type_kernel) {
+   if (bo->parent)
+   amdgpu_vm_bo_relocated(bo_base);
+   else
+   amdgpu_vm_bo_idle(bo_base);
+   } else if (bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv)
amdgpu_vm_bo_moved(bo_base);
else
amdgpu_vm_bo_invalidated(bo_base);
-- 
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 4/6] drm/ttm: rework BO delayed delete.

2020-02-10 Thread Pan, Xinhui
comments inline.
[xh]


> 2020年2月10日 23:09,Christian König  写道:
> 
> This patch reworks the whole delayed deletion of BOs which aren't idle.
> 
> Instead of having two counters for the BO structure we resurrect the BO
> when we find that a deleted BO is not idle yet.
> 
> This has many advantages, especially that we don't need to
> increment/decrement the BOs reference counter any more when it
> moves on the LRUs.
> 
> Signed-off-by: Christian König 
> ---
> drivers/gpu/drm/ttm/ttm_bo.c  | 217 +-
> drivers/gpu/drm/ttm/ttm_bo_util.c |   1 -
> include/drm/ttm/ttm_bo_api.h  |  11 +-
> 3 files changed, 97 insertions(+), 132 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index e12fc2c2d165..d0624685f5d2 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type)
>   return 1 << (type);
> }
> 
> -static void ttm_bo_release_list(struct kref *list_kref)
> -{
> - struct ttm_buffer_object *bo =
> - container_of(list_kref, struct ttm_buffer_object, list_kref);
> - size_t acc_size = bo->acc_size;
> -
> - BUG_ON(kref_read(&bo->list_kref));
> - BUG_ON(kref_read(&bo->kref));
> - BUG_ON(bo->mem.mm_node != NULL);
> - BUG_ON(!list_empty(&bo->lru));
> - BUG_ON(!list_empty(&bo->ddestroy));
> - ttm_tt_destroy(bo->ttm);
> - atomic_dec(&ttm_bo_glob.bo_count);
> - dma_fence_put(bo->moving);
> - if (!ttm_bo_uses_embedded_gem_object(bo))
> - dma_resv_fini(&bo->base._resv);
> - bo->destroy(bo);
> - ttm_mem_global_free(&ttm_mem_glob, acc_size);
> -}
> -
> static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
> struct ttm_mem_reg *mem)
> {
> @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct 
> ttm_buffer_object *bo,
> 
>   man = &bdev->man[mem->mem_type];
>   list_add_tail(&bo->lru, &man->lru[bo->priority]);
> - kref_get(&bo->list_kref);
> 
>   if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm &&
>   !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG |
>TTM_PAGE_FLAG_SWAPPED))) {
>   list_add_tail(&bo->swap, &ttm_bo_glob.swap_lru[bo->priority]);
> - kref_get(&bo->list_kref);
>   }
> }
> 
> -static void ttm_bo_ref_bug(struct kref *list_kref)
> -{
> - BUG();
> -}
> -
> static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
> {
>   struct ttm_bo_device *bdev = bo->bdev;
> @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct 
> ttm_buffer_object *bo)
> 
>   if (!list_empty(&bo->swap)) {
>   list_del_init(&bo->swap);
> - kref_put(&bo->list_kref, ttm_bo_ref_bug);
>   notify = true;
>   }
>   if (!list_empty(&bo->lru)) {
>   list_del_init(&bo->lru);
> - kref_put(&bo->list_kref, ttm_bo_ref_bug);
>   notify = true;
>   }
> 
> @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct 
> ttm_buffer_object *bo)
>   BUG_ON(!dma_resv_trylock(&bo->base._resv));
> 
>   r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
> - if (r)
> - dma_resv_unlock(&bo->base._resv);
> + dma_resv_unlock(&bo->base._resv);
> 
>   return r;
> }
> @@ -449,68 +419,10 @@ static void ttm_bo_flush_all_fences(struct 
> ttm_buffer_object *bo)
>   rcu_read_unlock();
> }
> 
> -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
> -{
> - struct ttm_bo_device *bdev = bo->bdev;
> - int ret;
> -
> - ret = ttm_bo_individualize_resv(bo);
> - if (ret) {
> - /* Last resort, if we fail to allocate memory for the
> -  * fences block for the BO to become idle
> -  */
> - dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
> - 30 * HZ);
> - spin_lock(&ttm_bo_glob.lru_lock);
> - goto error;
> - }
> -
> - spin_lock(&ttm_bo_glob.lru_lock);
> - ret = dma_resv_trylock(bo->base.resv) ? 0 : -EBUSY;
> - if (!ret) {
> - if (dma_resv_test_signaled_rcu(&bo->base._resv, true)) {
> - ttm_bo_del_from_lru(bo);
> - spin_unlock(&ttm_bo_glob.lru_lock);
> - if (bo->base.resv != &bo->base._resv)
> - dma_resv_unlock(&bo->base._resv);
> -
> - ttm_bo_cleanup_memtype_use(bo);
> - dma_resv_unlock(bo->base.resv);
> - return;
> - }
> -
> - ttm_bo_flush_all_fences(bo);
> -
> - /*
> -  * Make NO_EVICT bos immediately available to
> -  * shrinkers, now that they are queued for
> -  * destruction.
> -  */
> - if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) 

Re: [PATCH 4/6] drm/ttm: rework BO delayed delete.

2020-02-10 Thread Pan, Xinhui

comments inline

> 2020年2月11日 13:26,Pan, Xinhui  写道:
> 
> comments inline.
> [xh]
> 
> 
>> 2020年2月10日 23:09,Christian König  写道:
>> 
>> This patch reworks the whole delayed deletion of BOs which aren't idle.
>> 
>> Instead of having two counters for the BO structure we resurrect the BO
>> when we find that a deleted BO is not idle yet.
>> 
>> This has many advantages, especially that we don't need to
>> increment/decrement the BOs reference counter any more when it
>> moves on the LRUs.
>> 
>> Signed-off-by: Christian König 
>> ---
>> drivers/gpu/drm/ttm/ttm_bo.c  | 217 +-
>> drivers/gpu/drm/ttm/ttm_bo_util.c |   1 -
>> include/drm/ttm/ttm_bo_api.h  |  11 +-
>> 3 files changed, 97 insertions(+), 132 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index e12fc2c2d165..d0624685f5d2 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type)
>>  return 1 << (type);
>> }
>> 
>> -static void ttm_bo_release_list(struct kref *list_kref)
>> -{
>> -struct ttm_buffer_object *bo =
>> -container_of(list_kref, struct ttm_buffer_object, list_kref);
>> -size_t acc_size = bo->acc_size;
>> -
>> -BUG_ON(kref_read(&bo->list_kref));
>> -BUG_ON(kref_read(&bo->kref));
>> -BUG_ON(bo->mem.mm_node != NULL);
>> -BUG_ON(!list_empty(&bo->lru));
>> -BUG_ON(!list_empty(&bo->ddestroy));
>> -ttm_tt_destroy(bo->ttm);
>> -atomic_dec(&ttm_bo_glob.bo_count);
>> -dma_fence_put(bo->moving);
>> -if (!ttm_bo_uses_embedded_gem_object(bo))
>> -dma_resv_fini(&bo->base._resv);
>> -bo->destroy(bo);
>> -ttm_mem_global_free(&ttm_mem_glob, acc_size);
>> -}
>> -
>> static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
>>struct ttm_mem_reg *mem)
>> {
>> @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct 
>> ttm_buffer_object *bo,
>> 
>>  man = &bdev->man[mem->mem_type];
>>  list_add_tail(&bo->lru, &man->lru[bo->priority]);
>> -kref_get(&bo->list_kref);
>> 
>>  if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm &&
>>  !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG |
>>   TTM_PAGE_FLAG_SWAPPED))) {
>>  list_add_tail(&bo->swap, &ttm_bo_glob.swap_lru[bo->priority]);
>> -kref_get(&bo->list_kref);
>>  }
>> }
>> 
>> -static void ttm_bo_ref_bug(struct kref *list_kref)
>> -{
>> -BUG();
>> -}
>> -
>> static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
>> {
>>  struct ttm_bo_device *bdev = bo->bdev;
>> @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct 
>> ttm_buffer_object *bo)
>> 
>>  if (!list_empty(&bo->swap)) {
>>  list_del_init(&bo->swap);
>> -kref_put(&bo->list_kref, ttm_bo_ref_bug);
>>  notify = true;
>>  }
>>  if (!list_empty(&bo->lru)) {
>>  list_del_init(&bo->lru);
>> -kref_put(&bo->list_kref, ttm_bo_ref_bug);
>>  notify = true;
>>  }
>> 
>> @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct 
>> ttm_buffer_object *bo)
>>  BUG_ON(!dma_resv_trylock(&bo->base._resv));
>> 
>>  r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>> -if (r)
>> -dma_resv_unlock(&bo->base._resv);
>> +dma_resv_unlock(&bo->base._resv);
>> 
>>  return r;
>> }
>> @@ -449,68 +419,10 @@ static void ttm_bo_flush_all_fences(struct 
>> ttm_buffer_object *bo)
>>  rcu_read_unlock();
>> }
>> 
>> -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
>> -{
>> -struct ttm_bo_device *bdev = bo->bdev;
>> -int ret;
>> -
>> -ret = ttm_bo_individualize_resv(bo);
>> -if (ret) {
>> -/* Last resort, if we fail to allocate memory for the
>> - * fences block for the BO to become idle
>> - */
>> -dma_resv_wait_timeout_rcu(bo->base.resv, tru

Re: Cleanup TTMs delayed delete handling

2020-02-11 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

For patch 1/2/3/5/6
Reviewed-by: xinhui pan 

From: Christian König 
Sent: Monday, February 10, 2020 11:09:01 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 
; dri-de...@lists.freedesktop.org 

Subject: Cleanup TTMs delayed delete handling

This series of patches cleans up TTMs delayed delete handling.

The core of the new handling is that we new only have a single reference 
counter instead of two and use kref_get_unless_zero() to grab BOs from the LRU 
during eviction.

This reduces the overhead of LRU moves and allows us to properly individualize 
the BOs reservation object during deletion to allow adding BOs for clearing 
memory, unmapping page tables etc..

Please review and comment,
Christian.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: rework BO delayed delete. v2

2020-02-11 Thread Pan, Xinhui
Reviewed-by: xinhui pan 

> 2020年2月11日 21:43,Christian König  写道:
> 
> This patch reworks the whole delayed deletion of BOs which aren't idle.
> 
> Instead of having two counters for the BO structure we resurrect the BO
> when we find that a deleted BO is not idle yet.
> 
> This has many advantages, especially that we don't need to
> increment/decrement the BOs reference counter any more when it
> moves on the LRUs.
> 
> v2: remove duplicate ttm_tt_destroy, fix holde lock for LRU move
> 
> Signed-off-by: Christian König 
> ---
> drivers/gpu/drm/ttm/ttm_bo.c  | 217 +-
> drivers/gpu/drm/ttm/ttm_bo_util.c |   1 -
> include/drm/ttm/ttm_bo_api.h  |  11 +-
> 3 files changed, 97 insertions(+), 132 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 1fbc36f05d89..bfc42a9e4fb4 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type)
>   return 1 << (type);
> }
> 
> -static void ttm_bo_release_list(struct kref *list_kref)
> -{
> - struct ttm_buffer_object *bo =
> - container_of(list_kref, struct ttm_buffer_object, list_kref);
> - size_t acc_size = bo->acc_size;
> -
> - BUG_ON(kref_read(&bo->list_kref));
> - BUG_ON(kref_read(&bo->kref));
> - BUG_ON(bo->mem.mm_node != NULL);
> - BUG_ON(!list_empty(&bo->lru));
> - BUG_ON(!list_empty(&bo->ddestroy));
> - ttm_tt_destroy(bo->ttm);
> - atomic_dec(&ttm_bo_glob.bo_count);
> - dma_fence_put(bo->moving);
> - if (!ttm_bo_uses_embedded_gem_object(bo))
> - dma_resv_fini(&bo->base._resv);
> - bo->destroy(bo);
> - ttm_mem_global_free(&ttm_mem_glob, acc_size);
> -}
> -
> static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
> struct ttm_mem_reg *mem)
> {
> @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct 
> ttm_buffer_object *bo,
> 
>   man = &bdev->man[mem->mem_type];
>   list_add_tail(&bo->lru, &man->lru[bo->priority]);
> - kref_get(&bo->list_kref);
> 
>   if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm &&
>   !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG |
>TTM_PAGE_FLAG_SWAPPED))) {
>   list_add_tail(&bo->swap, &ttm_bo_glob.swap_lru[bo->priority]);
> - kref_get(&bo->list_kref);
>   }
> }
> 
> -static void ttm_bo_ref_bug(struct kref *list_kref)
> -{
> - BUG();
> -}
> -
> static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
> {
>   struct ttm_bo_device *bdev = bo->bdev;
> @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct 
> ttm_buffer_object *bo)
> 
>   if (!list_empty(&bo->swap)) {
>   list_del_init(&bo->swap);
> - kref_put(&bo->list_kref, ttm_bo_ref_bug);
>   notify = true;
>   }
>   if (!list_empty(&bo->lru)) {
>   list_del_init(&bo->lru);
> - kref_put(&bo->list_kref, ttm_bo_ref_bug);
>   notify = true;
>   }
> 
> @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct 
> ttm_buffer_object *bo)
>   BUG_ON(!dma_resv_trylock(&bo->base._resv));
> 
>   r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
> - if (r)
> - dma_resv_unlock(&bo->base._resv);
> + dma_resv_unlock(&bo->base._resv);
> 
>   return r;
> }
> @@ -447,68 +417,10 @@ static void ttm_bo_flush_all_fences(struct 
> ttm_buffer_object *bo)
>   }
> }
> 
> -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
> -{
> - struct ttm_bo_device *bdev = bo->bdev;
> - int ret;
> -
> - ret = ttm_bo_individualize_resv(bo);
> - if (ret) {
> - /* Last resort, if we fail to allocate memory for the
> -  * fences block for the BO to become idle
> -  */
> - dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
> - 30 * HZ);
> - spin_lock(&ttm_bo_glob.lru_lock);
> - goto error;
> - }
> -
> - spin_lock(&ttm_bo_glob.lru_lock);
> - ret = dma_resv_trylock(bo->base.resv) ? 0 : -EBUSY;
> - if (!ret) {
> - if (dma_resv_test_signaled_rcu(&bo->base._resv, true)) {
> - ttm_bo_del_from_lru(bo);
> - spin_unlock(&ttm_bo_glob.lru_lock);
> - if (bo->base.resv != &bo->base._resv)
> - dma_resv_unlock(&bo->base._resv);
> -
> - ttm_bo_cleanup_memtype_use(bo);
> - dma_resv_unlock(bo->base.resv);
> - return;
> - }
> -
> - ttm_bo_flush_all_fences(bo);
> -
> - /*
> -  * Make NO_EVICT bos immediately available to
> -  * shrinkers, now that they are queued for
> -  * destruction.
> -  */
> -

Re: [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2

2020-02-11 Thread Pan, Xinhui


> 2020年2月11日 22:14,Daniel Vetter  写道:
> 
> On Mon, Feb 10, 2020 at 04:09:06PM +0100, Christian König wrote:
>> When non-imported BOs are resurrected for delayed delete we replace
>> the dma_resv object to allow for easy reclaiming of the resources.
>> 
>> v2: move that to ttm_bo_individualize_resv
>> 
>> Signed-off-by: Christian König 
>> ---
>> drivers/gpu/drm/ttm/ttm_bo.c | 10 +-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index d0624685f5d2..4d161038de98 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct 
>> ttm_buffer_object *bo)
>> 
>>  r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>>  dma_resv_unlock(&bo->base._resv);
>> +if (r)
>> +return r;
>> +
>> +if (bo->type != ttm_bo_type_sg) {
>> +spin_lock(&ttm_bo_glob.lru_lock);
>> +bo->base.resv = &bo->base._resv;
> 
> Having the dma_resv pointer be protected by the lru_lock for ttm internal
> stuff, but invariant everywhere else is really confusing. Not sure that's

I think this is reader VS writer.
To avoid any internal functions using the old resv,  using an existing spin 
lock is acceptable.
Maybe RCU is better? That will need a lot of effort.
Anyway, ttm sucks. We HAS done a lot of work on it to make it better running on 
modern system.


> a great idea, I've just chased some ttm code around freaking out about
> that.
> -Daniel
> 
>> +spin_unlock(&ttm_bo_glob.lru_lock);
>> +}
>> 
>>  return r;
>> }
>> @@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct 
>> ttm_buffer_object *bo,
>> 
>>  if (bo->base.resv == ctx->resv) {
>>  dma_resv_assert_held(bo->base.resv);
>> -if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
>> +if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
>>  ret = true;
>>  *locked = false;
>>  if (busy)
>> -- 
>> 2.17.1
>> 
>> ___
>> dri-devel mailing list
>> dri-de...@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&sdata=ZpnP9MNBP1csQCKPR275ejIvsZ3b8xL80tmSlpf7MPA%3D&reserved=0
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch&data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&sdata=fk28jtHhAnE312CFMgVXaZtaS2YNqJjmyJ317FWjAoM%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3

2020-02-11 Thread Pan, Xinhui


> 2020年2月11日 23:43,Christian König  写道:
> 
> When non-imported BOs are resurrected for delayed delete we replace
> the dma_resv object to allow for easy reclaiming of the resources.
> 
> v2: move that to ttm_bo_individualize_resv
> v3: add a comment to explain what's going on
> 
> Signed-off-by: Christian König 
> Reviewed-by: xinhui pan 
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 14 +-
> 1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index bfc42a9e4fb4..8174603d390f 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -393,6 +393,18 @@ static int ttm_bo_individualize_resv(struct 
> ttm_buffer_object *bo)
> 
>   r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>   dma_resv_unlock(&bo->base._resv);
> + if (r)
> + return r;
> +
> + if (bo->type != ttm_bo_type_sg) {
> + /* This works because the BO is about to be destroyed and nobody
> +  * reference it any more. The only tricky case is the trylock on
> +  * the resv object while holding the lru_lock.
> +  */
> + spin_lock(&ttm_bo_glob.lru_lock);
> + bo->base.resv = &bo->base._resv;
> + spin_unlock(&ttm_bo_glob.lru_lock);
> + }
> 

how about something like that.
the basic idea is to do the bo cleanup work in bo release first and avoid any 
race with evict.
As in bo dieing progress, evict also just do bo cleanup work.

If bo is busy, neither bo_release nor evict  can do cleanupwork  on it. For the 
bo release case, we just add bo back to lru list.
So we can clean it up  both in workqueue and shrinker as the past way  did.

@@ -405,8 +405,9 @@ static int ttm_bo_individualize_resv(struct 
ttm_buffer_object *bo)
 
if (bo->type != ttm_bo_type_sg) {
spin_lock(&ttm_bo_glob.lru_lock);
-   bo->base.resv = &bo->base._resv;
+   ttm_bo_del_from_lru(bo);
spin_unlock(&ttm_bo_glob.lru_lock);
+   bo->base.resv = &bo->base._resv;
}   
 
return r;
@@ -606,10 +607,9 @@ static void ttm_bo_release(struct kref *kref)
 * shrinkers, now that they are queued for 
 * destruction.
 */  
-   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
+   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT)
bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
-   ttm_bo_move_to_lru_tail(bo, NULL);
-   }
+   ttm_bo_add_mem_to_lru(bo, &bo->mem);
 
kref_init(&bo->kref);
list_add_tail(&bo->ddestroy, &bdev->ddestroy);

thanks
xinhui


>   return r;
> }
> @@ -724,7 +736,7 @@ static bool ttm_bo_evict_swapout_allowable(struct 
> ttm_buffer_object *bo,
> 
>   if (bo->base.resv == ctx->resv) {
>   dma_resv_assert_held(bo->base.resv);
> - if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
> + if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
>   ret = true;
>   *locked = false;
>   if (busy)
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7Cb184dff5aaf349e2210008d7af092637%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170326204966375&sdata=KdZN1l%2FkDYodXxPQgaXaSXUvMz2RHxysSSF9krQRgpI%3D&reserved=0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3

2020-02-12 Thread Pan, Xinhui



> 2020年2月12日 19:53,Christian König  写道:
> 
> Am 12.02.20 um 07:23 schrieb Pan, Xinhui:
>> 
>>> 2020年2月11日 23:43,Christian König  写道:
>>> 
>>> When non-imported BOs are resurrected for delayed delete we replace
>>> the dma_resv object to allow for easy reclaiming of the resources.
>>> 
>>> v2: move that to ttm_bo_individualize_resv
>>> v3: add a comment to explain what's going on
>>> 
>>> Signed-off-by: Christian König 
>>> Reviewed-by: xinhui pan 
>>> ---
>>> drivers/gpu/drm/ttm/ttm_bo.c | 14 +-
>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index bfc42a9e4fb4..8174603d390f 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -393,6 +393,18 @@ static int ttm_bo_individualize_resv(struct 
>>> ttm_buffer_object *bo)
>>> 
>>> r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>>> dma_resv_unlock(&bo->base._resv);
>>> +   if (r)
>>> +   return r;
>>> +
>>> +   if (bo->type != ttm_bo_type_sg) {
>>> +   /* This works because the BO is about to be destroyed and nobody
>>> +* reference it any more. The only tricky case is the trylock on
>>> +* the resv object while holding the lru_lock.
>>> +*/
>>> +   spin_lock(&ttm_bo_glob.lru_lock);
>>> +   bo->base.resv = &bo->base._resv;
>>> +   spin_unlock(&ttm_bo_glob.lru_lock);
>>> +   }
>>> 
>> how about something like that.
>> the basic idea is to do the bo cleanup work in bo release first and avoid 
>> any race with evict.
>> As in bo dieing progress, evict also just do bo cleanup work.
>> 
>> If bo is busy, neither bo_release nor evict  can do cleanupwork  on it. For 
>> the bo release case, we just add bo back to lru list.
>> So we can clean it up  both in workqueue and shrinker as the past way  did.
>> 
>> @@ -405,8 +405,9 @@ static int ttm_bo_individualize_resv(struct 
>> ttm_buffer_object *bo)
>>   if (bo->type != ttm_bo_type_sg) {
>> spin_lock(&ttm_bo_glob.lru_lock);
>> -   bo->base.resv = &bo->base._resv;
>> +   ttm_bo_del_from_lru(bo);
>> spin_unlock(&ttm_bo_glob.lru_lock);
>> +   bo->base.resv = &bo->base._resv;
>> }
>>   return r;
>> @@ -606,10 +607,9 @@ static void ttm_bo_release(struct kref *kref)
>>  * shrinkers, now that they are queued for
>>  * destruction.
>>  */
>> -   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
>> +   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT)
>> bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
>> -   ttm_bo_move_to_lru_tail(bo, NULL);
>> -   }
>> +   ttm_bo_add_mem_to_lru(bo, &bo->mem);
>>   kref_init(&bo->kref);
>> list_add_tail(&bo->ddestroy, &bdev->ddestroy);
> 
> Yeah, thought about that as well. But this has the major drawback that the 
> deleted BO moves to the end of the LRU, which is something we don't want.

well, as the bo is busy, looks like it needs time to being idle. putting it to 
tail seems fair.

> I think the real solution to this problem is to go a completely different way 
> and remove the delayed delete feature from TTM altogether. Instead this 
> should be part of some DRM domain handler component.
> 

yes, completely agree. As long as we can shrink bos when OOM, the workqueue is 
not necessary, The workqueue does not  help in a heavy workload case.

Pls see my patches below, I remove the workqueue, and what’s more, we can 
clearup the bo without lru lock hold.
That would reduce the lock contention. I run kfdtest and got a good performance 
result.


> In other words it should not matter if a BO is evicted, moved or freed. 
> Whenever a piece of memory becomes available again we keep around a fence 
> which marks the end of using this piece of memory.
> 
> When then somebody asks for new memory we work through the LRU and test if 
> using a certain piece of memory makes sense or not. If we find that a BO 
> needs to be evicted for this we return a reference to the BO in question to 
> the upper level handling.
> 
> If we find that we can do the allocation but only with recently freed up 
> memory we gather the fences and say you can only use the newl

Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3

2020-02-13 Thread Pan, Xinhui


> 2020年2月13日 18:01,Koenig, Christian  写道:
> 
> Am 13.02.20 um 05:11 schrieb Pan, Xinhui:
>> 
>> 
>>> 2020年2月12日 19:53,Christian König  写道:
>>> 
>>> Am 12.02.20 um 07:23 schrieb Pan, Xinhui:
>>>>> 2020年2月11日 23:43,Christian König  写道:
>>>>> 
>>>>> When non-imported BOs are resurrected for delayed delete we replace
>>>>> the dma_resv object to allow for easy reclaiming of the resources.
>>>>> 
>>>>> v2: move that to ttm_bo_individualize_resv
>>>>> v3: add a comment to explain what's going on
>>>>> 
>>>>> Signed-off-by: Christian König 
>>>>> Reviewed-by: xinhui pan 
>>>>> ---
>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 14 +-
>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>> 
>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> index bfc42a9e4fb4..8174603d390f 100644
>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> @@ -393,6 +393,18 @@ static int ttm_bo_individualize_resv(struct 
>>>>> ttm_buffer_object *bo)
>>>>> 
>>>>>   r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>>>>>   dma_resv_unlock(&bo->base._resv);
>>>>> + if (r)
>>>>> + return r;
>>>>> +
>>>>> + if (bo->type != ttm_bo_type_sg) {
>>>>> + /* This works because the BO is about to be destroyed and nobody
>>>>> +  * reference it any more. The only tricky case is the trylock on
>>>>> +  * the resv object while holding the lru_lock.
>>>>> +  */
>>>>> + spin_lock(&ttm_bo_glob.lru_lock);
>>>>> + bo->base.resv = &bo->base._resv;
>>>>> + spin_unlock(&ttm_bo_glob.lru_lock);
>>>>> + }
>>>>> 
>>>> how about something like that.
>>>> the basic idea is to do the bo cleanup work in bo release first and avoid 
>>>> any race with evict.
>>>> As in bo dieing progress, evict also just do bo cleanup work.
>>>> 
>>>> If bo is busy, neither bo_release nor evict  can do cleanupwork  on it. 
>>>> For the bo release case, we just add bo back to lru list.
>>>> So we can clean it up  both in workqueue and shrinker as the past way  did.
>>>> 
>>>> @@ -405,8 +405,9 @@ static int ttm_bo_individualize_resv(struct 
>>>> ttm_buffer_object *bo)
>>>>   if (bo->type != ttm_bo_type_sg) {
>>>> spin_lock(&ttm_bo_glob.lru_lock);
>>>> -   bo->base.resv = &bo->base._resv;
>>>> +   ttm_bo_del_from_lru(bo);
>>>> spin_unlock(&ttm_bo_glob.lru_lock);
>>>> +   bo->base.resv = &bo->base._resv;
>>>> }
>>>>   return r;
>>>> @@ -606,10 +607,9 @@ static void ttm_bo_release(struct kref *kref)
>>>>  * shrinkers, now that they are queued for
>>>>  * destruction.
>>>>  */
>>>> -   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
>>>> +   if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT)
>>>> bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
>>>> -   ttm_bo_move_to_lru_tail(bo, NULL);
>>>> -   }
>>>> +   ttm_bo_add_mem_to_lru(bo, &bo->mem);
>>>>   kref_init(&bo->kref);
>>>> list_add_tail(&bo->ddestroy, &bdev->ddestroy);
>>> Yeah, thought about that as well. But this has the major drawback that the 
>>> deleted BO moves to the end of the LRU, which is something we don't want.
>> well, as the bo is busy, looks like it needs time to being idle. putting it 
>> to tail seems fair.
> 
> No, see BOs should move to the tail of the LRU whenever they are used. 
> Freeing up a BO is basically the opposite of using it.
> 
> So what would happen on the next memory contention is that the MM would evict 
> BOs which are still used and only after come to the delete BO which could 
> have been removed long ago.
> 
>>> I think the real solution to this problem is to go a completely different 
>>> way and remove the delayed delete feature from TTM altogether. Instead this 
>>> should be part of some DRM domain ha

Re: [RFC PATCH v5] drm/amdgpu: Remove kfd eviction fence before release bo

2020-02-18 Thread Pan, Xinhui


> 2020年2月19日 07:10,Kuehling, Felix  写道:
> 
> Hi Xinhui,
> 
> Two suggestions inline. Looks good to me otherwise.
> 
> On 2020-02-17 10:36 p.m., xinhui pan wrote:
>> No need to trigger eviction as the memory mapping will not be used
>> anymore.
>> 
>> All pt/pd bos share same resv, hence the same shared eviction fence.
>> Everytime page table is freed, the fence will be signled and that cuases
>> kfd unexcepted evictions.
>> 
>> Signed-off-by: xinhui pan 
>> CC: Christian König 
>> CC: Felix Kuehling 
>> CC: Alex Deucher 
>> ---
>> change from v4:
>> based on new ttm code.
>> 
>> change from v3:
>> fix a coding error
>> 
>> change from v2:
>> based on Chris' drm/ttm: rework BO delayed delete patchset.
>> 
>> ---
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  1 +
>>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 37 +++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  4 ++
>>  3 files changed, 42 insertions(+)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index 9e8db702d878..0ee8aae6c519 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -96,6 +96,7 @@ struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 
>> context,
>> struct mm_struct *mm);
>>  bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
>>  struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
>> +int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);
>>struct amdkfd_process_info {
>>  /* List head of all VMs that belong to a KFD process */
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index ef721cb65868..6aa20aa82bd3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -276,6 +276,41 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
>> amdgpu_bo *bo,
>>  return 0;
>>  }
>>  +int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo)
>> +{
>> +struct amdgpu_bo *root = bo;
>> +struct amdgpu_vm_bo_base *vm_bo;
>> +struct amdgpu_vm *vm;
>> +struct amdkfd_process_info *info;
>> +struct amdgpu_amdkfd_fence *ef;
>> +int ret;
>> +
>> +while (root->parent)
>> +root = root->parent;
> 
> This should not be necessary. Every page table BO has a pointer to a vm_bo 
> that has a pointer to the vm. So you don't need to find the root.
> 
> This should do the trick:
> 
>   if (!bo->vm_bo || !bo->vm_bo->vm)
>   return 0;
>   vm = bo->vm_bo->vm;
> 
> 
well,when free page tables, it clears bo->vm_bo first then release pt/pd bo.
Also we can change the sequence like I do in V2, looks like hit some weird 
issues.

>> +
>> +vm_bo = root->vm_bo;
>> +if (!vm_bo)
>> +return 0;
>> +
>> +vm = vm_bo->vm;
>> +if (!vm)
>> +return 0;
>> +
>> +info = vm->process_info;
>> +if (!info || !info->eviction_fence)
>> +return 0;
>> +
>> +ef = container_of(dma_fence_get(&info->eviction_fence->base),
>> +struct amdgpu_amdkfd_fence, base);
>> +
>> +dma_resv_lock(bo->tbo.base.resv, NULL);
>> +ret = amdgpu_amdkfd_remove_eviction_fence(bo, ef);
>> +dma_resv_unlock(bo->tbo.base.resv);
>> +
>> +dma_fence_put(&ef->base);
>> +return ret;
>> +}
>> +
>>  static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
>>   bool wait)
>>  {
>> @@ -1045,6 +1080,8 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct 
>> amdgpu_device *adev,
>>  list_del(&vm->vm_list_node);
>>  mutex_unlock(&process_info->lock);
>>  +   vm->process_info = NULL;
>> +
>>  /* Release per-process resources when last compute VM is destroyed */
>>  if (!process_info->n_vms) {
>>  WARN_ON(!list_empty(&process_info->kfd_bo_list));
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 6f60a581e3ba..16586651020f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -1307,6 +1307,10 @@ void amdgpu_bo_release_notify(struct 
>> ttm_buffer_object *bo)
>>  if (abo->kfd_bo)
>>  amdgpu_amdkfd_unreserve_memory_limit(abo);
>>  +   /* We only remove the fence if the resv has individualized. */
>> +if (bo->base.resv == &bo->base._resv)
> 
> Should this be a WARN_ON? We expect this condition to be always true. If it's 
> not, there should be a noisy warning that something is wrong.

good point.

thanks
xinhui

> 
> Regards,
>   Felix
> 
> 
>> +amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
>> +
>>  if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
>>  !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_

Re: [PATCH] drm/amdgpu: add VM update fences back to the root PD v2

2020-02-25 Thread Pan, Xinhui
Reviewed-by: xinhui pan 

> 2020年2月25日 20:45,Christian König  写道:
> 
> Am 19.02.20 um 16:02 schrieb Christian König:
>> Add update fences to the root PD while mapping BOs.
>> 
>> Otherwise PDs freed during the mapping won't wait for
>> updates to finish and can cause corruptions.
>> 
>> v2: rebased on drm-misc-next
>> 
>> Signed-off-by: Christian König 
>> Fixes: 90b69cdc5f159 drm/amdgpu: stop adding VM updates fences to the resv 
>> obj
> 
> Felix and Xinhui can I get an rb or at least Acked-by for this patch? It is a 
> major problem for testing which needs to get fixed.
> 
> Thanks,
> Christian.
> 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 --
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index d16231d6a790..ef73fa94f357 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -588,8 +588,8 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>>  {
>>  entry->priority = 0;
>>  entry->tv.bo = &vm->root.base.bo->tbo;
>> -/* One for TTM and one for the CS job */
>> -entry->tv.num_shared = 2;
>> +/* Two for VM updates, one for TTM and one for the CS job */
>> +entry->tv.num_shared = 4;
>>  entry->user_pages = NULL;
>>  list_add(&entry->tv.head, validated);
>>  }
>> @@ -1591,6 +1591,16 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>  goto error_unlock;
>>  }
>>  +   if (flags & AMDGPU_PTE_VALID) {
>> +struct amdgpu_bo *root = vm->root.base.bo;
>> +
>> +if (!dma_fence_is_signaled(vm->last_direct))
>> +amdgpu_bo_fence(root, vm->last_direct, true);
>> +
>> +if (!dma_fence_is_signaled(vm->last_delayed))
>> +amdgpu_bo_fence(root, vm->last_delayed, true);
>> +}
>> +
>>  r = vm->update_funcs->prepare(¶ms, owner, exclusive);
>>  if (r)
>>  goto error_unlock;
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [RFC PATCH v2] drm/amdkfd: Run restore_workers on freezable WQs

2023-11-09 Thread Pan, Xinhui
[AMD Official Use Only - General]

I once replaced the queue with the freezable one, but got hang in flush.
Looks like Felix has fixed it.

Acked-and-tested-by: xinhui pan 


-Original Message-
From: Kuehling, Felix 
Sent: Wednesday, November 8, 2023 6:06 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily ; Pan, Xinhui ; Koenig, 
Christian 
Subject: [RFC PATCH v2] drm/amdkfd: Run restore_workers on freezable WQs

Make restore workers freezable so we don't have to explicitly flush them in 
suspend and GPU reset code paths, and we don't accidentally try to restore BOs 
while the GPU is suspended. Not having to flush restore_work also helps avoid 
lock/fence dependencies in the GPU reset case where we're not allowed to wait 
for fences.

A side effect of this is, that we can now have multiple concurrent threads 
trying to signal the same eviction fence. Rework eviction fence signaling and 
replacement to account for that.

The GPU reset path can no longer rely on restore_process_worker to resume 
queues because evict/restore workers can run independently of it. Instead call 
a new restore_process_helper directly.

This is an RFC and request for testing.

v2:
- Reworked eviction fence signaling
- Introduced restore_process_helper

Signed-off-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 34 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 87 +++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  4 +-
 3 files changed, 81 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 54f31a420229..1b33ddc0512e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1405,7 +1405,6 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
  amdgpu_amdkfd_restore_userptr_worker);

*process_info = info;
-   *ef = dma_fence_get(&info->eviction_fence->base);
}

vm->process_info = *process_info;
@@ -1436,6 +1435,8 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
list_add_tail(&vm->vm_list_node,
&(vm->process_info->vm_list_head));
vm->process_info->n_vms++;
+
+   *ef = dma_fence_get(&vm->process_info->eviction_fence->base);
mutex_unlock(&vm->process_info->lock);

return 0;
@@ -1447,10 +1448,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
 reserve_pd_fail:
vm->process_info = NULL;
if (info) {
-   /* Two fence references: one in info and one in *ef */
dma_fence_put(&info->eviction_fence->base);
-   dma_fence_put(*ef);
-   *ef = NULL;
*process_info = NULL;
put_pid(info->pid);
 create_evict_fence_fail:
@@ -1644,7 +1642,8 @@ int amdgpu_amdkfd_criu_resume(void *p)
goto out_unlock;
}
WRITE_ONCE(pinfo->block_mmu_notifications, false);
-   schedule_delayed_work(&pinfo->restore_userptr_work, 0);
+   queue_delayed_work(system_freezable_wq,
+  &pinfo->restore_userptr_work, 0);

 out_unlock:
mutex_unlock(&pinfo->lock);
@@ -2458,7 +2457,8 @@ int amdgpu_amdkfd_evict_userptr(struct 
mmu_interval_notifier *mni,
   KFD_QUEUE_EVICTION_TRIGGER_USERPTR);
if (r)
pr_err("Failed to quiesce KFD\n");
-   schedule_delayed_work(&process_info->restore_userptr_work,
+   queue_delayed_work(system_freezable_wq,
+   &process_info->restore_userptr_work,
msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
}
mutex_unlock(&process_info->notifier_lock);
@@ -2793,7 +2793,8 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct 
work_struct *work)

/* If validation failed, reschedule another attempt */
if (evicted_bos) {
-   schedule_delayed_work(&process_info->restore_userptr_work,
+   queue_delayed_work(system_freezable_wq,
+   &process_info->restore_userptr_work,
msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));

kfd_smi_event_queue_restore_rescheduled(mm);
@@ -2802,6 +2803,23 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct 
work_struct *work)
put_task_struct(usertask);
 }

+static void replace_eviction_fence(struct dma_fence **ef,
+  struct dma_fence *new_ef)
+{
+   struct dma_fence *old_ef = rcu_replace_pointer(*ef, new_ef, true
+   /* protected by process_info->lock */);
+
+   /* If we'r

RE: [RFC PATCH v2] drm/amdkfd: Run restore_workers on freezable WQs

2023-11-10 Thread Pan, Xinhui
[AMD Official Use Only - General]

Wait, I think we need a small fix below.

--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2036,6 +2036,7 @@ int kfd_resume_all_processes(void)
int ret = 0, idx = srcu_read_lock(&kfd_processes_srcu);

hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
+   cancel_delayed_work_sync(&p->restore_work);
if (restore_process_helper(p)) {
pr_err("Restore process %d failed during resume\n",
   p->pasid);

Felix,
  restore_process_helper is called both in resume and restore-work. Which calls 
into amdgpu_amdkfd_gpuvm_restore_process_bos to create a new ef.
So there is one race below.
resume create a new ef and soon the restore-work which is freezed during 
suspend create another new ef.
Then there is one warning when you call replace_eviction_fence.

[   83.865870] Replacing unsignaled eviction fence
[   83.872452] WARNING: CPU: 5 PID: 9 at 
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2838 
amdgpu_amdkfd_gpuvm_restore_pro
cess_bos+0xa9e/0xfe0 [amdgpu]
[snip]
[   83.896776] Workqueue: kfd_restore_wq restore_process_worker [amdgpu]
[   83.989171] e1000e :00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, 
Flow Control: Rx/Tx
[   84.004699]
[   84.004701] RIP: 0010:amdgpu_amdkfd_gpuvm_restore_process_bos+0xa9e/0xfe0 
[amdgpu]
[   84.046060] Code: 48 83 05 8c aa ea 00 01 44 89 8d 08 fe ff ff 48 89 95 18 
fe ff ff c6 05 3a 82 d9 00 01 e8 ba 80 d1 d0 48
83 05 72 aa ea 00 01 <0f> 0b 48 83 05 70 aa ea 00 01 44 8b 8d 08 fe ff ff 48 8b 
95 18 fe
[   84.046062] RSP: 0018:a1e2c00c7bf0 EFLAGS: 00010202
[   84.046064] RAX:  RBX: 8c58558d9c00 RCX: 
[   84.046066] RDX: 0002 RSI: 9370d98a RDI: 
[   84.046067] RBP: a1e2c00c7e00 R08: 0003 R09: 0001
[   84.046069] R10: 0001 R11: 0001 R12: 8c58555ad008
[   84.046070] R13: 0400 R14: 8c58542f9510 R15: 8c5854cbeea8
[   84.046071] FS:  () GS:8c676dc8() 
knlGS:
[   84.046073] CS:  0010 DS:  ES:  CR0: 80050033
[   84.046074] CR2: 7ffd279bb0c8 CR3: 000fde856003 CR4: 003706e0
[   84.046076] DR0:  DR1:  DR2: 
[   84.046077] DR3:  DR6: fffe0ff0 DR7: 0400
[   84.046078] Call Trace:
[   84.046079]  
[   84.046081]  ? show_regs+0x6a/0x80
[   84.046085]  ? amdgpu_amdkfd_gpuvm_restore_process_bos+0xa9e/0xfe0 [amdgpu]
[   84.156138]  ? __warn+0x8d/0x180
[   84.156142]  ? amdgpu_amdkfd_gpuvm_restore_process_bos+0xa9e/0xfe0 [amdgpu]
[   84.166431]  ? report_bug+0x1e8/0x240
[   84.166435]  ? __wake_up_klogd.part.0+0x64/0xa0
[   84.166440]  ? handle_bug+0x46/0x80
[   84.166444]  ? exc_invalid_op+0x19/0x70
[   84.166447]  ? asm_exc_invalid_op+0x1b/0x20
[   84.166457]  ? amdgpu_amdkfd_gpuvm_restore_process_bos+0xa9e/0xfe0 [amdgpu]
[   84.166917]  ? __lock_acquire+0x5f3/0x28c0
[   84.166921]  ? __this_cpu_preempt_check+0x13/0x20
[   84.166938]  restore_process_helper+0x33/0x110 [amdgpu]
[   84.167292]  restore_process_worker+0x40/0x130 [amdgpu]
[   84.167644]  process_one_work+0x26a/0x550
[   84.167654]  worker_thread+0x58/0x3c0
[   84.167659]  ? __pfx_worker_thread+0x10/0x10
[   84.167661]  kthread+0x105/0x130
[   84.167664]  ? __pfx_kthread+0x10/0x10
[   84.167669]  ret_from_fork+0x29/0x50
[   84.167681]  
[   84.167683] irq event stamp: 1343665
[   84.167684] hardirqs last  enabled at (1343671): [] 
vprintk_emit+0x37b/0x3a0
[   84.167687] hardirqs last disabled at (1343676): [] 
vprintk_emit+0x367/0x3a0
[   84.167689] softirqs last  enabled at (1342680): [] 
__irq_exit_rcu+0xd3/0x140
[   84.167691] softirqs last disabled at (1342671): [] 
__irq_exit_rcu+0xd3/0x140
[   84.167692] ---[ end trace  ]---
[   84.189957] PM: suspe

Thanks
xinhui

-----Original Message-
From: Pan, Xinhui
Sent: Friday, November 10, 2023 12:51 PM
To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org
Cc: Deng, Emily ; Koenig, Christian 

Subject: RE: [RFC PATCH v2] drm/amdkfd: Run restore_workers on freezable WQs

I once replaced the queue with the freezable one, but got hang in flush.
Looks like Felix has fixed it.

Acked-and-tested-by: xinhui pan 


-Original Message-
From: Kuehling, Felix 
Sent: Wednesday, November 8, 2023 6:06 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily ; Pan, Xinhui ; Koenig, 
Christian 
Subject: [RFC PATCH v2] drm/amdkfd: Run restore_workers on freezable WQs

Make restore workers freezable so we don't have to explicitly flush them in 
suspend and GPU reset code paths, and we don't accidentally try to restore BOs 
while the GPU is suspended. Not having to flush restore_work also helps avoid 
lock/fence dependencies in the G

RE: [PATCH] drm/scheduler: Partially revert "drm/scheduler: track GPU active time per entity"

2023-08-16 Thread Pan, Xinhui
[AMD Official Use Only - General]

Can we just add kref for entity?
Or just collect such job time usage somewhere else?

-Original Message-
From: Pan, Xinhui 
Sent: Thursday, August 17, 2023 1:05 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tuikov, Luben ; airl...@gmail.com; 
dri-de...@lists.freedesktop.org; l.st...@pengutronix.de; Koenig, Christian 
; Pan, Xinhui 
Subject: [PATCH] drm/scheduler: Partially revert "drm/scheduler: track GPU 
active time per entity"

This patch partially revert commit df622729ddbf ("drm/scheduler: track GPU 
active time per entity") which touchs entity without any reference.

I notice there is one memory overwritten from gpu scheduler side.
The case is like below.
A(drm_sched_main)   B(vm fini)
drm_sched_job_begin drm_sched_entity_kill
//job in pending_list   wait_for_completion
complete_all...
... kfree entity
drm_sched_get_cleanup_job
//fetch job from pending_list
access job->entity //memory overwitten

As long as we can NOT guarantee entity is alive in this case, lets revert it 
for now.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/scheduler/sched_main.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 602361c690c9..1b3f1a6a8514 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -907,12 +907,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)

spin_unlock(&sched->job_list_lock);

-   if (job) {
-   job->entity->elapsed_ns += ktime_to_ns(
-   ktime_sub(job->s_fence->finished.timestamp,
- job->s_fence->scheduled.timestamp));
-   }
-
return job;
 }

--
2.34.1



RE: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-11 Thread Pan, Xinhui
[AMD Official Use Only - General]

Oh yep, Pinned BO is moved to other LRU list, So eviction fails because of 
other reason.
I will change the comments in the patch.
The problem is eviction fails as many reasons, say, BO is locked.
ASAIK, kfd will stop the queues and flush some evict/restore work in its 
suspend callback. SO the first eviction before kfd callback likely fails.

-Original Message-
From: Christian König 
Sent: Friday, September 8, 2023 2:49 PM
To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Koenig, Christian 
; Fan, Shikang 
Subject: Re: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

Am 08.09.23 um 05:39 schrieb xinhui pan:
> Some BOs might be pinned. So the first eviction's failure will abort
> the suspend sequence. These pinned BOs will be unpined afterwards
> during suspend.

That doesn't make much sense since pinned BOs don't cause eviction failure here.

What exactly is the error code you see?

Christian.

>
> Actaully it has evicted most BOs, so that should stil work fine in
> sriov full access mode.
>
> Fixes: 47ea20762bb7 ("drm/amdgpu: Add an extra evict_resource call
> during device_suspend.")
> Signed-off-by: xinhui pan 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5c0e2b766026..39af526cdbbe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4148,10 +4148,11 @@ int amdgpu_device_suspend(struct drm_device
> *dev, bool fbcon)
>
>   adev->in_suspend = true;
>
> - /* Evict the majority of BOs before grabbing the full access */
> - r = amdgpu_device_evict_resources(adev);
> - if (r)
> - return r;
> + /* Try to evict the majority of BOs before grabbing the full access
> +  * Ignore the ret val at first place as we will unpin some BOs if any
> +  * afterwards.
> +  */
> + (void)amdgpu_device_evict_resources(adev);
>
>   if (amdgpu_sriov_vf(adev)) {
>   amdgpu_virt_fini_data_exchange(adev);



回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-12 Thread Pan, Xinhui
[AMD Official Use Only - General]

I notice that only user space process are frozen on my side.  kthread and 
workqueue  keeps running. Maybe some kernel configs are not enabled.
I made one module which just prints something like i++ with mutex lock both in 
workqueue and kthread. I paste some logs below.
[438619.696196] XH: 14 from workqueue
[438619.700193] XH: 15 from kthread
[438620.394335] PM: suspend entry (deep)
[438620.399619] Filesystems sync: 0.001 seconds
[438620.403887] PM: Preparing system for sleep (deep)
[438620.409299] Freezing user space processes
[438620.414862] Freezing user space processes completed (elapsed 0.001 seconds)
[438620.421881] OOM killer disabled.
[438620.425197] Freezing remaining freezable tasks
[438620.430890] Freezing remaining freezable tasks completed (elapsed 0.001 
seconds)
[438620.438348] PM: Suspending system (deep)
.
[438623.746038] PM: suspend of devices complete after 3303.137 msecs
[438623.752125] PM: start suspend of devices complete after 3309.713 msecs
[438623.758722] PM: suspend debug: Waiting for 5 second(s).
[438623.792166] XH: 22 from kthread
[438623.824140] XH: 23 from workqueue


So BOs definitely can be in use during suspend.
Even if kthread or workqueue can be stopped with one special kernel config. I 
think suspend can only stop the workqueue with its callback finish.
otherwise something like below makes things crazy.
LOCK BO
do something
-> schedule or wait, anycode might sleep.  Stopped by suspend now? no, i 
think.
UNLOCK BO

I do tests  with  cmds below.
echo devices  > /sys/power/pm_test
echo 0  > /sys/power/pm_async
echo 1  > /sys/power/pm_print_times
echo 1 > /sys/power/pm_debug_messages
echo 1 > /sys/module/amdgpu/parameters/debug_evictions
./kfd.sh --gtest_filter=KFDEvictTest.BasicTest
pm-suspend

thanks
xinhui



发件人: Christian König 
发送时间: 2023年9月12日 17:01
收件人: Pan, Xinhui ; amd-gfx@lists.freedesktop.org 

抄送: Deucher, Alexander ; Koenig, Christian 
; Fan, Shikang 
主题: Re: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

When amdgpu_device_suspend() is called processes should be frozen
already. In other words KFD queues etc... should already be idle.

So when the eviction fails here we missed something previously and that
in turn can cause tons amount of problems.

So ignoring those errors is most likely not a good idea at all.

Regards,
Christian.

Am 12.09.23 um 02:21 schrieb Pan, Xinhui:
> [AMD Official Use Only - General]
>
> Oh yep, Pinned BO is moved to other LRU list, So eviction fails because of 
> other reason.
> I will change the comments in the patch.
> The problem is eviction fails as many reasons, say, BO is locked.
> ASAIK, kfd will stop the queues and flush some evict/restore work in its 
> suspend callback. SO the first eviction before kfd callback likely fails.
>
> -Original Message-
> From: Christian König 
> Sent: Friday, September 8, 2023 2:49 PM
> To: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Koenig, Christian 
> ; Fan, Shikang 
> Subject: Re: [PATCH] drm/amdgpu: Ignore first evction failure during suspend
>
> Am 08.09.23 um 05:39 schrieb xinhui pan:
>> Some BOs might be pinned. So the first eviction's failure will abort
>> the suspend sequence. These pinned BOs will be unpined afterwards
>> during suspend.
> That doesn't make much sense since pinned BOs don't cause eviction failure 
> here.
>
> What exactly is the error code you see?
>
> Christian.
>
>> Actaully it has evicted most BOs, so that should stil work fine in
>> sriov full access mode.
>>
>> Fixes: 47ea20762bb7 ("drm/amdgpu: Add an extra evict_resource call
>> during device_suspend.")
>> Signed-off-by: xinhui pan 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +
>>1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 5c0e2b766026..39af526cdbbe 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4148,10 +4148,11 @@ int amdgpu_device_suspend(struct drm_device
>> *dev, bool fbcon)
>>
>>adev->in_suspend = true;
>>
>> - /* Evict the majority of BOs before grabbing the full access */
>> - r = amdgpu_device_evict_resources(adev);
>> - if (r)
>> - return r;
>> + /* Try to evict the majority of BOs before grabbing the full access
>> +  * Ignore the ret val at first place as we will unpin some BOs if any
>> +  * afterwards.
>> +  */
>> + (void)amdgpu_device_evict_resources(adev);
>>
>>if (amdgpu_sriov_vf(adev)) {
>>amdgpu_virt_fini_data_exchange(adev);



RE: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

2023-09-13 Thread Pan, Xinhui
[AMD Official Use Only - General]

Chris,
I can dump these busy BOs with their alloc/free stack later today.

BTW, the two evictions and the kfd suspend are all called before hw_fini. IOW, 
between phase 1 and phase 2. SDMA is turned only in phase2. So current code 
works fine maybe.

From: Koenig, Christian 
Sent: Wednesday, September 13, 2023 10:29 PM
To: Kuehling, Felix ; Christian König 
; Pan, Xinhui ; 
amd-gfx@lists.freedesktop.org; Wentland, Harry 
Cc: Deucher, Alexander ; Fan, Shikang 

Subject: Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

[+Harry]
Am 13.09.23 um 15:54 schrieb Felix Kuehling:
On 2023-09-13 4:07, Christian König wrote:
[+Fleix]

Well that looks like quite a serious bug.

If I'm not completely mistaken the KFD work item tries to restore the process 
by moving BOs into memory even after the suspend freeze. Normally work items 
are frozen together with the user space processes unless explicitly marked as 
not freezable.

That this causes problem during the first eviction phase is just the tip of the 
iceberg here. If a BO is moved into invisible memory during this we wouldn't be 
able to get it out of that in the second phase because SDMA and hw is already 
turned off.

@Felix any idea how that can happen? Have you guys marked a work item / work 
queue as not freezable?

We don't set anything to non-freezable in KFD.



Regards,
  Felix


Or maybe the display guys?

Do you guys in the display do any delayed update in a work item which is marked 
as not-freezable?

Otherwise I have absolutely no idea what's going on here.

Thanks,
Christian.



@Xinhui please investigate what work item that is and where that is coming 
from. Something like "if (adev->in_suspend) dump_stack();" in the right place 
should probably do it.

Thanks,
Christian.
Am 13.09.23 um 07:13 schrieb Pan, Xinhui:

[AMD Official Use Only - General]

I notice that only user space process are frozen on my side.  kthread and 
workqueue  keeps running. Maybe some kernel configs are not enabled.
I made one module which just prints something like i++ with mutex lock both in 
workqueue and kthread. I paste some logs below.
[438619.696196] XH: 14 from workqueue
[438619.700193] XH: 15 from kthread
[438620.394335] PM: suspend entry (deep)
[438620.399619] Filesystems sync: 0.001 seconds
[438620.403887] PM: Preparing system for sleep (deep)
[438620.409299] Freezing user space processes
[438620.414862] Freezing user space processes completed (elapsed 0.001 seconds)
[438620.421881] OOM killer disabled.
[438620.425197] Freezing remaining freezable tasks
[438620.430890] Freezing remaining freezable tasks completed (elapsed 0.001 
seconds)
[438620.438348] PM: Suspending system (deep)
.
[438623.746038] PM: suspend of devices complete after 3303.137 msecs
[438623.752125] PM: start suspend of devices complete after 3309.713 msecs
[438623.758722] PM: suspend debug: Waiting for 5 second(s).
[438623.792166] XH: 22 from kthread
[438623.824140] XH: 23 from workqueue


So BOs definitely can be in use during suspend.
Even if kthread or workqueue can be stopped with one special kernel config. I 
think suspend can only stop the workqueue with its callback finish.
otherwise something like below makes things crazy.
LOCK BO
do something
-> schedule or wait, anycode might sleep.  Stopped by suspend now? no, i 
think.
UNLOCK BO

I do tests  with  cmds below.
echo devices  > /sys/power/pm_test
echo 0  > /sys/power/pm_async
echo 1  > /sys/power/pm_print_times
echo 1 > /sys/power/pm_debug_messages
echo 1 > /sys/module/amdgpu/parameters/debug_evictions
./kfd.sh --gtest_filter=KFDEvictTest.BasicTest
pm-suspend

thanks
xinhui



发件人: Christian König 
<mailto:ckoenig.leichtzumer...@gmail.com>
发送时间: 2023年9月12日 17:01
收件人: Pan, Xinhui <mailto:xinhui@amd.com>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
<mailto:amd-gfx@lists.freedesktop.org>
抄送: Deucher, Alexander 
<mailto:alexander.deuc...@amd.com>; Koenig, 
Christian <mailto:christian.koe...@amd.com>; Fan, 
Shikang <mailto:shikang@amd.com>
主题: Re: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

When amdgpu_device_suspend() is called processes should be frozen
already. In other words KFD queues etc... should already be idle.

So when the eviction fails here we missed something previously and that
in turn can cause tons amount of problems.

So ignoring those errors is most likely not a good idea at all.

Regards,
Christian.

Am 12.09.23 um 02:21 schrieb Pan, Xinhui:
> [AMD Official Use Only - General]
>
> Oh yep, Pinned BO is moved to other LRU list, So eviction fails because of 
> other reason.
> I will change the comments in the patch.
> The problem is eviction fails as many reasons, say, BO is locked.
> ASAIK, kfd will stop the queues and flush some evict/restore work in its 

Re: [PATCH] drm/amdgpu: further lower VRAM allocation overhead

2021-07-14 Thread Pan, Xinhui


> 2021年7月14日 16:33,Christian König  写道:
> 
> Hi Eric,
> 
> feel free to push into amd-staging-dkms-5.11, but please don't push it into 
> amd-staging-drm-next.
> 
> The later will just cause a merge failure which Alex needs to resolve 
> manually.
> 
> I can take care of pushing to amd-staging-drm-next as soon as that is rebased 
> on latest upstream.
> 
> Regards,
> Christian.
> 
> Am 13.07.21 um 21:19 schrieb Eric Huang:
>> Hi Christian/Felix,
>> 
>> If you don't have objection, it will be pushed into amd-staging-dkms-5.11 
>> and amd-staging-drm-next.
>> 
>> Thanks,
>> Eric
>> 
>> On 2021-07-13 3:17 p.m., Eric Huang wrote:
>>> For allocations larger than 48MiB we need more than a page for the
>>> housekeeping in the worst case resulting in the usual vmalloc overhead.
>>> 
>>> Try to avoid this by assuming the good case and only falling back to the
>>> worst case if this didn't worked.
>>> 
>>> Signed-off-by: Christian König 
>>> Signed-off-by: Eric Huang 
>>> Reviewed-by: Felix Kuehling 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 71 +++-
>>>   1 file changed, 53 insertions(+), 18 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>>> index be4261c4512e..ecbe05e1db66 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>>> @@ -361,9 +361,11 @@ static void amdgpu_vram_mgr_virt_start(struct 
>>> ttm_resource *mem,
>>>   static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
>>>  struct ttm_buffer_object *tbo,
>>>  const struct ttm_place *place,
>>> +   unsigned long num_nodes,
>>> +   unsigned long pages_per_node,
>>>  struct ttm_resource *mem)
>>>   {
>>> -unsigned long lpfn, num_nodes, pages_per_node, pages_left, pages;
>>> +unsigned long lpfn, pages_left, pages;
>>>   struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
>>>   struct amdgpu_device *adev = to_amdgpu_device(mgr);
>>>   uint64_t vis_usage = 0, mem_bytes, max_bytes;
>>> @@ -393,21 +395,6 @@ static int amdgpu_vram_mgr_new(struct 
>>> ttm_resource_manager *man,
>>>   return -ENOSPC;
>>>   }
>>>   -if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
>>> -pages_per_node = ~0ul;
>>> -num_nodes = 1;
>>> -} else {
>>> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>> -pages_per_node = HPAGE_PMD_NR;
>>> -#else
>>> -/* default to 2MB */
>>> -pages_per_node = 2UL << (20UL - PAGE_SHIFT);
>>> -#endif
>>> -pages_per_node = max_t(uint32_t, pages_per_node,
>>> -   mem->page_alignment);
>>> -num_nodes = DIV_ROUND_UP(mem->num_pages, pages_per_node);
>>> -}
>>> -
>>>   nodes = kvmalloc_array((uint32_t)num_nodes, sizeof(*nodes),
>>>  GFP_KERNEL | __GFP_ZERO);
>>>   if (!nodes) {
>>> @@ -435,7 +422,12 @@ static int amdgpu_vram_mgr_new(struct 
>>> ttm_resource_manager *man,
>>>   i = 0;
>>>   spin_lock(&mgr->lock);
>>>   while (pages_left) {
>>> -uint32_t alignment = mem->page_alignment;
>>> +unsigned long alignment = mem->page_alignment;
>>> +
>>> +if (i >= num_nodes) {
>>> +r = -E2BIG;
>>> +goto error;
>>> +}
>>> if (pages >= pages_per_node)
>>>   alignment = pages_per_node;
>>> @@ -492,6 +484,49 @@ static int amdgpu_vram_mgr_new(struct 
>>> ttm_resource_manager *man,
>>>   return r;
>>>   }
>>>   +/**
>>> + * amdgpu_vram_mgr_alloc - allocate new range
>>> + *
>>> + * @man: TTM memory type manager
>>> + * @tbo: TTM BO we need this range for
>>> + * @place: placement flags and restrictions
>>> + * @mem: the resulting mem object
>>> + *
>>> + * Allocate VRAM for the given BO.
>>> + */
>>> +static int amdgpu_vram_mgr_alloc(struct ttm_resource_manager *man,
>>> + struct ttm_buffer_object *tbo,
>>> + const struct ttm_place *place,
>>> + struct ttm_resource *mem)
>>> +{
>>> +unsigned long num_nodes, pages_per_node;
>>> +int r;
>>> +
>>> +if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
>>> +return amdgpu_vram_mgr_new(man, tbo, place, 1, ~0ul, mem);
>>> +
>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>> +pages_per_node = HPAGE_PMD_NR;
>>> +#else
>>> +/* default to 2MB */
>>> +pages_per_node = 2UL << (20UL - PAGE_SHIFT);
>>> +#endif
>>> +pages_per_node = max_t(uint32_t, pages_per_node,
>>> +   mem->page_alignment);
>>> +num_nodes = DIV_ROUND_UP(mem->num_pages, pages_per_node);
>>> +
>>> +if (sizeof(struct drm_mm_node) * num_nodes > PAGE_SIZE) {

I think this should be < PAGE_SIZE? Otherwise amdgpu_vram_mgr_new always return 
-E2BIG.  Or I am missing something?

But you want one page to save all drm mm nodes in the good case. What if user 
just create a bunch of small VRAM BO, s

  1   2   3   >