Re: [PATCH] drm/amdkfd: Restore all process on post reset

2021-08-03 Thread Felix Kuehling
Am 2021-08-03 um 11:02 a.m. schrieb Eric Huang: > > > On 2021-07-30 5:26 p.m., Felix Kuehling wrote: >> Am 2021-07-28 um 1:31 p.m. schrieb Eric Huang: >>> It is to fix a bug of gpu_recovery on multiple GPUs, >>> When one gpu is reset, the application running on other >>> gpu hangs, because kfd post

Re: [PATCH] drm/amdkfd: Restore all process on post reset

2021-08-03 Thread Eric Huang
On 2021-07-30 5:26 p.m., Felix Kuehling wrote: Am 2021-07-28 um 1:31 p.m. schrieb Eric Huang: It is to fix a bug of gpu_recovery on multiple GPUs, When one gpu is reset, the application running on other gpu hangs, because kfd post reset doesn't restore the running process. This will resume a

Re: [PATCH] drm/amdkfd: Restore all process on post reset

2021-07-30 Thread Felix Kuehling
Am 2021-07-28 um 1:31 p.m. schrieb Eric Huang: > It is to fix a bug of gpu_recovery on multiple GPUs, > When one gpu is reset, the application running on other > gpu hangs, because kfd post reset doesn't restore the > running process. This will resume all processes, even those that were affected b

[PATCH] drm/amdkfd: Restore all process on post reset

2021-07-28 Thread Eric Huang
It is to fix a bug of gpu_recovery on multiple GPUs, When one gpu is reset, the application running on other gpu hangs, because kfd post reset doesn't restore the running process. And it also fixes a bug in the function kfd_process_evict_queues, when one gpu hangs, process running on other gpus can