RE: [PATCH v2 04/10] drm/amdgpu/kfd: remove is_hws_hang and is_resetting

2024-05-29 Thread Li, Yunxiang (Teddy)
[AMD Official Use Only - AMD Internal Distribution Only] > One thing I could see going wrong is, that down_read_trylock(&dqm->dev- > >adev->reset_domain->sem) will not fail immediately when the reset is > scheduled. So there may be multipe attempts at HW access that detect an > error or time out,

Re: [PATCH v2 04/10] drm/amdgpu/kfd: remove is_hws_hang and is_resetting

2024-05-29 Thread Felix Kuehling
On 2024-05-28 13:23, Yunxiang Li wrote: > is_hws_hang and is_resetting serves pretty much the same purpose and > they all duplicates the work of the reset_domain lock, just check that > directly instead. This also eliminate a few bugs listed below and get > rid of dqm->ops.pre_reset. > > kfd_hw

Re: [PATCH v2 04/10] drm/amdgpu/kfd: remove is_hws_hang and is_resetting

2024-05-28 Thread Christian König
Am 28.05.24 um 19:23 schrieb Yunxiang Li: is_hws_hang and is_resetting serves pretty much the same purpose and they all duplicates the work of the reset_domain lock, just check that directly instead. This also eliminate a few bugs listed below and get rid of dqm->ops.pre_reset. kfd_hws_hang did

[PATCH v2 04/10] drm/amdgpu/kfd: remove is_hws_hang and is_resetting

2024-05-28 Thread Yunxiang Li
is_hws_hang and is_resetting serves pretty much the same purpose and they all duplicates the work of the reset_domain lock, just check that directly instead. This also eliminate a few bugs listed below and get rid of dqm->ops.pre_reset. kfd_hws_hang did not need to avoid scheduling another reset.