Re: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread shaoyunl
I agree this patch is a big improvement , I think we need this patch so SRIOV can put the  amdkfd_pre_reset in right place as bare metal mode . The further improvement can be in separate change . This serial is reviewed by shaoyun.liu < shaoyun@amd.com> Regards shaoyun.liu On

Re: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread Felix Kuehling
On 2019-12-20 14:31, shaoyunl wrote: Can we use the  dqm_lock when we try to get the dqm->is_hw_hang and dqm->is_resetting inside function kq_uninitialize ? Spreading the DQM lock around is probably not a good idea. Then I'd rather do more refactoring to move hqd_load and hqd_destroy out of

Re: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread shaoyunl
Can we use the  dqm_lock when we try to get the dqm->is_hw_hang and  dqm->is_resetting inside function kq_uninitialize ? I think more closer we check the status  to hqd_destroy it will be  more accurate . It does look better with this logic if the status are changed after dqm unmap_queue call

RE: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread Zeng, Oak
[AMD Official Use Only - Internal Distribution Only] I see. Thank you Felix for the explanation. Regards, Oak -Original Message- From: Kuehling, Felix Sent: Friday, December 20, 2019 12:28 PM To: Zeng, Oak ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 4/4] drm/amdkfd: Avoid

Re: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread Felix Kuehling
On 2019-12-20 12:22, Zeng, Oak wrote: [AMD Official Use Only - Internal Distribution Only] Regards, Oak -Original Message- From: amd-gfx On Behalf Of Felix Kuehling Sent: Friday, December 20, 2019 3:30 AM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 4/4] drm/amdkfd: Avoid

RE: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread Zeng, Oak
[AMD Official Use Only - Internal Distribution Only] Regards, Oak -Original Message- From: amd-gfx On Behalf Of Felix Kuehling Sent: Friday, December 20, 2019 3:30 AM To: amd-gfx@lists.freedesktop.org Subject: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch Don't use

Re: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread Felix Kuehling
dqm->is_hws_hang is protected by the DQM lock. kq_uninitialize runs outside that lock protection. Therefore I opted to pass in the hanging flag as a parameter. It also keeps the logic that decides all of that inside the device queue manager, which I think is cleaner. I was trying to clean

Re: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread shaoyunl
Looks like patch 2 is not related to this serial , but anyway . Patch 1,2,3 are reviewed by shaoyunl  For patch 4 ,  is it possible we directly check dqm->is_hws_hang || dqm->is_resetting  inside function kq_uninitialize.  so we don't need other interface change . I think even Inside that

RE: [PATCH 4/4] drm/amdkfd: Avoid hanging hardware in stop_cpsch

2019-12-20 Thread Deng, Emily
[AMD Official Use Only - Internal Distribution Only] Series Tested-by: Emily Deng on sriov environment with vege10 about TDR-1, TDR-2 and TDR-3 test cases. Best wishes Emily Deng >-Original Message- >From: amd-gfx On Behalf Of Felix >Kuehling >Sent: Friday, December 20, 2019 4:30