Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-11 Thread JingWen Chen
ee without the new patch-set ? >>>> >>>> Andrey >>>> >>> I think this page fault issue can be seen even on the original tree. It's >>> just drop the concurrent GPU reset will hit it more easily. >>> >>> We may need a new way to prote

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-07 Thread Andrey Grodzovsky
rey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that we need to adjust the implementatio

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread JingWen Chen
ch job timeout on each >>>>>>>>>> queue. Otherwise you have a race condition between the hypervisor >>>>>>>>>> and the scheduler. >>>>>>>>>> >>>>>>>>>> Properly setting in_gpu_reset i

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread JingWen Chen
We may need a new way to protect the reset in SRIOV. > >>>>>>> Andrey >>>>>>> >>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>>> >>>>>>>

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread Andrey Grodzovsky
tian König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that w

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread Christian König
anuary 4, 2022 6:19 PM To: Chen, JingWen ; Christian König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection f

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread JingWen Chen
t; signaling the need for a reset, similar to each job timeout on each >>>>>>>>> queue. Otherwise you have a race condition between the hypervisor and >>>>>>>>> the scheduler. >>>>>>> No it's not, FLR from hypervisor is j

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread JingWen Chen
lready executed, but host will do FLR anyway >>>>>> without waiting for guest too long >>>>>> >>>>>>>> In other words I strongly think that the current SRIOV reset >>>>>>>> implementation is severely broken and wha

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread Andrey Grodzovsky
Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that we need to adjust the implementation in amdgpu to actually match the requirements. Could be that the reset sequence is question

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Christian König
Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that we need to adjust the implementation in amdgpu to actually match the requirements. Could be that the reset sequence is que

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread JingWen Chen
hes look good to me. JingWen will pull these patches and do >>>>>>>> some basic TDR test on sriov environment, and give feedback. >>>>>>>> >>>>>>>> Best wishes >>>>>>>> Emily Deng >>>>>>>> >>

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread JingWen Chen
; we are hiring software manager for CVS core team >> --------------------------- >> >> -----Original Message----- >> From: Koenig, Christian >> Sent: Tuesday, January 4, 2022 6:19 PM >> To: Chen, JingWen ; Christian Kö

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Liu, Shaoyun
Liu, Monk ; Chen, JingWen > ; Christian König > ; Grodzovsky, Andrey > ; Deng, Emily ; > dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, > Horace > Cc: dan...@ffwll.ch > Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset > protectio

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Andrey Grodzovsky
Andrey ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Offic

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Liu, Shaoyun
:19 PM > To: Chen, JingWen ; Christian König > ; Grodzovsky, Andrey > ; Deng, Emily ; Liu, > Monk ; dri-de...@lists.freedesktop.org; > amd-gfx@lists.freedesktop.org; Chen, Horace ; > Chen, JingWen > Cc: dan...@ffwll.ch > Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Andrey Grodzovsky
ktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWe

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Christian König
4, 2022 6:19 PM To: Chen, JingWen ; Christian König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Liu, Monk
t: Tuesday, January 4, 2022 6:19 PM To: Chen, JingWen ; Christian König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset p

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Christian König
ngWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review on Andrey's patch Thanks ---

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread JingWen Chen
gWen @Deng, Emily >>>>> >>>>> Please take a review on Andrey's patch >>>>> >>>>> Thanks >>>>> --- >>>>> Monk Liu | Cloud GPU & Virtualization Solut

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-03 Thread Christian König
Christian ; Grodzovsky, Andrey ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Che

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-30 Thread Andrey Grodzovsky
; dri- de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: Since now flr work is serialized against GPU resets there is no need

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-30 Thread Andrey Grodzovsky
: Thursday, December 23, 2021 6:14 PM To: Koenig, Christian ; Grodzovsky, Andrey ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-24 Thread JingWen Chen
Thursday, December 23, 2021 6:14 PM >> To: Koenig, Christian ; Grodzovsky, Andrey >> ; dri-de...@lists.freedesktop.org; amd- >> g...@lists.freedesktop.org; Chen, Horace ; Chen, >> JingWen ; Deng, Emily >> Cc: dan...@ffwll.ch >> Subject: RE: [RFC v2 8/8] drm/amd/virt:

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-24 Thread Deng, Emily
ndrey >; dri-de...@lists.freedesktop.org; amd- >g...@lists.freedesktop.org; Chen, Horace ; Chen, >JingWen ; Deng, Emily >Cc: dan...@ffwll.ch >Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection >for SRIOV > >[AMD Official Use Only] > >@Chen, Ho

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-23 Thread Liu, Shaoyun
: Liu, Monk ; Grodzovsky, Andrey ; Chen, Horace ; Koenig, Christian ; dan...@ffwll.ch Subject: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Since now flr work is serialized against GPU resets there is no need for this. Signed-off-by: Andrey Grodzovsky --- drivers

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-23 Thread Liu, Monk
; amd-gfx@lists.freedesktop.org Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: > Since now flr work is serialized against GPU resets there is no need > for this. &g

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-23 Thread Christian König
Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: Since now flr work is serialized against GPU resets there is no need for this. Signed-off-by: Andrey Grodzovsky Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 --- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c |

[RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-22 Thread Andrey Grodzovsky
Since now flr work is serialized against GPU resets there is no need for this. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 --- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 11 --- 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/amd/