Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-11 Thread JingWen Chen
t;>>> Andrey >>>> >>> I think this page fault issue can be seen even on the original tree. It's >>> just drop the concurrent GPU reset will hit it more easily. >>> >>> We may need a new way to protect the reset in SRIOV. >>> >> Hi Andrey

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-07 Thread Andrey Grodzovsky
amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that we need to adjust the implementation in amdgpu to actually match the requirements. Could be that t

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread JingWen Chen
queue. Otherwise you have a race condition between the hypervisor >>>>>>>>>> and the scheduler. >>>>>>>>>> >>>>>>>>>> Properly setting in_gpu_reset is indeed mandatory, but should happen >>>>>>&g

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread JingWen Chen
gt; >>>>>>> Andrey >>>>>>> >>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>>> >>>>>>>> Am 04.01.22 um 11:49 schrieb Liu, Mon

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread Andrey Grodzovsky
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that we need to adjust the implementation in amdgpu to actual

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-06 Thread Christian König
an König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread JingWen Chen
similar to each job timeout on each >>>>>>>>> queue. Otherwise you have a race condition between the hypervisor and >>>>>>>>> the scheduler. >>>>>>> No it's not, FLR from hypervisor is just to notify guest the hw VF FLR &

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread JingWen Chen
yway >>>>>> without waiting for guest too long >>>>>> >>>>>>>> In other words I strongly think that the current SRIOV reset >>>>>>>> implementation is severely broken and what Andrey is doing is actually >>&

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread Andrey Grodzovsky
s.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Christian König
dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that we need to adjust the implementation in amdgpu to actually match the requirements. Could be that the reset sequence is questionable in general, but I doub

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread JingWen Chen
patches and do >>>>>>>> some basic TDR test on sriov environment, and give feedback. >>>>>>>> >>>>>>>> Best wishes >>>>>>>> Emily Deng >>>>>>>> >>>>>>>> >>>>>

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread JingWen Chen
ware manager for CVS core team >> --------------- >> >> -----Original Message----- >> From: Koenig, Christian >> Sent: Tuesday, January 4, 2022 6:19 PM >> To: Chen, JingWen ; Christian König >> ;

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Liu, Shaoyun
gt; ; Grodzovsky, Andrey > ; Deng, Emily ; > dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, > Horace > Cc: dan...@ffwll.ch > Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset > protection for SRIOV > > Am 04.01.22 um 11:49 schri

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Andrey Grodzovsky
hen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review on Andrey's patch Thanks

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Liu, Shaoyun
istian König > ; Grodzovsky, Andrey > ; Deng, Emily ; Liu, > Monk ; dri-de...@lists.freedesktop.org; > amd-gfx@lists.freedesktop.org; Chen, Horace ; > Chen, JingWen > Cc: dan...@ffwll.ch > Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset > prot

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Andrey Grodzovsky
en, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review on Andre

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Christian König
Chen, JingWen ; Christian König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen,

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Liu, Monk
, January 4, 2022 6:19 PM To: Chen, JingWen ; Christian König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Christian König
mily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review on Andrey's patch Thanks --- Mon

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread JingWen Chen
ly >>>>> >>>>> Please take a review on Andrey's patch >>>>> >>>>> Thanks >>>>> --- >>>>> Monk Liu | Cloud GPU & Virtualization Solution | AMD >>&g

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-03 Thread Christian König
; Grodzovsky, Andrey ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV [AMD Official Use Only] @Chen, Horace @Chen, JingWen

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-30 Thread Andrey Grodzovsky
- de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: Since now flr work is serialized against GPU resets there is no

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-30 Thread Andrey Grodzovsky
: Thursday, December 23, 2021 6:14 PM To: Koenig, Christian ; Grodzovsky, Andrey ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen ; Deng, Emily Cc: dan...@ffwll.ch Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-24 Thread JingWen Chen
Sent: Thursday, December 23, 2021 6:14 PM >> To: Koenig, Christian ; Grodzovsky, Andrey >> ; dri-de...@lists.freedesktop.org; amd- >> g...@lists.freedesktop.org; Chen, Horace ; Chen, >> JingWen ; Deng, Emily >> Cc: dan...@ffwll.ch >> Subject: RE: [RFC v2 8/8] drm/amd/virt:

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-24 Thread Deng, Emily
ndrey >; dri-de...@lists.freedesktop.org; amd- >g...@lists.freedesktop.org; Chen, Horace ; Chen, >JingWen ; Deng, Emily >Cc: dan...@ffwll.ch >Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection >for SRIOV > >[AMD Official Use Only] > >@Chen, Ho

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-23 Thread Liu, Shaoyun
[AMD Official Use Only] I have a discussion with Andrey about this offline. It seems dangerous to remove the in_gpu_reset and reset_semm directly inside the flr_work. In the case when the reset is triggered from host side , gpu need to be locked while host perform reset after flr_work

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-23 Thread Liu, Monk
-gfx@lists.freedesktop.org Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: > Since now flr work is serialized against GPU resets there is no need > for this. >

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-23 Thread Christian König
Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: Since now flr work is serialized against GPU resets there is no need for this. Signed-off-by: Andrey Grodzovsky Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 --- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c |