t;>>> Andrey
>>>>
>>> I think this page fault issue can be seen even on the original tree. It's
>>> just drop the concurrent GPU reset will hit it more easily.
>>>
>>> We may need a new way to protect the reset in SRIOV.
>>>
>> Hi Andrey
amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen
Cc: dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Hi Jingwen,
well what I mean is that we need to adjust the implementation in amdgpu to
actually match the requirements.
Could be that t
queue. Otherwise you have a race condition between the hypervisor
>>>>>>>>>> and the scheduler.
>>>>>>>>>>
>>>>>>>>>> Properly setting in_gpu_reset is indeed mandatory, but should happen
>>>>>>&g
gt;
>>>>>>> Andrey
>>>>>>>
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>> Am 04.01.22 um 11:49 schrieb Liu, Mon
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen
Cc: dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Hi Jingwen,
well what I mean is that we need to adjust the implementation in amdgpu to
actual
an König ; Grodzovsky,
Andrey ; Deng, Emily ; Liu, Monk ;
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen
Cc: dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Hi Jingwen,
well what I mean is that
similar to each job timeout on each
>>>>>>>>> queue. Otherwise you have a race condition between the hypervisor and
>>>>>>>>> the scheduler.
>>>>>>> No it's not, FLR from hypervisor is just to notify guest the hw VF FLR
&
yway
>>>>>> without waiting for guest too long
>>>>>>
>>>>>>>> In other words I strongly think that the current SRIOV reset
>>>>>>>> implementation is severely broken and what Andrey is doing is actually
>>&
s.freedesktop.org; amd-
g...@lists.freedesktop.org;
Chen, Horace ; Chen, JingWen
; Deng, Emily
Cc: dan...@ffwll.ch
Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU
reset
protection for SRIOV
[AMD Official Use Only]
@Chen, Horace @Chen, JingWen @Deng, Emily
Please take a review
dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Hi Jingwen,
well what I mean is that we need to adjust the implementation in amdgpu to
actually match the requirements.
Could be that the reset sequence is questionable in general, but I doub
patches and do
>>>>>>>> some basic TDR test on sriov environment, and give feedback.
>>>>>>>>
>>>>>>>> Best wishes
>>>>>>>> Emily Deng
>>>>>>>>
>>>>>>>>
>>>>>
ware manager for CVS core team
>> ---------------
>>
>> -----Original Message-----
>> From: Koenig, Christian
>> Sent: Tuesday, January 4, 2022 6:19 PM
>> To: Chen, JingWen ; Christian König
>> ;
gt; ; Grodzovsky, Andrey
> ; Deng, Emily ;
> dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen,
> Horace
> Cc: dan...@ffwll.ch
> Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset
> protection for SRIOV
>
> Am 04.01.22 um 11:49 schri
hen, JingWen
; Deng, Emily
Cc: dan...@ffwll.ch
Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU
reset protection for SRIOV
[AMD Official Use Only]
@Chen, Horace @Chen, JingWen @Deng, Emily
Please take a review on Andrey's patch
Thanks
istian König
> ; Grodzovsky, Andrey
> ; Deng, Emily ; Liu,
> Monk ; dri-de...@lists.freedesktop.org;
> amd-gfx@lists.freedesktop.org; Chen, Horace ;
> Chen, JingWen
> Cc: dan...@ffwll.ch
> Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset
> prot
en, Horace ; Chen, JingWen
; Deng, Emily
Cc: dan...@ffwll.ch
Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset
protection for SRIOV
[AMD Official Use Only]
@Chen, Horace @Chen, JingWen @Deng, Emily
Please take a review on Andre
Chen, JingWen ; Christian König ; Grodzovsky,
Andrey ; Deng, Emily ; Liu, Monk ;
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen, JingWen
Cc: dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Hi Jingwen,
, January 4, 2022 6:19 PM
To: Chen, JingWen ; Christian König
; Grodzovsky, Andrey
; Deng, Emily ; Liu, Monk
; dri-de...@lists.freedesktop.org;
amd-gfx@lists.freedesktop.org; Chen, Horace ; Chen,
JingWen
Cc: dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
mily
Cc: dan...@ffwll.ch
Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
[AMD Official Use Only]
@Chen, Horace @Chen, JingWen @Deng, Emily
Please take a review on Andrey's patch
Thanks
---
Mon
ly
>>>>>
>>>>> Please take a review on Andrey's patch
>>>>>
>>>>> Thanks
>>>>> ---
>>>>> Monk Liu | Cloud GPU & Virtualization Solution | AMD
>>&g
; Grodzovsky, Andrey
; dri-de...@lists.freedesktop.org; amd-
g...@lists.freedesktop.org; Chen, Horace ; Chen,
JingWen ; Deng, Emily
Cc: dan...@ffwll.ch
Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset
protection
for SRIOV
[AMD Official Use Only]
@Chen, Horace @Chen, JingWen
-
de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky:
Since now flr work is serialized against GPU resets there is no
: Thursday, December 23, 2021 6:14 PM
To: Koenig, Christian ; Grodzovsky, Andrey
; dri-de...@lists.freedesktop.org; amd-
g...@lists.freedesktop.org; Chen, Horace ; Chen,
JingWen ; Deng, Emily
Cc: dan...@ffwll.ch
Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Sent: Thursday, December 23, 2021 6:14 PM
>> To: Koenig, Christian ; Grodzovsky, Andrey
>> ; dri-de...@lists.freedesktop.org; amd-
>> g...@lists.freedesktop.org; Chen, Horace ; Chen,
>> JingWen ; Deng, Emily
>> Cc: dan...@ffwll.ch
>> Subject: RE: [RFC v2 8/8] drm/amd/virt:
ndrey
>; dri-de...@lists.freedesktop.org; amd-
>g...@lists.freedesktop.org; Chen, Horace ; Chen,
>JingWen ; Deng, Emily
>Cc: dan...@ffwll.ch
>Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
>for SRIOV
>
>[AMD Official Use Only]
>
>@Chen, Ho
[AMD Official Use Only]
I have a discussion with Andrey about this offline. It seems dangerous to
remove the in_gpu_reset and reset_semm directly inside the flr_work. In the
case when the reset is triggered from host side , gpu need to be locked while
host perform reset after flr_work
-gfx@lists.freedesktop.org
Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection
for SRIOV
Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky:
> Since now flr work is serialized against GPU resets there is no need
> for this.
>
Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky:
Since now flr work is serialized against GPU resets
there is no need for this.
Signed-off-by: Andrey Grodzovsky
Acked-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 ---
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c |
28 matches
Mail list logo