RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-04 Thread Liu, Monk
[AMD Official Use Only]

>> See the FLR request from the hypervisor is just another source of signaling 
>> the need for a reset, similar to each job timeout on each queue. Otherwise 
>> you have a race condition between the hypervisor and the scheduler.
No it's not, FLR from hypervisor is just to notify guest the hw VF FLR is about 
to start or was already executed, but host will do FLR anyway without waiting 
for guest too long

>> In other words I strongly think that the current SRIOV reset implementation 
>> is severely broken and what Andrey is doing is actually fixing it.
It makes the code to crash ... how could it be a fix ?

I'm afraid the patch is NAK from me,  but it is welcome if the cleanup do not 
ruin the logic, Andry or jingwen can try it if needed.

Thanks 
---
Monk Liu | Cloud GPU & Virtualization Solution | AMD
---
we are hiring software manager for CVS core team
---

-Original Message-
From: Koenig, Christian  
Sent: Tuesday, January 4, 2022 6:19 PM
To: Chen, JingWen ; Christian König 
; Grodzovsky, Andrey 
; Deng, Emily ; Liu, Monk 
; dri-devel@lists.freedesktop.org; 
amd-...@lists.freedesktop.org; Chen, Horace ; Chen, 
JingWen 
Cc: dan...@ffwll.ch
Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection 
for SRIOV

Hi Jingwen,

well what I mean is that we need to adjust the implementation in amdgpu to 
actually match the requirements.

Could be that the reset sequence is questionable in general, but I doubt so at 
least for now.

See the FLR request from the hypervisor is just another source of signaling the 
need for a reset, similar to each job timeout on each queue. Otherwise you have 
a race condition between the hypervisor and the scheduler.

Properly setting in_gpu_reset is indeed mandatory, but should happen at a 
central place and not in the SRIOV specific code.

In other words I strongly think that the current SRIOV reset implementation is 
severely broken and what Andrey is doing is actually fixing it.

Regards,
Christian.

Am 04.01.22 um 10:07 schrieb JingWen Chen:
> Hi Christian,
> I'm not sure what do you mean by "we need to change SRIOV not the driver".
>
> Do you mean we should change the reset sequence in SRIOV? This will be a huge 
> change for our SRIOV solution.
>
>  From my point of view, we can directly use amdgpu_device_lock_adev 
> and amdgpu_device_unlock_adev in flr_work instead of try_lock since no one 
> will conflict with this thread with reset_domain introduced.
> But we do need the reset_sem and adev->in_gpu_reset to keep device untouched 
> via user space.
>
> Best Regards,
> Jingwen Chen
>
> On 2022/1/3 下午6:17, Christian König wrote:
>> Please don't. This patch is vital to the cleanup of the reset procedure.
>>
>> If SRIOV doesn't work with that we need to change SRIOV and not the driver.
>>
>> Christian.
>>
>> Am 30.12.21 um 19:45 schrieb Andrey Grodzovsky:
>>> Sure, I guess i can drop this patch then.
>>>
>>> Andrey
>>>
>>> On 2021-12-24 4:57 a.m., JingWen Chen wrote:
>>>> I do agree with shaoyun, if the host find the gpu engine hangs first, and 
>>>> do the flr, guest side thread may not know this and still try to access 
>>>> HW(e.g. kfd is using a lot of amdgpu_in_reset and reset_sem to identify 
>>>> the reset status). And this may lead to very bad result.
>>>>
>>>> On 2021/12/24 下午4:58, Deng, Emily wrote:
>>>>> These patches look good to me. JingWen will pull these patches and do 
>>>>> some basic TDR test on sriov environment, and give feedback.
>>>>>
>>>>> Best wishes
>>>>> Emily Deng
>>>>>
>>>>>
>>>>>
>>>>>> -Original Message-
>>>>>> From: Liu, Monk 
>>>>>> Sent: Thursday, December 23, 2021 6:14 PM
>>>>>> To: Koenig, Christian ; Grodzovsky, 
>>>>>> Andrey ; 
>>>>>> dri-devel@lists.freedesktop.org; amd- g...@lists.freedesktop.org; 
>>>>>> Chen, Horace ; Chen, JingWen 
>>>>>> ; Deng, Emily 
>>>>>> Cc: dan...@ffwll.ch
>>>>>> Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset 
>>>>>> protection for SRIOV
>>>>>>
>>>>>> [AMD Official Use Only]
>>>>>>
>>>>>> @Chen, Horace @Chen, JingWen @Deng, Emily
>>>>>>
>>>>>> Please ta

RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2021-12-23 Thread Liu, Monk
[AMD Official Use Only]

@Chen, Horace @Chen, JingWen @Deng, Emily

Please take a review on Andrey's patch 

Thanks 
---
Monk Liu | Cloud GPU & Virtualization Solution | AMD
---
we are hiring software manager for CVS core team
---

-Original Message-
From: Koenig, Christian  
Sent: Thursday, December 23, 2021 4:42 PM
To: Grodzovsky, Andrey ; 
dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace 

Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection 
for SRIOV

Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky:
> Since now flr work is serialized against  GPU resets there is no need 
> for this.
>
> Signed-off-by: Andrey Grodzovsky 

Acked-by: Christian König 

> ---
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 ---
>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 11 ---
>   2 files changed, 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
> b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> index 487cd654b69e..7d59a66e3988 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> @@ -248,15 +248,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct 
> *work)
>   struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, 
> virt);
>   int timeout = AI_MAILBOX_POLL_FLR_TIMEDOUT;
>   
> - /* block amdgpu_gpu_recover till msg FLR COMPLETE received,
> -  * otherwise the mailbox msg will be ruined/reseted by
> -  * the VF FLR.
> -  */
> - if (!down_write_trylock(>reset_sem))
> - return;
> -
>   amdgpu_virt_fini_data_exchange(adev);
> - atomic_set(>in_gpu_reset, 1);
>   
>   xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);
>   
> @@ -269,9 +261,6 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct 
> *work)
>   } while (timeout > 1);
>   
>   flr_done:
> - atomic_set(>in_gpu_reset, 0);
> - up_write(>reset_sem);
> -
>   /* Trigger recovery for world switch failure if no TDR */
>   if (amdgpu_device_should_recover_gpu(adev)
>   && (!amdgpu_device_has_job_running(adev) || diff --git 
> a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
> b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> index e3869067a31d..f82c066c8e8d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> @@ -277,15 +277,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct 
> *work)
>   struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, 
> virt);
>   int timeout = NV_MAILBOX_POLL_FLR_TIMEDOUT;
>   
> - /* block amdgpu_gpu_recover till msg FLR COMPLETE received,
> -  * otherwise the mailbox msg will be ruined/reseted by
> -  * the VF FLR.
> -  */
> - if (!down_write_trylock(>reset_sem))
> - return;
> -
>   amdgpu_virt_fini_data_exchange(adev);
> - atomic_set(>in_gpu_reset, 1);
>   
>   xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);
>   
> @@ -298,9 +290,6 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct 
> *work)
>   } while (timeout > 1);
>   
>   flr_done:
> - atomic_set(>in_gpu_reset, 0);
> - up_write(>reset_sem);
> -
>   /* Trigger recovery for world switch failure if no TDR */
>   if (amdgpu_device_should_recover_gpu(adev)
>   && (!amdgpu_device_has_job_running(adev) ||


RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-09-06 Thread Liu, Monk
[AMD Official Use Only]

> I'm fearing that just repeating what Alex said, but to make it clear 
> once more: That is *not* necessary!
>
> The shared repository is owned by upstream maintainers and they are 
> usually free to do restructuring work without getting acknowledge from 
> every single driver maintainer.

Hi Daniel

Anyway thanks for officially confirm to me of working model & policy in 
community, I don't want to put my opinion here due to that's not my call to 
change no matter how.
I only want to let this diagnostic TDR scheme going to a good end for AMD or 
even for all DRM vendor.

How about this way, we still have a final patch not landed in DRM scheduler and 
I would like jingwen to present it to you and AlexD/Christian/Andrey,  I 
believe you will have concerns or objections regarding this patch, but that's 
fine, let us figure it out together, how to make it acceptable by you and other 
vendors that working with DRM scheduler.

P.S.:  I had to repeat myself again, we are not popping up new idea suddenly, 
it is disconnection issue, we didn't have changes (or plan to have changes) in 
DRM scheduler before, but eventually we found we must make job_timeout and 
sched_main to work in a serialized otherwise it won't work based on current 
scheduler's code structure.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Friday, September 3, 2021 12:11 AM
To: Koenig, Christian 
Cc: Liu, Monk ; Dave Airlie ; Alex Deucher 
; Grodzovsky, Andrey ; Chen, 
JingWen ; DRI Development 
; amd-...@lists.freedesktop.org
Subject: Re: [diagnostic TDR mode patches] unify our solution 
opinions/suggestions in one thread

On Thu, Sep 2, 2021 at 1:00 PM Christian König  wrote:
>
> Hi Monk,
>
> Am 02.09.21 um 07:52 schrieb Liu, Monk:
> > [AMD Official Use Only]
> >
> > I'm not sure I can add much to help this along, I'm sure Alex has 
> > some internal training, Once your driver is upstream, it belongs to 
> > upstream, you can maintain it, but you no longer control it 100%, it's a 
> > tradeoff, it's not one companies always understand.
> > Usually people are fine developing away internally, but once interaction 
> > with other parts of the kernel/subsystem is required they have the 
> > realisation that they needed to work upstream 6 months earlier.
> > The best time to interact with upstream was 6 months ago, the second best 
> > time is now.
> > <<<
> >
> > Daniel/AlexD
> >
> > I didn't mean your changes on AMD driver need my personal approval 
> > or review ... and  I'm totally already get used that our driver is not 100% 
> > under control by AMDers, but supposedly any one from community (including 
> > you) who tend to change AMD's driver need at least to get approvement from 
> > someone in AMD, e.g.: AlexD or Christian, doesn't that reasonable?
>
> I'm fearing that just repeating what Alex said, but to make it clear 
> once more: That is *not* necessary!
>
> The shared repository is owned by upstream maintainers and they are 
> usually free to do restructuring work without getting acknowledge from 
> every single driver maintainer.
>
> Anybody can of course technically object to upstream design decisions, 
> but that means that you need to pay attention to the mailing lists in 
> the first place.
>
> > just like we need your approve if we try to modify DRM-sched, or need 
> > panfrost's approval if we need to change panfrost code ...
> >
> > by only CC AMD's engineers looks not quite properly, how do you know if 
> > your changes (on AMD code part) are conflicting with AMD's on-going 
> > internal features/refactoring or not ?
>
> Well because AMD is supposed to work in public as much as possible and 
> ask upstream before doing changes to the code base.
>
> Additional to that design decisions are supposed to be discussed on 
> the mailing list and *not* internally.

Yeah I'm honestly really surprised about the course of this discussion here. 
With Alex, Christian and others amd has a lot of folks with years/decades of 
experience in how to collaborate in upstream, when to pull in others 
proactively and when that's not needed, and in general how to plan upstream 
work with the lest amount of risk and surprises.

I think step zero here needs to be some training at amd and then re-planning 
this effort, before we get back to technical stuff.
Otherwise we'll just get bogged down in pain because expectations about the 
process don't pan out.
-Daniel

>
> Regards,
> Christian.
>
> >
> > Thanks
> >
> > --
> > Monk Liu | Cloud-GPU Core team
> > --

RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-09-01 Thread Liu, Monk
[AMD Official Use Only]

>>>
I'm not sure I can add much to help this along, I'm sure Alex has some internal 
training,
Once your driver is upstream, it belongs to upstream, you can maintain it, but 
you no longer control it 100%, it's a tradeoff, it's not one companies always 
understand.
Usually people are fine developing away internally, but once interaction with 
other parts of the kernel/subsystem is required they have the realisation that 
they needed to work upstream 6 months earlier.
The best time to interact with upstream was 6 months ago, the second best time 
is now.
<<<

Daniel/AlexD

I didn't mean your changes on AMD driver need my personal approval or review 
... and  I'm totally already get used that our driver is not 100% under control 
by AMDers, 
but supposedly any one from community (including you) who tend to change AMD's 
driver need at least to get approvement from someone in AMD, e.g.: AlexD or 
Christian, doesn't that reasonable?
just like we need your approve if we try to modify DRM-sched, or need 
panfrost's approval if we need to change panfrost code ...

by only CC AMD's engineers looks not quite properly, how do you know if your 
changes (on AMD code part) are conflicting with AMD's on-going internal 
features/refactoring or not ?

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Dave Airlie  
Sent: Thursday, September 2, 2021 2:51 AM
To: Alex Deucher 
Cc: Liu, Monk ; Daniel Vetter ; Koenig, 
Christian ; Grodzovsky, Andrey 
; Chen, JingWen ; DRI 
Development ; amd-...@lists.freedesktop.org
Subject: Re: [diagnostic TDR mode patches] unify our solution 
opinions/suggestions in one thread

On Thu, 2 Sept 2021 at 01:20, Alex Deucher  wrote:
>
> On Wed, Sep 1, 2021 at 6:19 AM Liu, Monk  wrote:
> >
> > [AMD Official Use Only]
> >
> > Daniel
> >
> > From the link you share it looks you(or someone else) have quite a bunch 
> > patches that changes DRM_SCHED or even amdgpu, by that case before they are 
> > merged to kernel tree I'm wondering if any AMD develop reviewed them ?
> >
> > They looks to me somehow conflicting with what we changed in our repo
> >
> > It is really a chaos for AMDer if someone else out side of AMD changes our 
> > kernel driver (or/and scheduler) without reviewed by AMDer, just like we 
> > are requiring your review if we tend to change scheduler's logic here 
> >
> > This one changes AMD's code: 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flo
> > re.kernel.org%2Fdri-devel%2F20210625133327.2598825-2-boris.brezillon
> > %40collabora.com%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C6c507d18
> > d65341ef53bb08d96d7976e6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> > 7C637661190727875969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJ
> > QIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=BWJSkKN
> > y2%2BwjxbQrfxGPzuJ5PBpBwB4aV0ZH6QoJGEg%3Dreserved=0
> > And I didn't see any reviewed-by from AMDers ...
> >
> > This one also touches AMD's code: 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flo
> > re.kernel.org%2Fdri-devel%2F20200604081224.863494-12-daniel.vetter%4
> > 0ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C6c507d18d65341e
> > f53bb08d96d7976e6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63766
> > 1190727885929%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2
> > luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=%2F8vIVXCWjHkM
> > 56pcYI9EvuzhbsZhV9WczkKaBJE67KQ%3Dreserved=0
> > Which is conflicting with one patch we submitted (in our repo 
> > rightnow), and neither see AMDder gave a review-by on this one (let 
> > me know if I missed it)
> >
>
> Monk, this is not how upstream works.  You need to participate.
> That's how communities work.  There's a reason all these discussions 
> happen on public mailing lists.  The patch author can't be expected to 
> know every person on every vendor team to CC with a patch.  If you 
> have concerns, you need to raise them when the patches are being 
> discussed.
>

I'm not sure I can add much to help this along, I'm sure Alex has some internal 
training,

Once your driver is upstream, it belongs to upstream, you can maintain it, but 
you no longer control it 100%, it's a tradeoff, it's not one companies always 
understand.

Usually people are fine developing away internally, but once interaction with 
other parts of the kernel/subsystem is required they have the realisation that 
they needed to work upstream 6 months earlier.

The best time to interact with upstream was 6 months ago, the second best time 
is now.

Dave.


RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-09-01 Thread Liu, Monk
[AMD Official Use Only]

>> For me your project exists since a few weeks at most, because that is when 
>> your team showed up on dri-devel. That you already spent 6 months on this 
>> within amd, on a code area that very much affects shared code, without 
>> kicking of any thread on dri-devel isn't great, but also not something we 
>> can fix, since time machines don't exist.
This is partially true, because in the past months our change only resident in 
AMD driver, it is till now that we found we had to make changes in SCHED level 

>> Your team hasn't been active in any of these discussions, but now suddenly 
>> pops up out of nowhere and demands that your approach needs to land asap. 
>> That's really not how upstream works.
if our changes on DRI level part cannot get merged soon that's fine, we can 
discuss more, but that's not suddenly pops up from nowhere, we already worked 
on it for months inside of AMD drivers.

>> I think the best way forward is to go through the above process again and 
>> essentially restart. So submit a complete patch series with problem 
>> descriptions, solution you picked, why you picked that, all the amdgpu 
>> patches to get there and the core patches too. Since it sounds like a bunch 
>> of this has all landed already you probably need a patch 1 that goes back to 
>> 6 months ago so that we can see the overall direction, and review whether 
>> that's the right one or not.

We are not objecting this approach,  we didn't do that because previously all 
what we need to do is resident inside AMD driver ...   because we try to avoid 
change DRI/DRM interface part ... 

For the patches you shows to us with links I'm sorry that due to some IT 
infrastructure reason me and my team didn't see it before (we kind of work in 
AMD repo ... the patches you shows didn't get merged in our repo yet...)
One thing I also want to emphasis here: if any code need change inside AMD 
driver please always let us know and review.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: amd-gfx  On Behalf Of Daniel Vetter
Sent: Wednesday, September 1, 2021 4:18 PM
To: Liu, Monk 
Cc: Koenig, Christian ; Grodzovsky, Andrey 
; Chen, JingWen ; DRI 
Development ; amd-...@lists.freedesktop.org
Subject: Re: [diagnostic TDR mode patches] unify our solution 
opinions/suggestions in one thread

Hi Monk,

On Wed, Sep 1, 2021 at 3:23 AM Liu, Monk  wrote:
>
> [AMD Official Use Only]
>
>
> Hi Daniel/Christian/Andrey
>
>
>
> It looks the voice from you three are spread over those email floods to me, 
> the feature we are working on (diagnostic TDR scheme) is pending there for 
> more than 6 month (we started it from feb 2021).

For me your project exists since a few weeks at most, because that is when your 
team showed up on dri-devel. That you already spent 6 months on this within 
amd, on a code area that very much affects shared code, without kicking of any 
thread on dri-devel isn't great, but also not something we can fix, since time 
machines don't exist.

So we have to make the best out of the situation and move ahead where we are. 
From my understanding you've done a bunch of changes to the scheduler code. As 
far as I can see there's been two related things your team has done:

- remove some allocations from scheduler code, because that can lead to 
deadlocks. I've kicked up this topic quite a while ago here

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20200604081224.863494-10-daniel.vetter%40ffwll.ch%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=pG5sG5pjVXEAMaahvfNS11VwbHkYWRuWrtHFXM9mEyo%3Dreserved=0

This is just one patch of the entire series. This is an area where we really 
need a consistent solution across all drm/sched drivers, not something that 
individual drivers just fix in their own way.

- the other one is the timeout issue for the patches you cite here.
Again there's been discussion on this on dri-devel with Boris from panfrost 
about how we can handle at least some of the races in tdr.
That resulted in lots of discussions and documentation improvements.
Those patches are merged now, link
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210625133327.2598825-2-boris.brezillon%40collabora.com%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=m6U6tJbX2x38xiwQXE1oV0sz2BxXZfPlcouyqIqPZNU%3Dreser

RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-09-01 Thread Liu, Monk
[AMD Official Use Only]

Daniel 

>From the link you share it looks you(or someone else) have quite a bunch 
>patches that changes DRM_SCHED or even amdgpu, by that case before they are 
>merged to kernel tree I'm wondering if any AMD develop reviewed them ?

They looks to me somehow conflicting with what we changed in our repo 

It is really a chaos for AMDer if someone else out side of AMD changes our 
kernel driver (or/and scheduler) without reviewed by AMDer, just like we are 
requiring your review if we tend to change scheduler's logic here  

This one changes AMD's code: 
https://lore.kernel.org/dri-devel/20210625133327.2598825-2-boris.brezil...@collabora.com/
And I didn't see any reviewed-by from AMDers ...

This one also touches AMD's code: 
https://lore.kernel.org/dri-devel/20200604081224.863494-12-daniel.vet...@ffwll.ch/
Which is conflicting with one patch we submitted (in our repo rightnow), and 
neither see AMDder gave a review-by on this one (let me know if I missed it)

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: amd-gfx  On Behalf Of Daniel Vetter
Sent: Wednesday, September 1, 2021 4:18 PM
To: Liu, Monk 
Cc: Koenig, Christian ; Grodzovsky, Andrey 
; Chen, JingWen ; DRI 
Development ; amd-...@lists.freedesktop.org
Subject: Re: [diagnostic TDR mode patches] unify our solution 
opinions/suggestions in one thread

Hi Monk,

On Wed, Sep 1, 2021 at 3:23 AM Liu, Monk  wrote:
>
> [AMD Official Use Only]
>
>
> Hi Daniel/Christian/Andrey
>
>
>
> It looks the voice from you three are spread over those email floods to me, 
> the feature we are working on (diagnostic TDR scheme) is pending there for 
> more than 6 month (we started it from feb 2021).

For me your project exists since a few weeks at most, because that is when your 
team showed up on dri-devel. That you already spent 6 months on this within 
amd, on a code area that very much affects shared code, without kicking of any 
thread on dri-devel isn't great, but also not something we can fix, since time 
machines don't exist.

So we have to make the best out of the situation and move ahead where we are. 
From my understanding you've done a bunch of changes to the scheduler code. As 
far as I can see there's been two related things your team has done:

- remove some allocations from scheduler code, because that can lead to 
deadlocks. I've kicked up this topic quite a while ago here

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20200604081224.863494-10-daniel.vetter%40ffwll.ch%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=pG5sG5pjVXEAMaahvfNS11VwbHkYWRuWrtHFXM9mEyo%3Dreserved=0

This is just one patch of the entire series. This is an area where we really 
need a consistent solution across all drm/sched drivers, not something that 
individual drivers just fix in their own way.

- the other one is the timeout issue for the patches you cite here.
Again there's been discussion on this on dri-devel with Boris from panfrost 
about how we can handle at least some of the races in tdr.
That resulted in lots of discussions and documentation improvements.
Those patches are merged now, link
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210625133327.2598825-2-boris.brezillon%40collabora.com%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=m6U6tJbX2x38xiwQXE1oV0sz2BxXZfPlcouyqIqPZNU%3Dreserved=0

There's been more than just this, also quite some doc patches from Boris that 
explain how it's all supposed to work and be race-free.
Again your driver isn't the only one with interesting TDR races.

Your team hasn't been active in any of these discussions, but now suddenly pops 
up out of nowhere and demands that your approach needs to land asap. That's 
really not how upstream works.

The other thing where I'm struggling is that there's a lot of missing context 
for outsiders. The patches sometimes come with zero commit message, for tricky 
concurrency bugs. And there's no context with what you've done already on the 
amdgpu side (since that never showed up on dri-devel), which makes constructive 
discussions here really hard.

Now fixing these bugs is obviously good, but the way this is supposed to work 
when touching shared infrastructure is:

- Before you start merging anything kick off an RFC thread on dri-devel (or 
whatever the topic really is about) about the problem you have and how your 
trying to solve it.

RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

In the previous discussion, you guys stated that we should drop the 
"kthread_should_park" in cleanup_job.

@@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
{
struct drm_sched_job *job, *next;

-   /*
-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
-   return NULL;

But I suddenly have a question here: if return the timedout job no matter 
kthread_should_park() or not, then we are backing to the original problem 
again: that the timedout_job is suddenly signaling and cleanup_job still 
returns it to sched_main and job is freed while it is still handling by 
vendor's timeout callback

If we return NULL when kthread_should_park() in cleanup_job, we can prevent 
above scenario from happening: once a job is processed by job_timedout we can 
stop its scheduler, and after that even this job suddenly signaled the 
cleanup_job won't return it so sched_main won't free it in parallel ...

What do you think ?
Thanks

--
Monk Liu | Cloud-GPU Core team
----------

From: Liu, Monk
Sent: Wednesday, September 1, 2021 9:23 AM
To: Koenig, Christian ; Grodzovsky, Andrey 
; Daniel Vetter ; Chen, JingWen 

Cc: DRI Development ; 
amd-...@lists.freedesktop.org
Subject: [diagnostic TDR mode patches] unify our solution opinions/suggestions 
in one thread


[AMD Official Use Only]

Hi Daniel/Christian/Andrey

It looks the voice from you three are spread over those email floods to me, the 
feature we are working on (diagnostic TDR scheme) is pending there for more 
than 6 month (we started it from feb 2021).

Honestly speaking the email ways that we are using now is not friendly and 
quite painful to me 
Can we try to put all our opinions, suggestions, or even objects here together, 
let's go through them one by one, it's too hard for us to reply each email on 
different questions .

For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)

This is a fixing patch on the timeout timer in scheduler, can we complete this 
one first ? it should already resolved all the questions and suggestions.

For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

I think I already explained the questions raised by Daniel in other thread , 
regarding why I use __kthread_should_park()
For other aspects, can we put all our opinion synthesized here ?

Thanks !

--
Monk Liu | Cloud-GPU Core team
--



RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

That' really matter in practice, when two jobs from different process scheduled 
to the ring close to each other, if we don't discriminate A from B then B will 
be considered a bad job due to A's timeout, which will force B's process to 
exit (e.g.: X server)

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 9:09 PM
To: Koenig, Christian 
Cc: Grodzovsky, Andrey ; Christian König 
; Liu, Monk ; 
amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote:
> Yeah, that's what I meant with that the start of processing a job is a 
> bit swampy defined.
> 
> Jobs overload, but we simply don't have another good indicator that a 
> job started except that the previous one completed.
> 
> It's still better than starting the timer when pushing the job to the 
> ring buffer, because that is completely off.

Not sure this matters that much really in practice, and in some cases we might 
want to actually just reset it all if the built up backlog is way too long.

For anything that really runs way too long you want preempt-ctx/revoke fences 
anyway, not end-of-cs fences with tdr.
-Daniel

> 
> Christian.
> 
> Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
> > As I mentioned to Monk before - what about cases such as in this 
> > test - 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > tlab.freedesktop.org%2Fmesa%2Fdrm%2F-%2Fcommit%2Fbc21168fa924d3fc4a0
> > 00492e861f50a1a135b25data=04%7C01%7CMonk.Liu%40amd.com%7Cbd1847
> > 4429e34f8eaac208d96c80710e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C
> > 0%7C637660121179715855%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1WTD3
> > opiBtT29bbbqJub5nfhWgX5vMNppiFKgWDe%2FoQ%3Dreserved=0
> > 
> > Here you don't have serialized sequence where when jobs finishes 
> > processing and second starts, they execute together  concurrently - 
> > for those cases seems to me restarting the timer for the second job 
> > from scratch will let it hang much longer then allowed by TO value.
> > 
> > Andrey
> > 
> > On 2021-08-27 10:29 a.m., Christian König wrote:
> > > I don't think that makes sense.
> > > 
> > > See we don't want to start the time when the job is inserted into 
> > > the ring buffer, but rather when it starts processing.
> > > 
> > > Starting processing is a bit swampy defined, but just starting the 
> > > timer when the previous job completes should be fine enough.
> > > 
> > > Christian.
> > > 
> > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
> > > > The TS represents the point in time when the job was inserted 
> > > > into the pending list.
> > > > I don't think it matters when it actually starts to be 
> > > > processed, what matters is when this job was inserted into 
> > > > pending list because right at that point you arm the TO timer 
> > > > (when no other is running already) and so when the previous job 
> > > > completes and you cancel and rearm again you can use that TS 
> > > > from the next job in pending list to calculate how much time has 
> > > > actually left for it to run before TDR must be initiated and not 
> > > > just give it again full TO value to run even if it has already 
> > > > been running for a while.
> > > > 
> > > > Also, i am not sure also about the assumption that what we 
> > > > measure is processing by HW, what we measure is from the moment 
> > > > it was scheduled to ring to the moment the job completed (EOP 
> > > > event). At least that what our TDR timer measures and so it 
> > > > makes sense to set the TS at this point.
> > > > 
> > > > Andrey
> > > > 
> > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote:
> > > > > [AMD Official Use Only]
> > > > > 
> > > > > what is that 'ts' representing for ? it looks to me the 
> > > > > jiffies that it get scheduled to the ring,  but a job 
> > > > > scheduled to the ring doesn't represent it's being processed 
> > > > > by hw.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > --
> > > > > M

RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

Okay, I will reprepare this patch

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 9:02 PM
To: Liu, Monk 
Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Chen, 
Jingwen 
Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote:
> Can we please have some actual commit message here, with detailed 
> explanation of the race/bug/whatever, how you fix it and why this is 
> the best option?
> 
> On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> > tested-by: jingwen chen 
> > Signed-off-by: Monk Liu 
> > Signed-off-by: jingwen chen 
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 24 
> > 
> >  1 file changed, 4 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index ecf8140..894fdb24 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> > sched = container_of(work, struct drm_gpu_scheduler, 
> > work_tdr.work);
> >  
> > /* Protects against concurrent deletion in 
> > drm_sched_get_cleanup_job */
> > +   if (!__kthread_should_park(sched->thread))
> 
> This is a __ function, i.e. considered internal, and it's lockless 
> atomic, i.e. unordered. And you're not explaining why this works.
> 
> Iow it's probably buggy, and an just unconditionally parking the 
> kthread is probably the right thing to do. If it's not the right thing 
> to do, there's a bug here for sure.

Also why don't we reuse the function drivers already have to stop a scheduler 
thread? We seem to have two kthread_park now, that's probably one too much.
-Daniel

> > +   kthread_park(sched->thread);
> > +
> > spin_lock(>job_list_lock);
> > job = list_first_entry_or_null(>pending_list,
> >struct drm_sched_job, list);
> >  
> > if (job) {
> > -   /*
> > -* Remove the bad job so it cannot be freed by concurrent
> > -* drm_sched_cleanup_jobs. It will be reinserted back after 
> > sched->thread
> > -* is parked at which point it's safe.
> > -*/
> > -   list_del_init(>list);
> > spin_unlock(>job_list_lock);
> >  
> > +   /* vendor's timeout_job should call drm_sched_start() */
> > status = job->sched->ops->timedout_job(job);
> >  
> > /*
> > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> > struct drm_sched_job *bad)
> > kthread_park(sched->thread);
> >  
> > /*
> > -* Reinsert back the bad job here - now it's safe as
> > -* drm_sched_get_cleanup_job cannot race against us and release the
> > -* bad job at this point - we parked (waited for) any in progress
> > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > -* now until the scheduler thread is unparked.
> > -*/
> > -   if (bad && bad->sched == sched)
> > -   /*
> > -* Add at the head of the queue to reflect it was the earliest
> > -* job extracted.
> > -*/
> > -   list_add(>list, >pending_list);
> > -
> > -   /*
> >  * Iterate the job list from later to  earlier one and either deactive
> >  * their HW callbacks or remove them from pending list if they already
> >  * signaled.
> > --
> > 2.7.4
> > 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76
> b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6376601170
> 51194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL
> CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLg
> KeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0

--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660117051194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLgKeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0


[diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

Hi Daniel/Christian/Andrey

It looks the voice from you three are spread over those email floods to me, the 
feature we are working on (diagnostic TDR scheme) is pending there for more 
than 6 month (we started it from feb 2021).

Honestly speaking the email ways that we are using now is not friendly and 
quite painful to me 
Can we try to put all our opinions, suggestions, or even objects here together, 
let's go through them one by one, it's too hard for us to reply each email on 
different questions .

For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)

This is a fixing patch on the timeout timer in scheduler, can we complete this 
one first ? it should already resolved all the questions and suggestions.

For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

I think I already explained the questions raised by Daniel in other thread , 
regarding why I use __kthread_should_park()
For other aspects, can we put all our opinion synthesized here ?

Thanks !

--
Monk Liu | Cloud-GPU Core team
--



RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

>> Also why don't we reuse the function drivers already have to stop a 
>> scheduler thread? We seem to have two kthread_park now, that's probably one 
>> too much.
Are you referring to drm_sched_stop ?

That's  different, we don't need the logic from it, see that it go through 
pending list and remove all callbacks , etc... meanwhile vendor's timeout 
callback will call drm_sched_stop in a proper way,
All we want in my patch is to simply park scheduler,
Besides, even you call drm_sched_stop in job_timeout you still run into the 
warning issue I hit.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 9:02 PM
To: Liu, Monk 
Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Chen, 
Jingwen 
Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote:
> Can we please have some actual commit message here, with detailed 
> explanation of the race/bug/whatever, how you fix it and why this is 
> the best option?
> 
> On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> > tested-by: jingwen chen 
> > Signed-off-by: Monk Liu 
> > Signed-off-by: jingwen chen 
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 24 
> > 
> >  1 file changed, 4 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index ecf8140..894fdb24 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> > sched = container_of(work, struct drm_gpu_scheduler, 
> > work_tdr.work);
> >  
> > /* Protects against concurrent deletion in 
> > drm_sched_get_cleanup_job */
> > +   if (!__kthread_should_park(sched->thread))
> 
> This is a __ function, i.e. considered internal, and it's lockless 
> atomic, i.e. unordered. And you're not explaining why this works.
> 
> Iow it's probably buggy, and an just unconditionally parking the 
> kthread is probably the right thing to do. If it's not the right thing 
> to do, there's a bug here for sure.

Also why don't we reuse the function drivers already have to stop a scheduler 
thread? We seem to have two kthread_park now, that's probably one too much.
-Daniel

> > +   kthread_park(sched->thread);
> > +
> > spin_lock(>job_list_lock);
> > job = list_first_entry_or_null(>pending_list,
> >struct drm_sched_job, list);
> >  
> > if (job) {
> > -   /*
> > -* Remove the bad job so it cannot be freed by concurrent
> > -* drm_sched_cleanup_jobs. It will be reinserted back after 
> > sched->thread
> > -* is parked at which point it's safe.
> > -*/
> > -   list_del_init(>list);
> > spin_unlock(>job_list_lock);
> >  
> > +   /* vendor's timeout_job should call drm_sched_start() */
> > status = job->sched->ops->timedout_job(job);
> >  
> > /*
> > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> > struct drm_sched_job *bad)
> > kthread_park(sched->thread);
> >  
> > /*
> > -* Reinsert back the bad job here - now it's safe as
> > -* drm_sched_get_cleanup_job cannot race against us and release the
> > -* bad job at this point - we parked (waited for) any in progress
> > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > -* now until the scheduler thread is unparked.
> > -*/
> > -   if (bad && bad->sched == sched)
> > -   /*
> > -* Add at the head of the queue to reflect it was the earliest
> > -* job extracted.
> > -*/
> > -   list_add(>list, >pending_list);
> > -
> > -   /*
> >  * Iterate the job list from later to  earlier one and either deactive
> >  * their HW callbacks or remove them from pending list if they already
> >  * signaled.
> > --
> > 2.7.4
> > 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76
>

RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

>> This is a __ function, i.e. considered internal, and it's lockless atomic, 
>> i.e. unordered. And you're not explaining why this works.

It's not a traditional habit from what I can see that put explain in code, but 
we can do that in mails ,
We want to park the scheduler in job_timeout to serialize the job accessing 
from both sched and TO handler , but inside vendor's callback timeout_job at 
least both panfrost and amd 
They both call drm_sched_stop() on all schedulers.  

If we unconditionally call "kthread_park" in job_timedout  then the bailing 
job's timedout will try to call "kthread_park" again on its scheduler and 
introduce "warning"

The scenario is :
1,Job1 on sched1 triggers timedout, and sched1 is parked,
2,vendor callback runs, it will usually stop all schedulers.
3,Job2 on sched2 triggers timedout, so the job_timedout also try to park 
sched2, but sched2 was stopped already by above step.  (job2's timeout is 
introduced by job1, or by other VF)
  ---So there will be "warning" in kernel log from above step... after 
this "__kthread_should_park()" here we can avoid the warning, that's the only 
reason I need this __function.
4,Before vendor callback exit, it will unpark all schedulers.

>From another hand if we don't do the kthread_park() and still delete the job 
>here (drop deleting/reinserting the job from pending_list  is what we want), 
>we still have a small windows to hit the race issue: 
That cleanup_job from sched thread is freeing the job while job is under 
processing by job_timedout or vendor's call back.

And the reason we want to avoid deleting/reinserting the timedout job is 
because we (amd vendor) have our own way to do a diagnostic on all jobs in 
pending list from all scheduler, we want to cherry-pick the real bad job 
>From all scheduler's pending list that causes this JOB TIMEOUT.

Besides, it is also much reasonable to park scheduler when job_timedout is 
running there, they should exclusively access those common members like 
sched_job. (due to spin_lock is off before running into vendor's calback)

Hope I explained ourselves well enough.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-----
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 8:59 PM
To: Liu, Monk 
Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Chen, 
Jingwen 
Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

Can we please have some actual commit message here, with detailed explanation 
of the race/bug/whatever, how you fix it and why this is the best option?

On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> tested-by: jingwen chen 
> Signed-off-by: Monk Liu 
> Signed-off-by: jingwen chen 
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 24 
>  1 file changed, 4 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index ecf8140..894fdb24 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>   sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>  
>   /* Protects against concurrent deletion in drm_sched_get_cleanup_job 
> */
> + if (!__kthread_should_park(sched->thread))

This is a __ function, i.e. considered internal, and it's lockless atomic, i.e. 
unordered. And you're not explaining why this works.

Iow it's probably buggy, and an just unconditionally parking the kthread is 
probably the right thing to do. If it's not the right thing to do, there's a 
bug here for sure.
-Daniel

> + kthread_park(sched->thread);
> +
>   spin_lock(>job_list_lock);
>   job = list_first_entry_or_null(>pending_list,
>  struct drm_sched_job, list);
>  
>   if (job) {
> - /*
> -  * Remove the bad job so it cannot be freed by concurrent
> -  * drm_sched_cleanup_jobs. It will be reinserted back after 
> sched->thread
> -  * is parked at which point it's safe.
> -  */
> - list_del_init(>list);
>   spin_unlock(>job_list_lock);
>  
> + /* vendor's timeout_job should call drm_sched_start() */
>   status = job->sched->ops->timedout_job(job);
>  
>   /*
> @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> struct drm_sched_job *bad)
>   kthread_park(sched->thread);
>  
>   /*
> -  * Reinsert back the bad job here - now

RE: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

Ping Christian, Andrey

Can we merge this patch first ? this is a standalone patch for the timer 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Monk Liu  
Sent: Tuesday, August 31, 2021 6:36 PM
To: amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org; Liu, Monk 
Subject: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer even the its 
corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling only when the 
heading job is signaled, and if there is a "next" job we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job 
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job recover the 
implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling of 
scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization (no 
race issue anymore with the serialization)

tested-by: jingwen 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) 
 {
struct drm_sched_job *job, *next;
 
-   /*
-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
+   if (kthread_should_park())
return NULL;
 
spin_lock(>job_list_lock);
@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);
/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
-   if (next)
+
+   if (next) {
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
} else {
job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
}
 
spin_unlock(>job_list_lock);
@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
  (entity = 
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
 
-   if (cleanup_job) {
+   if (cleanup_job)
sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }
 
if (!entity)
continue;
--
2.7.4


RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Liu, Monk
[AMD Official Use Only]

>>> I'm not sure if the work_tdr is initialized when a maximum timeout is 
>>> specified. Please double check.
Even timeout set to max the work_tdr is still initialized:

int drm_sched_init(struct drm_gpu_scheduler *sched,
   const struct drm_sched_backend_ops *ops,
   unsigned hw_submission, unsigned hang_limit, long timeout,
   atomic_t *score, const char *name)
{
int i, ret;
sched->ops = ops;
sched->hw_submission_limit = hw_submission;
sched->name = name;
sched->timeout = timeout;
sched->hang_limit = hang_limit;
sched->score = score ? score : >_score;
for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
drm_sched_rq_init(sched, >sched_rq[i]);

init_waitqueue_head(>wake_up_worker);
init_waitqueue_head(>job_scheduled);
INIT_LIST_HEAD(>pending_list);
spin_lock_init(>job_list_lock);
atomic_set(>hw_rq_count, 0);
INIT_DELAYED_WORK(>work_tdr, drm_sched_job_timedout);
atomic_set(>_score, 0);
atomic64_set(>job_id_count, 0);

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König  
Sent: Thursday, August 26, 2021 8:38 PM
To: Liu, Monk ; amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Am 26.08.21 um 13:55 schrieb Liu, Monk:
> [AMD Official Use Only]
>
>>> I'm not sure if the work_tdr is initialized when a maximum timeout is 
>>> specified. Please double check.
> Ok, will do
>
>>> BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
>>> timeout functionality and not related to recovery in any way.
> We even do not start hardware recovery in a lot of cases now (when wave kill 
> is successfully).
>
> Umm, sounds reasonable, I can rename it to "to" with another patch

Maybe more like job_timeout or timeout_work or something into that direction.

Christian.

>
> Thanks
>
> --
> Monk Liu | Cloud-GPU Core team
> --
>
> -Original Message-
> From: Christian König 
> Sent: Thursday, August 26, 2021 6:09 PM
> To: Liu, Monk ; amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH] drm/sched: fix the bug of time out 
> calculation(v3)
>
> Am 26.08.21 um 06:55 schrieb Monk Liu:
>> issue:
>> in cleanup_job the cancle_delayed_work will cancel a TO timer even 
>> the its corresponding job is still running.
> Yeah, that makes a lot more sense.
>
>> fix:
>> do not cancel the timer in cleanup_job, instead do the cancelling 
>> only when the heading job is signaled, and if there is a "next" job 
>> we start_timeout again.
>>
>> v2:
>> further cleanup the logic, and do the TDR timer cancelling if the 
>> signaled job is the last one in its scheduler.
>>
>> v3:
>> change the issue description
>> remove the cancel_delayed_work in the begining of the cleanup_job 
>> recover the implement of drm_sched_job_begin.
>>
>> TODO:
>> 1)introduce pause/resume scheduler in job_timeout to serial the 
>> handling of scheduler and job_timeout.
>> 2)drop the bad job's del and insert in scheduler due to above 
>> serialization (no race issue anymore with the serialization)
>>
>> Signed-off-by: Monk Liu 
>> ---
>>drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
>>1 file changed, 10 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index a2a9536..ecf8140 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler 
>> *sched)
>>{
>>  struct drm_sched_job *job, *next;
>>
>> -/*
>> - * Don't destroy jobs while the timeout worker is running  OR thread
>> - * is being parked and hence assumed to not touch pending_list
>> - */
>> -if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -!cancel_delayed_work(>work_tdr)) ||
>> -kthread_should_park())
>> +if (kthread_should_park())
>>  return NULL;
>>
>>  spin_lock(>job_list_lock); @@ -693,17 +687,21 @@ 
>> drm_sched_g

RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Liu, Monk
[AMD Official Use Only]

Yeah, that "kthread_should_park" is also irrelevant looks to me as well and it 
delays the signaled job's cleanup/free

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König  
Sent: Friday, August 27, 2021 2:12 PM
To: Grodzovsky, Andrey ; Liu, Monk 
; amd-...@lists.freedesktop.org; Koenig, Christian 

Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

I don't think that this will be necessary nor desired.

See the job should be cleaned up as soon as possible after it is finished or 
otherwise we won't cancel the timeout quick enough either.

Christian.

Am 26.08.21 um 22:14 schrieb Andrey Grodzovsky:
> Attached quick patch for per job TTL calculation to make more precises 
> next timer expiration. It's on top of the patch in this thread. Let me 
> know if this makes sense.
>
> Andrey
>
> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>
>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>> issue:
>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even 
>>> the its corresponding job is still running.
>>>
>>> fix:
>>> do not cancel the timer in cleanup_job, instead do the cancelling 
>>> only when the heading job is signaled, and if there is a "next" job 
>>> we start_timeout again.
>>>
>>> v2:
>>> further cleanup the logic, and do the TDR timer cancelling if the 
>>> signaled job is the last one in its scheduler.
>>>
>>> v3:
>>> change the issue description
>>> remove the cancel_delayed_work in the begining of the cleanup_job 
>>> recover the implement of drm_sched_job_begin.
>>>
>>> TODO:
>>> 1)introduce pause/resume scheduler in job_timeout to serial the 
>>> handling of scheduler and job_timeout.
>>> 2)drop the bad job's del and insert in scheduler due to above 
>>> serialization (no race issue anymore with the serialization)
>>>
>>> Signed-off-by: Monk Liu 
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 25 
>>> ++---
>>>   1 file changed, 10 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index a2a9536..ecf8140 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
>>> drm_gpu_scheduler *sched)
>>>   {
>>>   struct drm_sched_job *job, *next;
>>>   -    /*
>>> - * Don't destroy jobs while the timeout worker is running OR 
>>> thread
>>> - * is being parked and hence assumed to not touch pending_list
>>> - */
>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>> -    !cancel_delayed_work(>work_tdr)) ||
>>> -    kthread_should_park())
>>> +    if (kthread_should_park())
>>>   return NULL;
>>
>>
>> I actually don't see why we need to keep the above, on the other side 
>> (in drm_sched_stop) we won't touch the pending list anyway until 
>> sched thread came to full stop (kthread_park). If you do see a reason 
>> why this needed then a comment should be here i think.
>>
>> Andrey
>>
>>
>>> spin_lock(>job_list_lock);
>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct 
>>> drm_gpu_scheduler *sched)
>>>   if (job && dma_fence_is_signaled(>s_fence->finished)) {
>>>   /* remove job from pending_list */
>>>   list_del_init(>list);
>>> +
>>> +    /* cancel this job's TO timer */
>>> +    cancel_delayed_work(>work_tdr);
>>>   /* make the scheduled timestamp more accurate */
>>>   next = list_first_entry_or_null(>pending_list,
>>>   typeof(*next), list);
>>> -    if (next)
>>> +
>>> +    if (next) {
>>>   next->s_fence->scheduled.timestamp =
>>>   job->s_fence->finished.timestamp;
>>> -
>>> +    /* start TO timer for next job */
>>> +    drm_sched_start_timeout(sched);
>>> +    }
>>>   } else {
>>>   job = NULL;
>>> -    /* queue timeout for next job */
>>> -    drm_sched_start_timeout(sched);
>>>   }
>>>     spin_unlock(>job_list_lock);
>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>     (entity = drm_sched_select_entity(sched))) 
>>> ||
>>>    kthread_should_stop());
>>>   -    if (cleanup_job) {
>>> +    if (cleanup_job)
>>>   sched->ops->free_job(cleanup_job);
>>> -    /* queue timeout for next job */
>>> -    drm_sched_start_timeout(sched);
>>> -    }
>>>     if (!entity)
>>>   continue;


RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-27 Thread Liu, Monk
[AMD Official Use Only]

what is that 'ts' representing for ? it looks to me the jiffies that it get 
scheduled to the ring,  but a job scheduled to the ring doesn't represent it's 
being processed by hw.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Grodzovsky, Andrey 
Sent: Friday, August 27, 2021 4:14 AM
To: Liu, Monk ; amd-...@lists.freedesktop.org; Koenig, 
Christian 
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Attached quick patch for per job TTL calculation to make more precises next 
timer expiration. It's on top of the patch in this thread. Let me know if this 
makes sense.

Andrey

On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>
> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>> issue:
>> in cleanup_job the cancle_delayed_work will cancel a TO timer even 
>> the its corresponding job is still running.
>>
>> fix:
>> do not cancel the timer in cleanup_job, instead do the cancelling 
>> only when the heading job is signaled, and if there is a "next" job 
>> we start_timeout again.
>>
>> v2:
>> further cleanup the logic, and do the TDR timer cancelling if the 
>> signaled job is the last one in its scheduler.
>>
>> v3:
>> change the issue description
>> remove the cancel_delayed_work in the begining of the cleanup_job 
>> recover the implement of drm_sched_job_begin.
>>
>> TODO:
>> 1)introduce pause/resume scheduler in job_timeout to serial the 
>> handling of scheduler and job_timeout.
>> 2)drop the bad job's del and insert in scheduler due to above 
>> serialization (no race issue anymore with the serialization)
>>
>> Signed-off-by: Monk Liu 
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 25
>> ++---
>>   1 file changed, 10 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index a2a9536..ecf8140 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
>> drm_gpu_scheduler *sched)
>>   {
>>   struct drm_sched_job *job, *next;
>>   -    /*
>> - * Don't destroy jobs while the timeout worker is running OR 
>> thread
>> - * is being parked and hence assumed to not touch pending_list
>> - */
>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -    !cancel_delayed_work(>work_tdr)) ||
>> -    kthread_should_park())
>> +    if (kthread_should_park())
>>   return NULL;
>
>
> I actually don't see why we need to keep the above, on the other side 
> (in drm_sched_stop) we won't touch the pending list anyway until sched 
> thread came to full stop (kthread_park). If you do see a reason why 
> this needed then a comment should be here i think.
>
> Andrey
>
>
>> spin_lock(>job_list_lock);
>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct 
>> drm_gpu_scheduler *sched)
>>   if (job && dma_fence_is_signaled(>s_fence->finished)) {
>>   /* remove job from pending_list */
>>   list_del_init(>list);
>> +
>> +    /* cancel this job's TO timer */
>> +    cancel_delayed_work(>work_tdr);
>>   /* make the scheduled timestamp more accurate */
>>   next = list_first_entry_or_null(>pending_list,
>>   typeof(*next), list);
>> -    if (next)
>> +
>> +    if (next) {
>>   next->s_fence->scheduled.timestamp =
>>   job->s_fence->finished.timestamp;
>> -
>> +    /* start TO timer for next job */
>> +    drm_sched_start_timeout(sched);
>> +    }
>>   } else {
>>   job = NULL;
>> -    /* queue timeout for next job */
>> -    drm_sched_start_timeout(sched);
>>   }
>>     spin_unlock(>job_list_lock);
>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>     (entity = drm_sched_select_entity(sched))) ||
>>    kthread_should_stop());
>>   -    if (cleanup_job) {
>> +    if (cleanup_job)
>>   sched->ops->free_job(cleanup_job);
>> -    /* queue timeout for next job */
>> -    drm_sched_start_timeout(sched);
>> -    }
>>     if (!entity)
>>   continue;


RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-26 Thread Liu, Monk
[AMD Official Use Only]

>>I'm not sure if the work_tdr is initialized when a maximum timeout is 
>>specified. Please double check.

Ok, will do

>>BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
>>timeout functionality and not related to recovery in any way.
We even do not start hardware recovery in a lot of cases now (when wave kill is 
successfully).

Umm, sounds reasonable, I can rename it to "to" with another patch 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Christian König  
Sent: Thursday, August 26, 2021 6:09 PM
To: Liu, Monk ; amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Am 26.08.21 um 06:55 schrieb Monk Liu:
> issue:
> in cleanup_job the cancle_delayed_work will cancel a TO timer even the 
> its corresponding job is still running.

Yeah, that makes a lot more sense.

>
> fix:
> do not cancel the timer in cleanup_job, instead do the cancelling only 
> when the heading job is signaled, and if there is a "next" job we 
> start_timeout again.
>
> v2:
> further cleanup the logic, and do the TDR timer cancelling if the 
> signaled job is the last one in its scheduler.
>
> v3:
> change the issue description
> remove the cancel_delayed_work in the begining of the cleanup_job 
> recover the implement of drm_sched_job_begin.
>
> TODO:
> 1)introduce pause/resume scheduler in job_timeout to serial the 
> handling of scheduler and job_timeout.
> 2)drop the bad job's del and insert in scheduler due to above 
> serialization (no race issue anymore with the serialization)
>
> Signed-off-by: Monk Liu 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
>   1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index a2a9536..ecf8140 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler 
> *sched)
>   {
>   struct drm_sched_job *job, *next;
>   
> - /*
> -  * Don't destroy jobs while the timeout worker is running  OR thread
> -  * is being parked and hence assumed to not touch pending_list
> -  */
> - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> - !cancel_delayed_work(>work_tdr)) ||
> - kthread_should_park())
> + if (kthread_should_park())
>   return NULL;
>   
>   spin_lock(>job_list_lock);
> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler 
> *sched)
>   if (job && dma_fence_is_signaled(>s_fence->finished)) {
>   /* remove job from pending_list */
>   list_del_init(>list);
> +
> + /* cancel this job's TO timer */
> + cancel_delayed_work(>work_tdr);

I'm not sure if the work_tdr is initialized when a maximum timeout is 
specified. Please double check.

BTW: Can we please drop the "tdr" naming from the scheduler? That is just a 
timeout functionality and not related to recovery in any way.

We even do not start hardware recovery in a lot of cases now (when wave kill is 
successfully).

Regards,
Christian.

>   /* make the scheduled timestamp more accurate */
>   next = list_first_entry_or_null(>pending_list,
>   typeof(*next), list);
> - if (next)
> +
> + if (next) {
>   next->s_fence->scheduled.timestamp =
>   job->s_fence->finished.timestamp;
> -
> + /* start TO timer for next job */
> + drm_sched_start_timeout(sched);
> + }
>   } else {
>   job = NULL;
> - /* queue timeout for next job */
> - drm_sched_start_timeout(sched);
>   }
>   
>   spin_unlock(>job_list_lock);
> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
> (entity = 
> drm_sched_select_entity(sched))) ||
>kthread_should_stop());
>   
> - if (cleanup_job) {
> + if (cleanup_job)
>   sched->ops->free_job(cleanup_job);
> - /* queue timeout for next job */
> - drm_sched_start_timeout(sched);
> - }
>   
>   if (!entity)
>   continue;


RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
[AMD Official Use Only]

>>But  for timer pending case (common case) your mod_delayed_work will 
>>effectively do exactly the same if you don't use per job TTLs - you mod it to 
>> sched->timeout value which resets the pending timer to again count from 0.

Ny patch will only modify the timer (restart it , actually) when the heading 
job is signaled, which means on HW ring the next job is just about start 
processing.
If the job is not signaled (your common case) the timer is still not touched at 
all ... 

>> I just wonder why we stopped using per job TDR timers in the first place ? 
>> Isn't the simplest way to count accurate timeouts for each job is to 
>> actually measure the timeouts for each job separately ?

I'm not sure if Christian can recall something, and I believe it is due to some 
limitations we found (or some race issue like two job on the same scheduler TO 
in the same time, which is probably if they are scheduled to the ring almost in 
the same timeframe)

Anyway I have a V3 version patch, please take a look, it looks working for me 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Grodzovsky, Andrey  
Sent: Thursday, August 26, 2021 11:05 AM
To: Liu, Monk ; Christian König 
; amd-...@lists.freedesktop.org; dri-devel 

Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)


On 2021-08-25 10:31 p.m., Liu, Monk wrote:
> [AMD Official Use Only]
>
> Hi Andrey
>
> I'm not quite sure if I read you correctly
>
>>> Seems to me you can only do it for empty pending list otherwise you risk 
>>> cancelling a legit new timer that was started by the next job or not 
>>> restarting timer at all since your timer was still pending when next job 
>>> tried to start it again (the common case).
> I don't understand above sentence, from my understanding for the 
> common case,  if the timer is pending, the cancel_delay_work in 
> beginning will cancel it and then we will get to the line of "queue 
> timeout for next job" since the heading job is not signaled (align 
> with the timer is pending), then the timer will be restarted (for the 
> next job)


Ignore it, i realized from looking that i missed the timer restart in then end 
of drm_sched_get_cleanup_job or the alternative one in drm_sched_main


>
> And above sequence is actually wrong to me, because we cancelled a 
> pending timer and restart the timer for the scheduler that its heading 
> job is still running there, the whole counting is repeated from zero 
> and inaccurate at all


But  for timer pending case (common case) your mod_delayed_work will 
effectively do exactly the same if you don't use per job TTLs - you mod it to  
sched->timeout value which resets the pending timer to again count from 0.

I just wonder why we stopped using per job TDR timers in the first place ? 
Isn't the simplest way to count accurate timeouts for each job is to actually 
measure the timeouts for each job separately ?

Andrey


>   
>
> Thanks

>
> --
> Monk Liu | Cloud-GPU Core team
> ------
>
> -Original Message-
> From: Grodzovsky, Andrey 
> Sent: Thursday, August 26, 2021 2:20 AM
> To: Christian König ; Liu, Monk 
> ; amd-...@lists.freedesktop.org; dri-devel 
> 
> Subject: Re: [PATCH] drm/sched: fix the bug of time out 
> calculation(v2)
>
>
> On 2021-08-25 8:11 a.m., Christian König wrote:
>> No, this would break that logic here.
>>
>> See drm_sched_start_timeout() can be called multiple times, this is 
>> intentional and very important!
>>
>> The logic in queue_delayed_work() makes sure that the timer is only 
>> started once and then never again.
>>
>> All we need to take care of is to cancel_delayed_work() when we know 
>> that the job is completed.
>
> Seems to me you can only do it for empty pending list otherwise you risk 
> cancelling a legit new timer that was started by the next job or not 
> restarting timer at all since your timer was still pending when next job 
> tried to start it again (the common case).
> For non empty pending list you have to adjust the currently active TDR's 
> timer from your's job TTL to TTL to the next job after you or just restart it 
> as Monk does it here which prolongs the timeout more then required but still 
> ok i guess.
>
> What about returning to the old scheme of timer sched_work per job so 
> each job has it's own timer and we don't share it and everything is 
> precise for each job, using the locking scheme we already have today 
> the actual TDR handler will execute only once while all the other 

RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
[AMD Official Use Only]

Hi Andrey

I'm not quite sure if I read you correctly

>>Seems to me you can only do it for empty pending list otherwise you risk 
>>cancelling a legit new timer that was started by the next job or not 
>>restarting timer at all since your timer was still pending when next job 
>>tried to start it again (the common case).

I don't understand above sentence, from my understanding for the common case,  
if the timer is pending, the cancel_delay_work in beginning will cancel it and 
then we will get to the line of "queue timeout for next job" since the heading 
job is not signaled (align with the timer is pending), then the timer will be 
restarted (for the next job)

And above sequence is actually wrong to me, because we cancelled a pending 
timer and restart the timer for the scheduler that its heading job is still 
running there, the whole counting is repeated from zero and inaccurate at all 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Grodzovsky, Andrey  
Sent: Thursday, August 26, 2021 2:20 AM
To: Christian König ; Liu, Monk 
; amd-...@lists.freedesktop.org; dri-devel 

Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)


On 2021-08-25 8:11 a.m., Christian König wrote:
> No, this would break that logic here.
>
> See drm_sched_start_timeout() can be called multiple times, this is 
> intentional and very important!
>
> The logic in queue_delayed_work() makes sure that the timer is only 
> started once and then never again.
>
> All we need to take care of is to cancel_delayed_work() when we know 
> that the job is completed.


Seems to me you can only do it for empty pending list otherwise you risk 
cancelling a legit new timer that was started by the next job or not restarting 
timer at all since your timer was still pending when next job tried to start it 
again (the common case).
For non empty pending list you have to adjust the currently active TDR's timer 
from your's job TTL to TTL to the next job after you or just restart it as Monk 
does it here which prolongs the timeout more then required but still ok i guess.

What about returning to the old scheme of timer sched_work per job so each job 
has it's own timer and we don't share it and everything is precise for each 
job, using the locking scheme we already have today the actual TDR handler will 
execute only once while all the other arising from the guilty job hang will be 
rejected (for amdgpu, for other drivers it probably requires same locking or we 
can move this to the scheduler layer)

Andrey


>
> This here works as intended as far as I can see and if you start to 
> use mod_delayed_work() you actually break it.
>
> Regards,
> Christian.
>
> Am 25.08.21 um 14:01 schrieb Liu, Monk:
>> [AMD Official Use Only]
>>
>> I think we should remove the cancel_delayed_work() in the beginning 
>> of the cleanup_job().
>>
>> Because by my patch the "mode_delayed_work" in cleanup_job is already 
>> doing its duty to retrigger the TO timer accordingly
>>
>> Thanks
>>
>> ------
>> Monk Liu | Cloud-GPU Core team
>> --
>>
>> -Original Message-
>> From: Liu, Monk
>> Sent: Wednesday, August 25, 2021 7:55 PM
>> To: 'Christian König' ;
>> amd-...@lists.freedesktop.org
>> Subject: RE: [PATCH] drm/sched: fix the bug of time out 
>> calculation(v2)
>>
>> [AMD Official Use Only]
>>
>>>> The timeout started by queue_delayed_work() in
>>>> drm_sched_start_timeout() is paired with the cancel_delayed_work() 
>>>> in drm_sched_get_cleanup_job().
>> No that's wrong, see that when we are in cleanup_job(), assume we do 
>> not have timeout on this sched (we are just keep submitting new jobs 
>> to this sched), Then the work_tdr is cancelled, and then we get the 
>> heading job, and let's assume the job is not signaled, then we run to 
>> the "queue timeout for next job" thus drm_sched_start_timeout() is 
>> called, so this heading job's TO timer is actually retriggered ...
>> which is totally wrong.
>>
>> With my patch the timer is already retriggered after previous JOB 
>> really signaled.
>>
>> Can you be more specific on the incorrect part ?
>>
>> Thanks
>> --
>> Monk Liu | Cloud-GPU Core team
>> --
>>
>> -Original Message-
>> From: Christian König 
>> Sent: Wednesday, August 25, 2021 2:32 PM
>> To: Liu, Monk ; amd-..

RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)

2021-08-25 Thread Liu, Monk
[AMD Official Use Only]

>> All we need to take care of is to cancel_delayed_work() when we know that 
>> the job is completed.

That's why I want to remove the cancel_delayed_work in the beginning of 
cleanup_job(), because in that moment we don't know if
There is a job completed (sched could be wake up due to new submit, instead of 
a job signaled) , until we get the job and acknowledged of its signaling.



static struct drm_sched_job *
drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
{
struct drm_sched_job *job, *next;

/*
 * Don't destroy jobs while the timeout worker is running  OR thread
 * is being parked and hence assumed to not touch pending_list
 */
if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
!cancel_delayed_work(>work_tdr)) || //normally if the job is 
not TO, then he cancel here is incorrect if the job is still running , 
kthread_should_park())
return NULL;

spin_lock(>job_list_lock);

job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);

if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
if (next)
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;

} else {
job = NULL;
/* queue timeout for next job */
drm_sched_start_timeout(sched); //if the job is not signaled, 
the timer will be retriggered here (counting is restarted ) , which is 
wrong  
}

spin_unlock(>job_list_lock);

return job;
}



>> This here works as intended as far as I can see and if you start to use 
>> mod_delayed_work() you actually break it.
Only in the place we find heading job is signaled and there is a next job is 
the moment that we should cancel the work_tdr for this scheduler , of cause 
with 
A new work_tdr queued as the "next" job is already started on HW... that's why 
I use mod_delayed_work. But I can change it to "cancel and queue" approach if 
you have concern.


Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-----Original Message-
From: Christian König  
Sent: Wednesday, August 25, 2021 8:11 PM
To: Liu, Monk ; amd-...@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2)

No, this would break that logic here.

See drm_sched_start_timeout() can be called multiple times, this is intentional 
and very important!

The logic in queue_delayed_work() makes sure that the timer is only started 
once and then never again.

All we need to take care of is to cancel_delayed_work() when we know that the 
job is completed.

This here works as intended as far as I can see and if you start to use
mod_delayed_work() you actually break it.

Regards,
Christian.

Am 25.08.21 um 14:01 schrieb Liu, Monk:
> [AMD Official Use Only]
>
> I think we should remove the cancel_delayed_work() in the beginning of the 
> cleanup_job().
>
> Because by my patch the "mode_delayed_work" in cleanup_job is already 
> doing its duty to retrigger the TO timer accordingly
>
> Thanks
>
> ----------
> Monk Liu | Cloud-GPU Core team
> --
>
> -Original Message-
> From: Liu, Monk
> Sent: Wednesday, August 25, 2021 7:55 PM
> To: 'Christian König' ; 
> amd-...@lists.freedesktop.org
> Subject: RE: [PATCH] drm/sched: fix the bug of time out 
> calculation(v2)
>
> [AMD Official Use Only]
>
>>> The timeout started by queue_delayed_work() in drm_sched_start_timeout() is 
>>> paired with the cancel_delayed_work() in drm_sched_get_cleanup_job().
> No that's wrong, see that when we are in cleanup_job(), assume we do not have 
> timeout on this sched (we are just keep submitting new jobs to this sched), 
> Then the work_tdr is cancelled, and then we get the heading job, and let's 
> assume the job is not signaled, then we run to the "queue timeout for next 
> job" thus drm_sched_start_timeout() is called, so this heading job's TO timer 
> is actually retriggered ... which is totally wrong.
>
> With my patch the timer is already retriggered after previous JOB really 
> signaled.
>
> Can you be more specific on the incorrect part ?
>
> Thanks
> --

RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-24 Thread Liu, Monk
[AMD Official Use Only]

Hi Andrey

Sorry that it is really hard for me to get any particular or solid potential 
bugs from your reply, can you be more specific, e.g.: what kind of race issue 
is introduced by this "kthread_stop/start" approach.

To your another question/concern: 
>> . In a constant rapid stream of jobs each new job comming will try to start 
>> the timer but most of the time this operation just bails out as there is 
>> already pending timer from one of the previous jobs which cancels out any 
>> new ones [1] so, when the TO handler does execute eventually it's not 
>> because something wrong but simply because TO has 
Expired

I totally agree withy you on this point, and I think I have a patch to address 
this, but this problem is not related with our current topic at all ... our  
current topic is the bailout bad job handling from advanced TDR mode.

The bug here is our current TO handler only do the counting on the first job to 
the given scheduler, and the following coming job won't recalculate the TO at 
all, and I can assure you that this is a regression because when I implement 
TDR years ago I already considered planned for such problem.
Please check this change to resolve it:

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..7b5f99a 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -235,6 +235,13 @@ static void drm_sched_start_timeout(struct 
drm_gpu_scheduler *sched)
schedule_delayed_work(>work_tdr, sched->timeout);
 }
 
+static void drm_sched_restart_timeout(struct drm_gpu_scheduler *sched)
+{
+   if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
+   !list_empty(>pending_list))
+   mod_delayed_work(system_wq, >work_tdr, sched->timeout);
+}
+
 /**
  * drm_sched_fault - immediately start timeout handler
  *
@@ -693,6 +682,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* once the job deleted from pending list we should restart
+* the timeout calculation for the next job.
+*/
+   drm_sched_restart_timeout(sched);
/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);


if you guys do not have concerns I can submit this patch for review, but again, 
let's focus on bailing out had job handling as our priority, we are very close 
to our purpose, let me know what's your concerned race issue and we can address 
it.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Grodzovsky, Andrey  
Sent: Friday, August 20, 2021 10:07 PM
To: Liu, Monk ; Daniel Vetter ; Koenig, 
Christian 
Cc: Alex Deucher ; Chen, JingWen 
; Maling list - DRI developers 
; amd-gfx list 
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."


On 2021-08-20 3:12 a.m., Liu, Monk wrote:
> [AMD Official Use Only]
>
> @Daniel Vetter @Grodzovsky, Andrey @Koenig, Christian
>   
> Do you have any concern on the kthread_park() approach ?
>
> Theoretically speaking sched_main shall run there exclusively with 
> job_timeout since they both touches jobs, and stop scheduler during 
> job_timeout won't impact performance since in that scenario There was 
> already something wrong/stuck on that ring/scheduler


Regarding last paragraph, and specifically the claim that there was already 
something wrong if the TO handler starts execution - Not sure about this and I 
wonder if we have a potential bug here - when we start the timeout timer in 
drm_sched_job_begin we do it for each new incoming job. In a constant rapid 
stream of jobs each new job comming will try to start the timer but most of the 
time this operation just bails out as there is already pending timer from one 
of the previous jobs which cancels out any new ones [1] so, when the TO handler 
does execute eventually it's not because something wrong but simply because TO 
has expired. If in this case the pending list not empty a false TDR will be 
triggered. I think long ago we used TO handler per job and not per scheduler, 
this would solve this problem but hurt the serialization issue we are trying to 
solve. So not sure what to do.

[1] -
https://elixir.bootlin.com/linux/v5.14-rc1/source/kernel/workqueue.c#L1665

Andrey

>
> Thanks
>
> ----------
> Monk Liu | Cloud-GPU Core team
> --
>
> -Original Me

RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-20 Thread Liu, Monk
[AMD Official Use Only]

@Daniel Vetter @Grodzovsky, Andrey @Koenig, Christian
 
Do you have any concern on the kthread_park() approach ?

Theoretically speaking sched_main shall run there exclusively with job_timeout 
since they both touches jobs, and stop scheduler during job_timeout won't 
impact performance since in that scenario
There was already something wrong/stuck on that ring/scheduler 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Liu, Monk 
Sent: Thursday, August 19, 2021 6:26 PM
To: Daniel Vetter ; Grodzovsky, Andrey 

Cc: Alex Deucher ; Chen, JingWen 
; Maling list - DRI developers 
; amd-gfx list 
; Koenig, Christian 
Subject: RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

[AMD Official Use Only]

Hi Daniel

>> Why can't we stop the scheduler thread first, so that there's guaranteed no 
>> race? I've recently had a lot of discussions with panfrost folks about their 
>> reset that spawns across engines, and without stopping the scheduler thread 
>> first before you touch anything it's just plain impossible.

Yeah we had this though as well in our mind.

Our second approach is to call ktrhead_stop() in job_timedout() routine so that 
 the "bad" job is guaranteed to be used without scheduler's touching or 
freeing, Check this sample patch one as well please:

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..50a49cb 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,17 +319,12 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   kthread_park(sched->thread);
spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
 
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
spin_unlock(>job_list_lock);
 
status = job->sched->ops->timedout_job(job);
@@ -345,6 +340,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
} else {
spin_unlock(>job_list_lock);
}
+   kthread_unpark(sched->thread);
 
if (status != DRM_GPU_SCHED_STAT_ENODEV) {
spin_lock(>job_list_lock); @@ -393,20 +389,6 @@ void 
drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
kthread_park(sched->thread);
 
/*
-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
-   /*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
 * signaled.


Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter 
Sent: Thursday, August 19, 2021 5:31 PM
To: Grodzovsky, Andrey 
Cc: Daniel Vetter ; Alex Deucher ; 
Chen, JingWen ; Maling list - DRI developers 
; amd-gfx list 
; Liu, Monk ; Koenig, 
Christian 
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
> > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > > > 
> > > > > > + dri-devel
> > > > > > 
> > > > > > Since scheduler is a shared component, please add dri-devel 
> > > > > > on all scheduler patches.
&g

RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-19 Thread Liu, Monk
[AMD Official Use Only]

Hi Daniel

>> Why can't we stop the scheduler thread first, so that there's guaranteed no 
>> race? I've recently had a lot of discussions with panfrost folks about their 
>> reset that spawns across engines, and without stopping the scheduler thread 
>> first before you touch anything it's just plain impossible.

Yeah we had this though as well in our mind.

Our second approach is to call ktrhead_stop() in job_timedout() routine so that 
 the "bad" job is guaranteed to be used without scheduler's touching or freeing,
Check this sample patch one as well please:

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..50a49cb 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,17 +319,12 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   kthread_park(sched->thread);
spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
 
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
spin_unlock(>job_list_lock);
 
status = job->sched->ops->timedout_job(job);
@@ -345,6 +340,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
} else {
spin_unlock(>job_list_lock);
}
+   kthread_unpark(sched->thread);
 
if (status != DRM_GPU_SCHED_STAT_ENODEV) {
spin_lock(>job_list_lock);
@@ -393,20 +389,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
kthread_park(sched->thread);
 
/*
-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
-   /*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
 * signaled.


Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Thursday, August 19, 2021 5:31 PM
To: Grodzovsky, Andrey 
Cc: Daniel Vetter ; Alex Deucher ; 
Chen, JingWen ; Maling list - DRI developers 
; amd-gfx list 
; Liu, Monk ; Koenig, 
Christian 
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
> > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > > > 
> > > > > > + dri-devel
> > > > > > 
> > > > > > Since scheduler is a shared component, please add dri-devel 
> > > > > > on all scheduler patches.
> > > > > > 
> > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen 
> > > > > >  wrote:
> > > > > > > [Why]
> > > > > > > for bailing job, this commit will delete it from pending 
> > > > > > > list thus the bailing job will never have a chance to be 
> > > > > > > resubmitted even in advance tdr mode.
> > > > > > > 
> > > > > > > [How]
> > > > > > > after embeded hw_fence into amdgpu_job is done, the race 
> > > > > > > condition that this commit tries to work around is 
> > > > > > > completely solved.So revert this commit.
> > > > > > > This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.

RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-18 Thread Liu, Monk
[AMD Official Use Only]

Hi Andrey and Daniel

We worked for a really long time on this new feature to AMD that finally can 
pick up the bad job from all timedout ones, and the change in scheduler 
(get/put fence in drm_sched_job_timedout, and remove the bad job delete and put 
back) is the last piece for us.

While we understand and realized that after the "bad job list node delete 
logic" being removed from job_timedout,  there will be race issues introduced 
if vendor's job_timeout calback is accessing the bad job  in parallel of 
scheduler doing "sched->ops->free_job(leanup_job)".

And to not introduce impact at all on those vendors I'd like to proposal a very 
simple change (which introduced a new bool member for scheduler to indicate if 
the del/put-back logic is needed or not) , check  patch here below:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 47ea468..5e0bdc4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -495,6 +495,8 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
return r;
}
 
+   ring->sched.keep_bad_job = true;
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 92d8de2..e7ac384 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
struct drm_gpu_scheduler *sched;
struct drm_sched_job *job;
+   struct dma_fence *f = NULL;
 
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
@@ -328,7 +329,11 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
 * drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
 * is parked at which point it's safe.
 */
-   list_del_init(>list);
+   if (sched->keep_bad_job == false)
+   list_del_init(>list);
+   else
+   f = dma_fence_get(job->s_fence->parent);//get parent 
fence here to prevent hw_fence dropping to zero due to sched-main's 
cleanup_jobs, for amdgpu once parent fence drop to zero the sched_job will be 
kfree-ed 
+
spin_unlock(>job_list_lock);
 
job->sched->ops->timedout_job(job);
@@ -341,6 +346,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
job->sched->ops->free_job(job);
sched->free_guilty = false;
}
+
+   dma_fence_put(f);
} else {
spin_unlock(>job_list_lock);
}
@@ -396,7 +403,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct 
drm_sched_job *bad)
 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
 * now until the scheduler thread is unparked.
 */
-   if (bad && bad->sched == sched)
+   if (bad && bad->sched == sched && sched->keep_bad_job == false)
/*
 * Add at the head of the queue to reflect it was the earliest
 * job extracted.
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 4ea8606..5f9a640 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -301,6 +301,7 @@ struct drm_gpu_scheduler {
atomic_t_score;
boolready;
boolfree_guilty;
+   bool keep_bad_job;
 };
 
 int drm_sched_init(struct drm_gpu_scheduler *sched,


Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Wednesday, August 18, 2021 10:43 PM
To: Grodzovsky, Andrey 
Cc: Daniel Vetter ; Alex Deucher ; 
Chen, JingWen ; Maling list - DRI developers 
; amd-gfx list 
; Liu, Monk ; Koenig, 
Christian 
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > 
> > > > + dri-devel
> > > > 
> > > > Since scheduler is a shared component, please add dri-devel on 
> > > > all scheduler patches.
> > > > 
> > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen  
> > > > wrote:
> > > > > [Why]
> > > > > for bailing job, this commit will delete it from pending list 
&g

RE: 回复: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak

2021-03-29 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

Hi Christian,

We don't need to debate on the design's topic, each of us have our own opinion, 
it is hard to persuade others sometimes, again with more and more features and 
requirements it is pretty normal that an old design need to
Refine and or even rework to satisfy all those needs, so I'm not trying to 
argue with you that we don't need a better rework, that's also pleasure me .

In the moment, the more important thing I care is the solution because SRIOV 
project still try best to put all changes into upstreaming tree, we don't want 
to fork another tree unless no choice ... 

Let's have a sync in another thread 

Thanks for you help on this

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Koenig, Christian  
Sent: Friday, March 26, 2021 10:51 PM
To: Liu, Monk ; Zhang, Jack (Jian) ; 
Grodzovsky, Andrey ; Christian König 
; dri-devel@lists.freedesktop.org; 
amd-...@lists.freedesktop.org; Deng, Emily ; Rob Herring 
; Tomeu Vizoso ; Steven Price 

Cc: Zhang, Andy ; Jiang, Jerry (SW) 
Subject: Re: 回复: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak

Hi Monk,

I can't disagree more.

The fundamental problem here is that we have pushed a design without validating 
if it really fits into the concepts the Linux kernel mandates here.

My mistake was that I haven't pushed back hard enough on the initial design 
resulting in numerous cycles of trying to save the design while band aiding the 
flaws which became obvious after a while.

I haven't counted them but I think we are now already had over 10 patches which 
try to work around lifetime issues of the job object because I wasn't able to 
properly explain why this isn't going to work like this.

Because of this I will hard reject any attempt to band aid this issue even more 
which isn't starting over again with a design which looks like it is going to 
work.

Regards,
Christian.

Am 26.03.21 um 12:21 schrieb Liu, Monk:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Christian
>
> This is not correct or correct perspective, any design comes with its 
> pros and cons, otherwise it wouldn't comes to kernel tree in the very 
> beginning , it is just with time passed we have more and more 
> requirement and feature need to implement And those new requirement 
> drags many new solution or idea, and some idea you prefer need to 
> based on a new infrastructure, that's all
>
> I don't why the job "should be" or not "should be" in the scheduler, 
> honestly speaking I can argue with you that the "scheduler" and the TDR 
> feature which invented by AMD developer "should" never escalate to drm layer 
> at all and by that assumption Those vendor's compatibilities headache right 
> now won't happen at all.
>
> Let's just focus on the issue so far.
>
> The solution Andrey and Jack doing right now looks good to me, and it 
> can solve our problems without introducing regression from a surface 
> look, but it is fine if you need a neat solution,  since we have our 
> project pressure (which we always have) Either we implement the first 
> version with Jack's patch and do the revise in another series of 
> patches (that also my initial suggestion) or we rework anything you 
> mentioned, but since looks to me you are from time to time asking 
> people to rework Something in the stage that people already have a 
> solution, which frustrated people a lot,
>
> I would like you do prepare a solution for us, which solves our 
> headaches ...  I really don't want to see you asked Jack to rework again and 
> again If you are out of bandwidth or no interest in doing this ,please at 
> least make your solution/proposal very detail and clear, jack told me he 
> couldn't understand your point here.
>
> Thanks very much, and please understand our painful here
>
> /Monk
>
>
> -邮件原件-
> 发件人: Koenig, Christian 
> 发送时间: 2021年3月26日 17:06
> 收件人: Zhang, Jack (Jian) ; Grodzovsky, Andrey 
> ; Christian König 
> ; dri-devel@lists.freedesktop.org; 
> amd-...@lists.freedesktop.org; Liu, Monk ; Deng, 
> Emily ; Rob Herring ; Tomeu 
> Vizoso ; Steven Price 
> 
> 主题: Re: [PATCH v3] drm/scheduler re-insert Bailing job to avoid 
> memleak
>
> Hi guys,
>
> Am 26.03.21 um 03:23 schrieb Zhang, Jack (Jian):
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi, Andrey,
>>
>>>> how u handle non guilty singnaled jobs in drm_sched_stop, currently 
>>>> looks like you don't call put for them and just explicitly free 
>>>> them as before
>> Good point, I missed that place. Will cover that in my next patch.
>>
>>>> Also 

回复: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak

2021-03-26 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

Hi Christian

This is not correct or correct perspective, any design comes with its pros and 
cons, otherwise it wouldn't comes to kernel tree in the very beginning , it is 
just with time passed we have more and more requirement and feature need to 
implement
And those new requirement drags many new solution or idea, and some idea you 
prefer need to based on a new infrastructure, that's all

I don't why the job "should be" or not "should be" in the scheduler, honestly 
speaking I can argue with you that the "scheduler" and the TDR feature which 
invented by AMD developer "should" never escalate to drm layer at all and by 
that assumption
Those vendor's compatibilities headache right now won't happen at all.

Let's just focus on the issue so far.

The solution Andrey and Jack doing right now looks good to me, and it can solve 
our problems without introducing regression from a surface look, but it is fine 
if you need a neat solution,  since we have our project pressure (which we 
always have)
Either we implement the first version with Jack's patch and do the revise in 
another series of patches (that also my initial suggestion) or we rework 
anything you mentioned, but since looks to me you are from time to time asking 
people to rework
Something in the stage that people already have a solution, which frustrated 
people a lot,

I would like you do prepare a solution for us, which solves our headaches ...  
I really don't want to see you asked Jack to rework again and again
If you are out of bandwidth or no interest in doing this ,please at least make 
your solution/proposal very detail and clear, jack told me he couldn't 
understand your point here.

Thanks very much, and please understand our painful here

/Monk


-邮件原件-
发件人: Koenig, Christian 
发送时间: 2021年3月26日 17:06
收件人: Zhang, Jack (Jian) ; Grodzovsky, Andrey 
; Christian König 
; dri-devel@lists.freedesktop.org; 
amd-...@lists.freedesktop.org; Liu, Monk ; Deng, Emily 
; Rob Herring ; Tomeu Vizoso 
; Steven Price 
主题: Re: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak

Hi guys,

Am 26.03.21 um 03:23 schrieb Zhang, Jack (Jian):
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi, Andrey,
>
>>> how u handle non guilty singnaled jobs in drm_sched_stop, currently
>>> looks like you don't call put for them and just explicitly free them
>>> as before
> Good point, I missed that place. Will cover that in my next patch.
>
>>> Also sched->free_guilty seems useless with the new approach.
> Yes, I agree.
>
>>> Do we even need the cleanup mechanism at drm_sched_get_cleanup_job with 
>>> this approach...
> I am not quite sure about that for now, let me think about this topic today.
>
> Hi, Christian,
> should I add a fence and get/put to that fence rather than using an explicit 
> refcount?
> And another concerns?

well let me re-iterate:

For the scheduler the job is just a temporary data structure used for 
scheduling the IBs to the hardware.

While pushing the job to the hardware we get a fence structure in return which 
represents the IBs executing on the hardware.

Unfortunately we have applied a design where the job structure is rather used 
for re-submitting the jobs to the hardware after a GPU reset and karma handling 
etc etc...

All that shouldn't have been pushed into the scheduler into the first place and 
we should now work on getting this cleaned up rather than making it an even 
bigger mess by applying halve backed solutions.

So in my opinion adding a reference count to the job is going into the 
completely wrong directly. What we should rather do is to fix the incorrect 
design decision to use jobs as vehicle in the scheduler for reset handling.

To fix this I suggest the following approach:
1. We add a pointer from the drm_sched_fence back to the drm_sched_job.
2. Instead of keeping the job around in the scheduler we keep the fence around. 
For this I suggest to replace the pending_list with a ring buffer.
3. The timedout_job callback is replaced with a timeout_fence callback.
4. The free_job callback is completed dropped. Job lifetime is now handled in 
the driver, not the scheduler.

Regards,
Christian.

>
> Thanks,
> Jack
>
> -Original Message-
> From: Grodzovsky, Andrey 
> Sent: Friday, March 26, 2021 12:32 AM
> To: Zhang, Jack (Jian) ; Christian König
> ; dri-devel@lists.freedesktop.org;
> amd-...@lists.freedesktop.org; Koenig, Christian
> ; Liu, Monk ; Deng, Emily
> ; Rob Herring ; Tomeu Vizoso
> ; Steven Price 
> Subject: Re: [PATCH v3] drm/scheduler re-insert Bailing job to avoid
> memleak
>
> There are a few issues here like - how u handle non guilty singnaled jobs in 
> drm_sched_stop, currently looks like you don't call put for them and

RE: [PATCH 3/5] drm/amdgpu: Paper over the drm_driver mangling for virt

2020-10-30 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

What's the purpose of the patch sets

e.g.: what bug can those 5 patches fix or what feature provided

for this particular one (3/5) I didn't see how it helpful, could you give a 
background  ?

thanks
_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Daniel Vetter 
Sent: Friday, October 30, 2020 6:11 PM
To: DRI Development 
Cc: Intel Graphics Development ; Daniel Vetter 
; Deucher, Alexander ; 
Koenig, Christian ; Quan, Evan ; 
Kuehling, Felix ; Zhang, Hawking 
; Grodzovsky, Andrey ; 
Tuikov, Luben ; Thomas Zimmermann ; 
Liu, Monk ; Yintian Tao ; Li, Dennis 
; Liu, Shaoyun ; Zhang, Bokun 
; Yang, Stanley ; Sheng, Wenhui 
; Gong, Curry ; Daniel Vetter 

Subject: [PATCH 3/5] drm/amdgpu: Paper over the drm_driver mangling for virt

Prep work to make drm_device->driver const.

Signed-off-by: Daniel Vetter 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: Evan Quan 
Cc: Felix Kuehling 
Cc: Hawking Zhang 
Cc: Andrey Grodzovsky 
Cc: Luben Tuikov 
Cc: Thomas Zimmermann 
Cc: Monk Liu 
Cc: Yintian Tao 
Cc: Dennis Li 
Cc: shaoyunl 
Cc: Bokun Zhang 
Cc: "Stanley.Yang" 
Cc: Wenhui Sheng 
Cc: chen gong 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  8   
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 12 +++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 024c3b70b1aa..3d337f13ae4e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1093,7 +1093,7 @@ static const struct pci_device_id pciidlist[] = {

 MODULE_DEVICE_TABLE(pci, pciidlist);

-static struct drm_driver kms_driver;
+struct drm_driver amdgpu_kms_driver;

 static int amdgpu_pci_probe(struct pci_dev *pdev,
 const struct pci_device_id *ent) @@ -1164,7 +1164,7 @@ static int 
amdgpu_pci_probe(struct pci_dev *pdev,
 if (ret)
 return ret;

-adev = devm_drm_dev_alloc(>dev, _driver, typeof(*adev), ddev);
+adev = devm_drm_dev_alloc(>dev, _kms_driver,
+typeof(*adev), ddev);
 if (IS_ERR(adev))
 return PTR_ERR(adev);

@@ -1508,7 +1508,7 @@ int amdgpu_file_to_fpriv(struct file *filp, struct 
amdgpu_fpriv **fpriv)
 return 0;
 }

-static struct drm_driver kms_driver = {
+struct drm_driver amdgpu_kms_driver = {
 .driver_features =
 DRIVER_ATOMIC |
 DRIVER_GEM |
@@ -1571,7 +1571,7 @@ static int __init amdgpu_init(void)
 goto error_fence;

 DRM_INFO("amdgpu kernel modesetting enabled.\n");
-kms_driver.num_ioctls = amdgpu_max_kms_ioctl;
+amdgpu_kms_driver.num_ioctls = amdgpu_max_kms_ioctl;
 amdgpu_register_atpx_handler();

 /* Ignore KFD init failures. Normal when CONFIG_HSA_AMD is not set. */ diff 
--git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index d0aea5e39531..dde4c449c284 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -45,13 +45,23 @@ bool amdgpu_virt_mmio_blocked(struct amdgpu_device *adev)
 return RREG32_NO_KIQ(0xc040) == 0x;  }

+extern struct drm_driver amdgpu_kms_driver;
+
 void amdgpu_virt_init_setting(struct amdgpu_device *adev)  {
 /* enable virtual display */
 if (adev->mode_info.num_crtc == 0)
 adev->mode_info.num_crtc = 1;
 adev->enable_virtual_display = true;
-adev_to_drm(adev)->driver->driver_features &= ~DRIVER_ATOMIC;
+
+/*
+ * FIXME: Either make virt support atomic or make sure you have two
+ * drm_driver structs, these kind of tricks are only ok when there's
+ * guaranteed only a single device per system. This should also be done
+ * before struct drm_device is initialized.
+ */
+amdgpu_kms_driver.driver_features &= ~DRIVER_ATOMIC;
+
 adev->cg_flags = 0;
 adev->pg_flags = 0;
 }
--
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] drm/scheduler: use job count instead of peek

2019-08-13 Thread Liu, Monk
Reviewed-by: monk@amd.com

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Christian König  
Sent: Friday, August 9, 2019 11:31 PM
To: Grodzovsky, Andrey ; 
dri-devel@lists.freedesktop.org; Liu, Monk ; Deng, Emily 

Subject: [PATCH] drm/scheduler: use job count instead of peek

The spsc_queue_peek function is accessing queue->head which belongs to the 
consumer thread and shouldn't be accessed by the producer

This is fixing a rare race condition when destroying entities.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 35ddbec1375a..671c90f34ede 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -95,7 +95,7 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity 
*entity)
rmb(); /* for list_empty to work without lock */
 
if (list_empty(>list) ||
-   spsc_queue_peek(>job_queue) == NULL)
+   spsc_queue_count(>job_queue) == 0)
return true;
 
return false;
@@ -281,7 +281,7 @@ void drm_sched_entity_fini(struct drm_sched_entity *entity)
/* Consumption of existing IBs wasn't completed. Forcefully
 * remove them here.
 */
-   if (spsc_queue_peek(>job_queue)) {
+   if (spsc_queue_count(>job_queue)) {
if (sched) {
/* Park the kernel for a moment to make sure it isn't 
processing
 * our enity.
--
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH] drm: should return upon the best size(v3)

2018-11-27 Thread Liu, Monk
Oh, yeah ... I find one aspect, we need to consider "range_start" and 
"range_end"

Yeah, you guys are right, cool 

/Monk

-Original Message-
From: Liu, Monk 
Sent: Tuesday, November 27, 2018 10:10 PM
To: Koenig, Christian ; Chris Wilson 
; dri-de...@freedesktop.org
Subject: RE: [PATCH] drm: should return upon the best size(v3)

The logic this patch change is only for "best_hole" which is only get called 
with " DRM_MM_INSERT_BEST", In drm_mm_insert_node_in_range(), is there other 
aspect also need calculation and judge for the mode " DRM_MM_INSERT_BEST"  ??

/Monk


-Original Message-
From: Christian König 
Sent: Tuesday, November 27, 2018 9:48 PM
To: Liu, Monk ; Koenig, Christian ; 
Chris Wilson ; dri-de...@freedesktop.org
Subject: Re: [PATCH] drm: should return upon the best size(v3)

Am 27.11.18 um 14:40 schrieb Liu, Monk:
>> A node with the searched size is not necessary the leftmost one,
> We may have two nodes with the same size, and the one return first will be 
> sure *not* the leftmost one, I aware of that ...
> But my question is why we need the leftmost one ?

Because the code is designed to iterate over all available nodes. The size is 
just the primary criteria to judge on.

If we won't return all nodes with the same size we won't necessary find a 
fitting one.

See how the code is used in drm_mm_insert_node_in_range().

Christian.

>
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, November 27, 2018 8:54 PM
> To: Chris Wilson ; Liu, Monk 
> ; dri-de...@freedesktop.org
> Subject: Re: [PATCH] drm: should return upon the best size(v3)
>
> Am 27.11.18 um 11:00 schrieb Christian König:
>> Am 27.11.18 um 10:20 schrieb Chris Wilson:
>>> Quoting Monk Liu (2018-11-27 03:10:34)
>>>> v2:
>>>> amend description:
>>>> for RB tree traveler we don't need to travel to the bottom level if 
>>>> already found the equal size node, thus the search performance can 
>>>> get improved.
>>>>
>>>> v3:
>>>> split "<=" to "<" case and "==" case
>>>>
>>>> Tested-by: Rex Zhu 
>>>> Signed-off-by: Monk Liu 
>>> Still fundamentally broken.
>> Can you explain that further? Do we need to return the deepest hole 
>> of the right size because the following algorithm depends on that?
> Ok figured it out myself by thinking more about it.
>
> A node with the searched size is not necessary the leftmost one, so we would 
> not see all nodes with the searched size and potentially use the wrong one.
>
> Sorry Monk, but Chris is right this optimization is illegal.
>
> Regards,
> Christian.
>
>> Thanks,
>> Christian.
>>
>>> -Chris
>>> ___
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] drm: should return upon the best size(v3)

2018-11-27 Thread Liu, Monk
The logic this patch change is only for "best_hole" which is only get called 
with " DRM_MM_INSERT_BEST", 
In drm_mm_insert_node_in_range(), is there other aspect also need calculation 
and judge for the mode " DRM_MM_INSERT_BEST"  ??

/Monk


-Original Message-
From: Christian König  
Sent: Tuesday, November 27, 2018 9:48 PM
To: Liu, Monk ; Koenig, Christian ; 
Chris Wilson ; dri-de...@freedesktop.org
Subject: Re: [PATCH] drm: should return upon the best size(v3)

Am 27.11.18 um 14:40 schrieb Liu, Monk:
>> A node with the searched size is not necessary the leftmost one,
> We may have two nodes with the same size, and the one return first will be 
> sure *not* the leftmost one, I aware of that ...
> But my question is why we need the leftmost one ?

Because the code is designed to iterate over all available nodes. The size is 
just the primary criteria to judge on.

If we won't return all nodes with the same size we won't necessary find a 
fitting one.

See how the code is used in drm_mm_insert_node_in_range().

Christian.

>
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, November 27, 2018 8:54 PM
> To: Chris Wilson ; Liu, Monk 
> ; dri-de...@freedesktop.org
> Subject: Re: [PATCH] drm: should return upon the best size(v3)
>
> Am 27.11.18 um 11:00 schrieb Christian König:
>> Am 27.11.18 um 10:20 schrieb Chris Wilson:
>>> Quoting Monk Liu (2018-11-27 03:10:34)
>>>> v2:
>>>> amend description:
>>>> for RB tree traveler we don't need to travel to the bottom level if 
>>>> already found the equal size node, thus the search performance can 
>>>> get improved.
>>>>
>>>> v3:
>>>> split "<=" to "<" case and "==" case
>>>>
>>>> Tested-by: Rex Zhu 
>>>> Signed-off-by: Monk Liu 
>>> Still fundamentally broken.
>> Can you explain that further? Do we need to return the deepest hole 
>> of the right size because the following algorithm depends on that?
> Ok figured it out myself by thinking more about it.
>
> A node with the searched size is not necessary the leftmost one, so we would 
> not see all nodes with the searched size and potentially use the wrong one.
>
> Sorry Monk, but Chris is right this optimization is illegal.
>
> Regards,
> Christian.
>
>> Thanks,
>> Christian.
>>
>>> -Chris
>>> ___
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] drm: should return upon the best size(v3)

2018-11-27 Thread Liu, Monk
> A node with the searched size is not necessary the leftmost one,

We may have two nodes with the same size, and the one return first will be sure 
*not* the leftmost one, I aware of that ...
But my question is why we need the leftmost one ? 


-Original Message-
From: Christian König  
Sent: Tuesday, November 27, 2018 8:54 PM
To: Chris Wilson ; Liu, Monk ; 
dri-de...@freedesktop.org
Subject: Re: [PATCH] drm: should return upon the best size(v3)

Am 27.11.18 um 11:00 schrieb Christian König:
> Am 27.11.18 um 10:20 schrieb Chris Wilson:
>> Quoting Monk Liu (2018-11-27 03:10:34)
>>> v2:
>>> amend description:
>>> for RB tree traveler we don't need to travel to the bottom level if 
>>> already found the equal size node, thus the search performance can 
>>> get improved.
>>>
>>> v3:
>>> split "<=" to "<" case and "==" case
>>>
>>> Tested-by: Rex Zhu 
>>> Signed-off-by: Monk Liu 
>> Still fundamentally broken.
>
> Can you explain that further? Do we need to return the deepest hole of 
> the right size because the following algorithm depends on that?

Ok figured it out myself by thinking more about it.

A node with the searched size is not necessary the leftmost one, so we would 
not see all nodes with the searched size and potentially use the wrong one.

Sorry Monk, but Chris is right this optimization is illegal.

Regards,
Christian.

>
> Thanks,
> Christian.
>
>> -Chris
>> ___
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


FW: [PATCH] drm: should break if already get the best size

2018-11-23 Thread Liu, Monk
Hi Chris

Please check the sanity test of the patch from Rex

/Monk
From: Zhu, Rex
Sent: Friday, November 23, 2018 5:45 PM
To: Liu, Monk ; amd-...@lists.freedesktop.org
Subject: Re: [PATCH] drm: should break if already get the best size


Tested-by: Rex Zhu mailto:rex@amd.com>>



Without this patch, if we search node via rb tree.



For example: we insert  different node with rand size, size range in 
(1-).



the key in root node is 5587.



if we try to find the node with key equal to 5587 or 7381,


Loop:
node->key is 5587
node->key is 2273
node->key is 3706
node->key is 4892
node->key is 5296
node->key is 5461
node->key is 5519
node->key is 5549
node->key is 5570
node->key is 5581
node->key is 5584
node->key is 5585
node->key is 5586
node->key is 5586

Find the best node, key is 5587 (Loop 14 levels)

Loop:
node->key is 5587
node->key is 7381
node->key is 6474
node->key is 7034
node->key is 7228
node->key is 7314
node->key is 7339
node->key is 7349
node->key is 7372
node->key is 7377
node->key is 7378
node->key is 7379
node->key is 7379

find the best node, key is 7381. (Loop 13 levels)



With this patch:

we don't need to go down if we found the right node that size equal to we 
needed.



Best Regards
Rex


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Monk Liu mailto:monk@amd.com>>
Sent: Thursday, November 22, 2018 8:33 PM
To: amd-...@lists.freedesktop.org<mailto:amd-...@lists.freedesktop.org>
Cc: Liu, Monk
Subject: [PATCH] drm: should break if already get the best size

Signed-off-by: Monk Liu mailto:monk@amd.com>>
---
 drivers/gpu/drm/drm_mm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 3cc5fbd..369fd9b 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, u64 
size)
 if (size <= node->hole_size) {
 best = node;
 rb = rb->rb_right;
+   if (size == node->hole_size)
+   break;
 } else {
 rb = rb->rb_left;
 }
--
2.7.4

___
amd-gfx mailing list
amd-...@lists.freedesktop.org<mailto:amd-...@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amd-gfx Info Page - 
freedesktop.org<https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
lists.freedesktop.org
To see the collection of prior postings to the list, visit the amd-gfx 
Archives.. Using amd-gfx: To post a message to all the list members, send email 
to amd-...@lists.freedesktop.org<mailto:amd-...@lists.freedesktop.org>. You can 
subscribe to the list, or change your existing subscription, in the sections 
below.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: FW: [PATCH] drm: should break if already get the best size

2018-11-23 Thread Liu, Monk
There is no checks at all in this best_hole() ... can you review the patch 
again ?

/Monk
-Original Message-
From: Chris Wilson  
Sent: Friday, November 23, 2018 5:34 PM
To: Liu, Monk ; dri-devel@lists.freedesktop.org
Subject: RE: FW: [PATCH] drm: should break if already get the best size

Quoting Liu, Monk (2018-11-23 09:11:02)
> What do you mean the first in the chain ? and also can you explain the 
> " perfect match." ? thanks
> 
> Assume there is couple nodes equal to the size you requested, without 
> this patch it will traveler to the bottom level of the RB tree and 
> gives you the node that close to the bottom level, which takes more 
> time compared with break on the first node, but anyway you eventually 
> get the node with the same size

Size is but of one many checks the node must pass before being returned.
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: FW: [PATCH] drm: should break if already get the best size

2018-11-23 Thread Liu, Monk
What do you mean the first in the chain ? and also can you explain the " 
perfect match." ? thanks 

Assume there is couple nodes equal to the size you requested, without this 
patch it will traveler to the bottom level of the RB tree and gives you
the node that close to the bottom level, which takes more time compared with 
break on the first node, but anyway you eventually get the node with the same 
size 

if there is no node that equal to the size you requested, without or with my 
patch the logic is totally the same.

/Monk
-Original Message-
From: Chris Wilson  
Sent: Friday, November 23, 2018 5:03 PM
To: Liu, Monk ; dri-devel@lists.freedesktop.org
Subject: Re: FW: [PATCH] drm: should break if already get the best size

Quoting Liu, Monk (2018-11-23 08:02:11)
> 
> 
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Monk Liu
> Sent: Thursday, November 22, 2018 8:33 PM
> To: amd-...@lists.freedesktop.org
> Cc: Liu, Monk 
> Subject: [PATCH] drm: should break if already get the best size
> 
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/drm_mm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 
> 3cc5fbd..369fd9b 100644
> --- a/drivers/gpu/drm/drm_mm.c
> +++ b/drivers/gpu/drm/drm_mm.c
> @@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, 
> u64 size)
> if (size <= node->hole_size) {
> best = node;
> rb = rb->rb_right;
> +   if (size == node->hole_size)
> +   break;

No. The point is to find the first in the chain that matches because not every 
node is suitable. By not checking all best_sizes you may end up skipping the 
perfect match.
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


FW: [PATCH] drm: should break if already get the best size

2018-11-23 Thread Liu, Monk


-Original Message-
From: amd-gfx  On Behalf Of Monk Liu
Sent: Thursday, November 22, 2018 8:33 PM
To: amd-...@lists.freedesktop.org
Cc: Liu, Monk 
Subject: [PATCH] drm: should break if already get the best size

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/drm_mm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 
3cc5fbd..369fd9b 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, u64 
size)
if (size <= node->hole_size) {
best = node;
rb = rb->rb_right;
+   if (size == node->hole_size)
+   break;
} else {
rb = rb->rb_left;
}
--
2.7.4

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


FW: [PATCH] drm: should break if already get the best size

2018-11-23 Thread Liu, Monk


-Original Message-
From: amd-gfx  On Behalf Of Monk Liu
Sent: Thursday, November 22, 2018 8:33 PM
To: amd-...@lists.freedesktop.org
Cc: Liu, Monk 
Subject: [PATCH] drm: should break if already get the best size

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/drm_mm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 
3cc5fbd..369fd9b 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, u64 
size)
if (size <= node->hole_size) {
best = node;
rb = rb->rb_right;
+   if (size == node->hole_size)
+   break;
} else {
rb = rb->rb_left;
}
--
2.7.4

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: reservation questions

2018-03-06 Thread Liu, Monk
Yeah, I got the point, using dependency/scheduler to make sure the resulting 
fence always not signal before the ones hooked in dependencies


thanks

/Monk


From: Christian K?nig <ckoenig.leichtzumer...@gmail.com>
Sent: Tuesday, March 6, 2018 8:46:57 PM
To: Liu, Monk; Koenig, Christian
Cc: dri-devel@lists.freedesktop.org
Subject: Re: reservation questions


e.g. the excl fence is with the same ctx (but later) of the one in shared list.

Well it's not on the same context, but it is guaranteed to not complete before 
all shared fences.

See for example how it is used in amdgpu_copy_buffer(). We first sync to all 
fences in the reservation object:
r = amdgpu_sync_resv(adev, >sync, resv,
 AMDGPU_FENCE_OWNER_UNDEFINED,
 false);
if (r) {
DRM_ERROR("sync failed (%d).\n", r);
goto error_free;
}
This way the resulting fence will never signal before anything else and so can 
safely be used as exclusive fence in the reservation object.

Regards,
Christian.

Am 06.03.2018 um 13:32 schrieb Liu, Monk:

You mean the caller must guarantees the excl fence will only signal till all 
shared fence signaled, so you can just

ignores all shared fence if  add_excl_fence() is invoked.


e.g. the excl fence is with the same ctx (but later) of the one in shared list.


thanks for the explanation


/Monk






From: Koenig, Christian
Sent: Tuesday, March 6, 2018 7:11:50 PM
To: Liu, Monk
Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; 
Chris Wilson
Subject: Re: reservation questions

Hi Monk,

your check isn't correct because you still haven't understood the semantics 
here.

the assumption that all shared fences should be signaled before adding excl 
fence looks not 100% guaranteed in LKG,
The semantic is NOT that all shared fences are signaled when the exclusive 
fence is added.

Instead the requirement is that the exclusive fence signals after all shared 
fences signaled. In other words that is an asynchronous handling here.

I honestly don't know how else to explain it.

Regards,
Christian.

Am 06.03.2018 um 12:03 schrieb Liu, Monk:

Hi Christian


I use blow patch to capture the incorrect case :


@@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct 
reservation_object *obj,
write_seqcount_end(>seq);
preempt_enable();

-   /* inplace update, no shared fences */
-   while (i--)
-   dma_fence_put(rcu_dereference_protected(old->shared[i],
-   reservation_object_held(obj)));
+   /* inplace update, no shared fences continue after all shared signaled 
*/
+   while (i--) {
+   struct dma_fence *f = rcu_dereference_protected(old->shared[i],
+   reservation_object_held(obj));
+   if (!dma_fence_is_signaled(f))
+   BUG();
+
+   dma_fence_put(f);
+   /* better assign shared[i] with NULL for sure */
+   rcu_assign_pointer(old->shared[i], NULL);
+   }

dma_fence_put(old_fence);
+
+
 }
 EXPORT_SYMBOL(reservation_object_add_excl_fence);

and I hit this BUG() during test:

[  105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for :00:08.0 on 
minor 0
[  105.623332] [ cut here ]
[  105.623335] kernel BUG at drivers/dma-buf/reservation.c:275!
[  105.624470] invalid opcode:  [#1] SMP
[  105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm 
i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt 
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep 
crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi 
snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper 
cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 
mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy 
pata_acpi
[  105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1
[  105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  105.633528] task: 8f8a6a165a00 task.stack: b1204159c000
[  105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0
[  105.635824] RSP: 0018:b1204159f9f0 EFLAGS: 00010246
[  105.636805] RAX:  RBX: 8f8a64bee760 RCX: 8f8a6bfa2f50
[  105.638123] RDX: 8f8a6bfa6770 RSI: 8f8a64bee660 RDI: 8f8a6635f628
[  105.639440] RBP: b1204159fa18 R08:  R09: 0001
[  105.640702] R10: b1204159f808 R11: 0003 R12: 
[  105.641947] R13:  R14: 8f8a6d0f0200 R15: 8f8a64beee60
[  1

Re: reservation questions

2018-03-06 Thread Liu, Monk
You mean the caller must guarantees the excl fence will only signal till all 
shared fence signaled, so you can just

ignores all shared fence if  add_excl_fence() is invoked.


e.g. the excl fence is with the same ctx (but later) of the one in shared list.


thanks for the explanation


/Monk






From: Koenig, Christian
Sent: Tuesday, March 6, 2018 7:11:50 PM
To: Liu, Monk
Cc: dri-devel@lists.freedesktop.org; Chris Wilson
Subject: Re: reservation questions

Hi Monk,

your check isn't correct because you still haven't understood the semantics 
here.

the assumption that all shared fences should be signaled before adding excl 
fence looks not 100% guaranteed in LKG,
The semantic is NOT that all shared fences are signaled when the exclusive 
fence is added.

Instead the requirement is that the exclusive fence signals after all shared 
fences signaled. In other words that is an asynchronous handling here.

I honestly don't know how else to explain it.

Regards,
Christian.

Am 06.03.2018 um 12:03 schrieb Liu, Monk:

Hi Christian


I use blow patch to capture the incorrect case :


@@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct 
reservation_object *obj,
write_seqcount_end(>seq);
preempt_enable();

-   /* inplace update, no shared fences */
-   while (i--)
-   dma_fence_put(rcu_dereference_protected(old->shared[i],
-   reservation_object_held(obj)));
+   /* inplace update, no shared fences continue after all shared signaled 
*/
+   while (i--) {
+   struct dma_fence *f = rcu_dereference_protected(old->shared[i],
+   reservation_object_held(obj));
+   if (!dma_fence_is_signaled(f))
+   BUG();
+
+   dma_fence_put(f);
+   /* better assign shared[i] with NULL for sure */
+   rcu_assign_pointer(old->shared[i], NULL);
+   }

dma_fence_put(old_fence);
+
+
 }
 EXPORT_SYMBOL(reservation_object_add_excl_fence);

and I hit this BUG() during test:

[  105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for :00:08.0 on 
minor 0
[  105.623332] [ cut here ]
[  105.623335] kernel BUG at drivers/dma-buf/reservation.c:275!
[  105.624470] invalid opcode:  [#1] SMP
[  105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm 
i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt 
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep 
crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi 
snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper 
cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 
mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy 
pata_acpi
[  105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1
[  105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  105.633528] task: 8f8a6a165a00 task.stack: b1204159c000
[  105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0
[  105.635824] RSP: 0018:b1204159f9f0 EFLAGS: 00010246
[  105.636805] RAX:  RBX: 8f8a64bee760 RCX: 8f8a6bfa2f50
[  105.638123] RDX: 8f8a6bfa6770 RSI: 8f8a64bee660 RDI: 8f8a6635f628
[  105.639440] RBP: b1204159fa18 R08:  R09: 0001
[  105.640702] R10: b1204159f808 R11: 0003 R12: 
[  105.641947] R13:  R14: 8f8a6d0f0200 R15: 8f8a64beee60
[  105.643165] FS:  7fd13c73d940() GS:8f8a76d8() 
knlGS:
[  105.644573] CS:  0010 DS:  ES:  CR0: 80050033
[  105.646482] CR2: 7fd13c6fd000 CR3: 0001a2a58000 CR4: 001406e0
[  105.648467] Call Trace:
[  105.652480]  amdgpu_bo_do_create+0x3a1/0x540 [amdgpu]
[  105.654233]  amdgpu_bo_create+0x3a/0x220 [amdgpu]
[  105.655956]  amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu]
[  105.657641]  amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu]
[  105.659155]  amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu]
[  105.660698]  ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu]
[  105.662515]  ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
[  105.664203]  drm_ioctl_kernel+0x69/0xb0 [drm]
[  105.665491]  ? drm_ioctl_kernel+0x69/0xb0 [drm]
[  105.666959]  drm_ioctl+0x2d2/0x390 [drm]
[  105.668373]  ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
[  105.670056]  ? call_rcu_sched+0x1d/0x20
[  105.671516]  ? put_object+0x26/0x30
[  105.672741]  ? __delete_object+0x39/0x50
[  105.674048]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
[  105.675551]  do_vfs_ioctl+0x92/0x5a0
[  105.676874]  ? kvm_sched_clock_read+0x1e/0x30
[  105.678276]  ? sched_clock+0x9/0x10
[  105.679553]  ? get_vtime_delta+0x99/0xc0
[  105.681007]  SyS_ioctl+0x79/0x90
[

Re: reservation questions

2018-03-06 Thread Liu, Monk
o take a deep look ...


/Monk


________
From: Liu, Monk
Sent: Tuesday, March 6, 2018 6:47 PM
To: Koenig, Christian; Chris Wilson; dri-devel@lists.freedesktop.org
Subject: Re: reservation questions


ok, that's good point ...


From: Koenig, Christian
Sent: Tuesday, March 6, 2018 6:42:44 PM
To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org
Subject: Re: reservation questions

Hi Monk,

that is to remove the problem that allocating memory could fail.

E.g. we only add the fence after sending the command to the hardware, so there 
is now way back and we need to add the fence or break memory management.

So reservation_object_reserve_shared() makes sure there is a free fence slot 
*before* we start to prepare things for the hardware.

Regards,
Christian.

Am 06.03.2018 um 11:19 schrieb Liu, Monk:

Hi Chris


another question is why we not just call "reservation_object_reserve_shared"

during below add_shared_fence function, so the BUG_ON() could be avoided and 
caller won't need

to worry when and how much time it should call reserve_shared() ?

thanks !


void reservation_object_add_shared_fence(struct reservation_object *obj,
 struct dma_fence *fence)
{
struct reservation_object_list *old, *fobj = obj->staged;
old = reservation_object_get_list(obj);
obj->staged = NULL;
if (!fobj) {
BUG_ON(old->shared_count >= old->shared_max);
reservation_object_add_shared_inplace(obj, old, fence);
} else
reservation_object_add_shared_replace(obj, old, fobj, fence);
}
EXPORT_SYMBOL(reservation_object_add_shared_fence);

From: Chris Wilson <ch...@chris-wilson.co.uk><mailto:ch...@chris-wilson.co.uk>
Sent: Tuesday, March 6, 2018 6:10:21 PM
To: Liu, Monk; 
dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; 
Koenig, Christian
Subject: Re: reservation questions

Quoting Liu, Monk (2018-03-06 09:45:19)
> call reservation_object_add_excl_fence,
> it set obj->fence->shared_count to 0, and put all shared fence from obj->fence
> without waiting signaling.
> (this action looks inappropriate, I think at least before put all those shared
> fences
> we should dma_wait_fence() on them to make sure they are signaled)

No. Serialisation of resv updates are handled by the caller, the fences
are ordered asynchronously so the wait is implicit in the construction.
(I.e. before the excl fence can be signaled, all of the earlier shared
fences must be signaled. You can even say before the operation that the
excl fence signals completion of can begin, all the shared fences must
have been signaled. But that is all implicit so that we can do it
asynchronously.)

> call reservation_object_reserve_shared,
> this time obj->staged isn't NULL, and it is freed (nothing bad now
> since obj->fence points to other place),
> and obj->staged set to NULL,
>
> call reservation_object_add_shared_fence,
> this time should going through reservation_object_add_shared_inplace,
> But BUG_ON(old->shared_count >= old->shared_max) will hit !

How? You only free staged iff shared_count < shared_max.

You've reminded me that we should cover all this with a bunch of
selftests.
-Chris

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: reservation questions

2018-03-06 Thread Liu, Monk
ok, that's good point ...


From: Koenig, Christian
Sent: Tuesday, March 6, 2018 6:42:44 PM
To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org
Subject: Re: reservation questions

Hi Monk,

that is to remove the problem that allocating memory could fail.

E.g. we only add the fence after sending the command to the hardware, so there 
is now way back and we need to add the fence or break memory management.

So reservation_object_reserve_shared() makes sure there is a free fence slot 
*before* we start to prepare things for the hardware.

Regards,
Christian.

Am 06.03.2018 um 11:19 schrieb Liu, Monk:

Hi Chris


another question is why we not just call "reservation_object_reserve_shared"

during below add_shared_fence function, so the BUG_ON() could be avoided and 
caller won't need

to worry when and how much time it should call reserve_shared() ?

thanks !


void reservation_object_add_shared_fence(struct reservation_object *obj,
 struct dma_fence *fence)
{
struct reservation_object_list *old, *fobj = obj->staged;
old = reservation_object_get_list(obj);
obj->staged = NULL;
if (!fobj) {
BUG_ON(old->shared_count >= old->shared_max);
reservation_object_add_shared_inplace(obj, old, fence);
} else
reservation_object_add_shared_replace(obj, old, fobj, fence);
}
EXPORT_SYMBOL(reservation_object_add_shared_fence);

From: Chris Wilson <ch...@chris-wilson.co.uk><mailto:ch...@chris-wilson.co.uk>
Sent: Tuesday, March 6, 2018 6:10:21 PM
To: Liu, Monk; 
dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; 
Koenig, Christian
Subject: Re: reservation questions

Quoting Liu, Monk (2018-03-06 09:45:19)
> call reservation_object_add_excl_fence,
> it set obj->fence->shared_count to 0, and put all shared fence from obj->fence
> without waiting signaling.
> (this action looks inappropriate, I think at least before put all those shared
> fences
> we should dma_wait_fence() on them to make sure they are signaled)

No. Serialisation of resv updates are handled by the caller, the fences
are ordered asynchronously so the wait is implicit in the construction.
(I.e. before the excl fence can be signaled, all of the earlier shared
fences must be signaled. You can even say before the operation that the
excl fence signals completion of can begin, all the shared fences must
have been signaled. But that is all implicit so that we can do it
asynchronously.)

> call reservation_object_reserve_shared,
> this time obj->staged isn't NULL, and it is freed (nothing bad now
> since obj->fence points to other place),
> and obj->staged set to NULL,
>
> call reservation_object_add_shared_fence,
> this time should going through reservation_object_add_shared_inplace,
> But BUG_ON(old->shared_count >= old->shared_max) will hit !

How? You only free staged iff shared_count < shared_max.

You've reminded me that we should cover all this with a bunch of
selftests.
-Chris

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)

2018-03-06 Thread Liu, Monk
okay


From: Chris Wilson <ch...@chris-wilson.co.uk>
Sent: Tuesday, March 6, 2018 4:24:21 PM
To: Liu, Monk; dri-de...@freedesktop.org
Cc: Liu, Monk
Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)

Quoting Monk Liu (2018-03-06 03:53:10)
> v2:
> still check context first to avoid warning from dma_fence_is_later
> apply this fix in add_shared_replace as well
>
> Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767
> Signed-off-by: Monk Liu <monk@amd.com>
> ---
>  drivers/dma-buf/reservation.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index 314eb10..c6e3c86 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -118,7 +118,8 @@ reservation_object_add_shared_inplace(struct 
> reservation_object *obj,
> old_fence = rcu_dereference_protected(fobj->shared[i],
> reservation_object_held(obj));
>
> -   if (old_fence->context == fence->context) {
> +   if (old_fence->context == fence->context &&
> +   dma_fence_is_later(fence, old_fence)) {

This should be true by construction. Adding an older fence on the same
context is a programming bug, imo. Between different callers the resv
should have been locked and the fenced operations serialised, from the
same caller, you shouldn't be handling more than one output fence?
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: reservation questions

2018-03-06 Thread Liu, Monk
Hi Chris


another question is why we not just call "reservation_object_reserve_shared"

during below add_shared_fence function, so the BUG_ON() could be avoided and 
caller won't need

to worry when and how much time it should call reserve_shared() ?


thanks !


void reservation_object_add_shared_fence(struct reservation_object *obj,
 struct dma_fence *fence)
{
struct reservation_object_list *old, *fobj = obj->staged;

old = reservation_object_get_list(obj);
obj->staged = NULL;

if (!fobj) {
BUG_ON(old->shared_count >= old->shared_max);
reservation_object_add_shared_inplace(obj, old, fence);
} else
reservation_object_add_shared_replace(obj, old, fobj, fence);
}
EXPORT_SYMBOL(reservation_object_add_shared_fence);




From: Chris Wilson <ch...@chris-wilson.co.uk>
Sent: Tuesday, March 6, 2018 6:10:21 PM
To: Liu, Monk; dri-devel@lists.freedesktop.org; Koenig, Christian
Subject: Re: reservation questions

Quoting Liu, Monk (2018-03-06 09:45:19)
> call reservation_object_add_excl_fence,
> it set obj->fence->shared_count to 0, and put all shared fence from obj->fence
> without waiting signaling.
> (this action looks inappropriate, I think at least before put all those shared
> fences
> we should dma_wait_fence() on them to make sure they are signaled)

No. Serialisation of resv updates are handled by the caller, the fences
are ordered asynchronously so the wait is implicit in the construction.
(I.e. before the excl fence can be signaled, all of the earlier shared
fences must be signaled. You can even say before the operation that the
excl fence signals completion of can begin, all the shared fences must
have been signaled. But that is all implicit so that we can do it
asynchronously.)

> call reservation_object_reserve_shared,
> this time obj->staged isn't NULL, and it is freed (nothing bad now
> since obj->fence points to other place),
> and obj->staged set to NULL,
>
> call reservation_object_add_shared_fence,
> this time should going through reservation_object_add_shared_inplace,
> But BUG_ON(old->shared_count >= old->shared_max) will hit !

How? You only free staged iff shared_count < shared_max.

You've reminded me that we should cover all this with a bunch of
selftests.
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: reservation questions

2018-03-06 Thread Liu, Monk
Yeah should be 0, typo sorry


I use 3dmark test and successfully triggered a case in reserve_shared:


if (obj->staged!=NULL) {

BUG();

}

kfree(obj->staged)


Previously I cannot figure out why the hell this BUG() could be hit, finally 
the scenario comes up,

you only need one add_excl_fence() after reserved_shared(), and next time the 
reserve_shared will

go hit that BUG(), but that's okay, since it only frees the obj->staged that 
allocated right before this calling

in the previous reserve_shared(), and no one refers to it now.





From: Koenig, Christian
Sent: Tuesday, March 6, 2018 6:05:27 PM
To: Liu, Monk; dri-devel@lists.freedesktop.org; Chris Wilson
Subject: Re: reservation questions

Am 06.03.2018 um 10:56 schrieb Liu, Monk:

sorry, I have some mistake in previous thread, correct it as followings.

1) considering below sequence:

call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,

call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.

call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, and put all shared fence from obj->fence 
without waiting signaling.
(this action looks inappropriate, I think at least before put all those shared 
fences
we should dma_wait_fence() on them to make sure they are signaled)

The exclusive fence replaces all shared fences, e.g. the exclusive operation 
needs to wait for all shared fences before it can access the protected object. 
Because of this we can drop all shared fences when a new exclusive fence is set.


call reservation_object_reserve_shared,
this time obj->staged isn't NULL, and it is freed (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,

call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
since old->shared_count is 1 now, during reservation_object_add_shared_inplace()

Why would old->shared_count be 1 now? As far as I can see it should be zero.

Christian.

it would go through the shared list, but the fence in shared list is now wild 
pointer:
for (i = 0; i < fobj->shared_count; ++i) {
struct dma_fence *old_fence;
old_fence = rcu_dereference_protected(fobj->shared[i],
reservation_object_held(obj));
if (old_fence->context == fence->context &&
dma_fence_is_later(fence, old_fence)) {
/* memory barrier is added by write_seqcount_begin */
RCU_INIT_POINTER(fobj->shared[i], fence);
write_seqcount_end(>seq);
preempt_enable();
dma_fence_put(old_fence);
return;
}
if (!signaled && dma_fence_is_signaled(old_fence)) {
signaled = old_fence;
signaled_idx = i;
}
}

see that old_fence is get from fobj, and fobj is from 
reservation_object_get_list(obj)
in outside, which is obj->fence, and in add_excl_fence, all dma_fence in
obj->fence is already put.

/Monk



From: Liu, Monk
Sent: Tuesday, March 6, 2018 5:45:19 PM
To: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; 
Chris Wilson; Koenig, Christian
Subject: reservation questions

Hi Christian & Chris

two question regarding resv:

1) considering below sequence:

call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,

call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.

call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, and put all shared fence from obj->fence 
without waiting signaling.
(this action looks inappropriate, I think at least before put all those shared 
fences
we should dma_wait_fence() on them to make sure they are signaled)

call reservation_object_reserve_shared,
this time obj->staged isn't NULL, and it is freed (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,

call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
But BUG_ON(old->shared_count >= old->shared_max) will hit !

This looks a design flaw in reservation object, shouldn't we fix it ?


2) in add_excl_fence(), it simply set old->shared_count to 0, and put all 
shared fences of old
is that correct? if excl fence is really exclusively used, why we still 
consider both shared fence and
excl fence on wait_timeout_rcu() routine, see blew description of this routine

/**
* reservation_object_wait_timeout_rcu - Wait on reservation's objects
* shared and/or exclusi

RE: reservation questions

2018-03-06 Thread Liu, Monk
Oh, I see
for (i =
0; i < fobj->shared_count; ++i) {

struct dma_fence *old_fence;


Since it use I < fobj->shared_count, then the wild pointer access won’t hit, 
please ignore question 1

The only valuable is question 2: how we treat excl fence and shared fence?
e.g. when we add an excl fence to resv, how to deal with shared ? current logic 
is just put the all
do we need to wait on their signaling before putting them ?

thanks

/Monk


From: Liu, Monk
Sent: 2018年3月6日 17:57
To: dri-devel@lists.freedesktop.org; Chris Wilson <ch...@chris-wilson.co.uk>; 
Koenig, Christian <christian.koe...@amd.com>
Subject: Re: reservation questions


sorry, I have some mistake in previous thread, correct it as followings.

1) considering below sequence:

call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,

call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.

call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, and put all shared fence from obj->fence 
without waiting signaling.
(this action looks inappropriate, I think at least before put all those shared 
fences
we should dma_wait_fence() on them to make sure they are signaled)

call reservation_object_reserve_shared,
this time obj->staged isn't NULL, and it is freed (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,

call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
since old->shared_count is 1 now, during reservation_object_add_shared_inplace()
it would go through the shared list, but the fence in shared list is now wild 
pointer:

for (i =
0; i < fobj->shared_count; ++i) {

struct dma_fence *old_fence;



old_fence = rcu_dereference_protected(fobj->shared[i],

reservation_object_held(obj));



if (old_fence->context == fence->context &&

dma_fence_is_later(fence, old_fence)) {

/* memory barrier is added by write_seqcount_begin */

RCU_INIT_POINTER(fobj->shared[i], fence);

write_seqcount_end(>seq);

preempt_enable();



dma_fence_put(old_fence);

return;

}



if (!signaled &&
dma_fence_is_signaled(old_fence)) {

signaled = old_fence;

signaled_idx = i;

}

}


see that old_fence is get from fobj, and fobj is from 
reservation_object_get_list(obj)
in outside, which is obj->fence, and in add_excl_fence, all dma_fence in
obj->fence is already put.

/Monk



From: Liu, Monk
Sent: Tuesday, March 6, 2018 5:45:19 PM
To: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; 
Chris Wilson; Koenig, Christian
Subject: reservation questions

Hi Christian & Chris

two question regarding resv:

1) considering below sequence:

call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,

call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.

call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, and put all shared fence from obj->fence 
without waiting signaling.
(this action looks inappropriate, I think at least before put all those shared 
fences
we should dma_wait_fence() on them to make sure they are signaled)

call reservation_object_reserve_shared,
this time obj->staged isn't NULL, and it is freed (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,

call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
But BUG_ON(old->shared_count >= old->shared_max) will hit !

This looks a design flaw in reservation object, shouldn't we fix it ?


2) in add_excl_fence(), it simply set old->shared_count to 0, and put all 
shared fences of old
is that correct? if excl fence is really exclusively used, why we still 
consider both shared fence and
excl fence on wait_timeout_rcu() routine, see blew description of this routine




/**

* reservation_object_wait_timeout_rcu - Wait on reservation's objects

* shared and/or exclusive fences.

* @obj: the reservation object

* @wait_all: if true, wait on all fences, else wait on just exclusive fence

* @intr: if true, do interruptible wait

* @timeout: timeout value in jiffies or zero to return immediately

*

* RETURNS

* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or

* greater than zer on success.

*/

long reservation_object_wait_timeou

Re: reservation questions

2018-03-06 Thread Liu, Monk

sorry, I have some mistake in previous thread, correct it as followings.

1) considering below sequence:

call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,

call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.

call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, and put all shared fence from obj->fence 
without waiting signaling.
(this action looks inappropriate, I think at least before put all those shared 
fences
we should dma_wait_fence() on them to make sure they are signaled)

call reservation_object_reserve_shared,
this time obj->staged isn't NULL, and it is freed (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,

call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
since old->shared_count is 1 now, during reservation_object_add_shared_inplace()
it would go through the shared list, but the fence in shared list is now wild 
pointer:
for (i = 0; i < fobj->shared_count; ++i) {
struct dma_fence *old_fence;

old_fence = rcu_dereference_protected(fobj->shared[i],
reservation_object_held(obj));

if (old_fence->context == fence->context &&
dma_fence_is_later(fence, old_fence)) {
/* memory barrier is added by write_seqcount_begin */
RCU_INIT_POINTER(fobj->shared[i], fence);
write_seqcount_end(>seq);
preempt_enable();

dma_fence_put(old_fence);
return;
}

if (!signaled && dma_fence_is_signaled(old_fence)) {
signaled = old_fence;
signaled_idx = i;
}
}

see that old_fence is get from fobj, and fobj is from 
reservation_object_get_list(obj)
in outside, which is obj->fence, and in add_excl_fence, all dma_fence in
obj->fence is already put.

/Monk




From: Liu, Monk
Sent: Tuesday, March 6, 2018 5:45:19 PM
To: dri-devel@lists.freedesktop.org; Chris Wilson; Koenig, Christian
Subject: reservation questions


Hi Christian & Chris

two question regarding resv:

1) considering below sequence:

call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,

call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.

call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, and put all shared fence from obj->fence 
without waiting signaling.
(this action looks inappropriate, I think at least before put all those shared 
fences
we should dma_wait_fence() on them to make sure they are signaled)

call reservation_object_reserve_shared,
this time obj->staged isn't NULL, and it is freed (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,

call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
But BUG_ON(old->shared_count >= old->shared_max) will hit !

This looks a design flaw in reservation object, shouldn't we fix it ?


2) in add_excl_fence(), it simply set old->shared_count to 0, and put all 
shared fences of old
is that correct? if excl fence is really exclusively used, why we still 
consider both shared fence and
excl fence on wait_timeout_rcu() routine, see blew description of this routine


/**
* reservation_object_wait_timeout_rcu - Wait on reservation's objects
* shared and/or exclusive fences.
* @obj: the reservation object
* @wait_all: if true, wait on all fences, else wait on just exclusive fence
* @intr: if true, do interruptible wait
* @timeout: timeout value in jiffies or zero to return immediately
*
* RETURNS
* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
* greater than zer on success.
*/
long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
 bool wait_all, bool intr,
 unsigned long timeout)


thanks
/Monk




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


reservation questions

2018-03-06 Thread Liu, Monk
Hi Christian & Chris

two question regarding resv:

1) considering below sequence:

call reservation_object_add_shared_fence,
now assume old->shared_count is now 3
call reservation_object_add_shared_fence,
now assume old->shared_count is now 4,

call reservation_object_reserve_shared,
now obj->staged is new allocated, and its shared_max = 8, but not
used by far.

call reservation_object_add_excl_fence,
it set obj->fence->shared_count to 0, and put all shared fence from obj->fence 
without waiting signaling.
(this action looks inappropriate, I think at least before put all those shared 
fences
we should dma_wait_fence() on them to make sure they are signaled)

call reservation_object_reserve_shared,
this time obj->staged isn't NULL, and it is freed (nothing bad now
since obj->fence points to other place),
and obj->staged set to NULL,

call reservation_object_add_shared_fence,
this time should going through reservation_object_add_shared_inplace,
But BUG_ON(old->shared_count >= old->shared_max) will hit !

This looks a design flaw in reservation object, shouldn't we fix it ?


2) in add_excl_fence(), it simply set old->shared_count to 0, and put all 
shared fences of old
is that correct? if excl fence is really exclusively used, why we still 
consider both shared fence and
excl fence on wait_timeout_rcu() routine, see blew description of this routine


/**
* reservation_object_wait_timeout_rcu - Wait on reservation's objects
* shared and/or exclusive fences.
* @obj: the reservation object
* @wait_all: if true, wait on all fences, else wait on just exclusive fence
* @intr: if true, do interruptible wait
* @timeout: timeout value in jiffies or zero to return immediately
*
* RETURNS
* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
* greater than zer on success.
*/
long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
 bool wait_all, bool intr,
 unsigned long timeout)


thanks
/Monk




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] dma-buf/reservation: should keep the new fence in add_shared_inplace

2018-03-06 Thread Liu, Monk
why? is there a design doc mentioned for this on reservation ?


From: Christian K?nig <ckoenig.leichtzumer...@gmail.com>
Sent: Tuesday, March 6, 2018 4:03:39 PM
To: Liu, Monk; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] dma-buf/reservation: should keep the new fence in 
add_shared_inplace

NAK, the newly added fence must always be newer than the existing one.

Christian.

Am 06.03.2018 um 04:09 schrieb Monk Liu:
> Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767
> Signed-off-by: Monk Liu <monk@amd.com>
> ---
>   drivers/dma-buf/reservation.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index 314eb10..29b7e45 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -118,7 +118,7 @@ reservation_object_add_shared_inplace(struct 
> reservation_object *obj,
>old_fence = rcu_dereference_protected(fobj->shared[i],
>reservation_object_held(obj));
>
> - if (old_fence->context == fence->context) {
> + if (dma_fence_is_later(fence, old_fence)) {
>/* memory barrier is added by write_seqcount_begin */
>RCU_INIT_POINTER(fobj->shared[i], fence);
>write_seqcount_end(>seq);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)

2018-03-05 Thread Liu, Monk
Make sense, will give v3

-Original Message-
From: Zhou, David(ChunMing) 
Sent: 2018年3月6日 14:08
To: Liu, Monk <monk@amd.com>; Zhou, David(ChunMing) <david1.z...@amd.com>; 
dri-de...@freedesktop.org
Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)



On 2018年03月06日 13:59, Liu, Monk wrote:
>>
>> -if (check->context == fence->context ||
>> +if ((check->context == fence->context &&
>> +dma_fence_is_later(fence, check)) ||
> We still need do more for !dma_fence_is_later(fence, check)   case, in which, 
> we will don't need add new fence to resv slot.
>
> if ((check->context == fence->context) && dma_fence_is_later(fence, check))
>   fobj->shared_count = j;
>   RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
>   fobj->shared_count++;
> } else {
>   dma_fence_put(fence);
> }
>
> No you cannot do that, check is changed and not the one you want, 
> Besides, we don't need to consider the case you mentioned, take only one 
> fence in obj->staged for example:
>
> You add a fence whose context is equal to this fence (check), so 
> current logic will put this check into fobj->shared[++j],
Yes, if !dma_fence_is_later(fence, check), then 'check' will be put to 
fobj->shared[++j], so we don't need add new fence to resv slot, don't we?

Regards,
David Zhou
> so in the end
> Obj->fence will point to fobj, and original old would be rcu_kfree()
>
> No additional action actually needed...
>
> /Monk
>
> -Original Message-
> From: Zhou, David(ChunMing)
> Sent: 2018年3月6日 12:25
> To: Liu, Monk <monk@amd.com>; dri-de...@freedesktop.org
> Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add 
> fence(v2)
>
>
>
> On 2018年03月06日 11:53, Monk Liu wrote:
>> v2:
>> still check context first to avoid warning from dma_fence_is_later 
>> apply this fix in add_shared_replace as well
>>
>> Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767
>> Signed-off-by: Monk Liu <monk@amd.com>
>> ---
>>drivers/dma-buf/reservation.c | 6 --
>>1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/dma-buf/reservation.c 
>> b/drivers/dma-buf/reservation.c index 314eb10..c6e3c86 100644
>> --- a/drivers/dma-buf/reservation.c
>> +++ b/drivers/dma-buf/reservation.c
>> @@ -118,7 +118,8 @@ reservation_object_add_shared_inplace(struct 
>> reservation_object *obj,
>>  old_fence = rcu_dereference_protected(fobj->shared[i],
>>  reservation_object_held(obj));
>>
>> -if (old_fence->context == fence->context) {
>> +if (old_fence->context == fence->context &&
>> +dma_fence_is_later(fence, old_fence)) {
>>  /* memory barrier is added by write_seqcount_begin */
>>  RCU_INIT_POINTER(fobj->shared[i], fence);
>>  write_seqcount_end(>seq);
>> @@ -179,7 +180,8 @@ reservation_object_add_shared_replace(struct 
>> reservation_object *obj,
>>  check = rcu_dereference_protected(old->shared[i],
>>  reservation_object_held(obj));
>>
>> -if (check->context == fence->context ||
>> +if ((check->context == fence->context &&
>> +dma_fence_is_later(fence, check)) ||
> We still need do more for !dma_fence_is_later(fence, check)   case, in which, 
> we will don't need add new fence to resv slot.
>
> if ((check->context == fence->context) && dma_fence_is_later(fence, check))
>       fobj->shared_count = j;
>       RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
>       fobj->shared_count++;
> } else {
>       dma_fence_put(fence);
> }
>
>
> Regards,
> David Zhou
>>  dma_fence_is_signaled(check))
>>  RCU_INIT_POINTER(fobj->shared[--k], check);
>>  else

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)

2018-03-05 Thread Liu, Monk
>   
> - if (check->context == fence->context ||
> + if ((check->context == fence->context &&
> + dma_fence_is_later(fence, check)) ||
We still need do more for !dma_fence_is_later(fence, check)   case, in which, 
we will don't need add new fence to resv slot.

if ((check->context == fence->context) && dma_fence_is_later(fence, check))
 fobj->shared_count = j;
 RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
 fobj->shared_count++;
} else {
 dma_fence_put(fence);
}

No you cannot do that, check is changed and not the one you want,
Besides, we don't need to consider the case you mentioned, take only one fence 
in obj->staged for example:

You add a fence whose context is equal to this fence (check), so current logic 
will put this check into fobj->shared[++j], so in the end
Obj->fence will point to fobj, and original old would be rcu_kfree() 

No additional action actually needed...

/Monk

-Original Message-----
From: Zhou, David(ChunMing) 
Sent: 2018年3月6日 12:25
To: Liu, Monk <monk@amd.com>; dri-de...@freedesktop.org
Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)



On 2018年03月06日 11:53, Monk Liu wrote:
> v2:
> still check context first to avoid warning from dma_fence_is_later 
> apply this fix in add_shared_replace as well
>
> Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767
> Signed-off-by: Monk Liu <monk@amd.com>
> ---
>   drivers/dma-buf/reservation.c | 6 --
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma-buf/reservation.c 
> b/drivers/dma-buf/reservation.c index 314eb10..c6e3c86 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -118,7 +118,8 @@ reservation_object_add_shared_inplace(struct 
> reservation_object *obj,
>   old_fence = rcu_dereference_protected(fobj->shared[i],
>   reservation_object_held(obj));
>   
> - if (old_fence->context == fence->context) {
> + if (old_fence->context == fence->context &&
> + dma_fence_is_later(fence, old_fence)) {
>   /* memory barrier is added by write_seqcount_begin */
>   RCU_INIT_POINTER(fobj->shared[i], fence);
>   write_seqcount_end(>seq);
> @@ -179,7 +180,8 @@ reservation_object_add_shared_replace(struct 
> reservation_object *obj,
>   check = rcu_dereference_protected(old->shared[i],
>   reservation_object_held(obj));
>   
> - if (check->context == fence->context ||
> + if ((check->context == fence->context &&
> + dma_fence_is_later(fence, check)) ||
We still need do more for !dma_fence_is_later(fence, check)   case, in which, 
we will don't need add new fence to resv slot.

if ((check->context == fence->context) && dma_fence_is_later(fence, check))
     fobj->shared_count = j;
     RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
     fobj->shared_count++;
} else {
     dma_fence_put(fence);
}


Regards,
David Zhou
>   dma_fence_is_signaled(check))
>   RCU_INIT_POINTER(fobj->shared[--k], check);
>   else

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] dma-buf/reservation: should keep the new fence in add_shared_inplace

2018-03-05 Thread Liu, Monk
Yeah, right 

-Original Message-
From: Zhou, David(ChunMing) 
Sent: 2018年3月6日 11:38
To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] dma-buf/reservation: should keep the new fence in 
add_shared_inplace



On 2018年03月06日 11:09, Monk Liu wrote:
> Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767
> Signed-off-by: Monk Liu <monk@amd.com>
> ---
>   drivers/dma-buf/reservation.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/reservation.c 
> b/drivers/dma-buf/reservation.c index 314eb10..29b7e45 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -118,7 +118,7 @@ reservation_object_add_shared_inplace(struct 
> reservation_object *obj,
>   old_fence = rcu_dereference_protected(fobj->shared[i],
>   reservation_object_held(obj));
>   
> - if (old_fence->context == fence->context) {
> + if (dma_fence_is_later(fence, old_fence)) {
OK, good catch, to avoid warning of different context, which should be "if 
((old_fence->context == fence->context) && dma_fence_is_later(fence,
old_fence)) { "
and reservation_object_add_shared_replace need this fix as well.

Regards,
David Zhou
>   /* memory barrier is added by write_seqcount_begin */
>   RCU_INIT_POINTER(fobj->shared[i], fence);
>   write_seqcount_end(>seq);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available

2018-03-05 Thread Liu, Monk
Can you give more details ? thanks 

/Monk

-Original Message-
From: Koenig, Christian 
Sent: 2018年3月5日 19:39
To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; 
linux-ker...@vger.kernel.org
Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot 
available

Am 05.03.2018 um 12:37 schrieb Liu, Monk:
> But the thing confuse me is according to the design, if driver keep 
> calling reserve_shared() prior to add_fence(), and with lock held of cause, 
> That BUG() shouldn't hit, so there are two things in face looks weired to me:
> 1) by design in reserve_shared(), obj->staged should be already NULL, 
> so why we kfree on it

No, that is not correct.

> 2) in fact, amdgpu can hit the case that obj->staged is not NULL in 
> reserved_shared(), don't know how it lead here

We reserved a fence slot without using it, so it is still there when
reserve_shared() is called.

Christian.

>
>
> Any thought ?
>
> /Monk
>
> -Original Message-
> From: Koenig, Christian
> Sent: 2018年3月5日 19:29
> To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; 
> linux-ker...@vger.kernel.org
> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when 
> slot available
>
> Am 05.03.2018 um 12:25 schrieb Liu, Monk:
>> And by the way, I add "if (staged!=NULL) BUG();" prior to 
>> "kfree(obj->staged)" in reserve_shared() routine, and this BUG() is actually 
>> hit, The stack dump shows it is hit during the vm_bo_update() in 
>> gem_va_update()...
> That is expected. The staged handling just makes sure that there is room 
> available, it doesn't guarantee that it is actually used.
>
> E.g. we can end up reserving a fence slot, but then find that we actually 
> don't need it.
>
> Christian.
>
>> Besides, the whole reservation logic still looks a little weired to me ... 
>> especially this staged part ...
>>
>> Thanks
>>
>> /Monk
>>
>> -Original Message-
>> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
>> Sent: 2018年3月5日 19:22
>> To: Liu, Monk <monk@amd.com>; Koenig, Christian 
>> <christian.koe...@amd.com>; dri-devel@lists.freedesktop.org; 
>> linux-ker...@vger.kernel.org
>> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when 
>> slot available
>>
>> Am 05.03.2018 um 08:55 schrieb Liu, Monk:
>>> Hi Christian
>>>
>>> You are right on that part of obj-staged is set to NULL in 
>>> add_fence, So my following question will be why we kfree(obj->staged) in 
>>> reserve_shared() if staged is always NULL in that point ?
>> Good question, I haven't wrote code that so I can't fully answer.
>>
>> Maybe Chris or Maarten know more about that.
>>
>> Christian.
>>
>>> Thanks
>>> /Monk
>>>
>>> -Original Message-
>>> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
>>> Sent: 2018年2月28日 16:27
>>> To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; 
>>> linux-ker...@vger.kernel.org
>>> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged 
>>> when slot available
>>>
>>> Am 28.02.2018 um 07:44 schrieb Monk Liu:
>>>> under below scenario the obj->fence would refer to a wild pointer:
>>>>
>>>> 1,call reservation_object_reserved_shared
>>>> 2,call reservation_object_add_shared_fence
>>>> 3,call reservation_object_reserved_shared
>>>> 4,call reservation_object_add_shared_fence
>>>>
>>>> in step 1, staged is allocated,
>>>>
>>>> in step 2, code path will go 
>>>> reservation_object_add_shared_replace()
>>>> and obj->fence would be assigned as staged (through
>>>> RCU_INIT_POINTER)
>>>>
>>>> in step 3, obj->staged will be freed(by simple kfree), which make
>>>> obj->fence point to a wild pointer...
>>> Well that explanation is still nonsense. See
>>> reservation_object_add_shared_fence:
>>>>    obj->staged = NULL;
>>> Among the first things reservation_object_add_shared_fence() does is 
>>> it sets obj->staged to NULL.
>>>
>>> So step 3 will not free anything and we never have a wild pointer.
>>>
>>> Regards,
>>> Christian.
>>>
>>>> in step 4, code path will go 
>>>> reservation_object_add_shared_inplace()
>>>> and inside it the @fobj (which equals to @obj->staged, set by above
>>

RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available

2018-03-05 Thread Liu, Monk
But the thing confuse me is according to the design, if driver keep calling 
reserve_shared() prior to add_fence(), and with lock held of cause, 
That BUG() shouldn't hit, so there are two things in face looks weired to me:
1) by design in reserve_shared(), obj->staged should be already NULL, so why we 
kfree on it
2) in fact, amdgpu can hit the case that obj->staged is not NULL in 
reserved_shared(), don't know how it lead here 


Any thought ?

/Monk

-Original Message-
From: Koenig, Christian 
Sent: 2018年3月5日 19:29
To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; 
linux-ker...@vger.kernel.org
Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot 
available

Am 05.03.2018 um 12:25 schrieb Liu, Monk:
> And by the way, I add "if (staged!=NULL) BUG();" prior to 
> "kfree(obj->staged)" in reserve_shared() routine, and this BUG() is actually 
> hit, The stack dump shows it is hit during the vm_bo_update() in 
> gem_va_update()...

That is expected. The staged handling just makes sure that there is room 
available, it doesn't guarantee that it is actually used.

E.g. we can end up reserving a fence slot, but then find that we actually don't 
need it.

Christian.

>
> Besides, the whole reservation logic still looks a little weired to me ... 
> especially this staged part ...
>
> Thanks
>
> /Monk
>
> -Original Message-
> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
> Sent: 2018年3月5日 19:22
> To: Liu, Monk <monk@amd.com>; Koenig, Christian 
> <christian.koe...@amd.com>; dri-devel@lists.freedesktop.org; 
> linux-ker...@vger.kernel.org
> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when 
> slot available
>
> Am 05.03.2018 um 08:55 schrieb Liu, Monk:
>> Hi Christian
>>
>> You are right on that part of obj-staged is set to NULL in add_fence, 
>> So my following question will be why we kfree(obj->staged) in 
>> reserve_shared() if staged is always NULL in that point ?
> Good question, I haven't wrote code that so I can't fully answer.
>
> Maybe Chris or Maarten know more about that.
>
> Christian.
>
>> Thanks
>> /Monk
>>
>> -Original Message-
>> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
>> Sent: 2018年2月28日 16:27
>> To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; 
>> linux-ker...@vger.kernel.org
>> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when 
>> slot available
>>
>> Am 28.02.2018 um 07:44 schrieb Monk Liu:
>>> under below scenario the obj->fence would refer to a wild pointer:
>>>
>>> 1,call reservation_object_reserved_shared
>>> 2,call reservation_object_add_shared_fence
>>> 3,call reservation_object_reserved_shared
>>> 4,call reservation_object_add_shared_fence
>>>
>>> in step 1, staged is allocated,
>>>
>>> in step 2, code path will go reservation_object_add_shared_replace()
>>> and obj->fence would be assigned as staged (through 
>>> RCU_INIT_POINTER)
>>>
>>> in step 3, obj->staged will be freed(by simple kfree), which make
>>> obj->fence point to a wild pointer...
>> Well that explanation is still nonsense. See
>> reservation_object_add_shared_fence:
>>>       obj->staged = NULL;
>> Among the first things reservation_object_add_shared_fence() does is 
>> it sets obj->staged to NULL.
>>
>> So step 3 will not free anything and we never have a wild pointer.
>>
>> Regards,
>> Christian.
>>
>>> in step 4, code path will go reservation_object_add_shared_inplace()
>>> and inside it the @fobj (which equals to @obj->staged, set by above
>>> steps) is already a wild pointer
>>>
>>> should remov the kfree on staged in
>>> reservation_object_reserve_shared()
>>>
>>> Change-Id: If7c01f1b4be3d3d8a81efa90216841f79ab1fc1c
>>> Signed-off-by: Monk Liu <monk@amd.com>
>>> ---
>>> drivers/dma-buf/reservation.c | 7 ++-
>>> 1 file changed, 2 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/dma-buf/reservation.c 
>>> b/drivers/dma-buf/reservation.c index 375de41..b473ccc 100644
>>> --- a/drivers/dma-buf/reservation.c
>>> +++ b/drivers/dma-buf/reservation.c
>>> @@ -74,12 +74,9 @@ int reservation_object_reserve_shared(struct 
>>> reservation_object *obj)
>>> old = reservation_object_get_list(obj);
>>> 
>>> if (old && old->shared_max) {
>

RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available

2018-03-05 Thread Liu, Monk
And by the way, I add "if (staged!=NULL) BUG();" prior to "kfree(obj->staged)" 
in reserve_shared() routine, and this BUG() is actually hit, 
The stack dump shows it is hit during the vm_bo_update() in gem_va_update()...

Besides, the whole reservation logic still looks a little weired to me ... 
especially this staged part ...

Thanks

/Monk

-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] 
Sent: 2018年3月5日 19:22
To: Liu, Monk <monk@amd.com>; Koenig, Christian <christian.koe...@amd.com>; 
dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org
Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot 
available

Am 05.03.2018 um 08:55 schrieb Liu, Monk:
> Hi Christian
>
> You are right on that part of obj-staged is set to NULL in add_fence, 
> So my following question will be why we kfree(obj->staged) in 
> reserve_shared() if staged is always NULL in that point ?

Good question, I haven't wrote code that so I can't fully answer.

Maybe Chris or Maarten know more about that.

Christian.

>
> Thanks
> /Monk
>
> -Original Message-
> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com]
> Sent: 2018年2月28日 16:27
> To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; 
> linux-ker...@vger.kernel.org
> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when 
> slot available
>
> Am 28.02.2018 um 07:44 schrieb Monk Liu:
>> under below scenario the obj->fence would refer to a wild pointer:
>>
>> 1,call reservation_object_reserved_shared
>> 2,call reservation_object_add_shared_fence
>> 3,call reservation_object_reserved_shared
>> 4,call reservation_object_add_shared_fence
>>
>> in step 1, staged is allocated,
>>
>> in step 2, code path will go reservation_object_add_shared_replace()
>> and obj->fence would be assigned as staged (through RCU_INIT_POINTER)
>>
>> in step 3, obj->staged will be freed(by simple kfree), which make
>> obj->fence point to a wild pointer...
>
> Well that explanation is still nonsense. See
> reservation_object_add_shared_fence:
>>      obj->staged = NULL;
> Among the first things reservation_object_add_shared_fence() does is 
> it sets obj->staged to NULL.
>
> So step 3 will not free anything and we never have a wild pointer.
>
> Regards,
> Christian.
>
>> in step 4, code path will go reservation_object_add_shared_inplace()
>> and inside it the @fobj (which equals to @obj->staged, set by above 
>> steps) is already a wild pointer
>>
>> should remov the kfree on staged in 
>> reservation_object_reserve_shared()
>>
>> Change-Id: If7c01f1b4be3d3d8a81efa90216841f79ab1fc1c
>> Signed-off-by: Monk Liu <monk@amd.com>
>> ---
>>drivers/dma-buf/reservation.c | 7 ++-
>>1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/dma-buf/reservation.c 
>> b/drivers/dma-buf/reservation.c index 375de41..b473ccc 100644
>> --- a/drivers/dma-buf/reservation.c
>> +++ b/drivers/dma-buf/reservation.c
>> @@ -74,12 +74,9 @@ int reservation_object_reserve_shared(struct 
>> reservation_object *obj)
>>  old = reservation_object_get_list(obj);
>>
>>  if (old && old->shared_max) {
>> -if (old->shared_count < old->shared_max) {
>> -/* perform an in-place update */
>> -kfree(obj->staged);
>> -obj->staged = NULL;
>> +if (old->shared_count < old->shared_max)
>>  return 0;
>> -} else
>> +else
>>  max = old->shared_max * 2;
>>  } else
>>  max = 4;
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available

2018-03-04 Thread Liu, Monk
Hi Christian

You are right on that part of obj-staged is set to NULL in add_fence, 
So my following question will be why we kfree(obj->staged) in reserve_shared() 
if staged is always NULL in that point ?

Thanks 
/Monk

-Original Message-
From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] 
Sent: 2018年2月28日 16:27
To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; 
linux-ker...@vger.kernel.org
Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot 
available

Am 28.02.2018 um 07:44 schrieb Monk Liu:
> under below scenario the obj->fence would refer to a wild pointer:
>
> 1,call reservation_object_reserved_shared
> 2,call reservation_object_add_shared_fence
> 3,call reservation_object_reserved_shared
> 4,call reservation_object_add_shared_fence
>
> in step 1, staged is allocated,
>
> in step 2, code path will go reservation_object_add_shared_replace()
> and obj->fence would be assigned as staged (through RCU_INIT_POINTER)
>
> in step 3, obj->staged will be freed(by simple kfree), which make 
> obj->fence point to a wild pointer...


Well that explanation is still nonsense. See
reservation_object_add_shared_fence:
>     obj->staged = NULL;

Among the first things reservation_object_add_shared_fence() does is it 
sets obj->staged to NULL.

So step 3 will not free anything and we never have a wild pointer.

Regards,
Christian.

>
> in step 4, code path will go reservation_object_add_shared_inplace()
> and inside it the @fobj (which equals to @obj->staged, set by above steps)
> is already a wild pointer
>
> should remov the kfree on staged in reservation_object_reserve_shared()
>
> Change-Id: If7c01f1b4be3d3d8a81efa90216841f79ab1fc1c
> Signed-off-by: Monk Liu <monk@amd.com>
> ---
>   drivers/dma-buf/reservation.c | 7 ++-
>   1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index 375de41..b473ccc 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -74,12 +74,9 @@ int reservation_object_reserve_shared(struct 
> reservation_object *obj)
>   old = reservation_object_get_list(obj);
>   
>   if (old && old->shared_max) {
> - if (old->shared_count < old->shared_max) {
> - /* perform an in-place update */
> - kfree(obj->staged);
> - obj->staged = NULL;
> + if (old->shared_count < old->shared_max)
>   return 0;
> - } else
> + else
>   max = old->shared_max * 2;
>   } else
>   max = 4;

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 0/2] Move scheduler out of AMDGPU

2017-12-02 Thread Liu, Monk
I'm wondering if GPU reset still work after this home move ...



-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Friday, December 1, 2017 11:55 PM
To: Lucas Stach ; Deucher, Alexander 

Cc: Grodzovsky, Andrey ; David Airlie 
; amd-...@lists.freedesktop.org; 
patchwork-...@pengutronix.de; dri-devel@lists.freedesktop.org; 
ker...@pengutronix.de
Subject: Re: [PATCH 0/2] Move scheduler out of AMDGPU

Am 01.12.2017 um 16:28 schrieb Lucas Stach:
> Hi all,
>
> so this is the first step to make the marvelous AMDGPU scheduler 
> useable for other drivers. I have a (mostly) working prototype of 
> Etnaviv using the scheduler, but those patches need to keep baking for a 
> while.
>
> I'm sending this out as I want to avoid rebasing this change too much 
> and don't want to take people by surprise when the Etnaviv 
> implementation surfaces. Also this might need some coordination 
> between AMDGPU and Etnaviv, which might be good to get going now.
>
> Please speak up now if you have any objections or comments.

Looks good to me, but question is what is this based upon?

I strongly assume drm-next, so question is now if we have any patches inside 
amd branches we should apply before doing this.

CCing Andrey as well cause he has some tasks assigned around the scheduler as 
well.

Regards,
Christian.

>
> Regards,
> Lucas
>
> Lucas Stach (2):
>drm: move amd_gpu_scheduler into common location
>drm/sched: move fence slab handling to module init/exit
>
>   drivers/gpu/drm/Kconfig|   5 +
>   drivers/gpu/drm/Makefile   |   1 +
>   drivers/gpu/drm/amd/amdgpu/Makefile|   5 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c|  38 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  12 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   8 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |   4 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c|  22 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   |  14 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  12 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c  |  20 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sched.h  |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c   |   6 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|   4 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h|   4 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c|   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h|   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c|  14 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h|   4 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  10 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |   4 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |   4 +-
>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |   8 +-
>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |   8 +-
>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.h  | 185 --
>   drivers/gpu/drm/scheduler/Makefile |   4 +
>   .../gpu/drm/{amd => }/scheduler/gpu_scheduler.c| 281 
> +++--
>   drivers/gpu/drm/{amd => }/scheduler/sched_fence.c  | 122 +
>   include/drm/gpu_scheduler.h| 171 +
>   .../drm/gpu_scheduler_trace.h  |  14 +-
>   34 files changed, 525 insertions(+), 511 deletions(-)
>   delete mode 100644 drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>   create mode 100644 drivers/gpu/drm/scheduler/Makefile
>   rename drivers/gpu/drm/{amd => }/scheduler/gpu_scheduler.c (64%)
>   rename drivers/gpu/drm/{amd => }/scheduler/sched_fence.c (58%)
>   create mode 100644 include/drm/gpu_scheduler.h
>   rename drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h => 
> include/drm/gpu_scheduler_trace.h (83%)
>

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] drm/amdgpu/virt: remove redundant variable pf2vf_ver

2017-11-12 Thread Liu, Monk
+ Horace

-Original Message-
From: Colin King [mailto:colin.k...@canonical.com] 
Sent: 2017年11月11日 19:51
To: Deucher, Alexander <alexander.deuc...@amd.com>; Koenig, Christian 
<christian.koe...@amd.com>; David Airlie <airl...@linux.ie>; Liu, Monk 
<monk@amd.com>; Yu, Xiangliang <xiangliang...@amd.com>; 
amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org
Subject: [PATCH] drm/amdgpu/virt: remove redundant variable pf2vf_ver

From: Colin Ian King <colin.k...@canonical.com>

Variable pf2vf_ver is assigned but never read, it is redundant and hence can be 
removed.

Cleans up clang warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:310:3: warning: Value stored to 
'pf2vf_ver' is never read

Signed-off-by: Colin Ian King <colin.k...@canonical.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 6738df836a70..b1cc179512fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -296,7 +296,6 @@ int amdgpu_virt_fw_reserve_get_checksum(void *obj,
 
 void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev)  {
-   uint32_t pf2vf_ver = 0;
uint32_t pf2vf_size = 0;
uint32_t checksum = 0;
uint32_t checkval;
@@ -309,7 +308,6 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device 
*adev)
adev->virt.fw_reserve.p_pf2vf =
(struct amdgim_pf2vf_info_header *)(
adev->fw_vram_usage.va + AMDGIM_DATAEXCHANGE_OFFSET);
-   pf2vf_ver = adev->virt.fw_reserve.p_pf2vf->version;
AMDGPU_FW_VRAM_PF2VF_READ(adev, header.size, _size);
AMDGPU_FW_VRAM_PF2VF_READ(adev, checksum, );
 
--
2.14.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/ttm: fix memory leak while individualizing BOs

2017-09-13 Thread Liu, Monk
verified work,


Reviewed-by: Monk Liu <monk@amd.com>


From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> on behalf of Christian 
König <deathsim...@vodafone.de>
Sent: Wednesday, September 13, 2017 4:47:34 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: [PATCH] drm/ttm: fix memory leak while individualizing BOs

From: Christian König <christian.koe...@amd.com>

We need to free the reservation object before we take the BO
from the delayed delete list.

Signed-off-by: Christian König <christian.koe...@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bee77d3..d79607a 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -557,6 +557,8 @@ static int ttm_bo_cleanup_refs_and_unlock(struct 
ttm_buffer_object *bo,
 }

 ttm_bo_del_from_lru(bo);
+   if (!list_empty(>ddestroy) && (bo->resv != >ttm_resv))
+   reservation_object_fini(>ttm_resv);
 list_del_init(>ddestroy);
 kref_put(>list_kref, ttm_bo_ref_bug);

--
2.7.4

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/1] drm/ttm: Fix accounting error when fail to get pages for pool

2017-08-16 Thread Liu, Monk
Reviewed-by: Monk Liu <monk@amd.com>


From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> on behalf of 
Xiangliang.Yu <xiangliang...@amd.com>
Sent: Wednesday, August 16, 2017 3:20:46 PM
To: a...@linux-foundation.org; labb...@redhat.com; 
dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
Cc: Yu, Xiangliang
Subject: [PATCH 1/1] drm/ttm: Fix accounting error when fail to get pages for 
pool

When fail to get needed page for pool, need to put allocated pages
into pool. But current code has a miscalculation of allocated pages,
correct it.

Signed-off-by: Xiangliang.Yu <xiangliang...@amd.com>
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index eeddc1e..8715998 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -615,7 +615,7 @@ static void ttm_page_pool_fill_locked(struct ttm_page_pool 
*pool,
 } else {
 pr_err("Failed to fill pool (%p)\n", pool);
 /* If we have any pages left put them to the pool. */
-   list_for_each_entry(p, >list, lru) {
+   list_for_each_entry(p, _pages, lru) {
 ++cpages;
 }
 list_splice(_pages, >list);
--
2.7.4

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/7] drm/qxl: fix incorrect use of the lru_lock

2017-08-08 Thread Liu, Monk
Ack-by: Monk.Liu 


From: amd-gfx  on behalf of Christian 
König 
Sent: Tuesday, August 8, 2017 4:14:46 PM
To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Dave 
Airlie; Gerd Hoffmann
Subject: Re: [PATCH 2/7] drm/qxl: fix incorrect use of the lru_lock

Hi guys,

can I get an rb or at least an Acked-by for that one?

The code was obviously copied over from radeon where it wasn't correct
in the first place.

Thanks,
Christian.

Am 07.08.2017 um 17:48 schrieb Christian König:
> From: Christian König 
>
> The BO manager has its own lock and doesn't use the lru_lock.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/qxl/qxl_ttm.c | 13 -
>   1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
> index 0fdedee..07dc08d 100644
> --- a/drivers/gpu/drm/qxl/qxl_ttm.c
> +++ b/drivers/gpu/drm/qxl/qxl_ttm.c
> @@ -454,15 +454,10 @@ void qxl_ttm_fini(struct qxl_device *qdev)
>   static int qxl_mm_dump_table(struct seq_file *m, void *data)
>   {
>struct drm_info_node *node = (struct drm_info_node *)m->private;
> - struct drm_mm *mm = (struct drm_mm *)node->info_ent->data;
> - struct drm_device *dev = node->minor->dev;
> - struct qxl_device *rdev = dev->dev_private;
> - struct ttm_bo_global *glob = rdev->mman.bdev.glob;
> + struct ttm_mem_type_manager *man = node->info_ent->data;
>struct drm_printer p = drm_seq_file_printer(m);
>
> - spin_lock(>lru_lock);
> - drm_mm_print(mm, );
> - spin_unlock(>lru_lock);
> + man->func->debug(man, );
>return 0;
>   }
>   #endif
> @@ -483,9 +478,9 @@ int qxl_ttm_debugfs_init(struct qxl_device *qdev)
>qxl_mem_types_list[i].show = _mm_dump_table;
>qxl_mem_types_list[i].driver_features = 0;
>if (i == 0)
> - qxl_mem_types_list[i].data = 
> qdev->mman.bdev.man[TTM_PL_VRAM].priv;
> + qxl_mem_types_list[i].data = 
> >mman.bdev.man[TTM_PL_VRAM];
>else
> - qxl_mem_types_list[i].data = 
> qdev->mman.bdev.man[TTM_PL_PRIV].priv;
> + qxl_mem_types_list[i].data = 
> >mman.bdev.man[TTM_PL_PRIV];
>
>}
>return qxl_debugfs_add_files(qdev, qxl_mem_types_list, i);


___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


答复: [PATCH] drm/amdgpu/virt: fix double kfree on bo_va

2017-02-05 Thread Liu, Monk
thanks for this catch!


Reviewed-by: Monk Liu <monk@amd.com>


发件人: Colin King <colin.k...@canonical.com>
发送时间: 2017年2月4日 4:23:42
收件人: Deucher, Alexander; Koenig, Christian; David Airlie; Liu, Monk; Yu, 
Xiangliang; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
抄送: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org
主题: [PATCH] drm/amdgpu/virt: fix double kfree on bo_va

From: Colin Ian King <colin.k...@canonical.com>

bo_va is being kfree'd twice, once in the call to amdgpu_vm_bo_rmv
and then a short while later. Fix this double free by removing
the 2nd kfree.

Detected by CoverityScan, CID#1399524 ("Double Free")

Signed-off-by: Colin Ian King <colin.k...@canonical.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 3fd951c..dcfb7df 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -83,7 +83,6 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
 DRM_ERROR("failed to do bo_map on static CSA, err=%d\n", r);
 amdgpu_vm_bo_rmv(adev, bo_va);
 ttm_eu_backoff_reservation(, );
-   kfree(bo_va);
 return r;
 }

--
2.10.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/4] drm/amdgpu: clear RB at ring init

2016-06-07 Thread Liu, Monk
> We should only call memset if amdgpu_bo_kmap succeeded. Same issue in patch 2.
[ml] agreed 

>There's something else about these two patches I'm a bit worried about:

>The GPU should only read data from ring buffers and IBs that we previously 
>explicitly wrote there. I'm afraid these patches might just paper over bugs 
>elsewhere, which might still bite us under different circumstances.

[ml] so do you mean we should use NOP to fulfill the ring buffer instead of 0 ?

-Original Message-
From: Michel Dänzer [mailto:mic...@daenzer.net] 
Sent: Tuesday, June 07, 2016 3:34 PM
To: Alex Deucher 
Cc: dri-devel at lists.freedesktop.org; Liu, Monk 
Subject: Re: [PATCH 1/4] drm/amdgpu: clear RB at ring init

On 02.06.2016 07:27, Alex Deucher wrote:
> From: Monk Liu 
> 
> This help fix reloading driver hang issue of SDMA ring.
> 
> Signed-off-by: Monk Liu 
> Reviewed-by: Alex Deucher 
> Reviewed-by: Christian König 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 3b02272..a4b3f44 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -310,6 +310,9 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring,
>   }
>   r = amdgpu_bo_kmap(ring->ring_obj,
>  (void **)>ring);
> +
> + memset((void *)ring->ring, 0, ring->ring_size);
> +
>   amdgpu_bo_unreserve(ring->ring_obj);
>   if (r) {
>   dev_err(adev->dev, "(%d) ring map failed\n", r);
> 

We should only call memset if amdgpu_bo_kmap succeeded. Same issue in patch 2.


There's something else about these two patches I'm a bit worried about:

The GPU should only read data from ring buffers and IBs that we previously 
explicitly wrote there. I'm afraid these patches might just paper over bugs 
elsewhere, which might still bite us under different circumstances.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer