RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV
[AMD Official Use Only] >> See the FLR request from the hypervisor is just another source of signaling >> the need for a reset, similar to each job timeout on each queue. Otherwise >> you have a race condition between the hypervisor and the scheduler. No it's not, FLR from hypervisor is just to notify guest the hw VF FLR is about to start or was already executed, but host will do FLR anyway without waiting for guest too long >> In other words I strongly think that the current SRIOV reset implementation >> is severely broken and what Andrey is doing is actually fixing it. It makes the code to crash ... how could it be a fix ? I'm afraid the patch is NAK from me, but it is welcome if the cleanup do not ruin the logic, Andry or jingwen can try it if needed. Thanks --- Monk Liu | Cloud GPU & Virtualization Solution | AMD --- we are hiring software manager for CVS core team --- -Original Message- From: Koenig, Christian Sent: Tuesday, January 4, 2022 6:19 PM To: Chen, JingWen ; Christian König ; Grodzovsky, Andrey ; Deng, Emily ; Liu, Monk ; dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org; Chen, Horace ; Chen, JingWen Cc: dan...@ffwll.ch Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Hi Jingwen, well what I mean is that we need to adjust the implementation in amdgpu to actually match the requirements. Could be that the reset sequence is questionable in general, but I doubt so at least for now. See the FLR request from the hypervisor is just another source of signaling the need for a reset, similar to each job timeout on each queue. Otherwise you have a race condition between the hypervisor and the scheduler. Properly setting in_gpu_reset is indeed mandatory, but should happen at a central place and not in the SRIOV specific code. In other words I strongly think that the current SRIOV reset implementation is severely broken and what Andrey is doing is actually fixing it. Regards, Christian. Am 04.01.22 um 10:07 schrieb JingWen Chen: > Hi Christian, > I'm not sure what do you mean by "we need to change SRIOV not the driver". > > Do you mean we should change the reset sequence in SRIOV? This will be a huge > change for our SRIOV solution. > > From my point of view, we can directly use amdgpu_device_lock_adev > and amdgpu_device_unlock_adev in flr_work instead of try_lock since no one > will conflict with this thread with reset_domain introduced. > But we do need the reset_sem and adev->in_gpu_reset to keep device untouched > via user space. > > Best Regards, > Jingwen Chen > > On 2022/1/3 下午6:17, Christian König wrote: >> Please don't. This patch is vital to the cleanup of the reset procedure. >> >> If SRIOV doesn't work with that we need to change SRIOV and not the driver. >> >> Christian. >> >> Am 30.12.21 um 19:45 schrieb Andrey Grodzovsky: >>> Sure, I guess i can drop this patch then. >>> >>> Andrey >>> >>> On 2021-12-24 4:57 a.m., JingWen Chen wrote: >>>> I do agree with shaoyun, if the host find the gpu engine hangs first, and >>>> do the flr, guest side thread may not know this and still try to access >>>> HW(e.g. kfd is using a lot of amdgpu_in_reset and reset_sem to identify >>>> the reset status). And this may lead to very bad result. >>>> >>>> On 2021/12/24 下午4:58, Deng, Emily wrote: >>>>> These patches look good to me. JingWen will pull these patches and do >>>>> some basic TDR test on sriov environment, and give feedback. >>>>> >>>>> Best wishes >>>>> Emily Deng >>>>> >>>>> >>>>> >>>>>> -Original Message- >>>>>> From: Liu, Monk >>>>>> Sent: Thursday, December 23, 2021 6:14 PM >>>>>> To: Koenig, Christian ; Grodzovsky, >>>>>> Andrey ; >>>>>> dri-devel@lists.freedesktop.org; amd- g...@lists.freedesktop.org; >>>>>> Chen, Horace ; Chen, JingWen >>>>>> ; Deng, Emily >>>>>> Cc: dan...@ffwll.ch >>>>>> Subject: RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset >>>>>> protection for SRIOV >>>>>> >>>>>> [AMD Official Use Only] >>>>>> >>>>>> @Chen, Horace @Chen, JingWen @Deng, Emily >>>>>> >>>>>> Please ta
RE: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV
[AMD Official Use Only] @Chen, Horace @Chen, JingWen @Deng, Emily Please take a review on Andrey's patch Thanks --- Monk Liu | Cloud GPU & Virtualization Solution | AMD --- we are hiring software manager for CVS core team --- -Original Message- From: Koenig, Christian Sent: Thursday, December 23, 2021 4:42 PM To: Grodzovsky, Andrey ; dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org Cc: dan...@ffwll.ch; Liu, Monk ; Chen, Horace Subject: Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV Am 22.12.21 um 23:14 schrieb Andrey Grodzovsky: > Since now flr work is serialized against GPU resets there is no need > for this. > > Signed-off-by: Andrey Grodzovsky Acked-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 --- > drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 11 --- > 2 files changed, 22 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > index 487cd654b69e..7d59a66e3988 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c > @@ -248,15 +248,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct > *work) > struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, > virt); > int timeout = AI_MAILBOX_POLL_FLR_TIMEDOUT; > > - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, > - * otherwise the mailbox msg will be ruined/reseted by > - * the VF FLR. > - */ > - if (!down_write_trylock(>reset_sem)) > - return; > - > amdgpu_virt_fini_data_exchange(adev); > - atomic_set(>in_gpu_reset, 1); > > xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); > > @@ -269,9 +261,6 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct > *work) > } while (timeout > 1); > > flr_done: > - atomic_set(>in_gpu_reset, 0); > - up_write(>reset_sem); > - > /* Trigger recovery for world switch failure if no TDR */ > if (amdgpu_device_should_recover_gpu(adev) > && (!amdgpu_device_has_job_running(adev) || diff --git > a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c > b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c > index e3869067a31d..f82c066c8e8d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c > +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c > @@ -277,15 +277,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct > *work) > struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, > virt); > int timeout = NV_MAILBOX_POLL_FLR_TIMEDOUT; > > - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, > - * otherwise the mailbox msg will be ruined/reseted by > - * the VF FLR. > - */ > - if (!down_write_trylock(>reset_sem)) > - return; > - > amdgpu_virt_fini_data_exchange(adev); > - atomic_set(>in_gpu_reset, 1); > > xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); > > @@ -298,9 +290,6 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct > *work) > } while (timeout > 1); > > flr_done: > - atomic_set(>in_gpu_reset, 0); > - up_write(>reset_sem); > - > /* Trigger recovery for world switch failure if no TDR */ > if (amdgpu_device_should_recover_gpu(adev) > && (!amdgpu_device_has_job_running(adev) ||
RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] > I'm fearing that just repeating what Alex said, but to make it clear > once more: That is *not* necessary! > > The shared repository is owned by upstream maintainers and they are > usually free to do restructuring work without getting acknowledge from > every single driver maintainer. Hi Daniel Anyway thanks for officially confirm to me of working model & policy in community, I don't want to put my opinion here due to that's not my call to change no matter how. I only want to let this diagnostic TDR scheme going to a good end for AMD or even for all DRM vendor. How about this way, we still have a final patch not landed in DRM scheduler and I would like jingwen to present it to you and AlexD/Christian/Andrey, I believe you will have concerns or objections regarding this patch, but that's fine, let us figure it out together, how to make it acceptable by you and other vendors that working with DRM scheduler. P.S.: I had to repeat myself again, we are not popping up new idea suddenly, it is disconnection issue, we didn't have changes (or plan to have changes) in DRM scheduler before, but eventually we found we must make job_timeout and sched_main to work in a serialized otherwise it won't work based on current scheduler's code structure. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Friday, September 3, 2021 12:11 AM To: Koenig, Christian Cc: Liu, Monk ; Dave Airlie ; Alex Deucher ; Grodzovsky, Andrey ; Chen, JingWen ; DRI Development ; amd-...@lists.freedesktop.org Subject: Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread On Thu, Sep 2, 2021 at 1:00 PM Christian König wrote: > > Hi Monk, > > Am 02.09.21 um 07:52 schrieb Liu, Monk: > > [AMD Official Use Only] > > > > I'm not sure I can add much to help this along, I'm sure Alex has > > some internal training, Once your driver is upstream, it belongs to > > upstream, you can maintain it, but you no longer control it 100%, it's a > > tradeoff, it's not one companies always understand. > > Usually people are fine developing away internally, but once interaction > > with other parts of the kernel/subsystem is required they have the > > realisation that they needed to work upstream 6 months earlier. > > The best time to interact with upstream was 6 months ago, the second best > > time is now. > > <<< > > > > Daniel/AlexD > > > > I didn't mean your changes on AMD driver need my personal approval > > or review ... and I'm totally already get used that our driver is not 100% > > under control by AMDers, but supposedly any one from community (including > > you) who tend to change AMD's driver need at least to get approvement from > > someone in AMD, e.g.: AlexD or Christian, doesn't that reasonable? > > I'm fearing that just repeating what Alex said, but to make it clear > once more: That is *not* necessary! > > The shared repository is owned by upstream maintainers and they are > usually free to do restructuring work without getting acknowledge from > every single driver maintainer. > > Anybody can of course technically object to upstream design decisions, > but that means that you need to pay attention to the mailing lists in > the first place. > > > just like we need your approve if we try to modify DRM-sched, or need > > panfrost's approval if we need to change panfrost code ... > > > > by only CC AMD's engineers looks not quite properly, how do you know if > > your changes (on AMD code part) are conflicting with AMD's on-going > > internal features/refactoring or not ? > > Well because AMD is supposed to work in public as much as possible and > ask upstream before doing changes to the code base. > > Additional to that design decisions are supposed to be discussed on > the mailing list and *not* internally. Yeah I'm honestly really surprised about the course of this discussion here. With Alex, Christian and others amd has a lot of folks with years/decades of experience in how to collaborate in upstream, when to pull in others proactively and when that's not needed, and in general how to plan upstream work with the lest amount of risk and surprises. I think step zero here needs to be some training at amd and then re-planning this effort, before we get back to technical stuff. Otherwise we'll just get bogged down in pain because expectations about the process don't pan out. -Daniel > > Regards, > Christian. > > > > > Thanks > > > > -- > > Monk Liu | Cloud-GPU Core team > > --
RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] >>> I'm not sure I can add much to help this along, I'm sure Alex has some internal training, Once your driver is upstream, it belongs to upstream, you can maintain it, but you no longer control it 100%, it's a tradeoff, it's not one companies always understand. Usually people are fine developing away internally, but once interaction with other parts of the kernel/subsystem is required they have the realisation that they needed to work upstream 6 months earlier. The best time to interact with upstream was 6 months ago, the second best time is now. <<< Daniel/AlexD I didn't mean your changes on AMD driver need my personal approval or review ... and I'm totally already get used that our driver is not 100% under control by AMDers, but supposedly any one from community (including you) who tend to change AMD's driver need at least to get approvement from someone in AMD, e.g.: AlexD or Christian, doesn't that reasonable? just like we need your approve if we try to modify DRM-sched, or need panfrost's approval if we need to change panfrost code ... by only CC AMD's engineers looks not quite properly, how do you know if your changes (on AMD code part) are conflicting with AMD's on-going internal features/refactoring or not ? Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Dave Airlie Sent: Thursday, September 2, 2021 2:51 AM To: Alex Deucher Cc: Liu, Monk ; Daniel Vetter ; Koenig, Christian ; Grodzovsky, Andrey ; Chen, JingWen ; DRI Development ; amd-...@lists.freedesktop.org Subject: Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread On Thu, 2 Sept 2021 at 01:20, Alex Deucher wrote: > > On Wed, Sep 1, 2021 at 6:19 AM Liu, Monk wrote: > > > > [AMD Official Use Only] > > > > Daniel > > > > From the link you share it looks you(or someone else) have quite a bunch > > patches that changes DRM_SCHED or even amdgpu, by that case before they are > > merged to kernel tree I'm wondering if any AMD develop reviewed them ? > > > > They looks to me somehow conflicting with what we changed in our repo > > > > It is really a chaos for AMDer if someone else out side of AMD changes our > > kernel driver (or/and scheduler) without reviewed by AMDer, just like we > > are requiring your review if we tend to change scheduler's logic here > > > > This one changes AMD's code: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flo > > re.kernel.org%2Fdri-devel%2F20210625133327.2598825-2-boris.brezillon > > %40collabora.com%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C6c507d18 > > d65341ef53bb08d96d7976e6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0% > > 7C637661190727875969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJ > > QIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=BWJSkKN > > y2%2BwjxbQrfxGPzuJ5PBpBwB4aV0ZH6QoJGEg%3Dreserved=0 > > And I didn't see any reviewed-by from AMDers ... > > > > This one also touches AMD's code: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flo > > re.kernel.org%2Fdri-devel%2F20200604081224.863494-12-daniel.vetter%4 > > 0ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C6c507d18d65341e > > f53bb08d96d7976e6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63766 > > 1190727885929%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2 > > luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=%2F8vIVXCWjHkM > > 56pcYI9EvuzhbsZhV9WczkKaBJE67KQ%3Dreserved=0 > > Which is conflicting with one patch we submitted (in our repo > > rightnow), and neither see AMDder gave a review-by on this one (let > > me know if I missed it) > > > > Monk, this is not how upstream works. You need to participate. > That's how communities work. There's a reason all these discussions > happen on public mailing lists. The patch author can't be expected to > know every person on every vendor team to CC with a patch. If you > have concerns, you need to raise them when the patches are being > discussed. > I'm not sure I can add much to help this along, I'm sure Alex has some internal training, Once your driver is upstream, it belongs to upstream, you can maintain it, but you no longer control it 100%, it's a tradeoff, it's not one companies always understand. Usually people are fine developing away internally, but once interaction with other parts of the kernel/subsystem is required they have the realisation that they needed to work upstream 6 months earlier. The best time to interact with upstream was 6 months ago, the second best time is now. Dave.
RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] >> For me your project exists since a few weeks at most, because that is when >> your team showed up on dri-devel. That you already spent 6 months on this >> within amd, on a code area that very much affects shared code, without >> kicking of any thread on dri-devel isn't great, but also not something we >> can fix, since time machines don't exist. This is partially true, because in the past months our change only resident in AMD driver, it is till now that we found we had to make changes in SCHED level >> Your team hasn't been active in any of these discussions, but now suddenly >> pops up out of nowhere and demands that your approach needs to land asap. >> That's really not how upstream works. if our changes on DRI level part cannot get merged soon that's fine, we can discuss more, but that's not suddenly pops up from nowhere, we already worked on it for months inside of AMD drivers. >> I think the best way forward is to go through the above process again and >> essentially restart. So submit a complete patch series with problem >> descriptions, solution you picked, why you picked that, all the amdgpu >> patches to get there and the core patches too. Since it sounds like a bunch >> of this has all landed already you probably need a patch 1 that goes back to >> 6 months ago so that we can see the overall direction, and review whether >> that's the right one or not. We are not objecting this approach, we didn't do that because previously all what we need to do is resident inside AMD driver ... because we try to avoid change DRI/DRM interface part ... For the patches you shows to us with links I'm sorry that due to some IT infrastructure reason me and my team didn't see it before (we kind of work in AMD repo ... the patches you shows didn't get merged in our repo yet...) One thing I also want to emphasis here: if any code need change inside AMD driver please always let us know and review. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: amd-gfx On Behalf Of Daniel Vetter Sent: Wednesday, September 1, 2021 4:18 PM To: Liu, Monk Cc: Koenig, Christian ; Grodzovsky, Andrey ; Chen, JingWen ; DRI Development ; amd-...@lists.freedesktop.org Subject: Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread Hi Monk, On Wed, Sep 1, 2021 at 3:23 AM Liu, Monk wrote: > > [AMD Official Use Only] > > > Hi Daniel/Christian/Andrey > > > > It looks the voice from you three are spread over those email floods to me, > the feature we are working on (diagnostic TDR scheme) is pending there for > more than 6 month (we started it from feb 2021). For me your project exists since a few weeks at most, because that is when your team showed up on dri-devel. That you already spent 6 months on this within amd, on a code area that very much affects shared code, without kicking of any thread on dri-devel isn't great, but also not something we can fix, since time machines don't exist. So we have to make the best out of the situation and move ahead where we are. From my understanding you've done a bunch of changes to the scheduler code. As far as I can see there's been two related things your team has done: - remove some allocations from scheduler code, because that can lead to deadlocks. I've kicked up this topic quite a while ago here https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20200604081224.863494-10-daniel.vetter%40ffwll.ch%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=pG5sG5pjVXEAMaahvfNS11VwbHkYWRuWrtHFXM9mEyo%3Dreserved=0 This is just one patch of the entire series. This is an area where we really need a consistent solution across all drm/sched drivers, not something that individual drivers just fix in their own way. - the other one is the timeout issue for the patches you cite here. Again there's been discussion on this on dri-devel with Boris from panfrost about how we can handle at least some of the races in tdr. That resulted in lots of discussions and documentation improvements. Those patches are merged now, link https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210625133327.2598825-2-boris.brezillon%40collabora.com%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=m6U6tJbX2x38xiwQXE1oV0sz2BxXZfPlcouyqIqPZNU%3Dreser
RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] Daniel >From the link you share it looks you(or someone else) have quite a bunch >patches that changes DRM_SCHED or even amdgpu, by that case before they are >merged to kernel tree I'm wondering if any AMD develop reviewed them ? They looks to me somehow conflicting with what we changed in our repo It is really a chaos for AMDer if someone else out side of AMD changes our kernel driver (or/and scheduler) without reviewed by AMDer, just like we are requiring your review if we tend to change scheduler's logic here This one changes AMD's code: https://lore.kernel.org/dri-devel/20210625133327.2598825-2-boris.brezil...@collabora.com/ And I didn't see any reviewed-by from AMDers ... This one also touches AMD's code: https://lore.kernel.org/dri-devel/20200604081224.863494-12-daniel.vet...@ffwll.ch/ Which is conflicting with one patch we submitted (in our repo rightnow), and neither see AMDder gave a review-by on this one (let me know if I missed it) Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: amd-gfx On Behalf Of Daniel Vetter Sent: Wednesday, September 1, 2021 4:18 PM To: Liu, Monk Cc: Koenig, Christian ; Grodzovsky, Andrey ; Chen, JingWen ; DRI Development ; amd-...@lists.freedesktop.org Subject: Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread Hi Monk, On Wed, Sep 1, 2021 at 3:23 AM Liu, Monk wrote: > > [AMD Official Use Only] > > > Hi Daniel/Christian/Andrey > > > > It looks the voice from you three are spread over those email floods to me, > the feature we are working on (diagnostic TDR scheme) is pending there for > more than 6 month (we started it from feb 2021). For me your project exists since a few weeks at most, because that is when your team showed up on dri-devel. That you already spent 6 months on this within amd, on a code area that very much affects shared code, without kicking of any thread on dri-devel isn't great, but also not something we can fix, since time machines don't exist. So we have to make the best out of the situation and move ahead where we are. From my understanding you've done a bunch of changes to the scheduler code. As far as I can see there's been two related things your team has done: - remove some allocations from scheduler code, because that can lead to deadlocks. I've kicked up this topic quite a while ago here https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20200604081224.863494-10-daniel.vetter%40ffwll.ch%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=pG5sG5pjVXEAMaahvfNS11VwbHkYWRuWrtHFXM9mEyo%3Dreserved=0 This is just one patch of the entire series. This is an area where we really need a consistent solution across all drm/sched drivers, not something that individual drivers just fix in their own way. - the other one is the timeout issue for the patches you cite here. Again there's been discussion on this on dri-devel with Boris from panfrost about how we can handle at least some of the races in tdr. That resulted in lots of discussions and documentation improvements. Those patches are merged now, link https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210625133327.2598825-2-boris.brezillon%40collabora.com%2Fdata=04%7C01%7Cmonk.liu%40amd.com%7Cd90ad990ac1a499c266208d96d21138d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660811106940372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=m6U6tJbX2x38xiwQXE1oV0sz2BxXZfPlcouyqIqPZNU%3Dreserved=0 There's been more than just this, also quite some doc patches from Boris that explain how it's all supposed to work and be race-free. Again your driver isn't the only one with interesting TDR races. Your team hasn't been active in any of these discussions, but now suddenly pops up out of nowhere and demands that your approach needs to land asap. That's really not how upstream works. The other thing where I'm struggling is that there's a lot of missing context for outsiders. The patches sometimes come with zero commit message, for tricky concurrency bugs. And there's no context with what you've done already on the amdgpu side (since that never showed up on dri-devel), which makes constructive discussions here really hard. Now fixing these bugs is obviously good, but the way this is supposed to work when touching shared infrastructure is: - Before you start merging anything kick off an RFC thread on dri-devel (or whatever the topic really is about) about the problem you have and how your trying to solve it.
RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] In the previous discussion, you guys stated that we should drop the "kthread_should_park" in cleanup_job. @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) - return NULL; But I suddenly have a question here: if return the timedout job no matter kthread_should_park() or not, then we are backing to the original problem again: that the timedout_job is suddenly signaling and cleanup_job still returns it to sched_main and job is freed while it is still handling by vendor's timeout callback If we return NULL when kthread_should_park() in cleanup_job, we can prevent above scenario from happening: once a job is processed by job_timedout we can stop its scheduler, and after that even this job suddenly signaled the cleanup_job won't return it so sched_main won't free it in parallel ... What do you think ? Thanks -- Monk Liu | Cloud-GPU Core team ---------- From: Liu, Monk Sent: Wednesday, September 1, 2021 9:23 AM To: Koenig, Christian ; Grodzovsky, Andrey ; Daniel Vetter ; Chen, JingWen Cc: DRI Development ; amd-...@lists.freedesktop.org Subject: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread [AMD Official Use Only] Hi Daniel/Christian/Andrey It looks the voice from you three are spread over those email floods to me, the feature we are working on (diagnostic TDR scheme) is pending there for more than 6 month (we started it from feb 2021). Honestly speaking the email ways that we are using now is not friendly and quite painful to me Can we try to put all our opinions, suggestions, or even objects here together, let's go through them one by one, it's too hard for us to reply each email on different questions . For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) This is a fixing patch on the timeout timer in scheduler, can we complete this one first ? it should already resolved all the questions and suggestions. For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler I think I already explained the questions raised by Daniel in other thread , regarding why I use __kthread_should_park() For other aspects, can we put all our opinion synthesized here ? Thanks ! -- Monk Liu | Cloud-GPU Core team --
RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] That' really matter in practice, when two jobs from different process scheduled to the ring close to each other, if we don't discriminate A from B then B will be considered a bad job due to A's timeout, which will force B's process to exit (e.g.: X server) Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Tuesday, August 31, 2021 9:09 PM To: Koenig, Christian Cc: Grodzovsky, Andrey ; Christian König ; Liu, Monk ; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote: > Yeah, that's what I meant with that the start of processing a job is a > bit swampy defined. > > Jobs overload, but we simply don't have another good indicator that a > job started except that the previous one completed. > > It's still better than starting the timer when pushing the job to the > ring buffer, because that is completely off. Not sure this matters that much really in practice, and in some cases we might want to actually just reset it all if the built up backlog is way too long. For anything that really runs way too long you want preempt-ctx/revoke fences anyway, not end-of-cs fences with tdr. -Daniel > > Christian. > > Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky: > > As I mentioned to Monk before - what about cases such as in this > > test - > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi > > tlab.freedesktop.org%2Fmesa%2Fdrm%2F-%2Fcommit%2Fbc21168fa924d3fc4a0 > > 00492e861f50a1a135b25data=04%7C01%7CMonk.Liu%40amd.com%7Cbd1847 > > 4429e34f8eaac208d96c80710e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C > > 0%7C637660121179715855%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL > > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1WTD3 > > opiBtT29bbbqJub5nfhWgX5vMNppiFKgWDe%2FoQ%3Dreserved=0 > > > > Here you don't have serialized sequence where when jobs finishes > > processing and second starts, they execute together concurrently - > > for those cases seems to me restarting the timer for the second job > > from scratch will let it hang much longer then allowed by TO value. > > > > Andrey > > > > On 2021-08-27 10:29 a.m., Christian König wrote: > > > I don't think that makes sense. > > > > > > See we don't want to start the time when the job is inserted into > > > the ring buffer, but rather when it starts processing. > > > > > > Starting processing is a bit swampy defined, but just starting the > > > timer when the previous job completes should be fine enough. > > > > > > Christian. > > > > > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky: > > > > The TS represents the point in time when the job was inserted > > > > into the pending list. > > > > I don't think it matters when it actually starts to be > > > > processed, what matters is when this job was inserted into > > > > pending list because right at that point you arm the TO timer > > > > (when no other is running already) and so when the previous job > > > > completes and you cancel and rearm again you can use that TS > > > > from the next job in pending list to calculate how much time has > > > > actually left for it to run before TDR must be initiated and not > > > > just give it again full TO value to run even if it has already > > > > been running for a while. > > > > > > > > Also, i am not sure also about the assumption that what we > > > > measure is processing by HW, what we measure is from the moment > > > > it was scheduled to ring to the moment the job completed (EOP > > > > event). At least that what our TDR timer measures and so it > > > > makes sense to set the TS at this point. > > > > > > > > Andrey > > > > > > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote: > > > > > [AMD Official Use Only] > > > > > > > > > > what is that 'ts' representing for ? it looks to me the > > > > > jiffies that it get scheduled to the ring, but a job > > > > > scheduled to the ring doesn't represent it's being processed > > > > > by hw. > > > > > > > > > > Thanks > > > > > > > > > > -- > > > > > M
RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
[AMD Official Use Only] Okay, I will reprepare this patch Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Tuesday, August 31, 2021 9:02 PM To: Liu, Monk Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Chen, Jingwen Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote: > Can we please have some actual commit message here, with detailed > explanation of the race/bug/whatever, how you fix it and why this is > the best option? > > On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > > tested-by: jingwen chen > > Signed-off-by: Monk Liu > > Signed-off-by: jingwen chen > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 24 > > > > 1 file changed, 4 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index ecf8140..894fdb24 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > > *work) > > sched = container_of(work, struct drm_gpu_scheduler, > > work_tdr.work); > > > > /* Protects against concurrent deletion in > > drm_sched_get_cleanup_job */ > > + if (!__kthread_should_park(sched->thread)) > > This is a __ function, i.e. considered internal, and it's lockless > atomic, i.e. unordered. And you're not explaining why this works. > > Iow it's probably buggy, and an just unconditionally parking the > kthread is probably the right thing to do. If it's not the right thing > to do, there's a bug here for sure. Also why don't we reuse the function drivers already have to stop a scheduler thread? We seem to have two kthread_park now, that's probably one too much. -Daniel > > + kthread_park(sched->thread); > > + > > spin_lock(>job_list_lock); > > job = list_first_entry_or_null(>pending_list, > >struct drm_sched_job, list); > > > > if (job) { > > - /* > > -* Remove the bad job so it cannot be freed by concurrent > > -* drm_sched_cleanup_jobs. It will be reinserted back after > > sched->thread > > -* is parked at which point it's safe. > > -*/ > > - list_del_init(>list); > > spin_unlock(>job_list_lock); > > > > + /* vendor's timeout_job should call drm_sched_start() */ > > status = job->sched->ops->timedout_job(job); > > > > /* > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > > struct drm_sched_job *bad) > > kthread_park(sched->thread); > > > > /* > > -* Reinsert back the bad job here - now it's safe as > > -* drm_sched_get_cleanup_job cannot race against us and release the > > -* bad job at this point - we parked (waited for) any in progress > > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called > > -* now until the scheduler thread is unparked. > > -*/ > > - if (bad && bad->sched == sched) > > - /* > > -* Add at the head of the queue to reflect it was the earliest > > -* job extracted. > > -*/ > > - list_add(>list, >pending_list); > > - > > - /* > > * Iterate the job list from later to earlier one and either deactive > > * their HW callbacks or remove them from pending list if they already > > * signaled. > > -- > > 2.7.4 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog. > ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76 > b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6376601170 > 51194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL > CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLg > KeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0 -- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660117051194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLgKeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0
[diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] Hi Daniel/Christian/Andrey It looks the voice from you three are spread over those email floods to me, the feature we are working on (diagnostic TDR scheme) is pending there for more than 6 month (we started it from feb 2021). Honestly speaking the email ways that we are using now is not friendly and quite painful to me Can we try to put all our opinions, suggestions, or even objects here together, let's go through them one by one, it's too hard for us to reply each email on different questions . For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) This is a fixing patch on the timeout timer in scheduler, can we complete this one first ? it should already resolved all the questions and suggestions. For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler I think I already explained the questions raised by Daniel in other thread , regarding why I use __kthread_should_park() For other aspects, can we put all our opinion synthesized here ? Thanks ! -- Monk Liu | Cloud-GPU Core team --
RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
[AMD Official Use Only] >> Also why don't we reuse the function drivers already have to stop a >> scheduler thread? We seem to have two kthread_park now, that's probably one >> too much. Are you referring to drm_sched_stop ? That's different, we don't need the logic from it, see that it go through pending list and remove all callbacks , etc... meanwhile vendor's timeout callback will call drm_sched_stop in a proper way, All we want in my patch is to simply park scheduler, Besides, even you call drm_sched_stop in job_timeout you still run into the warning issue I hit. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Tuesday, August 31, 2021 9:02 PM To: Liu, Monk Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Chen, Jingwen Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote: > Can we please have some actual commit message here, with detailed > explanation of the race/bug/whatever, how you fix it and why this is > the best option? > > On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > > tested-by: jingwen chen > > Signed-off-by: Monk Liu > > Signed-off-by: jingwen chen > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 24 > > > > 1 file changed, 4 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index ecf8140..894fdb24 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > > *work) > > sched = container_of(work, struct drm_gpu_scheduler, > > work_tdr.work); > > > > /* Protects against concurrent deletion in > > drm_sched_get_cleanup_job */ > > + if (!__kthread_should_park(sched->thread)) > > This is a __ function, i.e. considered internal, and it's lockless > atomic, i.e. unordered. And you're not explaining why this works. > > Iow it's probably buggy, and an just unconditionally parking the > kthread is probably the right thing to do. If it's not the right thing > to do, there's a bug here for sure. Also why don't we reuse the function drivers already have to stop a scheduler thread? We seem to have two kthread_park now, that's probably one too much. -Daniel > > + kthread_park(sched->thread); > > + > > spin_lock(>job_list_lock); > > job = list_first_entry_or_null(>pending_list, > >struct drm_sched_job, list); > > > > if (job) { > > - /* > > -* Remove the bad job so it cannot be freed by concurrent > > -* drm_sched_cleanup_jobs. It will be reinserted back after > > sched->thread > > -* is parked at which point it's safe. > > -*/ > > - list_del_init(>list); > > spin_unlock(>job_list_lock); > > > > + /* vendor's timeout_job should call drm_sched_start() */ > > status = job->sched->ops->timedout_job(job); > > > > /* > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > > struct drm_sched_job *bad) > > kthread_park(sched->thread); > > > > /* > > -* Reinsert back the bad job here - now it's safe as > > -* drm_sched_get_cleanup_job cannot race against us and release the > > -* bad job at this point - we parked (waited for) any in progress > > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called > > -* now until the scheduler thread is unparked. > > -*/ > > - if (bad && bad->sched == sched) > > - /* > > -* Add at the head of the queue to reflect it was the earliest > > -* job extracted. > > -*/ > > - list_add(>list, >pending_list); > > - > > - /* > > * Iterate the job list from later to earlier one and either deactive > > * their HW callbacks or remove them from pending list if they already > > * signaled. > > -- > > 2.7.4 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog. > ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76 >
RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
[AMD Official Use Only] >> This is a __ function, i.e. considered internal, and it's lockless atomic, >> i.e. unordered. And you're not explaining why this works. It's not a traditional habit from what I can see that put explain in code, but we can do that in mails , We want to park the scheduler in job_timeout to serialize the job accessing from both sched and TO handler , but inside vendor's callback timeout_job at least both panfrost and amd They both call drm_sched_stop() on all schedulers. If we unconditionally call "kthread_park" in job_timedout then the bailing job's timedout will try to call "kthread_park" again on its scheduler and introduce "warning" The scenario is : 1,Job1 on sched1 triggers timedout, and sched1 is parked, 2,vendor callback runs, it will usually stop all schedulers. 3,Job2 on sched2 triggers timedout, so the job_timedout also try to park sched2, but sched2 was stopped already by above step. (job2's timeout is introduced by job1, or by other VF) ---So there will be "warning" in kernel log from above step... after this "__kthread_should_park()" here we can avoid the warning, that's the only reason I need this __function. 4,Before vendor callback exit, it will unpark all schedulers. >From another hand if we don't do the kthread_park() and still delete the job >here (drop deleting/reinserting the job from pending_list is what we want), >we still have a small windows to hit the race issue: That cleanup_job from sched thread is freeing the job while job is under processing by job_timedout or vendor's call back. And the reason we want to avoid deleting/reinserting the timedout job is because we (amd vendor) have our own way to do a diagnostic on all jobs in pending list from all scheduler, we want to cherry-pick the real bad job >From all scheduler's pending list that causes this JOB TIMEOUT. Besides, it is also much reasonable to park scheduler when job_timedout is running there, they should exclusively access those common members like sched_job. (due to spin_lock is off before running into vendor's calback) Hope I explained ourselves well enough. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message----- From: Daniel Vetter Sent: Tuesday, August 31, 2021 8:59 PM To: Liu, Monk Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Chen, Jingwen Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler Can we please have some actual commit message here, with detailed explanation of the race/bug/whatever, how you fix it and why this is the best option? On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > tested-by: jingwen chen > Signed-off-by: Monk Liu > Signed-off-by: jingwen chen > --- > drivers/gpu/drm/scheduler/sched_main.c | 24 > 1 file changed, 4 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index ecf8140..894fdb24 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > *work) > sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); > > /* Protects against concurrent deletion in drm_sched_get_cleanup_job > */ > + if (!__kthread_should_park(sched->thread)) This is a __ function, i.e. considered internal, and it's lockless atomic, i.e. unordered. And you're not explaining why this works. Iow it's probably buggy, and an just unconditionally parking the kthread is probably the right thing to do. If it's not the right thing to do, there's a bug here for sure. -Daniel > + kthread_park(sched->thread); > + > spin_lock(>job_list_lock); > job = list_first_entry_or_null(>pending_list, > struct drm_sched_job, list); > > if (job) { > - /* > - * Remove the bad job so it cannot be freed by concurrent > - * drm_sched_cleanup_jobs. It will be reinserted back after > sched->thread > - * is parked at which point it's safe. > - */ > - list_del_init(>list); > spin_unlock(>job_list_lock); > > + /* vendor's timeout_job should call drm_sched_start() */ > status = job->sched->ops->timedout_job(job); > > /* > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > struct drm_sched_job *bad) > kthread_park(sched->thread); > > /* > - * Reinsert back the bad job here - now
RE: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] Ping Christian, Andrey Can we merge this patch first ? this is a standalone patch for the timer Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Monk Liu Sent: Tuesday, August 31, 2021 6:36 PM To: amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Liu, Monk Subject: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3) issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is signaled, and if there is a "next" job we start_timeout again. v2: further cleanup the logic, and do the TDR timer cancelling if the signaled job is the last one in its scheduler. v3: change the issue description remove the cancel_delayed_work in the begining of the cleanup_job recover the implement of drm_sched_job_begin. TODO: 1)introduce pause/resume scheduler in job_timeout to serial the handling of scheduler and job_timeout. 2)drop the bad job's del and insert in scheduler due to above serialization (no race issue anymore with the serialization) tested-by: jingwen Signed-off-by: Monk Liu --- drivers/gpu/drm/scheduler/sched_main.c | 25 ++--- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..ecf8140 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) + if (kthread_should_park()) return NULL; spin_lock(>job_list_lock); @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) if (job && dma_fence_is_signaled(>s_fence->finished)) { /* remove job from pending_list */ list_del_init(>list); + + /* cancel this job's TO timer */ + cancel_delayed_work(>work_tdr); /* make the scheduled timestamp more accurate */ next = list_first_entry_or_null(>pending_list, typeof(*next), list); - if (next) + + if (next) { next->s_fence->scheduled.timestamp = job->s_fence->finished.timestamp; - + /* start TO timer for next job */ + drm_sched_start_timeout(sched); + } } else { job = NULL; - /* queue timeout for next job */ - drm_sched_start_timeout(sched); } spin_unlock(>job_list_lock); @@ -791,11 +789,8 @@ static int drm_sched_main(void *param) (entity = drm_sched_select_entity(sched))) || kthread_should_stop()); - if (cleanup_job) { + if (cleanup_job) sched->ops->free_job(cleanup_job); - /* queue timeout for next job */ - drm_sched_start_timeout(sched); - } if (!entity) continue; -- 2.7.4
RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] >>> I'm not sure if the work_tdr is initialized when a maximum timeout is >>> specified. Please double check. Even timeout set to max the work_tdr is still initialized: int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_backend_ops *ops, unsigned hw_submission, unsigned hang_limit, long timeout, atomic_t *score, const char *name) { int i, ret; sched->ops = ops; sched->hw_submission_limit = hw_submission; sched->name = name; sched->timeout = timeout; sched->hang_limit = hang_limit; sched->score = score ? score : >_score; for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++) drm_sched_rq_init(sched, >sched_rq[i]); init_waitqueue_head(>wake_up_worker); init_waitqueue_head(>job_scheduled); INIT_LIST_HEAD(>pending_list); spin_lock_init(>job_list_lock); atomic_set(>hw_rq_count, 0); INIT_DELAYED_WORK(>work_tdr, drm_sched_job_timedout); atomic_set(>_score, 0); atomic64_set(>job_id_count, 0); Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Christian König Sent: Thursday, August 26, 2021 8:38 PM To: Liu, Monk ; amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Am 26.08.21 um 13:55 schrieb Liu, Monk: > [AMD Official Use Only] > >>> I'm not sure if the work_tdr is initialized when a maximum timeout is >>> specified. Please double check. > Ok, will do > >>> BTW: Can we please drop the "tdr" naming from the scheduler? That is just a >>> timeout functionality and not related to recovery in any way. > We even do not start hardware recovery in a lot of cases now (when wave kill > is successfully). > > Umm, sounds reasonable, I can rename it to "to" with another patch Maybe more like job_timeout or timeout_work or something into that direction. Christian. > > Thanks > > -- > Monk Liu | Cloud-GPU Core team > -- > > -Original Message- > From: Christian König > Sent: Thursday, August 26, 2021 6:09 PM > To: Liu, Monk ; amd-...@lists.freedesktop.org > Cc: dri-devel@lists.freedesktop.org > Subject: Re: [PATCH] drm/sched: fix the bug of time out > calculation(v3) > > Am 26.08.21 um 06:55 schrieb Monk Liu: >> issue: >> in cleanup_job the cancle_delayed_work will cancel a TO timer even >> the its corresponding job is still running. > Yeah, that makes a lot more sense. > >> fix: >> do not cancel the timer in cleanup_job, instead do the cancelling >> only when the heading job is signaled, and if there is a "next" job >> we start_timeout again. >> >> v2: >> further cleanup the logic, and do the TDR timer cancelling if the >> signaled job is the last one in its scheduler. >> >> v3: >> change the issue description >> remove the cancel_delayed_work in the begining of the cleanup_job >> recover the implement of drm_sched_job_begin. >> >> TODO: >> 1)introduce pause/resume scheduler in job_timeout to serial the >> handling of scheduler and job_timeout. >> 2)drop the bad job's del and insert in scheduler due to above >> serialization (no race issue anymore with the serialization) >> >> Signed-off-by: Monk Liu >> --- >>drivers/gpu/drm/scheduler/sched_main.c | 25 ++--- >>1 file changed, 10 insertions(+), 15 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >> b/drivers/gpu/drm/scheduler/sched_main.c >> index a2a9536..ecf8140 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler >> *sched) >>{ >> struct drm_sched_job *job, *next; >> >> -/* >> - * Don't destroy jobs while the timeout worker is running OR thread >> - * is being parked and hence assumed to not touch pending_list >> - */ >> -if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && >> -!cancel_delayed_work(>work_tdr)) || >> -kthread_should_park()) >> +if (kthread_should_park()) >> return NULL; >> >> spin_lock(>job_list_lock); @@ -693,17 +687,21 @@ >> drm_sched_g
RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] Yeah, that "kthread_should_park" is also irrelevant looks to me as well and it delays the signaled job's cleanup/free Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Christian König Sent: Friday, August 27, 2021 2:12 PM To: Grodzovsky, Andrey ; Liu, Monk ; amd-...@lists.freedesktop.org; Koenig, Christian Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) I don't think that this will be necessary nor desired. See the job should be cleaned up as soon as possible after it is finished or otherwise we won't cancel the timeout quick enough either. Christian. Am 26.08.21 um 22:14 schrieb Andrey Grodzovsky: > Attached quick patch for per job TTL calculation to make more precises > next timer expiration. It's on top of the patch in this thread. Let me > know if this makes sense. > > Andrey > > On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote: >> >> On 2021-08-26 12:55 a.m., Monk Liu wrote: >>> issue: >>> in cleanup_job the cancle_delayed_work will cancel a TO timer even >>> the its corresponding job is still running. >>> >>> fix: >>> do not cancel the timer in cleanup_job, instead do the cancelling >>> only when the heading job is signaled, and if there is a "next" job >>> we start_timeout again. >>> >>> v2: >>> further cleanup the logic, and do the TDR timer cancelling if the >>> signaled job is the last one in its scheduler. >>> >>> v3: >>> change the issue description >>> remove the cancel_delayed_work in the begining of the cleanup_job >>> recover the implement of drm_sched_job_begin. >>> >>> TODO: >>> 1)introduce pause/resume scheduler in job_timeout to serial the >>> handling of scheduler and job_timeout. >>> 2)drop the bad job's del and insert in scheduler due to above >>> serialization (no race issue anymore with the serialization) >>> >>> Signed-off-by: Monk Liu >>> --- >>> drivers/gpu/drm/scheduler/sched_main.c | 25 >>> ++--- >>> 1 file changed, 10 insertions(+), 15 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>> b/drivers/gpu/drm/scheduler/sched_main.c >>> index a2a9536..ecf8140 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct >>> drm_gpu_scheduler *sched) >>> { >>> struct drm_sched_job *job, *next; >>> - /* >>> - * Don't destroy jobs while the timeout worker is running OR >>> thread >>> - * is being parked and hence assumed to not touch pending_list >>> - */ >>> - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && >>> - !cancel_delayed_work(>work_tdr)) || >>> - kthread_should_park()) >>> + if (kthread_should_park()) >>> return NULL; >> >> >> I actually don't see why we need to keep the above, on the other side >> (in drm_sched_stop) we won't touch the pending list anyway until >> sched thread came to full stop (kthread_park). If you do see a reason >> why this needed then a comment should be here i think. >> >> Andrey >> >> >>> spin_lock(>job_list_lock); >>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct >>> drm_gpu_scheduler *sched) >>> if (job && dma_fence_is_signaled(>s_fence->finished)) { >>> /* remove job from pending_list */ >>> list_del_init(>list); >>> + >>> + /* cancel this job's TO timer */ >>> + cancel_delayed_work(>work_tdr); >>> /* make the scheduled timestamp more accurate */ >>> next = list_first_entry_or_null(>pending_list, >>> typeof(*next), list); >>> - if (next) >>> + >>> + if (next) { >>> next->s_fence->scheduled.timestamp = >>> job->s_fence->finished.timestamp; >>> - >>> + /* start TO timer for next job */ >>> + drm_sched_start_timeout(sched); >>> + } >>> } else { >>> job = NULL; >>> - /* queue timeout for next job */ >>> - drm_sched_start_timeout(sched); >>> } >>> spin_unlock(>job_list_lock); >>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param) >>> (entity = drm_sched_select_entity(sched))) >>> || >>> kthread_should_stop()); >>> - if (cleanup_job) { >>> + if (cleanup_job) >>> sched->ops->free_job(cleanup_job); >>> - /* queue timeout for next job */ >>> - drm_sched_start_timeout(sched); >>> - } >>> if (!entity) >>> continue;
RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] what is that 'ts' representing for ? it looks to me the jiffies that it get scheduled to the ring, but a job scheduled to the ring doesn't represent it's being processed by hw. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Friday, August 27, 2021 4:14 AM To: Liu, Monk ; amd-...@lists.freedesktop.org; Koenig, Christian Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Attached quick patch for per job TTL calculation to make more precises next timer expiration. It's on top of the patch in this thread. Let me know if this makes sense. Andrey On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote: > > On 2021-08-26 12:55 a.m., Monk Liu wrote: >> issue: >> in cleanup_job the cancle_delayed_work will cancel a TO timer even >> the its corresponding job is still running. >> >> fix: >> do not cancel the timer in cleanup_job, instead do the cancelling >> only when the heading job is signaled, and if there is a "next" job >> we start_timeout again. >> >> v2: >> further cleanup the logic, and do the TDR timer cancelling if the >> signaled job is the last one in its scheduler. >> >> v3: >> change the issue description >> remove the cancel_delayed_work in the begining of the cleanup_job >> recover the implement of drm_sched_job_begin. >> >> TODO: >> 1)introduce pause/resume scheduler in job_timeout to serial the >> handling of scheduler and job_timeout. >> 2)drop the bad job's del and insert in scheduler due to above >> serialization (no race issue anymore with the serialization) >> >> Signed-off-by: Monk Liu >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 25 >> ++--- >> 1 file changed, 10 insertions(+), 15 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >> b/drivers/gpu/drm/scheduler/sched_main.c >> index a2a9536..ecf8140 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct >> drm_gpu_scheduler *sched) >> { >> struct drm_sched_job *job, *next; >> - /* >> - * Don't destroy jobs while the timeout worker is running OR >> thread >> - * is being parked and hence assumed to not touch pending_list >> - */ >> - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && >> - !cancel_delayed_work(>work_tdr)) || >> - kthread_should_park()) >> + if (kthread_should_park()) >> return NULL; > > > I actually don't see why we need to keep the above, on the other side > (in drm_sched_stop) we won't touch the pending list anyway until sched > thread came to full stop (kthread_park). If you do see a reason why > this needed then a comment should be here i think. > > Andrey > > >> spin_lock(>job_list_lock); >> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct >> drm_gpu_scheduler *sched) >> if (job && dma_fence_is_signaled(>s_fence->finished)) { >> /* remove job from pending_list */ >> list_del_init(>list); >> + >> + /* cancel this job's TO timer */ >> + cancel_delayed_work(>work_tdr); >> /* make the scheduled timestamp more accurate */ >> next = list_first_entry_or_null(>pending_list, >> typeof(*next), list); >> - if (next) >> + >> + if (next) { >> next->s_fence->scheduled.timestamp = >> job->s_fence->finished.timestamp; >> - >> + /* start TO timer for next job */ >> + drm_sched_start_timeout(sched); >> + } >> } else { >> job = NULL; >> - /* queue timeout for next job */ >> - drm_sched_start_timeout(sched); >> } >> spin_unlock(>job_list_lock); >> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param) >> (entity = drm_sched_select_entity(sched))) || >> kthread_should_stop()); >> - if (cleanup_job) { >> + if (cleanup_job) >> sched->ops->free_job(cleanup_job); >> - /* queue timeout for next job */ >> - drm_sched_start_timeout(sched); >> - } >> if (!entity) >> continue;
RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] >>I'm not sure if the work_tdr is initialized when a maximum timeout is >>specified. Please double check. Ok, will do >>BTW: Can we please drop the "tdr" naming from the scheduler? That is just a >>timeout functionality and not related to recovery in any way. We even do not start hardware recovery in a lot of cases now (when wave kill is successfully). Umm, sounds reasonable, I can rename it to "to" with another patch Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Christian König Sent: Thursday, August 26, 2021 6:09 PM To: Liu, Monk ; amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) Am 26.08.21 um 06:55 schrieb Monk Liu: > issue: > in cleanup_job the cancle_delayed_work will cancel a TO timer even the > its corresponding job is still running. Yeah, that makes a lot more sense. > > fix: > do not cancel the timer in cleanup_job, instead do the cancelling only > when the heading job is signaled, and if there is a "next" job we > start_timeout again. > > v2: > further cleanup the logic, and do the TDR timer cancelling if the > signaled job is the last one in its scheduler. > > v3: > change the issue description > remove the cancel_delayed_work in the begining of the cleanup_job > recover the implement of drm_sched_job_begin. > > TODO: > 1)introduce pause/resume scheduler in job_timeout to serial the > handling of scheduler and job_timeout. > 2)drop the bad job's del and insert in scheduler due to above > serialization (no race issue anymore with the serialization) > > Signed-off-by: Monk Liu > --- > drivers/gpu/drm/scheduler/sched_main.c | 25 ++--- > 1 file changed, 10 insertions(+), 15 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index a2a9536..ecf8140 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler > *sched) > { > struct drm_sched_job *job, *next; > > - /* > - * Don't destroy jobs while the timeout worker is running OR thread > - * is being parked and hence assumed to not touch pending_list > - */ > - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && > - !cancel_delayed_work(>work_tdr)) || > - kthread_should_park()) > + if (kthread_should_park()) > return NULL; > > spin_lock(>job_list_lock); > @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler > *sched) > if (job && dma_fence_is_signaled(>s_fence->finished)) { > /* remove job from pending_list */ > list_del_init(>list); > + > + /* cancel this job's TO timer */ > + cancel_delayed_work(>work_tdr); I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check. BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way. We even do not start hardware recovery in a lot of cases now (when wave kill is successfully). Regards, Christian. > /* make the scheduled timestamp more accurate */ > next = list_first_entry_or_null(>pending_list, > typeof(*next), list); > - if (next) > + > + if (next) { > next->s_fence->scheduled.timestamp = > job->s_fence->finished.timestamp; > - > + /* start TO timer for next job */ > + drm_sched_start_timeout(sched); > + } > } else { > job = NULL; > - /* queue timeout for next job */ > - drm_sched_start_timeout(sched); > } > > spin_unlock(>job_list_lock); > @@ -791,11 +789,8 @@ static int drm_sched_main(void *param) > (entity = > drm_sched_select_entity(sched))) || >kthread_should_stop()); > > - if (cleanup_job) { > + if (cleanup_job) > sched->ops->free_job(cleanup_job); > - /* queue timeout for next job */ > - drm_sched_start_timeout(sched); > - } > > if (!entity) > continue;
RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)
[AMD Official Use Only] >>But for timer pending case (common case) your mod_delayed_work will >>effectively do exactly the same if you don't use per job TTLs - you mod it to >> sched->timeout value which resets the pending timer to again count from 0. Ny patch will only modify the timer (restart it , actually) when the heading job is signaled, which means on HW ring the next job is just about start processing. If the job is not signaled (your common case) the timer is still not touched at all ... >> I just wonder why we stopped using per job TDR timers in the first place ? >> Isn't the simplest way to count accurate timeouts for each job is to >> actually measure the timeouts for each job separately ? I'm not sure if Christian can recall something, and I believe it is due to some limitations we found (or some race issue like two job on the same scheduler TO in the same time, which is probably if they are scheduled to the ring almost in the same timeframe) Anyway I have a V3 version patch, please take a look, it looks working for me Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Thursday, August 26, 2021 11:05 AM To: Liu, Monk ; Christian König ; amd-...@lists.freedesktop.org; dri-devel Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) On 2021-08-25 10:31 p.m., Liu, Monk wrote: > [AMD Official Use Only] > > Hi Andrey > > I'm not quite sure if I read you correctly > >>> Seems to me you can only do it for empty pending list otherwise you risk >>> cancelling a legit new timer that was started by the next job or not >>> restarting timer at all since your timer was still pending when next job >>> tried to start it again (the common case). > I don't understand above sentence, from my understanding for the > common case, if the timer is pending, the cancel_delay_work in > beginning will cancel it and then we will get to the line of "queue > timeout for next job" since the heading job is not signaled (align > with the timer is pending), then the timer will be restarted (for the > next job) Ignore it, i realized from looking that i missed the timer restart in then end of drm_sched_get_cleanup_job or the alternative one in drm_sched_main > > And above sequence is actually wrong to me, because we cancelled a > pending timer and restart the timer for the scheduler that its heading > job is still running there, the whole counting is repeated from zero > and inaccurate at all But for timer pending case (common case) your mod_delayed_work will effectively do exactly the same if you don't use per job TTLs - you mod it to sched->timeout value which resets the pending timer to again count from 0. I just wonder why we stopped using per job TDR timers in the first place ? Isn't the simplest way to count accurate timeouts for each job is to actually measure the timeouts for each job separately ? Andrey > > > Thanks > > -- > Monk Liu | Cloud-GPU Core team > ------ > > -Original Message- > From: Grodzovsky, Andrey > Sent: Thursday, August 26, 2021 2:20 AM > To: Christian König ; Liu, Monk > ; amd-...@lists.freedesktop.org; dri-devel > > Subject: Re: [PATCH] drm/sched: fix the bug of time out > calculation(v2) > > > On 2021-08-25 8:11 a.m., Christian König wrote: >> No, this would break that logic here. >> >> See drm_sched_start_timeout() can be called multiple times, this is >> intentional and very important! >> >> The logic in queue_delayed_work() makes sure that the timer is only >> started once and then never again. >> >> All we need to take care of is to cancel_delayed_work() when we know >> that the job is completed. > > Seems to me you can only do it for empty pending list otherwise you risk > cancelling a legit new timer that was started by the next job or not > restarting timer at all since your timer was still pending when next job > tried to start it again (the common case). > For non empty pending list you have to adjust the currently active TDR's > timer from your's job TTL to TTL to the next job after you or just restart it > as Monk does it here which prolongs the timeout more then required but still > ok i guess. > > What about returning to the old scheme of timer sched_work per job so > each job has it's own timer and we don't share it and everything is > precise for each job, using the locking scheme we already have today > the actual TDR handler will execute only once while all the other
RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)
[AMD Official Use Only] Hi Andrey I'm not quite sure if I read you correctly >>Seems to me you can only do it for empty pending list otherwise you risk >>cancelling a legit new timer that was started by the next job or not >>restarting timer at all since your timer was still pending when next job >>tried to start it again (the common case). I don't understand above sentence, from my understanding for the common case, if the timer is pending, the cancel_delay_work in beginning will cancel it and then we will get to the line of "queue timeout for next job" since the heading job is not signaled (align with the timer is pending), then the timer will be restarted (for the next job) And above sequence is actually wrong to me, because we cancelled a pending timer and restart the timer for the scheduler that its heading job is still running there, the whole counting is repeated from zero and inaccurate at all Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Thursday, August 26, 2021 2:20 AM To: Christian König ; Liu, Monk ; amd-...@lists.freedesktop.org; dri-devel Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) On 2021-08-25 8:11 a.m., Christian König wrote: > No, this would break that logic here. > > See drm_sched_start_timeout() can be called multiple times, this is > intentional and very important! > > The logic in queue_delayed_work() makes sure that the timer is only > started once and then never again. > > All we need to take care of is to cancel_delayed_work() when we know > that the job is completed. Seems to me you can only do it for empty pending list otherwise you risk cancelling a legit new timer that was started by the next job or not restarting timer at all since your timer was still pending when next job tried to start it again (the common case). For non empty pending list you have to adjust the currently active TDR's timer from your's job TTL to TTL to the next job after you or just restart it as Monk does it here which prolongs the timeout more then required but still ok i guess. What about returning to the old scheme of timer sched_work per job so each job has it's own timer and we don't share it and everything is precise for each job, using the locking scheme we already have today the actual TDR handler will execute only once while all the other arising from the guilty job hang will be rejected (for amdgpu, for other drivers it probably requires same locking or we can move this to the scheduler layer) Andrey > > This here works as intended as far as I can see and if you start to > use mod_delayed_work() you actually break it. > > Regards, > Christian. > > Am 25.08.21 um 14:01 schrieb Liu, Monk: >> [AMD Official Use Only] >> >> I think we should remove the cancel_delayed_work() in the beginning >> of the cleanup_job(). >> >> Because by my patch the "mode_delayed_work" in cleanup_job is already >> doing its duty to retrigger the TO timer accordingly >> >> Thanks >> >> ------ >> Monk Liu | Cloud-GPU Core team >> -- >> >> -Original Message- >> From: Liu, Monk >> Sent: Wednesday, August 25, 2021 7:55 PM >> To: 'Christian König' ; >> amd-...@lists.freedesktop.org >> Subject: RE: [PATCH] drm/sched: fix the bug of time out >> calculation(v2) >> >> [AMD Official Use Only] >> >>>> The timeout started by queue_delayed_work() in >>>> drm_sched_start_timeout() is paired with the cancel_delayed_work() >>>> in drm_sched_get_cleanup_job(). >> No that's wrong, see that when we are in cleanup_job(), assume we do >> not have timeout on this sched (we are just keep submitting new jobs >> to this sched), Then the work_tdr is cancelled, and then we get the >> heading job, and let's assume the job is not signaled, then we run to >> the "queue timeout for next job" thus drm_sched_start_timeout() is >> called, so this heading job's TO timer is actually retriggered ... >> which is totally wrong. >> >> With my patch the timer is already retriggered after previous JOB >> really signaled. >> >> Can you be more specific on the incorrect part ? >> >> Thanks >> -- >> Monk Liu | Cloud-GPU Core team >> -- >> >> -Original Message- >> From: Christian König >> Sent: Wednesday, August 25, 2021 2:32 PM >> To: Liu, Monk ; amd-..
RE: [PATCH] drm/sched: fix the bug of time out calculation(v2)
[AMD Official Use Only] >> All we need to take care of is to cancel_delayed_work() when we know that >> the job is completed. That's why I want to remove the cancel_delayed_work in the beginning of cleanup_job(), because in that moment we don't know if There is a job completed (sched could be wake up due to new submit, instead of a job signaled) , until we get the job and acknowledged of its signaling. static struct drm_sched_job * drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; /* * Don't destroy jobs while the timeout worker is running OR thread * is being parked and hence assumed to not touch pending_list */ if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && !cancel_delayed_work(>work_tdr)) || //normally if the job is not TO, then he cancel here is incorrect if the job is still running , kthread_should_park()) return NULL; spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job && dma_fence_is_signaled(>s_fence->finished)) { /* remove job from pending_list */ list_del_init(>list); /* make the scheduled timestamp more accurate */ next = list_first_entry_or_null(>pending_list, typeof(*next), list); if (next) next->s_fence->scheduled.timestamp = job->s_fence->finished.timestamp; } else { job = NULL; /* queue timeout for next job */ drm_sched_start_timeout(sched); //if the job is not signaled, the timer will be retriggered here (counting is restarted ) , which is wrong } spin_unlock(>job_list_lock); return job; } >> This here works as intended as far as I can see and if you start to use >> mod_delayed_work() you actually break it. Only in the place we find heading job is signaled and there is a next job is the moment that we should cancel the work_tdr for this scheduler , of cause with A new work_tdr queued as the "next" job is already started on HW... that's why I use mod_delayed_work. But I can change it to "cancel and queue" approach if you have concern. Thanks -- Monk Liu | Cloud-GPU Core team -- -----Original Message- From: Christian König Sent: Wednesday, August 25, 2021 8:11 PM To: Liu, Monk ; amd-...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v2) No, this would break that logic here. See drm_sched_start_timeout() can be called multiple times, this is intentional and very important! The logic in queue_delayed_work() makes sure that the timer is only started once and then never again. All we need to take care of is to cancel_delayed_work() when we know that the job is completed. This here works as intended as far as I can see and if you start to use mod_delayed_work() you actually break it. Regards, Christian. Am 25.08.21 um 14:01 schrieb Liu, Monk: > [AMD Official Use Only] > > I think we should remove the cancel_delayed_work() in the beginning of the > cleanup_job(). > > Because by my patch the "mode_delayed_work" in cleanup_job is already > doing its duty to retrigger the TO timer accordingly > > Thanks > > ---------- > Monk Liu | Cloud-GPU Core team > -- > > -Original Message- > From: Liu, Monk > Sent: Wednesday, August 25, 2021 7:55 PM > To: 'Christian König' ; > amd-...@lists.freedesktop.org > Subject: RE: [PATCH] drm/sched: fix the bug of time out > calculation(v2) > > [AMD Official Use Only] > >>> The timeout started by queue_delayed_work() in drm_sched_start_timeout() is >>> paired with the cancel_delayed_work() in drm_sched_get_cleanup_job(). > No that's wrong, see that when we are in cleanup_job(), assume we do not have > timeout on this sched (we are just keep submitting new jobs to this sched), > Then the work_tdr is cancelled, and then we get the heading job, and let's > assume the job is not signaled, then we run to the "queue timeout for next > job" thus drm_sched_start_timeout() is called, so this heading job's TO timer > is actually retriggered ... which is totally wrong. > > With my patch the timer is already retriggered after previous JOB really > signaled. > > Can you be more specific on the incorrect part ? > > Thanks > --
RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
[AMD Official Use Only] Hi Andrey Sorry that it is really hard for me to get any particular or solid potential bugs from your reply, can you be more specific, e.g.: what kind of race issue is introduced by this "kthread_stop/start" approach. To your another question/concern: >> . In a constant rapid stream of jobs each new job comming will try to start >> the timer but most of the time this operation just bails out as there is >> already pending timer from one of the previous jobs which cancels out any >> new ones [1] so, when the TO handler does execute eventually it's not >> because something wrong but simply because TO has Expired I totally agree withy you on this point, and I think I have a patch to address this, but this problem is not related with our current topic at all ... our current topic is the bailout bad job handling from advanced TDR mode. The bug here is our current TO handler only do the counting on the first job to the given scheduler, and the following coming job won't recalculate the TO at all, and I can assure you that this is a regression because when I implement TDR years ago I already considered planned for such problem. Please check this change to resolve it: diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..7b5f99a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -235,6 +235,13 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) schedule_delayed_work(>work_tdr, sched->timeout); } +static void drm_sched_restart_timeout(struct drm_gpu_scheduler *sched) +{ + if (sched->timeout != MAX_SCHEDULE_TIMEOUT && + !list_empty(>pending_list)) + mod_delayed_work(system_wq, >work_tdr, sched->timeout); +} + /** * drm_sched_fault - immediately start timeout handler * @@ -693,6 +682,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) if (job && dma_fence_is_signaled(>s_fence->finished)) { /* remove job from pending_list */ list_del_init(>list); + + /* once the job deleted from pending list we should restart +* the timeout calculation for the next job. +*/ + drm_sched_restart_timeout(sched); /* make the scheduled timestamp more accurate */ next = list_first_entry_or_null(>pending_list, typeof(*next), list); if you guys do not have concerns I can submit this patch for review, but again, let's focus on bailing out had job handling as our priority, we are very close to our purpose, let me know what's your concerned race issue and we can address it. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Grodzovsky, Andrey Sent: Friday, August 20, 2021 10:07 PM To: Liu, Monk ; Daniel Vetter ; Koenig, Christian Cc: Alex Deucher ; Chen, JingWen ; Maling list - DRI developers ; amd-gfx list Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job." On 2021-08-20 3:12 a.m., Liu, Monk wrote: > [AMD Official Use Only] > > @Daniel Vetter @Grodzovsky, Andrey @Koenig, Christian > > Do you have any concern on the kthread_park() approach ? > > Theoretically speaking sched_main shall run there exclusively with > job_timeout since they both touches jobs, and stop scheduler during > job_timeout won't impact performance since in that scenario There was > already something wrong/stuck on that ring/scheduler Regarding last paragraph, and specifically the claim that there was already something wrong if the TO handler starts execution - Not sure about this and I wonder if we have a potential bug here - when we start the timeout timer in drm_sched_job_begin we do it for each new incoming job. In a constant rapid stream of jobs each new job comming will try to start the timer but most of the time this operation just bails out as there is already pending timer from one of the previous jobs which cancels out any new ones [1] so, when the TO handler does execute eventually it's not because something wrong but simply because TO has expired. If in this case the pending list not empty a false TDR will be triggered. I think long ago we used TO handler per job and not per scheduler, this would solve this problem but hurt the serialization issue we are trying to solve. So not sure what to do. [1] - https://elixir.bootlin.com/linux/v5.14-rc1/source/kernel/workqueue.c#L1665 Andrey > > Thanks > > ---------- > Monk Liu | Cloud-GPU Core team > -- > > -Original Me
RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
[AMD Official Use Only] @Daniel Vetter @Grodzovsky, Andrey @Koenig, Christian Do you have any concern on the kthread_park() approach ? Theoretically speaking sched_main shall run there exclusively with job_timeout since they both touches jobs, and stop scheduler during job_timeout won't impact performance since in that scenario There was already something wrong/stuck on that ring/scheduler Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Liu, Monk Sent: Thursday, August 19, 2021 6:26 PM To: Daniel Vetter ; Grodzovsky, Andrey Cc: Alex Deucher ; Chen, JingWen ; Maling list - DRI developers ; amd-gfx list ; Koenig, Christian Subject: RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job." [AMD Official Use Only] Hi Daniel >> Why can't we stop the scheduler thread first, so that there's guaranteed no >> race? I've recently had a lot of discussions with panfrost folks about their >> reset that spawns across engines, and without stopping the scheduler thread >> first before you touch anything it's just plain impossible. Yeah we had this though as well in our mind. Our second approach is to call ktrhead_stop() in job_timedout() routine so that the "bad" job is guaranteed to be used without scheduler's touching or freeing, Check this sample patch one as well please: diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..50a49cb 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,17 +319,12 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + kthread_park(sched->thread); spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job) { - /* -* Remove the bad job so it cannot be freed by concurrent -* drm_sched_cleanup_jobs. It will be reinserted back after sched->thread -* is parked at which point it's safe. -*/ - list_del_init(>list); spin_unlock(>job_list_lock); status = job->sched->ops->timedout_job(job); @@ -345,6 +340,7 @@ static void drm_sched_job_timedout(struct work_struct *work) } else { spin_unlock(>job_list_lock); } + kthread_unpark(sched->thread); if (status != DRM_GPU_SCHED_STAT_ENODEV) { spin_lock(>job_list_lock); @@ -393,20 +389,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* -* Reinsert back the bad job here - now it's safe as -* drm_sched_get_cleanup_job cannot race against us and release the -* bad job at this point - we parked (waited for) any in progress -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called -* now until the scheduler thread is unparked. -*/ - if (bad && bad->sched == sched) - /* -* Add at the head of the queue to reflect it was the earliest -* job extracted. -*/ - list_add(>list, >pending_list); - - /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from pending list if they already * signaled. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Thursday, August 19, 2021 5:31 PM To: Grodzovsky, Andrey Cc: Daniel Vetter ; Alex Deucher ; Chen, JingWen ; Maling list - DRI developers ; amd-gfx list ; Liu, Monk ; Koenig, Christian Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job." On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote: > > On 2021-08-18 10:42 a.m., Daniel Vetter wrote: > > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote: > > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote: > > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote: > > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote: > > > > > > > > > > > + dri-devel > > > > > > > > > > > > Since scheduler is a shared component, please add dri-devel > > > > > > on all scheduler patches. &g
RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
[AMD Official Use Only] Hi Daniel >> Why can't we stop the scheduler thread first, so that there's guaranteed no >> race? I've recently had a lot of discussions with panfrost folks about their >> reset that spawns across engines, and without stopping the scheduler thread >> first before you touch anything it's just plain impossible. Yeah we had this though as well in our mind. Our second approach is to call ktrhead_stop() in job_timedout() routine so that the "bad" job is guaranteed to be used without scheduler's touching or freeing, Check this sample patch one as well please: diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..50a49cb 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,17 +319,12 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + kthread_park(sched->thread); spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job) { - /* -* Remove the bad job so it cannot be freed by concurrent -* drm_sched_cleanup_jobs. It will be reinserted back after sched->thread -* is parked at which point it's safe. -*/ - list_del_init(>list); spin_unlock(>job_list_lock); status = job->sched->ops->timedout_job(job); @@ -345,6 +340,7 @@ static void drm_sched_job_timedout(struct work_struct *work) } else { spin_unlock(>job_list_lock); } + kthread_unpark(sched->thread); if (status != DRM_GPU_SCHED_STAT_ENODEV) { spin_lock(>job_list_lock); @@ -393,20 +389,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* -* Reinsert back the bad job here - now it's safe as -* drm_sched_get_cleanup_job cannot race against us and release the -* bad job at this point - we parked (waited for) any in progress -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called -* now until the scheduler thread is unparked. -*/ - if (bad && bad->sched == sched) - /* -* Add at the head of the queue to reflect it was the earliest -* job extracted. -*/ - list_add(>list, >pending_list); - - /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from pending list if they already * signaled. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Thursday, August 19, 2021 5:31 PM To: Grodzovsky, Andrey Cc: Daniel Vetter ; Alex Deucher ; Chen, JingWen ; Maling list - DRI developers ; amd-gfx list ; Liu, Monk ; Koenig, Christian Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job." On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote: > > On 2021-08-18 10:42 a.m., Daniel Vetter wrote: > > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote: > > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote: > > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote: > > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote: > > > > > > > > > > > + dri-devel > > > > > > > > > > > > Since scheduler is a shared component, please add dri-devel > > > > > > on all scheduler patches. > > > > > > > > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen > > > > > > wrote: > > > > > > > [Why] > > > > > > > for bailing job, this commit will delete it from pending > > > > > > > list thus the bailing job will never have a chance to be > > > > > > > resubmitted even in advance tdr mode. > > > > > > > > > > > > > > [How] > > > > > > > after embeded hw_fence into amdgpu_job is done, the race > > > > > > > condition that this commit tries to work around is > > > > > > > completely solved.So revert this commit. > > > > > > > This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.
RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
[AMD Official Use Only] Hi Andrey and Daniel We worked for a really long time on this new feature to AMD that finally can pick up the bad job from all timedout ones, and the change in scheduler (get/put fence in drm_sched_job_timedout, and remove the bad job delete and put back) is the last piece for us. While we understand and realized that after the "bad job list node delete logic" being removed from job_timedout, there will be race issues introduced if vendor's job_timeout calback is accessing the bad job in parallel of scheduler doing "sched->ops->free_job(leanup_job)". And to not introduce impact at all on those vendors I'd like to proposal a very simple change (which introduced a new bool member for scheduler to indicate if the del/put-back logic is needed or not) , check patch here below: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 47ea468..5e0bdc4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -495,6 +495,8 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, return r; } + ring->sched.keep_bad_job = true; + return 0; } diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 92d8de2..e7ac384 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work) { struct drm_gpu_scheduler *sched; struct drm_sched_job *job; + struct dma_fence *f = NULL; sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); @@ -328,7 +329,11 @@ static void drm_sched_job_timedout(struct work_struct *work) * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread * is parked at which point it's safe. */ - list_del_init(>list); + if (sched->keep_bad_job == false) + list_del_init(>list); + else + f = dma_fence_get(job->s_fence->parent);//get parent fence here to prevent hw_fence dropping to zero due to sched-main's cleanup_jobs, for amdgpu once parent fence drop to zero the sched_job will be kfree-ed + spin_unlock(>job_list_lock); job->sched->ops->timedout_job(job); @@ -341,6 +346,8 @@ static void drm_sched_job_timedout(struct work_struct *work) job->sched->ops->free_job(job); sched->free_guilty = false; } + + dma_fence_put(f); } else { spin_unlock(>job_list_lock); } @@ -396,7 +403,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) * (earlier) cleanups and drm_sched_get_cleanup_job will not be called * now until the scheduler thread is unparked. */ - if (bad && bad->sched == sched) + if (bad && bad->sched == sched && sched->keep_bad_job == false) /* * Add at the head of the queue to reflect it was the earliest * job extracted. diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 4ea8606..5f9a640 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -301,6 +301,7 @@ struct drm_gpu_scheduler { atomic_t_score; boolready; boolfree_guilty; + bool keep_bad_job; }; int drm_sched_init(struct drm_gpu_scheduler *sched, Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Wednesday, August 18, 2021 10:43 PM To: Grodzovsky, Andrey Cc: Daniel Vetter ; Alex Deucher ; Chen, JingWen ; Maling list - DRI developers ; amd-gfx list ; Liu, Monk ; Koenig, Christian Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job." On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote: > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote: > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote: > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote: > > > > > > > + dri-devel > > > > > > > > Since scheduler is a shared component, please add dri-devel on > > > > all scheduler patches. > > > > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen > > > > wrote: > > > > > [Why] > > > > > for bailing job, this commit will delete it from pending list &g
RE: 回复: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak
[AMD Official Use Only - Internal Distribution Only] Hi Christian, We don't need to debate on the design's topic, each of us have our own opinion, it is hard to persuade others sometimes, again with more and more features and requirements it is pretty normal that an old design need to Refine and or even rework to satisfy all those needs, so I'm not trying to argue with you that we don't need a better rework, that's also pleasure me . In the moment, the more important thing I care is the solution because SRIOV project still try best to put all changes into upstreaming tree, we don't want to fork another tree unless no choice ... Let's have a sync in another thread Thanks for you help on this -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Koenig, Christian Sent: Friday, March 26, 2021 10:51 PM To: Liu, Monk ; Zhang, Jack (Jian) ; Grodzovsky, Andrey ; Christian König ; dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org; Deng, Emily ; Rob Herring ; Tomeu Vizoso ; Steven Price Cc: Zhang, Andy ; Jiang, Jerry (SW) Subject: Re: 回复: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak Hi Monk, I can't disagree more. The fundamental problem here is that we have pushed a design without validating if it really fits into the concepts the Linux kernel mandates here. My mistake was that I haven't pushed back hard enough on the initial design resulting in numerous cycles of trying to save the design while band aiding the flaws which became obvious after a while. I haven't counted them but I think we are now already had over 10 patches which try to work around lifetime issues of the job object because I wasn't able to properly explain why this isn't going to work like this. Because of this I will hard reject any attempt to band aid this issue even more which isn't starting over again with a design which looks like it is going to work. Regards, Christian. Am 26.03.21 um 12:21 schrieb Liu, Monk: > [AMD Official Use Only - Internal Distribution Only] > > Hi Christian > > This is not correct or correct perspective, any design comes with its > pros and cons, otherwise it wouldn't comes to kernel tree in the very > beginning , it is just with time passed we have more and more > requirement and feature need to implement And those new requirement > drags many new solution or idea, and some idea you prefer need to > based on a new infrastructure, that's all > > I don't why the job "should be" or not "should be" in the scheduler, > honestly speaking I can argue with you that the "scheduler" and the TDR > feature which invented by AMD developer "should" never escalate to drm layer > at all and by that assumption Those vendor's compatibilities headache right > now won't happen at all. > > Let's just focus on the issue so far. > > The solution Andrey and Jack doing right now looks good to me, and it > can solve our problems without introducing regression from a surface > look, but it is fine if you need a neat solution, since we have our > project pressure (which we always have) Either we implement the first > version with Jack's patch and do the revise in another series of > patches (that also my initial suggestion) or we rework anything you > mentioned, but since looks to me you are from time to time asking > people to rework Something in the stage that people already have a > solution, which frustrated people a lot, > > I would like you do prepare a solution for us, which solves our > headaches ... I really don't want to see you asked Jack to rework again and > again If you are out of bandwidth or no interest in doing this ,please at > least make your solution/proposal very detail and clear, jack told me he > couldn't understand your point here. > > Thanks very much, and please understand our painful here > > /Monk > > > -邮件原件- > 发件人: Koenig, Christian > 发送时间: 2021年3月26日 17:06 > 收件人: Zhang, Jack (Jian) ; Grodzovsky, Andrey > ; Christian König > ; dri-devel@lists.freedesktop.org; > amd-...@lists.freedesktop.org; Liu, Monk ; Deng, > Emily ; Rob Herring ; Tomeu > Vizoso ; Steven Price > > 主题: Re: [PATCH v3] drm/scheduler re-insert Bailing job to avoid > memleak > > Hi guys, > > Am 26.03.21 um 03:23 schrieb Zhang, Jack (Jian): >> [AMD Official Use Only - Internal Distribution Only] >> >> Hi, Andrey, >> >>>> how u handle non guilty singnaled jobs in drm_sched_stop, currently >>>> looks like you don't call put for them and just explicitly free >>>> them as before >> Good point, I missed that place. Will cover that in my next patch. >> >>>> Also
回复: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak
[AMD Official Use Only - Internal Distribution Only] Hi Christian This is not correct or correct perspective, any design comes with its pros and cons, otherwise it wouldn't comes to kernel tree in the very beginning , it is just with time passed we have more and more requirement and feature need to implement And those new requirement drags many new solution or idea, and some idea you prefer need to based on a new infrastructure, that's all I don't why the job "should be" or not "should be" in the scheduler, honestly speaking I can argue with you that the "scheduler" and the TDR feature which invented by AMD developer "should" never escalate to drm layer at all and by that assumption Those vendor's compatibilities headache right now won't happen at all. Let's just focus on the issue so far. The solution Andrey and Jack doing right now looks good to me, and it can solve our problems without introducing regression from a surface look, but it is fine if you need a neat solution, since we have our project pressure (which we always have) Either we implement the first version with Jack's patch and do the revise in another series of patches (that also my initial suggestion) or we rework anything you mentioned, but since looks to me you are from time to time asking people to rework Something in the stage that people already have a solution, which frustrated people a lot, I would like you do prepare a solution for us, which solves our headaches ... I really don't want to see you asked Jack to rework again and again If you are out of bandwidth or no interest in doing this ,please at least make your solution/proposal very detail and clear, jack told me he couldn't understand your point here. Thanks very much, and please understand our painful here /Monk -邮件原件- 发件人: Koenig, Christian 发送时间: 2021年3月26日 17:06 收件人: Zhang, Jack (Jian) ; Grodzovsky, Andrey ; Christian König ; dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org; Liu, Monk ; Deng, Emily ; Rob Herring ; Tomeu Vizoso ; Steven Price 主题: Re: [PATCH v3] drm/scheduler re-insert Bailing job to avoid memleak Hi guys, Am 26.03.21 um 03:23 schrieb Zhang, Jack (Jian): > [AMD Official Use Only - Internal Distribution Only] > > Hi, Andrey, > >>> how u handle non guilty singnaled jobs in drm_sched_stop, currently >>> looks like you don't call put for them and just explicitly free them >>> as before > Good point, I missed that place. Will cover that in my next patch. > >>> Also sched->free_guilty seems useless with the new approach. > Yes, I agree. > >>> Do we even need the cleanup mechanism at drm_sched_get_cleanup_job with >>> this approach... > I am not quite sure about that for now, let me think about this topic today. > > Hi, Christian, > should I add a fence and get/put to that fence rather than using an explicit > refcount? > And another concerns? well let me re-iterate: For the scheduler the job is just a temporary data structure used for scheduling the IBs to the hardware. While pushing the job to the hardware we get a fence structure in return which represents the IBs executing on the hardware. Unfortunately we have applied a design where the job structure is rather used for re-submitting the jobs to the hardware after a GPU reset and karma handling etc etc... All that shouldn't have been pushed into the scheduler into the first place and we should now work on getting this cleaned up rather than making it an even bigger mess by applying halve backed solutions. So in my opinion adding a reference count to the job is going into the completely wrong directly. What we should rather do is to fix the incorrect design decision to use jobs as vehicle in the scheduler for reset handling. To fix this I suggest the following approach: 1. We add a pointer from the drm_sched_fence back to the drm_sched_job. 2. Instead of keeping the job around in the scheduler we keep the fence around. For this I suggest to replace the pending_list with a ring buffer. 3. The timedout_job callback is replaced with a timeout_fence callback. 4. The free_job callback is completed dropped. Job lifetime is now handled in the driver, not the scheduler. Regards, Christian. > > Thanks, > Jack > > -Original Message- > From: Grodzovsky, Andrey > Sent: Friday, March 26, 2021 12:32 AM > To: Zhang, Jack (Jian) ; Christian König > ; dri-devel@lists.freedesktop.org; > amd-...@lists.freedesktop.org; Koenig, Christian > ; Liu, Monk ; Deng, Emily > ; Rob Herring ; Tomeu Vizoso > ; Steven Price > Subject: Re: [PATCH v3] drm/scheduler re-insert Bailing job to avoid > memleak > > There are a few issues here like - how u handle non guilty singnaled jobs in > drm_sched_stop, currently looks like you don't call put for them and
RE: [PATCH 3/5] drm/amdgpu: Paper over the drm_driver mangling for virt
[AMD Official Use Only - Internal Distribution Only] What's the purpose of the patch sets e.g.: what bug can those 5 patches fix or what feature provided for this particular one (3/5) I didn't see how it helpful, could you give a background ? thanks _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Daniel Vetter Sent: Friday, October 30, 2020 6:11 PM To: DRI Development Cc: Intel Graphics Development ; Daniel Vetter ; Deucher, Alexander ; Koenig, Christian ; Quan, Evan ; Kuehling, Felix ; Zhang, Hawking ; Grodzovsky, Andrey ; Tuikov, Luben ; Thomas Zimmermann ; Liu, Monk ; Yintian Tao ; Li, Dennis ; Liu, Shaoyun ; Zhang, Bokun ; Yang, Stanley ; Sheng, Wenhui ; Gong, Curry ; Daniel Vetter Subject: [PATCH 3/5] drm/amdgpu: Paper over the drm_driver mangling for virt Prep work to make drm_device->driver const. Signed-off-by: Daniel Vetter Cc: Alex Deucher Cc: "Christian König" Cc: Evan Quan Cc: Felix Kuehling Cc: Hawking Zhang Cc: Andrey Grodzovsky Cc: Luben Tuikov Cc: Thomas Zimmermann Cc: Monk Liu Cc: Yintian Tao Cc: Dennis Li Cc: shaoyunl Cc: Bokun Zhang Cc: "Stanley.Yang" Cc: Wenhui Sheng Cc: chen gong Signed-off-by: Daniel Vetter --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 12 +++- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 024c3b70b1aa..3d337f13ae4e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1093,7 +1093,7 @@ static const struct pci_device_id pciidlist[] = { MODULE_DEVICE_TABLE(pci, pciidlist); -static struct drm_driver kms_driver; +struct drm_driver amdgpu_kms_driver; static int amdgpu_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) @@ -1164,7 +1164,7 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, if (ret) return ret; -adev = devm_drm_dev_alloc(>dev, _driver, typeof(*adev), ddev); +adev = devm_drm_dev_alloc(>dev, _kms_driver, +typeof(*adev), ddev); if (IS_ERR(adev)) return PTR_ERR(adev); @@ -1508,7 +1508,7 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv) return 0; } -static struct drm_driver kms_driver = { +struct drm_driver amdgpu_kms_driver = { .driver_features = DRIVER_ATOMIC | DRIVER_GEM | @@ -1571,7 +1571,7 @@ static int __init amdgpu_init(void) goto error_fence; DRM_INFO("amdgpu kernel modesetting enabled.\n"); -kms_driver.num_ioctls = amdgpu_max_kms_ioctl; +amdgpu_kms_driver.num_ioctls = amdgpu_max_kms_ioctl; amdgpu_register_atpx_handler(); /* Ignore KFD init failures. Normal when CONFIG_HSA_AMD is not set. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index d0aea5e39531..dde4c449c284 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -45,13 +45,23 @@ bool amdgpu_virt_mmio_blocked(struct amdgpu_device *adev) return RREG32_NO_KIQ(0xc040) == 0x; } +extern struct drm_driver amdgpu_kms_driver; + void amdgpu_virt_init_setting(struct amdgpu_device *adev) { /* enable virtual display */ if (adev->mode_info.num_crtc == 0) adev->mode_info.num_crtc = 1; adev->enable_virtual_display = true; -adev_to_drm(adev)->driver->driver_features &= ~DRIVER_ATOMIC; + +/* + * FIXME: Either make virt support atomic or make sure you have two + * drm_driver structs, these kind of tricks are only ok when there's + * guaranteed only a single device per system. This should also be done + * before struct drm_device is initialized. + */ +amdgpu_kms_driver.driver_features &= ~DRIVER_ATOMIC; + adev->cg_flags = 0; adev->pg_flags = 0; } -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] drm/scheduler: use job count instead of peek
Reviewed-by: monk@amd.com _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Christian König Sent: Friday, August 9, 2019 11:31 PM To: Grodzovsky, Andrey ; dri-devel@lists.freedesktop.org; Liu, Monk ; Deng, Emily Subject: [PATCH] drm/scheduler: use job count instead of peek The spsc_queue_peek function is accessing queue->head which belongs to the consumer thread and shouldn't be accessed by the producer This is fixing a rare race condition when destroying entities. Signed-off-by: Christian König --- drivers/gpu/drm/scheduler/sched_entity.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 35ddbec1375a..671c90f34ede 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -95,7 +95,7 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity) rmb(); /* for list_empty to work without lock */ if (list_empty(>list) || - spsc_queue_peek(>job_queue) == NULL) + spsc_queue_count(>job_queue) == 0) return true; return false; @@ -281,7 +281,7 @@ void drm_sched_entity_fini(struct drm_sched_entity *entity) /* Consumption of existing IBs wasn't completed. Forcefully * remove them here. */ - if (spsc_queue_peek(>job_queue)) { + if (spsc_queue_count(>job_queue)) { if (sched) { /* Park the kernel for a moment to make sure it isn't processing * our enity. -- 2.17.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] drm: should return upon the best size(v3)
Oh, yeah ... I find one aspect, we need to consider "range_start" and "range_end" Yeah, you guys are right, cool /Monk -Original Message- From: Liu, Monk Sent: Tuesday, November 27, 2018 10:10 PM To: Koenig, Christian ; Chris Wilson ; dri-de...@freedesktop.org Subject: RE: [PATCH] drm: should return upon the best size(v3) The logic this patch change is only for "best_hole" which is only get called with " DRM_MM_INSERT_BEST", In drm_mm_insert_node_in_range(), is there other aspect also need calculation and judge for the mode " DRM_MM_INSERT_BEST" ?? /Monk -Original Message- From: Christian König Sent: Tuesday, November 27, 2018 9:48 PM To: Liu, Monk ; Koenig, Christian ; Chris Wilson ; dri-de...@freedesktop.org Subject: Re: [PATCH] drm: should return upon the best size(v3) Am 27.11.18 um 14:40 schrieb Liu, Monk: >> A node with the searched size is not necessary the leftmost one, > We may have two nodes with the same size, and the one return first will be > sure *not* the leftmost one, I aware of that ... > But my question is why we need the leftmost one ? Because the code is designed to iterate over all available nodes. The size is just the primary criteria to judge on. If we won't return all nodes with the same size we won't necessary find a fitting one. See how the code is used in drm_mm_insert_node_in_range(). Christian. > > > -Original Message- > From: Christian König > Sent: Tuesday, November 27, 2018 8:54 PM > To: Chris Wilson ; Liu, Monk > ; dri-de...@freedesktop.org > Subject: Re: [PATCH] drm: should return upon the best size(v3) > > Am 27.11.18 um 11:00 schrieb Christian König: >> Am 27.11.18 um 10:20 schrieb Chris Wilson: >>> Quoting Monk Liu (2018-11-27 03:10:34) >>>> v2: >>>> amend description: >>>> for RB tree traveler we don't need to travel to the bottom level if >>>> already found the equal size node, thus the search performance can >>>> get improved. >>>> >>>> v3: >>>> split "<=" to "<" case and "==" case >>>> >>>> Tested-by: Rex Zhu >>>> Signed-off-by: Monk Liu >>> Still fundamentally broken. >> Can you explain that further? Do we need to return the deepest hole >> of the right size because the following algorithm depends on that? > Ok figured it out myself by thinking more about it. > > A node with the searched size is not necessary the leftmost one, so we would > not see all nodes with the searched size and potentially use the wrong one. > > Sorry Monk, but Chris is right this optimization is illegal. > > Regards, > Christian. > >> Thanks, >> Christian. >> >>> -Chris >>> ___ >>> dri-devel mailing list >>> dri-devel@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] drm: should return upon the best size(v3)
The logic this patch change is only for "best_hole" which is only get called with " DRM_MM_INSERT_BEST", In drm_mm_insert_node_in_range(), is there other aspect also need calculation and judge for the mode " DRM_MM_INSERT_BEST" ?? /Monk -Original Message- From: Christian König Sent: Tuesday, November 27, 2018 9:48 PM To: Liu, Monk ; Koenig, Christian ; Chris Wilson ; dri-de...@freedesktop.org Subject: Re: [PATCH] drm: should return upon the best size(v3) Am 27.11.18 um 14:40 schrieb Liu, Monk: >> A node with the searched size is not necessary the leftmost one, > We may have two nodes with the same size, and the one return first will be > sure *not* the leftmost one, I aware of that ... > But my question is why we need the leftmost one ? Because the code is designed to iterate over all available nodes. The size is just the primary criteria to judge on. If we won't return all nodes with the same size we won't necessary find a fitting one. See how the code is used in drm_mm_insert_node_in_range(). Christian. > > > -Original Message- > From: Christian König > Sent: Tuesday, November 27, 2018 8:54 PM > To: Chris Wilson ; Liu, Monk > ; dri-de...@freedesktop.org > Subject: Re: [PATCH] drm: should return upon the best size(v3) > > Am 27.11.18 um 11:00 schrieb Christian König: >> Am 27.11.18 um 10:20 schrieb Chris Wilson: >>> Quoting Monk Liu (2018-11-27 03:10:34) >>>> v2: >>>> amend description: >>>> for RB tree traveler we don't need to travel to the bottom level if >>>> already found the equal size node, thus the search performance can >>>> get improved. >>>> >>>> v3: >>>> split "<=" to "<" case and "==" case >>>> >>>> Tested-by: Rex Zhu >>>> Signed-off-by: Monk Liu >>> Still fundamentally broken. >> Can you explain that further? Do we need to return the deepest hole >> of the right size because the following algorithm depends on that? > Ok figured it out myself by thinking more about it. > > A node with the searched size is not necessary the leftmost one, so we would > not see all nodes with the searched size and potentially use the wrong one. > > Sorry Monk, but Chris is right this optimization is illegal. > > Regards, > Christian. > >> Thanks, >> Christian. >> >>> -Chris >>> ___ >>> dri-devel mailing list >>> dri-devel@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] drm: should return upon the best size(v3)
> A node with the searched size is not necessary the leftmost one, We may have two nodes with the same size, and the one return first will be sure *not* the leftmost one, I aware of that ... But my question is why we need the leftmost one ? -Original Message- From: Christian König Sent: Tuesday, November 27, 2018 8:54 PM To: Chris Wilson ; Liu, Monk ; dri-de...@freedesktop.org Subject: Re: [PATCH] drm: should return upon the best size(v3) Am 27.11.18 um 11:00 schrieb Christian König: > Am 27.11.18 um 10:20 schrieb Chris Wilson: >> Quoting Monk Liu (2018-11-27 03:10:34) >>> v2: >>> amend description: >>> for RB tree traveler we don't need to travel to the bottom level if >>> already found the equal size node, thus the search performance can >>> get improved. >>> >>> v3: >>> split "<=" to "<" case and "==" case >>> >>> Tested-by: Rex Zhu >>> Signed-off-by: Monk Liu >> Still fundamentally broken. > > Can you explain that further? Do we need to return the deepest hole of > the right size because the following algorithm depends on that? Ok figured it out myself by thinking more about it. A node with the searched size is not necessary the leftmost one, so we would not see all nodes with the searched size and potentially use the wrong one. Sorry Monk, but Chris is right this optimization is illegal. Regards, Christian. > > Thanks, > Christian. > >> -Chris >> ___ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
FW: [PATCH] drm: should break if already get the best size
Hi Chris Please check the sanity test of the patch from Rex /Monk From: Zhu, Rex Sent: Friday, November 23, 2018 5:45 PM To: Liu, Monk ; amd-...@lists.freedesktop.org Subject: Re: [PATCH] drm: should break if already get the best size Tested-by: Rex Zhu mailto:rex@amd.com>> Without this patch, if we search node via rb tree. For example: we insert different node with rand size, size range in (1-). the key in root node is 5587. if we try to find the node with key equal to 5587 or 7381, Loop: node->key is 5587 node->key is 2273 node->key is 3706 node->key is 4892 node->key is 5296 node->key is 5461 node->key is 5519 node->key is 5549 node->key is 5570 node->key is 5581 node->key is 5584 node->key is 5585 node->key is 5586 node->key is 5586 Find the best node, key is 5587 (Loop 14 levels) Loop: node->key is 5587 node->key is 7381 node->key is 6474 node->key is 7034 node->key is 7228 node->key is 7314 node->key is 7339 node->key is 7349 node->key is 7372 node->key is 7377 node->key is 7378 node->key is 7379 node->key is 7379 find the best node, key is 7381. (Loop 13 levels) With this patch: we don't need to go down if we found the right node that size equal to we needed. Best Regards Rex From: amd-gfx mailto:amd-gfx-boun...@lists.freedesktop.org>> on behalf of Monk Liu mailto:monk@amd.com>> Sent: Thursday, November 22, 2018 8:33 PM To: amd-...@lists.freedesktop.org<mailto:amd-...@lists.freedesktop.org> Cc: Liu, Monk Subject: [PATCH] drm: should break if already get the best size Signed-off-by: Monk Liu mailto:monk@amd.com>> --- drivers/gpu/drm/drm_mm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 3cc5fbd..369fd9b 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, u64 size) if (size <= node->hole_size) { best = node; rb = rb->rb_right; + if (size == node->hole_size) + break; } else { rb = rb->rb_left; } -- 2.7.4 ___ amd-gfx mailing list amd-...@lists.freedesktop.org<mailto:amd-...@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/amd-gfx amd-gfx Info Page - freedesktop.org<https://lists.freedesktop.org/mailman/listinfo/amd-gfx> lists.freedesktop.org To see the collection of prior postings to the list, visit the amd-gfx Archives.. Using amd-gfx: To post a message to all the list members, send email to amd-...@lists.freedesktop.org<mailto:amd-...@lists.freedesktop.org>. You can subscribe to the list, or change your existing subscription, in the sections below. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: FW: [PATCH] drm: should break if already get the best size
There is no checks at all in this best_hole() ... can you review the patch again ? /Monk -Original Message- From: Chris Wilson Sent: Friday, November 23, 2018 5:34 PM To: Liu, Monk ; dri-devel@lists.freedesktop.org Subject: RE: FW: [PATCH] drm: should break if already get the best size Quoting Liu, Monk (2018-11-23 09:11:02) > What do you mean the first in the chain ? and also can you explain the > " perfect match." ? thanks > > Assume there is couple nodes equal to the size you requested, without > this patch it will traveler to the bottom level of the RB tree and > gives you the node that close to the bottom level, which takes more > time compared with break on the first node, but anyway you eventually > get the node with the same size Size is but of one many checks the node must pass before being returned. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: FW: [PATCH] drm: should break if already get the best size
What do you mean the first in the chain ? and also can you explain the " perfect match." ? thanks Assume there is couple nodes equal to the size you requested, without this patch it will traveler to the bottom level of the RB tree and gives you the node that close to the bottom level, which takes more time compared with break on the first node, but anyway you eventually get the node with the same size if there is no node that equal to the size you requested, without or with my patch the logic is totally the same. /Monk -Original Message- From: Chris Wilson Sent: Friday, November 23, 2018 5:03 PM To: Liu, Monk ; dri-devel@lists.freedesktop.org Subject: Re: FW: [PATCH] drm: should break if already get the best size Quoting Liu, Monk (2018-11-23 08:02:11) > > > -Original Message- > From: amd-gfx On Behalf Of > Monk Liu > Sent: Thursday, November 22, 2018 8:33 PM > To: amd-...@lists.freedesktop.org > Cc: Liu, Monk > Subject: [PATCH] drm: should break if already get the best size > > Signed-off-by: Monk Liu > --- > drivers/gpu/drm/drm_mm.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index > 3cc5fbd..369fd9b 100644 > --- a/drivers/gpu/drm/drm_mm.c > +++ b/drivers/gpu/drm/drm_mm.c > @@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, > u64 size) > if (size <= node->hole_size) { > best = node; > rb = rb->rb_right; > + if (size == node->hole_size) > + break; No. The point is to find the first in the chain that matches because not every node is suitable. By not checking all best_sizes you may end up skipping the perfect match. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
FW: [PATCH] drm: should break if already get the best size
-Original Message- From: amd-gfx On Behalf Of Monk Liu Sent: Thursday, November 22, 2018 8:33 PM To: amd-...@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH] drm: should break if already get the best size Signed-off-by: Monk Liu --- drivers/gpu/drm/drm_mm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 3cc5fbd..369fd9b 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, u64 size) if (size <= node->hole_size) { best = node; rb = rb->rb_right; + if (size == node->hole_size) + break; } else { rb = rb->rb_left; } -- 2.7.4 ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
FW: [PATCH] drm: should break if already get the best size
-Original Message- From: amd-gfx On Behalf Of Monk Liu Sent: Thursday, November 22, 2018 8:33 PM To: amd-...@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH] drm: should break if already get the best size Signed-off-by: Monk Liu --- drivers/gpu/drm/drm_mm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 3cc5fbd..369fd9b 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -318,6 +318,8 @@ static struct drm_mm_node *best_hole(struct drm_mm *mm, u64 size) if (size <= node->hole_size) { best = node; rb = rb->rb_right; + if (size == node->hole_size) + break; } else { rb = rb->rb_left; } -- 2.7.4 ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: reservation questions
Yeah, I got the point, using dependency/scheduler to make sure the resulting fence always not signal before the ones hooked in dependencies thanks /Monk From: Christian K?nig <ckoenig.leichtzumer...@gmail.com> Sent: Tuesday, March 6, 2018 8:46:57 PM To: Liu, Monk; Koenig, Christian Cc: dri-devel@lists.freedesktop.org Subject: Re: reservation questions e.g. the excl fence is with the same ctx (but later) of the one in shared list. Well it's not on the same context, but it is guaranteed to not complete before all shared fences. See for example how it is used in amdgpu_copy_buffer(). We first sync to all fences in the reservation object: r = amdgpu_sync_resv(adev, >sync, resv, AMDGPU_FENCE_OWNER_UNDEFINED, false); if (r) { DRM_ERROR("sync failed (%d).\n", r); goto error_free; } This way the resulting fence will never signal before anything else and so can safely be used as exclusive fence in the reservation object. Regards, Christian. Am 06.03.2018 um 13:32 schrieb Liu, Monk: You mean the caller must guarantees the excl fence will only signal till all shared fence signaled, so you can just ignores all shared fence if add_excl_fence() is invoked. e.g. the excl fence is with the same ctx (but later) of the one in shared list. thanks for the explanation /Monk From: Koenig, Christian Sent: Tuesday, March 6, 2018 7:11:50 PM To: Liu, Monk Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; Chris Wilson Subject: Re: reservation questions Hi Monk, your check isn't correct because you still haven't understood the semantics here. the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG, The semantic is NOT that all shared fences are signaled when the exclusive fence is added. Instead the requirement is that the exclusive fence signals after all shared fences signaled. In other words that is an asynchronous handling here. I honestly don't know how else to explain it. Regards, Christian. Am 06.03.2018 um 12:03 schrieb Liu, Monk: Hi Christian I use blow patch to capture the incorrect case : @@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, write_seqcount_end(>seq); preempt_enable(); - /* inplace update, no shared fences */ - while (i--) - dma_fence_put(rcu_dereference_protected(old->shared[i], - reservation_object_held(obj))); + /* inplace update, no shared fences continue after all shared signaled */ + while (i--) { + struct dma_fence *f = rcu_dereference_protected(old->shared[i], + reservation_object_held(obj)); + if (!dma_fence_is_signaled(f)) + BUG(); + + dma_fence_put(f); + /* better assign shared[i] with NULL for sure */ + rcu_assign_pointer(old->shared[i], NULL); + } dma_fence_put(old_fence); + + } EXPORT_SYMBOL(reservation_object_add_excl_fence); and I hit this BUG() during test: [ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for :00:08.0 on minor 0 [ 105.623332] [ cut here ] [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275! [ 105.624470] invalid opcode: [#1] SMP [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy pata_acpi [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1 [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 105.633528] task: 8f8a6a165a00 task.stack: b1204159c000 [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0 [ 105.635824] RSP: 0018:b1204159f9f0 EFLAGS: 00010246 [ 105.636805] RAX: RBX: 8f8a64bee760 RCX: 8f8a6bfa2f50 [ 105.638123] RDX: 8f8a6bfa6770 RSI: 8f8a64bee660 RDI: 8f8a6635f628 [ 105.639440] RBP: b1204159fa18 R08: R09: 0001 [ 105.640702] R10: b1204159f808 R11: 0003 R12: [ 105.641947] R13: R14: 8f8a6d0f0200 R15: 8f8a64beee60 [ 1
Re: reservation questions
You mean the caller must guarantees the excl fence will only signal till all shared fence signaled, so you can just ignores all shared fence if add_excl_fence() is invoked. e.g. the excl fence is with the same ctx (but later) of the one in shared list. thanks for the explanation /Monk From: Koenig, Christian Sent: Tuesday, March 6, 2018 7:11:50 PM To: Liu, Monk Cc: dri-devel@lists.freedesktop.org; Chris Wilson Subject: Re: reservation questions Hi Monk, your check isn't correct because you still haven't understood the semantics here. the assumption that all shared fences should be signaled before adding excl fence looks not 100% guaranteed in LKG, The semantic is NOT that all shared fences are signaled when the exclusive fence is added. Instead the requirement is that the exclusive fence signals after all shared fences signaled. In other words that is an asynchronous handling here. I honestly don't know how else to explain it. Regards, Christian. Am 06.03.2018 um 12:03 schrieb Liu, Monk: Hi Christian I use blow patch to capture the incorrect case : @@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, write_seqcount_end(>seq); preempt_enable(); - /* inplace update, no shared fences */ - while (i--) - dma_fence_put(rcu_dereference_protected(old->shared[i], - reservation_object_held(obj))); + /* inplace update, no shared fences continue after all shared signaled */ + while (i--) { + struct dma_fence *f = rcu_dereference_protected(old->shared[i], + reservation_object_held(obj)); + if (!dma_fence_is_signaled(f)) + BUG(); + + dma_fence_put(f); + /* better assign shared[i] with NULL for sure */ + rcu_assign_pointer(old->shared[i], NULL); + } dma_fence_put(old_fence); + + } EXPORT_SYMBOL(reservation_object_add_excl_fence); and I hit this BUG() during test: [ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for :00:08.0 on minor 0 [ 105.623332] [ cut here ] [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275! [ 105.624470] invalid opcode: [#1] SMP [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy pata_acpi [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1 [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 105.633528] task: 8f8a6a165a00 task.stack: b1204159c000 [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0 [ 105.635824] RSP: 0018:b1204159f9f0 EFLAGS: 00010246 [ 105.636805] RAX: RBX: 8f8a64bee760 RCX: 8f8a6bfa2f50 [ 105.638123] RDX: 8f8a6bfa6770 RSI: 8f8a64bee660 RDI: 8f8a6635f628 [ 105.639440] RBP: b1204159fa18 R08: R09: 0001 [ 105.640702] R10: b1204159f808 R11: 0003 R12: [ 105.641947] R13: R14: 8f8a6d0f0200 R15: 8f8a64beee60 [ 105.643165] FS: 7fd13c73d940() GS:8f8a76d8() knlGS: [ 105.644573] CS: 0010 DS: ES: CR0: 80050033 [ 105.646482] CR2: 7fd13c6fd000 CR3: 0001a2a58000 CR4: 001406e0 [ 105.648467] Call Trace: [ 105.652480] amdgpu_bo_do_create+0x3a1/0x540 [amdgpu] [ 105.654233] amdgpu_bo_create+0x3a/0x220 [amdgpu] [ 105.655956] amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu] [ 105.657641] amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu] [ 105.659155] amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu] [ 105.660698] ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu] [ 105.662515] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.664203] drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.665491] ? drm_ioctl_kernel+0x69/0xb0 [drm] [ 105.666959] drm_ioctl+0x2d2/0x390 [drm] [ 105.668373] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu] [ 105.670056] ? call_rcu_sched+0x1d/0x20 [ 105.671516] ? put_object+0x26/0x30 [ 105.672741] ? __delete_object+0x39/0x50 [ 105.674048] amdgpu_drm_ioctl+0x4c/0x80 [amdgpu] [ 105.675551] do_vfs_ioctl+0x92/0x5a0 [ 105.676874] ? kvm_sched_clock_read+0x1e/0x30 [ 105.678276] ? sched_clock+0x9/0x10 [ 105.679553] ? get_vtime_delta+0x99/0xc0 [ 105.681007] SyS_ioctl+0x79/0x90 [
Re: reservation questions
o take a deep look ... /Monk ________ From: Liu, Monk Sent: Tuesday, March 6, 2018 6:47 PM To: Koenig, Christian; Chris Wilson; dri-devel@lists.freedesktop.org Subject: Re: reservation questions ok, that's good point ... From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:42:44 PM To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org Subject: Re: reservation questions Hi Monk, that is to remove the problem that allocating memory could fail. E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management. So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware. Regards, Christian. Am 06.03.2018 um 11:19 schrieb Liu, Monk: Hi Chris another question is why we not just call "reservation_object_reserve_shared" during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need to worry when and how much time it should call reserve_shared() ? thanks ! void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence); From: Chris Wilson <ch...@chris-wilson.co.uk><mailto:ch...@chris-wilson.co.uk> Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; Koenig, Christian Subject: Re: reservation questions Quoting Liu, Monk (2018-03-06 09:45:19) > call reservation_object_add_excl_fence, > it set obj->fence->shared_count to 0, and put all shared fence from obj->fence > without waiting signaling. > (this action looks inappropriate, I think at least before put all those shared > fences > we should dma_wait_fence() on them to make sure they are signaled) No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.) > call reservation_object_reserve_shared, > this time obj->staged isn't NULL, and it is freed (nothing bad now > since obj->fence points to other place), > and obj->staged set to NULL, > > call reservation_object_add_shared_fence, > this time should going through reservation_object_add_shared_inplace, > But BUG_ON(old->shared_count >= old->shared_max) will hit ! How? You only free staged iff shared_count < shared_max. You've reminded me that we should cover all this with a bunch of selftests. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: reservation questions
ok, that's good point ... From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:42:44 PM To: Liu, Monk; Chris Wilson; dri-devel@lists.freedesktop.org Subject: Re: reservation questions Hi Monk, that is to remove the problem that allocating memory could fail. E.g. we only add the fence after sending the command to the hardware, so there is now way back and we need to add the fence or break memory management. So reservation_object_reserve_shared() makes sure there is a free fence slot *before* we start to prepare things for the hardware. Regards, Christian. Am 06.03.2018 um 11:19 schrieb Liu, Monk: Hi Chris another question is why we not just call "reservation_object_reserve_shared" during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need to worry when and how much time it should call reserve_shared() ? thanks ! void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence); From: Chris Wilson <ch...@chris-wilson.co.uk><mailto:ch...@chris-wilson.co.uk> Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; Koenig, Christian Subject: Re: reservation questions Quoting Liu, Monk (2018-03-06 09:45:19) > call reservation_object_add_excl_fence, > it set obj->fence->shared_count to 0, and put all shared fence from obj->fence > without waiting signaling. > (this action looks inappropriate, I think at least before put all those shared > fences > we should dma_wait_fence() on them to make sure they are signaled) No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.) > call reservation_object_reserve_shared, > this time obj->staged isn't NULL, and it is freed (nothing bad now > since obj->fence points to other place), > and obj->staged set to NULL, > > call reservation_object_add_shared_fence, > this time should going through reservation_object_add_shared_inplace, > But BUG_ON(old->shared_count >= old->shared_max) will hit ! How? You only free staged iff shared_count < shared_max. You've reminded me that we should cover all this with a bunch of selftests. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)
okay From: Chris Wilson <ch...@chris-wilson.co.uk> Sent: Tuesday, March 6, 2018 4:24:21 PM To: Liu, Monk; dri-de...@freedesktop.org Cc: Liu, Monk Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2) Quoting Monk Liu (2018-03-06 03:53:10) > v2: > still check context first to avoid warning from dma_fence_is_later > apply this fix in add_shared_replace as well > > Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767 > Signed-off-by: Monk Liu <monk@amd.com> > --- > drivers/dma-buf/reservation.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index 314eb10..c6e3c86 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -118,7 +118,8 @@ reservation_object_add_shared_inplace(struct > reservation_object *obj, > old_fence = rcu_dereference_protected(fobj->shared[i], > reservation_object_held(obj)); > > - if (old_fence->context == fence->context) { > + if (old_fence->context == fence->context && > + dma_fence_is_later(fence, old_fence)) { This should be true by construction. Adding an older fence on the same context is a programming bug, imo. Between different callers the resv should have been locked and the fenced operations serialised, from the same caller, you shouldn't be handling more than one output fence? -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: reservation questions
Hi Chris another question is why we not just call "reservation_object_reserve_shared" during below add_shared_fence function, so the BUG_ON() could be avoided and caller won't need to worry when and how much time it should call reserve_shared() ? thanks ! void reservation_object_add_shared_fence(struct reservation_object *obj, struct dma_fence *fence) { struct reservation_object_list *old, *fobj = obj->staged; old = reservation_object_get_list(obj); obj->staged = NULL; if (!fobj) { BUG_ON(old->shared_count >= old->shared_max); reservation_object_add_shared_inplace(obj, old, fence); } else reservation_object_add_shared_replace(obj, old, fobj, fence); } EXPORT_SYMBOL(reservation_object_add_shared_fence); From: Chris Wilson <ch...@chris-wilson.co.uk> Sent: Tuesday, March 6, 2018 6:10:21 PM To: Liu, Monk; dri-devel@lists.freedesktop.org; Koenig, Christian Subject: Re: reservation questions Quoting Liu, Monk (2018-03-06 09:45:19) > call reservation_object_add_excl_fence, > it set obj->fence->shared_count to 0, and put all shared fence from obj->fence > without waiting signaling. > (this action looks inappropriate, I think at least before put all those shared > fences > we should dma_wait_fence() on them to make sure they are signaled) No. Serialisation of resv updates are handled by the caller, the fences are ordered asynchronously so the wait is implicit in the construction. (I.e. before the excl fence can be signaled, all of the earlier shared fences must be signaled. You can even say before the operation that the excl fence signals completion of can begin, all the shared fences must have been signaled. But that is all implicit so that we can do it asynchronously.) > call reservation_object_reserve_shared, > this time obj->staged isn't NULL, and it is freed (nothing bad now > since obj->fence points to other place), > and obj->staged set to NULL, > > call reservation_object_add_shared_fence, > this time should going through reservation_object_add_shared_inplace, > But BUG_ON(old->shared_count >= old->shared_max) will hit ! How? You only free staged iff shared_count < shared_max. You've reminded me that we should cover all this with a bunch of selftests. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: reservation questions
Yeah should be 0, typo sorry I use 3dmark test and successfully triggered a case in reserve_shared: if (obj->staged!=NULL) { BUG(); } kfree(obj->staged) Previously I cannot figure out why the hell this BUG() could be hit, finally the scenario comes up, you only need one add_excl_fence() after reserved_shared(), and next time the reserve_shared will go hit that BUG(), but that's okay, since it only frees the obj->staged that allocated right before this calling in the previous reserve_shared(), and no one refers to it now. From: Koenig, Christian Sent: Tuesday, March 6, 2018 6:05:27 PM To: Liu, Monk; dri-devel@lists.freedesktop.org; Chris Wilson Subject: Re: reservation questions Am 06.03.2018 um 10:56 schrieb Liu, Monk: sorry, I have some mistake in previous thread, correct it as followings. 1) considering below sequence: call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4, call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far. call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled) The exclusive fence replaces all shared fences, e.g. the exclusive operation needs to wait for all shared fences before it can access the protected object. Because of this we can drop all shared fences when a new exclusive fence is set. call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL, call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, since old->shared_count is 1 now, during reservation_object_add_shared_inplace() Why would old->shared_count be 1 now? As far as I can see it should be zero. Christian. it would go through the shared list, but the fence in shared list is now wild pointer: for (i = 0; i < fobj->shared_count; ++i) { struct dma_fence *old_fence; old_fence = rcu_dereference_protected(fobj->shared[i], reservation_object_held(obj)); if (old_fence->context == fence->context && dma_fence_is_later(fence, old_fence)) { /* memory barrier is added by write_seqcount_begin */ RCU_INIT_POINTER(fobj->shared[i], fence); write_seqcount_end(>seq); preempt_enable(); dma_fence_put(old_fence); return; } if (!signaled && dma_fence_is_signaled(old_fence)) { signaled = old_fence; signaled_idx = i; } } see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj) in outside, which is obj->fence, and in add_excl_fence, all dma_fence in obj->fence is already put. /Monk From: Liu, Monk Sent: Tuesday, March 6, 2018 5:45:19 PM To: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; Chris Wilson; Koenig, Christian Subject: reservation questions Hi Christian & Chris two question regarding resv: 1) considering below sequence: call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4, call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far. call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled) call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL, call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit ! This looks a design flaw in reservation object, shouldn't we fix it ? 2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine /** * reservation_object_wait_timeout_rcu - Wait on reservation's objects * shared and/or exclusi
RE: reservation questions
Oh, I see for (i = 0; i < fobj->shared_count; ++i) { struct dma_fence *old_fence; Since it use I < fobj->shared_count, then the wild pointer access won’t hit, please ignore question 1 The only valuable is question 2: how we treat excl fence and shared fence? e.g. when we add an excl fence to resv, how to deal with shared ? current logic is just put the all do we need to wait on their signaling before putting them ? thanks /Monk From: Liu, Monk Sent: 2018年3月6日 17:57 To: dri-devel@lists.freedesktop.org; Chris Wilson <ch...@chris-wilson.co.uk>; Koenig, Christian <christian.koe...@amd.com> Subject: Re: reservation questions sorry, I have some mistake in previous thread, correct it as followings. 1) considering below sequence: call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4, call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far. call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled) call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL, call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, since old->shared_count is 1 now, during reservation_object_add_shared_inplace() it would go through the shared list, but the fence in shared list is now wild pointer: for (i = 0; i < fobj->shared_count; ++i) { struct dma_fence *old_fence; old_fence = rcu_dereference_protected(fobj->shared[i], reservation_object_held(obj)); if (old_fence->context == fence->context && dma_fence_is_later(fence, old_fence)) { /* memory barrier is added by write_seqcount_begin */ RCU_INIT_POINTER(fobj->shared[i], fence); write_seqcount_end(>seq); preempt_enable(); dma_fence_put(old_fence); return; } if (!signaled && dma_fence_is_signaled(old_fence)) { signaled = old_fence; signaled_idx = i; } } see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj) in outside, which is obj->fence, and in add_excl_fence, all dma_fence in obj->fence is already put. /Monk From: Liu, Monk Sent: Tuesday, March 6, 2018 5:45:19 PM To: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; Chris Wilson; Koenig, Christian Subject: reservation questions Hi Christian & Chris two question regarding resv: 1) considering below sequence: call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4, call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far. call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled) call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL, call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit ! This looks a design flaw in reservation object, shouldn't we fix it ? 2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine /** * reservation_object_wait_timeout_rcu - Wait on reservation's objects * shared and/or exclusive fences. * @obj: the reservation object * @wait_all: if true, wait on all fences, else wait on just exclusive fence * @intr: if true, do interruptible wait * @timeout: timeout value in jiffies or zero to return immediately * * RETURNS * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or * greater than zer on success. */ long reservation_object_wait_timeou
Re: reservation questions
sorry, I have some mistake in previous thread, correct it as followings. 1) considering below sequence: call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4, call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far. call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled) call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL, call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, since old->shared_count is 1 now, during reservation_object_add_shared_inplace() it would go through the shared list, but the fence in shared list is now wild pointer: for (i = 0; i < fobj->shared_count; ++i) { struct dma_fence *old_fence; old_fence = rcu_dereference_protected(fobj->shared[i], reservation_object_held(obj)); if (old_fence->context == fence->context && dma_fence_is_later(fence, old_fence)) { /* memory barrier is added by write_seqcount_begin */ RCU_INIT_POINTER(fobj->shared[i], fence); write_seqcount_end(>seq); preempt_enable(); dma_fence_put(old_fence); return; } if (!signaled && dma_fence_is_signaled(old_fence)) { signaled = old_fence; signaled_idx = i; } } see that old_fence is get from fobj, and fobj is from reservation_object_get_list(obj) in outside, which is obj->fence, and in add_excl_fence, all dma_fence in obj->fence is already put. /Monk From: Liu, Monk Sent: Tuesday, March 6, 2018 5:45:19 PM To: dri-devel@lists.freedesktop.org; Chris Wilson; Koenig, Christian Subject: reservation questions Hi Christian & Chris two question regarding resv: 1) considering below sequence: call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4, call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far. call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled) call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL, call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit ! This looks a design flaw in reservation object, shouldn't we fix it ? 2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine /** * reservation_object_wait_timeout_rcu - Wait on reservation's objects * shared and/or exclusive fences. * @obj: the reservation object * @wait_all: if true, wait on all fences, else wait on just exclusive fence * @intr: if true, do interruptible wait * @timeout: timeout value in jiffies or zero to return immediately * * RETURNS * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or * greater than zer on success. */ long reservation_object_wait_timeout_rcu(struct reservation_object *obj, bool wait_all, bool intr, unsigned long timeout) thanks /Monk ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
reservation questions
Hi Christian & Chris two question regarding resv: 1) considering below sequence: call reservation_object_add_shared_fence, now assume old->shared_count is now 3 call reservation_object_add_shared_fence, now assume old->shared_count is now 4, call reservation_object_reserve_shared, now obj->staged is new allocated, and its shared_max = 8, but not used by far. call reservation_object_add_excl_fence, it set obj->fence->shared_count to 0, and put all shared fence from obj->fence without waiting signaling. (this action looks inappropriate, I think at least before put all those shared fences we should dma_wait_fence() on them to make sure they are signaled) call reservation_object_reserve_shared, this time obj->staged isn't NULL, and it is freed (nothing bad now since obj->fence points to other place), and obj->staged set to NULL, call reservation_object_add_shared_fence, this time should going through reservation_object_add_shared_inplace, But BUG_ON(old->shared_count >= old->shared_max) will hit ! This looks a design flaw in reservation object, shouldn't we fix it ? 2) in add_excl_fence(), it simply set old->shared_count to 0, and put all shared fences of old is that correct? if excl fence is really exclusively used, why we still consider both shared fence and excl fence on wait_timeout_rcu() routine, see blew description of this routine /** * reservation_object_wait_timeout_rcu - Wait on reservation's objects * shared and/or exclusive fences. * @obj: the reservation object * @wait_all: if true, wait on all fences, else wait on just exclusive fence * @intr: if true, do interruptible wait * @timeout: timeout value in jiffies or zero to return immediately * * RETURNS * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or * greater than zer on success. */ long reservation_object_wait_timeout_rcu(struct reservation_object *obj, bool wait_all, bool intr, unsigned long timeout) thanks /Monk ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-buf/reservation: should keep the new fence in add_shared_inplace
why? is there a design doc mentioned for this on reservation ? From: Christian K?nig <ckoenig.leichtzumer...@gmail.com> Sent: Tuesday, March 6, 2018 4:03:39 PM To: Liu, Monk; dri-devel@lists.freedesktop.org Subject: Re: [PATCH] dma-buf/reservation: should keep the new fence in add_shared_inplace NAK, the newly added fence must always be newer than the existing one. Christian. Am 06.03.2018 um 04:09 schrieb Monk Liu: > Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767 > Signed-off-by: Monk Liu <monk@amd.com> > --- > drivers/dma-buf/reservation.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index 314eb10..29b7e45 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -118,7 +118,7 @@ reservation_object_add_shared_inplace(struct > reservation_object *obj, >old_fence = rcu_dereference_protected(fobj->shared[i], >reservation_object_held(obj)); > > - if (old_fence->context == fence->context) { > + if (dma_fence_is_later(fence, old_fence)) { >/* memory barrier is added by write_seqcount_begin */ >RCU_INIT_POINTER(fobj->shared[i], fence); >write_seqcount_end(>seq); ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)
Make sense, will give v3 -Original Message- From: Zhou, David(ChunMing) Sent: 2018年3月6日 14:08 To: Liu, Monk <monk@amd.com>; Zhou, David(ChunMing) <david1.z...@amd.com>; dri-de...@freedesktop.org Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2) On 2018年03月06日 13:59, Liu, Monk wrote: >> >> -if (check->context == fence->context || >> +if ((check->context == fence->context && >> +dma_fence_is_later(fence, check)) || > We still need do more for !dma_fence_is_later(fence, check) case, in which, > we will don't need add new fence to resv slot. > > if ((check->context == fence->context) && dma_fence_is_later(fence, check)) > fobj->shared_count = j; > RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); > fobj->shared_count++; > } else { > dma_fence_put(fence); > } > > No you cannot do that, check is changed and not the one you want, > Besides, we don't need to consider the case you mentioned, take only one > fence in obj->staged for example: > > You add a fence whose context is equal to this fence (check), so > current logic will put this check into fobj->shared[++j], Yes, if !dma_fence_is_later(fence, check), then 'check' will be put to fobj->shared[++j], so we don't need add new fence to resv slot, don't we? Regards, David Zhou > so in the end > Obj->fence will point to fobj, and original old would be rcu_kfree() > > No additional action actually needed... > > /Monk > > -Original Message- > From: Zhou, David(ChunMing) > Sent: 2018年3月6日 12:25 > To: Liu, Monk <monk@amd.com>; dri-de...@freedesktop.org > Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add > fence(v2) > > > > On 2018年03月06日 11:53, Monk Liu wrote: >> v2: >> still check context first to avoid warning from dma_fence_is_later >> apply this fix in add_shared_replace as well >> >> Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767 >> Signed-off-by: Monk Liu <monk@amd.com> >> --- >>drivers/dma-buf/reservation.c | 6 -- >>1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/dma-buf/reservation.c >> b/drivers/dma-buf/reservation.c index 314eb10..c6e3c86 100644 >> --- a/drivers/dma-buf/reservation.c >> +++ b/drivers/dma-buf/reservation.c >> @@ -118,7 +118,8 @@ reservation_object_add_shared_inplace(struct >> reservation_object *obj, >> old_fence = rcu_dereference_protected(fobj->shared[i], >> reservation_object_held(obj)); >> >> -if (old_fence->context == fence->context) { >> +if (old_fence->context == fence->context && >> +dma_fence_is_later(fence, old_fence)) { >> /* memory barrier is added by write_seqcount_begin */ >> RCU_INIT_POINTER(fobj->shared[i], fence); >> write_seqcount_end(>seq); >> @@ -179,7 +180,8 @@ reservation_object_add_shared_replace(struct >> reservation_object *obj, >> check = rcu_dereference_protected(old->shared[i], >> reservation_object_held(obj)); >> >> -if (check->context == fence->context || >> +if ((check->context == fence->context && >> +dma_fence_is_later(fence, check)) || > We still need do more for !dma_fence_is_later(fence, check) case, in which, > we will don't need add new fence to resv slot. > > if ((check->context == fence->context) && dma_fence_is_later(fence, check)) > fobj->shared_count = j; > RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); > fobj->shared_count++; > } else { > dma_fence_put(fence); > } > > > Regards, > David Zhou >> dma_fence_is_signaled(check)) >> RCU_INIT_POINTER(fobj->shared[--k], check); >> else ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] dma-buf/reservation: should keep later one in add fence(v2)
> > - if (check->context == fence->context || > + if ((check->context == fence->context && > + dma_fence_is_later(fence, check)) || We still need do more for !dma_fence_is_later(fence, check) case, in which, we will don't need add new fence to resv slot. if ((check->context == fence->context) && dma_fence_is_later(fence, check)) fobj->shared_count = j; RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); fobj->shared_count++; } else { dma_fence_put(fence); } No you cannot do that, check is changed and not the one you want, Besides, we don't need to consider the case you mentioned, take only one fence in obj->staged for example: You add a fence whose context is equal to this fence (check), so current logic will put this check into fobj->shared[++j], so in the end Obj->fence will point to fobj, and original old would be rcu_kfree() No additional action actually needed... /Monk -Original Message----- From: Zhou, David(ChunMing) Sent: 2018年3月6日 12:25 To: Liu, Monk <monk@amd.com>; dri-de...@freedesktop.org Subject: Re: [PATCH] dma-buf/reservation: should keep later one in add fence(v2) On 2018年03月06日 11:53, Monk Liu wrote: > v2: > still check context first to avoid warning from dma_fence_is_later > apply this fix in add_shared_replace as well > > Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767 > Signed-off-by: Monk Liu <monk@amd.com> > --- > drivers/dma-buf/reservation.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/dma-buf/reservation.c > b/drivers/dma-buf/reservation.c index 314eb10..c6e3c86 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -118,7 +118,8 @@ reservation_object_add_shared_inplace(struct > reservation_object *obj, > old_fence = rcu_dereference_protected(fobj->shared[i], > reservation_object_held(obj)); > > - if (old_fence->context == fence->context) { > + if (old_fence->context == fence->context && > + dma_fence_is_later(fence, old_fence)) { > /* memory barrier is added by write_seqcount_begin */ > RCU_INIT_POINTER(fobj->shared[i], fence); > write_seqcount_end(>seq); > @@ -179,7 +180,8 @@ reservation_object_add_shared_replace(struct > reservation_object *obj, > check = rcu_dereference_protected(old->shared[i], > reservation_object_held(obj)); > > - if (check->context == fence->context || > + if ((check->context == fence->context && > + dma_fence_is_later(fence, check)) || We still need do more for !dma_fence_is_later(fence, check) case, in which, we will don't need add new fence to resv slot. if ((check->context == fence->context) && dma_fence_is_later(fence, check)) fobj->shared_count = j; RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); fobj->shared_count++; } else { dma_fence_put(fence); } Regards, David Zhou > dma_fence_is_signaled(check)) > RCU_INIT_POINTER(fobj->shared[--k], check); > else ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] dma-buf/reservation: should keep the new fence in add_shared_inplace
Yeah, right -Original Message- From: Zhou, David(ChunMing) Sent: 2018年3月6日 11:38 To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org Subject: Re: [PATCH] dma-buf/reservation: should keep the new fence in add_shared_inplace On 2018年03月06日 11:09, Monk Liu wrote: > Change-Id: If6a979ba9fd6c923b82212f35f07a9ff31c86767 > Signed-off-by: Monk Liu <monk@amd.com> > --- > drivers/dma-buf/reservation.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/reservation.c > b/drivers/dma-buf/reservation.c index 314eb10..29b7e45 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -118,7 +118,7 @@ reservation_object_add_shared_inplace(struct > reservation_object *obj, > old_fence = rcu_dereference_protected(fobj->shared[i], > reservation_object_held(obj)); > > - if (old_fence->context == fence->context) { > + if (dma_fence_is_later(fence, old_fence)) { OK, good catch, to avoid warning of different context, which should be "if ((old_fence->context == fence->context) && dma_fence_is_later(fence, old_fence)) { " and reservation_object_add_shared_replace need this fix as well. Regards, David Zhou > /* memory barrier is added by write_seqcount_begin */ > RCU_INIT_POINTER(fobj->shared[i], fence); > write_seqcount_end(>seq); ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available
Can you give more details ? thanks /Monk -Original Message- From: Koenig, Christian Sent: 2018年3月5日 19:39 To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available Am 05.03.2018 um 12:37 schrieb Liu, Monk: > But the thing confuse me is according to the design, if driver keep > calling reserve_shared() prior to add_fence(), and with lock held of cause, > That BUG() shouldn't hit, so there are two things in face looks weired to me: > 1) by design in reserve_shared(), obj->staged should be already NULL, > so why we kfree on it No, that is not correct. > 2) in fact, amdgpu can hit the case that obj->staged is not NULL in > reserved_shared(), don't know how it lead here We reserved a fence slot without using it, so it is still there when reserve_shared() is called. Christian. > > > Any thought ? > > /Monk > > -Original Message- > From: Koenig, Christian > Sent: 2018年3月5日 19:29 > To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; > linux-ker...@vger.kernel.org > Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when > slot available > > Am 05.03.2018 um 12:25 schrieb Liu, Monk: >> And by the way, I add "if (staged!=NULL) BUG();" prior to >> "kfree(obj->staged)" in reserve_shared() routine, and this BUG() is actually >> hit, The stack dump shows it is hit during the vm_bo_update() in >> gem_va_update()... > That is expected. The staged handling just makes sure that there is room > available, it doesn't guarantee that it is actually used. > > E.g. we can end up reserving a fence slot, but then find that we actually > don't need it. > > Christian. > >> Besides, the whole reservation logic still looks a little weired to me ... >> especially this staged part ... >> >> Thanks >> >> /Monk >> >> -Original Message- >> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] >> Sent: 2018年3月5日 19:22 >> To: Liu, Monk <monk@amd.com>; Koenig, Christian >> <christian.koe...@amd.com>; dri-devel@lists.freedesktop.org; >> linux-ker...@vger.kernel.org >> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when >> slot available >> >> Am 05.03.2018 um 08:55 schrieb Liu, Monk: >>> Hi Christian >>> >>> You are right on that part of obj-staged is set to NULL in >>> add_fence, So my following question will be why we kfree(obj->staged) in >>> reserve_shared() if staged is always NULL in that point ? >> Good question, I haven't wrote code that so I can't fully answer. >> >> Maybe Chris or Maarten know more about that. >> >> Christian. >> >>> Thanks >>> /Monk >>> >>> -Original Message- >>> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] >>> Sent: 2018年2月28日 16:27 >>> To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; >>> linux-ker...@vger.kernel.org >>> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged >>> when slot available >>> >>> Am 28.02.2018 um 07:44 schrieb Monk Liu: >>>> under below scenario the obj->fence would refer to a wild pointer: >>>> >>>> 1,call reservation_object_reserved_shared >>>> 2,call reservation_object_add_shared_fence >>>> 3,call reservation_object_reserved_shared >>>> 4,call reservation_object_add_shared_fence >>>> >>>> in step 1, staged is allocated, >>>> >>>> in step 2, code path will go >>>> reservation_object_add_shared_replace() >>>> and obj->fence would be assigned as staged (through >>>> RCU_INIT_POINTER) >>>> >>>> in step 3, obj->staged will be freed(by simple kfree), which make >>>> obj->fence point to a wild pointer... >>> Well that explanation is still nonsense. See >>> reservation_object_add_shared_fence: >>>> obj->staged = NULL; >>> Among the first things reservation_object_add_shared_fence() does is >>> it sets obj->staged to NULL. >>> >>> So step 3 will not free anything and we never have a wild pointer. >>> >>> Regards, >>> Christian. >>> >>>> in step 4, code path will go >>>> reservation_object_add_shared_inplace() >>>> and inside it the @fobj (which equals to @obj->staged, set by above >>
RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available
But the thing confuse me is according to the design, if driver keep calling reserve_shared() prior to add_fence(), and with lock held of cause, That BUG() shouldn't hit, so there are two things in face looks weired to me: 1) by design in reserve_shared(), obj->staged should be already NULL, so why we kfree on it 2) in fact, amdgpu can hit the case that obj->staged is not NULL in reserved_shared(), don't know how it lead here Any thought ? /Monk -Original Message- From: Koenig, Christian Sent: 2018年3月5日 19:29 To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available Am 05.03.2018 um 12:25 schrieb Liu, Monk: > And by the way, I add "if (staged!=NULL) BUG();" prior to > "kfree(obj->staged)" in reserve_shared() routine, and this BUG() is actually > hit, The stack dump shows it is hit during the vm_bo_update() in > gem_va_update()... That is expected. The staged handling just makes sure that there is room available, it doesn't guarantee that it is actually used. E.g. we can end up reserving a fence slot, but then find that we actually don't need it. Christian. > > Besides, the whole reservation logic still looks a little weired to me ... > especially this staged part ... > > Thanks > > /Monk > > -Original Message- > From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] > Sent: 2018年3月5日 19:22 > To: Liu, Monk <monk@amd.com>; Koenig, Christian > <christian.koe...@amd.com>; dri-devel@lists.freedesktop.org; > linux-ker...@vger.kernel.org > Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when > slot available > > Am 05.03.2018 um 08:55 schrieb Liu, Monk: >> Hi Christian >> >> You are right on that part of obj-staged is set to NULL in add_fence, >> So my following question will be why we kfree(obj->staged) in >> reserve_shared() if staged is always NULL in that point ? > Good question, I haven't wrote code that so I can't fully answer. > > Maybe Chris or Maarten know more about that. > > Christian. > >> Thanks >> /Monk >> >> -Original Message- >> From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] >> Sent: 2018年2月28日 16:27 >> To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; >> linux-ker...@vger.kernel.org >> Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when >> slot available >> >> Am 28.02.2018 um 07:44 schrieb Monk Liu: >>> under below scenario the obj->fence would refer to a wild pointer: >>> >>> 1,call reservation_object_reserved_shared >>> 2,call reservation_object_add_shared_fence >>> 3,call reservation_object_reserved_shared >>> 4,call reservation_object_add_shared_fence >>> >>> in step 1, staged is allocated, >>> >>> in step 2, code path will go reservation_object_add_shared_replace() >>> and obj->fence would be assigned as staged (through >>> RCU_INIT_POINTER) >>> >>> in step 3, obj->staged will be freed(by simple kfree), which make >>> obj->fence point to a wild pointer... >> Well that explanation is still nonsense. See >> reservation_object_add_shared_fence: >>> obj->staged = NULL; >> Among the first things reservation_object_add_shared_fence() does is >> it sets obj->staged to NULL. >> >> So step 3 will not free anything and we never have a wild pointer. >> >> Regards, >> Christian. >> >>> in step 4, code path will go reservation_object_add_shared_inplace() >>> and inside it the @fobj (which equals to @obj->staged, set by above >>> steps) is already a wild pointer >>> >>> should remov the kfree on staged in >>> reservation_object_reserve_shared() >>> >>> Change-Id: If7c01f1b4be3d3d8a81efa90216841f79ab1fc1c >>> Signed-off-by: Monk Liu <monk@amd.com> >>> --- >>> drivers/dma-buf/reservation.c | 7 ++- >>> 1 file changed, 2 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/dma-buf/reservation.c >>> b/drivers/dma-buf/reservation.c index 375de41..b473ccc 100644 >>> --- a/drivers/dma-buf/reservation.c >>> +++ b/drivers/dma-buf/reservation.c >>> @@ -74,12 +74,9 @@ int reservation_object_reserve_shared(struct >>> reservation_object *obj) >>> old = reservation_object_get_list(obj); >>> >>> if (old && old->shared_max) { >
RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available
And by the way, I add "if (staged!=NULL) BUG();" prior to "kfree(obj->staged)" in reserve_shared() routine, and this BUG() is actually hit, The stack dump shows it is hit during the vm_bo_update() in gem_va_update()... Besides, the whole reservation logic still looks a little weired to me ... especially this staged part ... Thanks /Monk -Original Message- From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] Sent: 2018年3月5日 19:22 To: Liu, Monk <monk@amd.com>; Koenig, Christian <christian.koe...@amd.com>; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available Am 05.03.2018 um 08:55 schrieb Liu, Monk: > Hi Christian > > You are right on that part of obj-staged is set to NULL in add_fence, > So my following question will be why we kfree(obj->staged) in > reserve_shared() if staged is always NULL in that point ? Good question, I haven't wrote code that so I can't fully answer. Maybe Chris or Maarten know more about that. Christian. > > Thanks > /Monk > > -Original Message- > From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] > Sent: 2018年2月28日 16:27 > To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; > linux-ker...@vger.kernel.org > Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when > slot available > > Am 28.02.2018 um 07:44 schrieb Monk Liu: >> under below scenario the obj->fence would refer to a wild pointer: >> >> 1,call reservation_object_reserved_shared >> 2,call reservation_object_add_shared_fence >> 3,call reservation_object_reserved_shared >> 4,call reservation_object_add_shared_fence >> >> in step 1, staged is allocated, >> >> in step 2, code path will go reservation_object_add_shared_replace() >> and obj->fence would be assigned as staged (through RCU_INIT_POINTER) >> >> in step 3, obj->staged will be freed(by simple kfree), which make >> obj->fence point to a wild pointer... > > Well that explanation is still nonsense. See > reservation_object_add_shared_fence: >> obj->staged = NULL; > Among the first things reservation_object_add_shared_fence() does is > it sets obj->staged to NULL. > > So step 3 will not free anything and we never have a wild pointer. > > Regards, > Christian. > >> in step 4, code path will go reservation_object_add_shared_inplace() >> and inside it the @fobj (which equals to @obj->staged, set by above >> steps) is already a wild pointer >> >> should remov the kfree on staged in >> reservation_object_reserve_shared() >> >> Change-Id: If7c01f1b4be3d3d8a81efa90216841f79ab1fc1c >> Signed-off-by: Monk Liu <monk@amd.com> >> --- >>drivers/dma-buf/reservation.c | 7 ++- >>1 file changed, 2 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/dma-buf/reservation.c >> b/drivers/dma-buf/reservation.c index 375de41..b473ccc 100644 >> --- a/drivers/dma-buf/reservation.c >> +++ b/drivers/dma-buf/reservation.c >> @@ -74,12 +74,9 @@ int reservation_object_reserve_shared(struct >> reservation_object *obj) >> old = reservation_object_get_list(obj); >> >> if (old && old->shared_max) { >> -if (old->shared_count < old->shared_max) { >> -/* perform an in-place update */ >> -kfree(obj->staged); >> -obj->staged = NULL; >> +if (old->shared_count < old->shared_max) >> return 0; >> -} else >> +else >> max = old->shared_max * 2; >> } else >> max = 4; > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available
Hi Christian You are right on that part of obj-staged is set to NULL in add_fence, So my following question will be why we kfree(obj->staged) in reserve_shared() if staged is always NULL in that point ? Thanks /Monk -Original Message- From: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] Sent: 2018年2月28日 16:27 To: Liu, Monk <monk@amd.com>; dri-devel@lists.freedesktop.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH] dma-buf/reservation: shouldn't kfree staged when slot available Am 28.02.2018 um 07:44 schrieb Monk Liu: > under below scenario the obj->fence would refer to a wild pointer: > > 1,call reservation_object_reserved_shared > 2,call reservation_object_add_shared_fence > 3,call reservation_object_reserved_shared > 4,call reservation_object_add_shared_fence > > in step 1, staged is allocated, > > in step 2, code path will go reservation_object_add_shared_replace() > and obj->fence would be assigned as staged (through RCU_INIT_POINTER) > > in step 3, obj->staged will be freed(by simple kfree), which make > obj->fence point to a wild pointer... Well that explanation is still nonsense. See reservation_object_add_shared_fence: > obj->staged = NULL; Among the first things reservation_object_add_shared_fence() does is it sets obj->staged to NULL. So step 3 will not free anything and we never have a wild pointer. Regards, Christian. > > in step 4, code path will go reservation_object_add_shared_inplace() > and inside it the @fobj (which equals to @obj->staged, set by above steps) > is already a wild pointer > > should remov the kfree on staged in reservation_object_reserve_shared() > > Change-Id: If7c01f1b4be3d3d8a81efa90216841f79ab1fc1c > Signed-off-by: Monk Liu <monk@amd.com> > --- > drivers/dma-buf/reservation.c | 7 ++- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index 375de41..b473ccc 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -74,12 +74,9 @@ int reservation_object_reserve_shared(struct > reservation_object *obj) > old = reservation_object_get_list(obj); > > if (old && old->shared_max) { > - if (old->shared_count < old->shared_max) { > - /* perform an in-place update */ > - kfree(obj->staged); > - obj->staged = NULL; > + if (old->shared_count < old->shared_max) > return 0; > - } else > + else > max = old->shared_max * 2; > } else > max = 4; ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH 0/2] Move scheduler out of AMDGPU
I'm wondering if GPU reset still work after this home move ... -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Christian K?nig Sent: Friday, December 1, 2017 11:55 PM To: Lucas Stach; Deucher, Alexander Cc: Grodzovsky, Andrey ; David Airlie ; amd-...@lists.freedesktop.org; patchwork-...@pengutronix.de; dri-devel@lists.freedesktop.org; ker...@pengutronix.de Subject: Re: [PATCH 0/2] Move scheduler out of AMDGPU Am 01.12.2017 um 16:28 schrieb Lucas Stach: > Hi all, > > so this is the first step to make the marvelous AMDGPU scheduler > useable for other drivers. I have a (mostly) working prototype of > Etnaviv using the scheduler, but those patches need to keep baking for a > while. > > I'm sending this out as I want to avoid rebasing this change too much > and don't want to take people by surprise when the Etnaviv > implementation surfaces. Also this might need some coordination > between AMDGPU and Etnaviv, which might be good to get going now. > > Please speak up now if you have any objections or comments. Looks good to me, but question is what is this based upon? I strongly assume drm-next, so question is now if we have any patches inside amd branches we should apply before doing this. CCing Andrey as well cause he has some tasks assigned around the scheduler as well. Regards, Christian. > > Regards, > Lucas > > Lucas Stach (2): >drm: move amd_gpu_scheduler into common location >drm/sched: move fence slab handling to module init/exit > > drivers/gpu/drm/Kconfig| 5 + > drivers/gpu/drm/Makefile | 1 + > drivers/gpu/drm/amd/amdgpu/Makefile| 5 +- > drivers/gpu/drm/amd/amdgpu/amdgpu.h| 16 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 38 +-- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 8 - > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 22 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 12 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 20 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_sched.h | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 6 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h| 4 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c| 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h| 4 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c| 8 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h| 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c| 14 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h| 4 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 10 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 4 +- > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 +- > drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 8 +- > drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 8 +- > drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 185 -- > drivers/gpu/drm/scheduler/Makefile | 4 + > .../gpu/drm/{amd => }/scheduler/gpu_scheduler.c| 281 > +++-- > drivers/gpu/drm/{amd => }/scheduler/sched_fence.c | 122 + > include/drm/gpu_scheduler.h| 171 + > .../drm/gpu_scheduler_trace.h | 14 +- > 34 files changed, 525 insertions(+), 511 deletions(-) > delete mode 100644 drivers/gpu/drm/amd/scheduler/gpu_scheduler.h > create mode 100644 drivers/gpu/drm/scheduler/Makefile > rename drivers/gpu/drm/{amd => }/scheduler/gpu_scheduler.c (64%) > rename drivers/gpu/drm/{amd => }/scheduler/sched_fence.c (58%) > create mode 100644 include/drm/gpu_scheduler.h > rename drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h => > include/drm/gpu_scheduler_trace.h (83%) > ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH] drm/amdgpu/virt: remove redundant variable pf2vf_ver
+ Horace -Original Message- From: Colin King [mailto:colin.k...@canonical.com] Sent: 2017年11月11日 19:51 To: Deucher, Alexander <alexander.deuc...@amd.com>; Koenig, Christian <christian.koe...@amd.com>; David Airlie <airl...@linux.ie>; Liu, Monk <monk@amd.com>; Yu, Xiangliang <xiangliang...@amd.com>; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org Subject: [PATCH] drm/amdgpu/virt: remove redundant variable pf2vf_ver From: Colin Ian King <colin.k...@canonical.com> Variable pf2vf_ver is assigned but never read, it is redundant and hence can be removed. Cleans up clang warning: drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:310:3: warning: Value stored to 'pf2vf_ver' is never read Signed-off-by: Colin Ian King <colin.k...@canonical.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 6738df836a70..b1cc179512fa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -296,7 +296,6 @@ int amdgpu_virt_fw_reserve_get_checksum(void *obj, void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev) { - uint32_t pf2vf_ver = 0; uint32_t pf2vf_size = 0; uint32_t checksum = 0; uint32_t checkval; @@ -309,7 +308,6 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev) adev->virt.fw_reserve.p_pf2vf = (struct amdgim_pf2vf_info_header *)( adev->fw_vram_usage.va + AMDGIM_DATAEXCHANGE_OFFSET); - pf2vf_ver = adev->virt.fw_reserve.p_pf2vf->version; AMDGPU_FW_VRAM_PF2VF_READ(adev, header.size, _size); AMDGPU_FW_VRAM_PF2VF_READ(adev, checksum, ); -- 2.14.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/ttm: fix memory leak while individualizing BOs
verified work, Reviewed-by: Monk Liu <monk@amd.com> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> on behalf of Christian König <deathsim...@vodafone.de> Sent: Wednesday, September 13, 2017 4:47:34 PM To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org Subject: [PATCH] drm/ttm: fix memory leak while individualizing BOs From: Christian König <christian.koe...@amd.com> We need to free the reservation object before we take the BO from the delayed delete list. Signed-off-by: Christian König <christian.koe...@amd.com> --- drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bee77d3..d79607a 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -557,6 +557,8 @@ static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo, } ttm_bo_del_from_lru(bo); + if (!list_empty(>ddestroy) && (bo->resv != >ttm_resv)) + reservation_object_fini(>ttm_resv); list_del_init(>ddestroy); kref_put(>list_kref, ttm_bo_ref_bug); -- 2.7.4 ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/1] drm/ttm: Fix accounting error when fail to get pages for pool
Reviewed-by: Monk Liu <monk@amd.com> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> on behalf of Xiangliang.Yu <xiangliang...@amd.com> Sent: Wednesday, August 16, 2017 3:20:46 PM To: a...@linux-foundation.org; labb...@redhat.com; dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org Cc: Yu, Xiangliang Subject: [PATCH 1/1] drm/ttm: Fix accounting error when fail to get pages for pool When fail to get needed page for pool, need to put allocated pages into pool. But current code has a miscalculation of allocated pages, correct it. Signed-off-by: Xiangliang.Yu <xiangliang...@amd.com> --- drivers/gpu/drm/ttm/ttm_page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c index eeddc1e..8715998 100644 --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c @@ -615,7 +615,7 @@ static void ttm_page_pool_fill_locked(struct ttm_page_pool *pool, } else { pr_err("Failed to fill pool (%p)\n", pool); /* If we have any pages left put them to the pool. */ - list_for_each_entry(p, >list, lru) { + list_for_each_entry(p, _pages, lru) { ++cpages; } list_splice(_pages, >list); -- 2.7.4 ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/7] drm/qxl: fix incorrect use of the lru_lock
Ack-by: Monk.LiuFrom: amd-gfx on behalf of Christian König Sent: Tuesday, August 8, 2017 4:14:46 PM To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Dave Airlie; Gerd Hoffmann Subject: Re: [PATCH 2/7] drm/qxl: fix incorrect use of the lru_lock Hi guys, can I get an rb or at least an Acked-by for that one? The code was obviously copied over from radeon where it wasn't correct in the first place. Thanks, Christian. Am 07.08.2017 um 17:48 schrieb Christian König: > From: Christian König > > The BO manager has its own lock and doesn't use the lru_lock. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/qxl/qxl_ttm.c | 13 - > 1 file changed, 4 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c > index 0fdedee..07dc08d 100644 > --- a/drivers/gpu/drm/qxl/qxl_ttm.c > +++ b/drivers/gpu/drm/qxl/qxl_ttm.c > @@ -454,15 +454,10 @@ void qxl_ttm_fini(struct qxl_device *qdev) > static int qxl_mm_dump_table(struct seq_file *m, void *data) > { >struct drm_info_node *node = (struct drm_info_node *)m->private; > - struct drm_mm *mm = (struct drm_mm *)node->info_ent->data; > - struct drm_device *dev = node->minor->dev; > - struct qxl_device *rdev = dev->dev_private; > - struct ttm_bo_global *glob = rdev->mman.bdev.glob; > + struct ttm_mem_type_manager *man = node->info_ent->data; >struct drm_printer p = drm_seq_file_printer(m); > > - spin_lock(>lru_lock); > - drm_mm_print(mm, ); > - spin_unlock(>lru_lock); > + man->func->debug(man, ); >return 0; > } > #endif > @@ -483,9 +478,9 @@ int qxl_ttm_debugfs_init(struct qxl_device *qdev) >qxl_mem_types_list[i].show = _mm_dump_table; >qxl_mem_types_list[i].driver_features = 0; >if (i == 0) > - qxl_mem_types_list[i].data = > qdev->mman.bdev.man[TTM_PL_VRAM].priv; > + qxl_mem_types_list[i].data = > >mman.bdev.man[TTM_PL_VRAM]; >else > - qxl_mem_types_list[i].data = > qdev->mman.bdev.man[TTM_PL_PRIV].priv; > + qxl_mem_types_list[i].data = > >mman.bdev.man[TTM_PL_PRIV]; > >} >return qxl_debugfs_add_files(qdev, qxl_mem_types_list, i); ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
答复: [PATCH] drm/amdgpu/virt: fix double kfree on bo_va
thanks for this catch! Reviewed-by: Monk Liu <monk@amd.com> 发件人: Colin King <colin.k...@canonical.com> 发送时间: 2017年2月4日 4:23:42 收件人: Deucher, Alexander; Koenig, Christian; David Airlie; Liu, Monk; Yu, Xiangliang; amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org 抄送: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org 主题: [PATCH] drm/amdgpu/virt: fix double kfree on bo_va From: Colin Ian King <colin.k...@canonical.com> bo_va is being kfree'd twice, once in the call to amdgpu_vm_bo_rmv and then a short while later. Fix this double free by removing the 2nd kfree. Detected by CoverityScan, CID#1399524 ("Double Free") Signed-off-by: Colin Ian King <colin.k...@canonical.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 3fd951c..dcfb7df 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -83,7 +83,6 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm) DRM_ERROR("failed to do bo_map on static CSA, err=%d\n", r); amdgpu_vm_bo_rmv(adev, bo_va); ttm_eu_backoff_reservation(, ); - kfree(bo_va); return r; } -- 2.10.2 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/4] drm/amdgpu: clear RB at ring init
> We should only call memset if amdgpu_bo_kmap succeeded. Same issue in patch 2. [ml] agreed >There's something else about these two patches I'm a bit worried about: >The GPU should only read data from ring buffers and IBs that we previously >explicitly wrote there. I'm afraid these patches might just paper over bugs >elsewhere, which might still bite us under different circumstances. [ml] so do you mean we should use NOP to fulfill the ring buffer instead of 0 ? -Original Message- From: Michel Dänzer [mailto:mic...@daenzer.net] Sent: Tuesday, June 07, 2016 3:34 PM To: Alex Deucher Cc: dri-devel at lists.freedesktop.org; Liu, Monk Subject: Re: [PATCH 1/4] drm/amdgpu: clear RB at ring init On 02.06.2016 07:27, Alex Deucher wrote: > From: Monk Liu > > This help fix reloading driver hang issue of SDMA ring. > > Signed-off-by: Monk Liu > Reviewed-by: Alex Deucher > Reviewed-by: Christian König > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > index 3b02272..a4b3f44 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > @@ -310,6 +310,9 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct > amdgpu_ring *ring, > } > r = amdgpu_bo_kmap(ring->ring_obj, > (void **)>ring); > + > + memset((void *)ring->ring, 0, ring->ring_size); > + > amdgpu_bo_unreserve(ring->ring_obj); > if (r) { > dev_err(adev->dev, "(%d) ring map failed\n", r); > We should only call memset if amdgpu_bo_kmap succeeded. Same issue in patch 2. There's something else about these two patches I'm a bit worried about: The GPU should only read data from ring buffers and IBs that we previously explicitly wrote there. I'm afraid these patches might just paper over bugs elsewhere, which might still bite us under different circumstances. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer