[PATCH] drm/ttm: fix ttm_bo_unreserve
Since we now keep BOs on the LRU we need to make sure that they are removed when they are pinned. Signed-off-by: Christian König --- include/drm/ttm/ttm_bo_driver.h | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo, */ static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) { - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { - spin_lock(&bo->bdev->glob->lru_lock); - if (list_empty(&bo->lru)) - ttm_bo_add_to_lru(bo); - else - ttm_bo_move_to_lru_tail(bo, NULL); - spin_unlock(&bo->bdev->glob->lru_lock); - } + spin_lock(&bo->bdev->glob->lru_lock); + if (list_empty(&bo->lru)) + ttm_bo_add_to_lru(bo); + else + ttm_bo_move_to_lru_tail(bo, NULL); + spin_unlock(&bo->bdev->glob->lru_lock); reservation_object_unlock(bo->resv); } -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/ttm: fix ttm_bo_unreserve
On 2019-06-04 11:23, Christian König wrote: > Since we now keep BOs on the LRU we need to make sure > that they are removed when they are pinned. > > Signed-off-by: Christian König > --- > include/drm/ttm/ttm_bo_driver.h | 14 ++ > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h > index 9f54cf9c60df..c9b8ba492f24 100644 > --- a/include/drm/ttm/ttm_bo_driver.h > +++ b/include/drm/ttm/ttm_bo_driver.h > @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct > ttm_buffer_object *bo, >*/ > static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) > { > - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { > - spin_lock(&bo->bdev->glob->lru_lock); > - if (list_empty(&bo->lru)) > - ttm_bo_add_to_lru(bo); > - else > - ttm_bo_move_to_lru_tail(bo, NULL); > - spin_unlock(&bo->bdev->glob->lru_lock); > - } > + spin_lock(&bo->bdev->glob->lru_lock); > + if (list_empty(&bo->lru)) > + ttm_bo_add_to_lru(bo); > + else > + ttm_bo_move_to_lru_tail(bo, NULL); Going just by the function names, this seems to do the exact opposite of what the change description says. Anway, this patch is Reviewed-by: Felix Kuehling BTW, this fix is needed for KFD. It fixes our eviction test that was broken by your previous patch series. This test specifically triggers interactions between KFD and graphics under memory pressure. It's something we rarely see in real world compute application testing without a targeted test. But when it breaks it leads to some painful intermittent failures that are hard to regress and debug. Do you have any targeted tests to trigger evictions when you work on TTM internals? Regards, Felix > + spin_unlock(&bo->bdev->glob->lru_lock); > reservation_object_unlock(bo->resv); > } > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/ttm: fix ttm_bo_unreserve
Regards, Oak -Original Message- From: amd-gfx On Behalf Of Kuehling, Felix Sent: Tuesday, June 4, 2019 2:47 PM To: Christian König ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve On 2019-06-04 11:23, Christian König wrote: > Since we now keep BOs on the LRU we need to make sure that they are > removed when they are pinned. > > Signed-off-by: Christian König > --- > include/drm/ttm/ttm_bo_driver.h | 14 ++ > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/include/drm/ttm/ttm_bo_driver.h > b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24 > 100644 > --- a/include/drm/ttm/ttm_bo_driver.h > +++ b/include/drm/ttm/ttm_bo_driver.h > @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct > ttm_buffer_object *bo, >*/ > static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) > { > - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { > - spin_lock(&bo->bdev->glob->lru_lock); > - if (list_empty(&bo->lru)) > - ttm_bo_add_to_lru(bo); > - else > - ttm_bo_move_to_lru_tail(bo, NULL); > - spin_unlock(&bo->bdev->glob->lru_lock); > - } > + spin_lock(&bo->bdev->glob->lru_lock); > + if (list_empty(&bo->lru)) > + ttm_bo_add_to_lru(bo); > + else > + ttm_bo_move_to_lru_tail(bo, NULL); Going just by the function names, this seems to do the exact opposite of what the change description says. [Oak] +1, when I read the description, I also get lost...So please do add a more accurate description. Anway, this patch is Reviewed-by: Felix Kuehling BTW, this fix is needed for KFD. It fixes our eviction test that was broken by your previous patch series. This test specifically triggers interactions between KFD and graphics under memory pressure. It's something we rarely see in real world compute application testing without a targeted test. But when it breaks it leads to some painful intermittent failures that are hard to regress and debug. Do you have any targeted tests to trigger evictions when you work on TTM internals? Regards, Felix > + spin_unlock(&bo->bdev->glob->lru_lock); > reservation_object_unlock(bo->resv); > } > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/ttm: fix ttm_bo_unreserve
Am 04.06.19 um 21:03 schrieb Zeng, Oak: Regards, Oak -Original Message- From: amd-gfx On Behalf Of Kuehling, Felix Sent: Tuesday, June 4, 2019 2:47 PM To: Christian König ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve On 2019-06-04 11:23, Christian König wrote: Since we now keep BOs on the LRU we need to make sure that they are removed when they are pinned. Signed-off-by: Christian König --- include/drm/ttm/ttm_bo_driver.h | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo, */ static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) { - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { - spin_lock(&bo->bdev->glob->lru_lock); - if (list_empty(&bo->lru)) - ttm_bo_add_to_lru(bo); - else - ttm_bo_move_to_lru_tail(bo, NULL); - spin_unlock(&bo->bdev->glob->lru_lock); - } + spin_lock(&bo->bdev->glob->lru_lock); + if (list_empty(&bo->lru)) + ttm_bo_add_to_lru(bo); + else + ttm_bo_move_to_lru_tail(bo, NULL); Going just by the function names, this seems to do the exact opposite of what the change description says. [Oak] +1, when I read the description, I also get lost...So please do add a more accurate description. I'm puzzled why you are confused. We now keep the BOs on the LRU while they are reserved, so on unreserve we now need to explicitly remove them from the LRU when they are pinned. Anway, this patch is Reviewed-by: Felix Kuehling BTW, this fix is needed for KFD. It fixes our eviction test that was broken by your previous patch series. This test specifically triggers interactions between KFD and graphics under memory pressure. It's something we rarely see in real world compute application testing without a targeted test. But when it breaks it leads to some painful intermittent failures that are hard to regress and debug. Do you have any targeted tests to trigger evictions when you work on TTM internals? Cat amdgpu_evict_gtt in debugfs is a good test for this. Christian. Regards, Felix + spin_unlock(&bo->bdev->glob->lru_lock); reservation_object_unlock(bo->resv); } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/ttm: fix ttm_bo_unreserve
On 2019-06-05 1:24 p.m., Christian König wrote: > Am 04.06.19 um 21:03 schrieb Zeng, Oak: >> From: amd-gfx On Behalf Of >> Kuehling, Felix >> On 2019-06-04 11:23, Christian König wrote: >> >>> Since we now keep BOs on the LRU we need to make sure that they are >>> removed when they are pinned. >>> >>> Signed-off-by: Christian König >>> --- >>> include/drm/ttm/ttm_bo_driver.h | 14 ++ >>> 1 file changed, 6 insertions(+), 8 deletions(-) >>> >>> diff --git a/include/drm/ttm/ttm_bo_driver.h >>> b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24 >>> 100644 >>> --- a/include/drm/ttm/ttm_bo_driver.h >>> +++ b/include/drm/ttm/ttm_bo_driver.h >>> @@ -767,14 +767,12 @@ static inline int >>> ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo, >>> */ >>> static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) >>> { >>> - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { >>> - spin_lock(&bo->bdev->glob->lru_lock); >>> - if (list_empty(&bo->lru)) >>> - ttm_bo_add_to_lru(bo); >>> - else >>> - ttm_bo_move_to_lru_tail(bo, NULL); >>> - spin_unlock(&bo->bdev->glob->lru_lock); >>> - } >>> + spin_lock(&bo->bdev->glob->lru_lock); >>> + if (list_empty(&bo->lru)) >>> + ttm_bo_add_to_lru(bo); >>> + else >>> + ttm_bo_move_to_lru_tail(bo, NULL); >> Going just by the function names, this seems to do the exact opposite >> of what the change description says. >> >> [Oak] +1, when I read the description, I also get lost...So please do >> add a more accurate description. > > I'm puzzled why you are confused. We now keep the BOs on the LRU while > they are reserved, so on unreserve we now need to explicitly remove them > from the LRU when they are pinned. I don't know about Felix and Oak, but for me "remove from the LRU" is confusing, as I don't see that in the code, only adding to the LRU or moving to its tail. -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/ttm: fix ttm_bo_unreserve
On 2019-06-05 9:56, Michel Dänzer wrote: > On 2019-06-05 1:24 p.m., Christian König wrote: >> Am 04.06.19 um 21:03 schrieb Zeng, Oak: >>> From: amd-gfx On Behalf Of >>> Kuehling, Felix >>> On 2019-06-04 11:23, Christian König wrote: [snip] >>> --- a/include/drm/ttm/ttm_bo_driver.h >>> +++ b/include/drm/ttm/ttm_bo_driver.h >>> @@ -767,14 +767,12 @@ static inline int >>> ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo, >>> */ >>> static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) >>> { >>> - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { >>> - spin_lock(&bo->bdev->glob->lru_lock); >>> - if (list_empty(&bo->lru)) >>> - ttm_bo_add_to_lru(bo); >>> - else >>> - ttm_bo_move_to_lru_tail(bo, NULL); >>> - spin_unlock(&bo->bdev->glob->lru_lock); >>> - } >>> + spin_lock(&bo->bdev->glob->lru_lock); >>> + if (list_empty(&bo->lru)) >>> + ttm_bo_add_to_lru(bo); >>> + else >>> + ttm_bo_move_to_lru_tail(bo, NULL); >>> Going just by the function names, this seems to do the exact opposite >>> of what the change description says. >>> >>> [Oak] +1, when I read the description, I also get lost...So please do >>> add a more accurate description. >> I'm puzzled why you are confused. We now keep the BOs on the LRU while >> they are reserved, so on unreserve we now need to explicitly remove them >> from the LRU when they are pinned. > I don't know about Felix and Oak, but for me "remove from the LRU" is > confusing, as I don't see that in the code, only adding to the LRU or > moving to its tail. Exactly. The names of the functions being called imply that something gets added or moved on the LRU list. You have to go look at the implementation of those functions to find out that they do something else for pinned BOs (that have TTM_PL_FLAG_NO_EVICT set in their placement flags). Fixing the function names would probably be overkill: ttm_bo_add_lru_unless_pinned and ttm_bo_move_to_lru_tail_or_remove_if_pinned. But maybe a comment in ttm_bo_unreserve would help. Regards, Felix > > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/ttm: fix ttm_bo_unreserve
Regards, Oak -Original Message- From: Christian König Sent: Wednesday, June 5, 2019 7:25 AM To: Zeng, Oak ; Kuehling, Felix ; dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve Am 04.06.19 um 21:03 schrieb Zeng, Oak: > > Regards, > Oak > > -Original Message- > From: amd-gfx On Behalf Of > Kuehling, Felix > Sent: Tuesday, June 4, 2019 2:47 PM > To: Christian König ; > dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve > > On 2019-06-04 11:23, Christian König wrote: > >> Since we now keep BOs on the LRU we need to make sure that they are >> removed when they are pinned. >> >> Signed-off-by: Christian König >> --- >>include/drm/ttm/ttm_bo_driver.h | 14 ++ >>1 file changed, 6 insertions(+), 8 deletions(-) >> >> diff --git a/include/drm/ttm/ttm_bo_driver.h >> b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24 >> 100644 >> --- a/include/drm/ttm/ttm_bo_driver.h >> +++ b/include/drm/ttm/ttm_bo_driver.h >> @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct >> ttm_buffer_object *bo, >> */ >>static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) >>{ >> -if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { >> -spin_lock(&bo->bdev->glob->lru_lock); >> -if (list_empty(&bo->lru)) >> -ttm_bo_add_to_lru(bo); >> -else >> -ttm_bo_move_to_lru_tail(bo, NULL); >> -spin_unlock(&bo->bdev->glob->lru_lock); >> -} >> +spin_lock(&bo->bdev->glob->lru_lock); >> +if (list_empty(&bo->lru)) >> +ttm_bo_add_to_lru(bo); >> +else >> +ttm_bo_move_to_lru_tail(bo, NULL); > Going just by the function names, this seems to do the exact opposite of what > the change description says. > > [Oak] +1, when I read the description, I also get lost...So please do add a > more accurate description. I'm puzzled why you are confused. We now keep the BOs on the LRU while they are reserved, so on unreserve we now need to explicitly remove them from the LRU when they are pinned. [Oak] When I read the description, I though you meant to remove bo from LRU on a pin action, but from codes, it is done on unreserve. In other words, it is better to say "if it is pinned" than "when it is pinned". Sorry being pickyAlso from codes before your change, there was a condition "!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)". Is this condition to check whether bo is no pinned? How do you check whether bo is pinned in the new codes? To me condition " list_empty(&bo->lru)" only means this bo is currently not on LRU list, I am not sure whether this also means it is not pinned. Also, can ttm_bo_move_to_lru_tail be replaced with ttm_bo_del_from_lru - from your description, this is more like a function to remove it from LRU. Sorry too many questions. I really don't know the context here... > > Anway, this patch is Reviewed-by: Felix Kuehling > > BTW, this fix is needed for KFD. It fixes our eviction test that was broken > by your previous patch series. This test specifically triggers interactions > between KFD and graphics under memory pressure. It's something we rarely see > in real world compute application testing without a targeted test. But when > it breaks it leads to some painful intermittent failures that are hard to > regress and debug. > > Do you have any targeted tests to trigger evictions when you work on TTM > internals? Cat amdgpu_evict_gtt in debugfs is a good test for this. Christian. > > Regards, > Felix > > >> +spin_unlock(&bo->bdev->glob->lru_lock); >> reservation_object_unlock(bo->resv); >>} >> > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/ttm: fix ttm_bo_unreserve
Am 05.06.19 um 16:33 schrieb Kuehling, Felix: > On 2019-06-05 9:56, Michel Dänzer wrote: >> On 2019-06-05 1:24 p.m., Christian König wrote: >>> Am 04.06.19 um 21:03 schrieb Zeng, Oak: From: amd-gfx On Behalf Of Kuehling, Felix On 2019-06-04 11:23, Christian König wrote: > [snip] --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo, */ static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo) { - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { - spin_lock(&bo->bdev->glob->lru_lock); - if (list_empty(&bo->lru)) - ttm_bo_add_to_lru(bo); - else - ttm_bo_move_to_lru_tail(bo, NULL); - spin_unlock(&bo->bdev->glob->lru_lock); - } + spin_lock(&bo->bdev->glob->lru_lock); + if (list_empty(&bo->lru)) + ttm_bo_add_to_lru(bo); + else + ttm_bo_move_to_lru_tail(bo, NULL); Going just by the function names, this seems to do the exact opposite of what the change description says. [Oak] +1, when I read the description, I also get lost...So please do add a more accurate description. >>> I'm puzzled why you are confused. We now keep the BOs on the LRU while >>> they are reserved, so on unreserve we now need to explicitly remove them >>> from the LRU when they are pinned. >> I don't know about Felix and Oak, but for me "remove from the LRU" is >> confusing, as I don't see that in the code, only adding to the LRU or >> moving to its tail. > Exactly. The names of the functions being called imply that something > gets added or moved on the LRU list. You have to go look at the > implementation of those functions to find out that they do something > else for pinned BOs (that have TTM_PL_FLAG_NO_EVICT set in their > placement flags). > > Fixing the function names would probably be overkill: > ttm_bo_add_lru_unless_pinned and > ttm_bo_move_to_lru_tail_or_remove_if_pinned. But maybe a comment in > ttm_bo_unreserve would help. Ah! Yes of course, I thought you mean the ttm_bo_unreserve function name. Going to add a comment when we start to rename the functions. Christian. > > Regards, > Felix > > >> ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx