[PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-04 Thread Christian König
Since we now keep BOs on the LRU we need to make sure
that they are removed when they are pinned.

Signed-off-by: Christian König 
---
 include/drm/ttm/ttm_bo_driver.h | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 9f54cf9c60df..c9b8ba492f24 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct 
ttm_buffer_object *bo,
  */
 static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
 {
-   if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
-   spin_lock(&bo->bdev->glob->lru_lock);
-   if (list_empty(&bo->lru))
-   ttm_bo_add_to_lru(bo);
-   else
-   ttm_bo_move_to_lru_tail(bo, NULL);
-   spin_unlock(&bo->bdev->glob->lru_lock);
-   }
+   spin_lock(&bo->bdev->glob->lru_lock);
+   if (list_empty(&bo->lru))
+   ttm_bo_add_to_lru(bo);
+   else
+   ttm_bo_move_to_lru_tail(bo, NULL);
+   spin_unlock(&bo->bdev->glob->lru_lock);
reservation_object_unlock(bo->resv);
 }
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-04 Thread Kuehling, Felix
On 2019-06-04 11:23, Christian König wrote:

> Since we now keep BOs on the LRU we need to make sure
> that they are removed when they are pinned.
>
> Signed-off-by: Christian König 
> ---
>   include/drm/ttm/ttm_bo_driver.h | 14 ++
>   1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
> index 9f54cf9c60df..c9b8ba492f24 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct 
> ttm_buffer_object *bo,
>*/
>   static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
>   {
> - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
> - spin_lock(&bo->bdev->glob->lru_lock);
> - if (list_empty(&bo->lru))
> - ttm_bo_add_to_lru(bo);
> - else
> - ttm_bo_move_to_lru_tail(bo, NULL);
> - spin_unlock(&bo->bdev->glob->lru_lock);
> - }
> + spin_lock(&bo->bdev->glob->lru_lock);
> + if (list_empty(&bo->lru))
> + ttm_bo_add_to_lru(bo);
> + else
> + ttm_bo_move_to_lru_tail(bo, NULL);

Going just by the function names, this seems to do the exact opposite of 
what the change description says.

Anway, this patch is Reviewed-by: Felix Kuehling 

BTW, this fix is needed for KFD. It fixes our eviction test that was 
broken by your previous patch series. This test specifically triggers 
interactions between KFD and graphics under memory pressure. It's 
something we rarely see in real world compute application testing 
without a targeted test. But when it breaks it leads to some painful 
intermittent failures that are hard to regress and debug.

Do you have any targeted tests to trigger evictions when you work on TTM 
internals?

Regards,
   Felix


> + spin_unlock(&bo->bdev->glob->lru_lock);
>   reservation_object_unlock(bo->resv);
>   }
>   
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-04 Thread Zeng, Oak


Regards,
Oak

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Tuesday, June 4, 2019 2:47 PM
To: Christian König ; 
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

On 2019-06-04 11:23, Christian König wrote:

> Since we now keep BOs on the LRU we need to make sure that they are 
> removed when they are pinned.
>
> Signed-off-by: Christian König 
> ---
>   include/drm/ttm/ttm_bo_driver.h | 14 ++
>   1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/include/drm/ttm/ttm_bo_driver.h 
> b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24 
> 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct 
> ttm_buffer_object *bo,
>*/
>   static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
>   {
> - if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
> - spin_lock(&bo->bdev->glob->lru_lock);
> - if (list_empty(&bo->lru))
> - ttm_bo_add_to_lru(bo);
> - else
> - ttm_bo_move_to_lru_tail(bo, NULL);
> - spin_unlock(&bo->bdev->glob->lru_lock);
> - }
> + spin_lock(&bo->bdev->glob->lru_lock);
> + if (list_empty(&bo->lru))
> + ttm_bo_add_to_lru(bo);
> + else
> + ttm_bo_move_to_lru_tail(bo, NULL);

Going just by the function names, this seems to do the exact opposite of what 
the change description says.

[Oak] +1, when I read the description, I also get lost...So please do add a 
more accurate description.

Anway, this patch is Reviewed-by: Felix Kuehling 

BTW, this fix is needed for KFD. It fixes our eviction test that was broken by 
your previous patch series. This test specifically triggers interactions 
between KFD and graphics under memory pressure. It's something we rarely see in 
real world compute application testing without a targeted test. But when it 
breaks it leads to some painful intermittent failures that are hard to regress 
and debug.

Do you have any targeted tests to trigger evictions when you work on TTM 
internals?

Regards,
   Felix


> + spin_unlock(&bo->bdev->glob->lru_lock);
>   reservation_object_unlock(bo->resv);
>   }
>   
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-05 Thread Christian König

Am 04.06.19 um 21:03 schrieb Zeng, Oak:


Regards,
Oak

-Original Message-
From: amd-gfx  On Behalf Of Kuehling, 
Felix
Sent: Tuesday, June 4, 2019 2:47 PM
To: Christian König ; 
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

On 2019-06-04 11:23, Christian König wrote:


Since we now keep BOs on the LRU we need to make sure that they are
removed when they are pinned.

Signed-off-by: Christian König 
---
   include/drm/ttm/ttm_bo_driver.h | 14 ++
   1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/drm/ttm/ttm_bo_driver.h
b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24
100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct 
ttm_buffer_object *bo,
*/
   static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
   {
-   if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
-   spin_lock(&bo->bdev->glob->lru_lock);
-   if (list_empty(&bo->lru))
-   ttm_bo_add_to_lru(bo);
-   else
-   ttm_bo_move_to_lru_tail(bo, NULL);
-   spin_unlock(&bo->bdev->glob->lru_lock);
-   }
+   spin_lock(&bo->bdev->glob->lru_lock);
+   if (list_empty(&bo->lru))
+   ttm_bo_add_to_lru(bo);
+   else
+   ttm_bo_move_to_lru_tail(bo, NULL);

Going just by the function names, this seems to do the exact opposite of what 
the change description says.

[Oak] +1, when I read the description, I also get lost...So please do add a 
more accurate description.


I'm puzzled why you are confused. We now keep the BOs on the LRU while 
they are reserved, so on unreserve we now need to explicitly remove them 
from the LRU when they are pinned.




Anway, this patch is Reviewed-by: Felix Kuehling 

BTW, this fix is needed for KFD. It fixes our eviction test that was broken by 
your previous patch series. This test specifically triggers interactions 
between KFD and graphics under memory pressure. It's something we rarely see in 
real world compute application testing without a targeted test. But when it 
breaks it leads to some painful intermittent failures that are hard to regress 
and debug.

Do you have any targeted tests to trigger evictions when you work on TTM 
internals?


Cat amdgpu_evict_gtt in debugfs is a good test for this.

Christian.



Regards,
    Felix



+   spin_unlock(&bo->bdev->glob->lru_lock);
reservation_object_unlock(bo->resv);
   }
   

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-05 Thread Michel Dänzer
On 2019-06-05 1:24 p.m., Christian König wrote:
> Am 04.06.19 um 21:03 schrieb Zeng, Oak:
>> From: amd-gfx  On Behalf Of
>> Kuehling, Felix
>> On 2019-06-04 11:23, Christian König wrote:
>>
>>> Since we now keep BOs on the LRU we need to make sure that they are
>>> removed when they are pinned.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>>    include/drm/ttm/ttm_bo_driver.h | 14 ++
>>>    1 file changed, 6 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/include/drm/ttm/ttm_bo_driver.h
>>> b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24
>>> 100644
>>> --- a/include/drm/ttm/ttm_bo_driver.h
>>> +++ b/include/drm/ttm/ttm_bo_driver.h
>>> @@ -767,14 +767,12 @@ static inline int
>>> ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo,
>>>     */
>>>    static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
>>>    {
>>> -    if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>> -    spin_lock(&bo->bdev->glob->lru_lock);
>>> -    if (list_empty(&bo->lru))
>>> -    ttm_bo_add_to_lru(bo);
>>> -    else
>>> -    ttm_bo_move_to_lru_tail(bo, NULL);
>>> -    spin_unlock(&bo->bdev->glob->lru_lock);
>>> -    }
>>> +    spin_lock(&bo->bdev->glob->lru_lock);
>>> +    if (list_empty(&bo->lru))
>>> +    ttm_bo_add_to_lru(bo);
>>> +    else
>>> +    ttm_bo_move_to_lru_tail(bo, NULL);
>> Going just by the function names, this seems to do the exact opposite
>> of what the change description says.
>>
>> [Oak] +1, when I read the description, I also get lost...So please do
>> add a more accurate description.
> 
> I'm puzzled why you are confused. We now keep the BOs on the LRU while
> they are reserved, so on unreserve we now need to explicitly remove them
> from the LRU when they are pinned.

I don't know about Felix and Oak, but for me "remove from the LRU" is
confusing, as I don't see that in the code, only adding to the LRU or
moving to its tail.


-- 
Earthling Michel Dänzer   |  https://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-05 Thread Kuehling, Felix
On 2019-06-05 9:56, Michel Dänzer wrote:
> On 2019-06-05 1:24 p.m., Christian König wrote:
>> Am 04.06.19 um 21:03 schrieb Zeng, Oak:
>>> From: amd-gfx  On Behalf Of
>>> Kuehling, Felix
>>> On 2019-06-04 11:23, Christian König wrote:
[snip]
>>> --- a/include/drm/ttm/ttm_bo_driver.h
>>> +++ b/include/drm/ttm/ttm_bo_driver.h
>>> @@ -767,14 +767,12 @@ static inline int
>>> ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo,
>>>      */
>>>     static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
>>>     {
>>> -    if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>> -    spin_lock(&bo->bdev->glob->lru_lock);
>>> -    if (list_empty(&bo->lru))
>>> -    ttm_bo_add_to_lru(bo);
>>> -    else
>>> -    ttm_bo_move_to_lru_tail(bo, NULL);
>>> -    spin_unlock(&bo->bdev->glob->lru_lock);
>>> -    }
>>> +    spin_lock(&bo->bdev->glob->lru_lock);
>>> +    if (list_empty(&bo->lru))
>>> +    ttm_bo_add_to_lru(bo);
>>> +    else
>>> +    ttm_bo_move_to_lru_tail(bo, NULL);
>>> Going just by the function names, this seems to do the exact opposite
>>> of what the change description says.
>>>
>>> [Oak] +1, when I read the description, I also get lost...So please do
>>> add a more accurate description.
>> I'm puzzled why you are confused. We now keep the BOs on the LRU while
>> they are reserved, so on unreserve we now need to explicitly remove them
>> from the LRU when they are pinned.
> I don't know about Felix and Oak, but for me "remove from the LRU" is
> confusing, as I don't see that in the code, only adding to the LRU or
> moving to its tail.

Exactly. The names of the functions being called imply that something 
gets added or moved on the LRU list. You have to go look at the 
implementation of those functions to find out that they do something 
else for pinned BOs (that have TTM_PL_FLAG_NO_EVICT set in their 
placement flags).

Fixing the function names would probably be overkill: 
ttm_bo_add_lru_unless_pinned and 
ttm_bo_move_to_lru_tail_or_remove_if_pinned. But maybe a comment in 
ttm_bo_unreserve would help.

Regards,
   Felix


>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-05 Thread Zeng, Oak


Regards,
Oak

-Original Message-
From: Christian König  
Sent: Wednesday, June 5, 2019 7:25 AM
To: Zeng, Oak ; Kuehling, Felix ; 
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

Am 04.06.19 um 21:03 schrieb Zeng, Oak:
>
> Regards,
> Oak
>
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Kuehling, Felix
> Sent: Tuesday, June 4, 2019 2:47 PM
> To: Christian König ; 
> dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/ttm: fix ttm_bo_unreserve
>
> On 2019-06-04 11:23, Christian König wrote:
>
>> Since we now keep BOs on the LRU we need to make sure that they are 
>> removed when they are pinned.
>>
>> Signed-off-by: Christian König 
>> ---
>>include/drm/ttm/ttm_bo_driver.h | 14 ++
>>1 file changed, 6 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/drm/ttm/ttm_bo_driver.h 
>> b/include/drm/ttm/ttm_bo_driver.h index 9f54cf9c60df..c9b8ba492f24
>> 100644
>> --- a/include/drm/ttm/ttm_bo_driver.h
>> +++ b/include/drm/ttm/ttm_bo_driver.h
>> @@ -767,14 +767,12 @@ static inline int ttm_bo_reserve_slowpath(struct 
>> ttm_buffer_object *bo,
>> */
>>static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
>>{
>> -if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>> -spin_lock(&bo->bdev->glob->lru_lock);
>> -if (list_empty(&bo->lru))
>> -ttm_bo_add_to_lru(bo);
>> -else
>> -ttm_bo_move_to_lru_tail(bo, NULL);
>> -spin_unlock(&bo->bdev->glob->lru_lock);
>> -}
>> +spin_lock(&bo->bdev->glob->lru_lock);
>> +if (list_empty(&bo->lru))
>> +ttm_bo_add_to_lru(bo);
>> +else
>> +ttm_bo_move_to_lru_tail(bo, NULL);
> Going just by the function names, this seems to do the exact opposite of what 
> the change description says.
>
> [Oak] +1, when I read the description, I also get lost...So please do add a 
> more accurate description.

I'm puzzled why you are confused. We now keep the BOs on the LRU while they are 
reserved, so on unreserve we now need to explicitly remove them from the LRU 
when they are pinned.

[Oak] When I read the description, I though you meant to remove bo from LRU on 
a pin action, but from codes, it is done on unreserve. In other words, it is 
better to say "if it is pinned" than  "when it is pinned". Sorry being 
pickyAlso from codes before your change, there was a condition 
"!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)". Is this condition to check 
whether bo is no pinned? How do you check whether bo is pinned in the new 
codes? To me condition " list_empty(&bo->lru)" only means this bo is currently 
not on LRU list, I am not sure whether this also means it is not pinned. Also, 
can ttm_bo_move_to_lru_tail be replaced with ttm_bo_del_from_lru - from your 
description, this is more like a function to remove it from LRU. Sorry too many 
questions. I really don't know the context here...

>
> Anway, this patch is Reviewed-by: Felix Kuehling 
>
> BTW, this fix is needed for KFD. It fixes our eviction test that was broken 
> by your previous patch series. This test specifically triggers interactions 
> between KFD and graphics under memory pressure. It's something we rarely see 
> in real world compute application testing without a targeted test. But when 
> it breaks it leads to some painful intermittent failures that are hard to 
> regress and debug.
>
> Do you have any targeted tests to trigger evictions when you work on TTM 
> internals?

Cat amdgpu_evict_gtt in debugfs is a good test for this.

Christian.

>
> Regards,
>     Felix
>
>
>> +spin_unlock(&bo->bdev->glob->lru_lock);
>>  reservation_object_unlock(bo->resv);
>>}
>>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: fix ttm_bo_unreserve

2019-06-05 Thread Koenig, Christian
Am 05.06.19 um 16:33 schrieb Kuehling, Felix:
> On 2019-06-05 9:56, Michel Dänzer wrote:
>> On 2019-06-05 1:24 p.m., Christian König wrote:
>>> Am 04.06.19 um 21:03 schrieb Zeng, Oak:
 From: amd-gfx  On Behalf Of
 Kuehling, Felix
 On 2019-06-04 11:23, Christian König wrote:
> [snip]
 --- a/include/drm/ttm/ttm_bo_driver.h
 +++ b/include/drm/ttm/ttm_bo_driver.h
 @@ -767,14 +767,12 @@ static inline int
 ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo,
       */
      static inline void ttm_bo_unreserve(struct ttm_buffer_object *bo)
      {
 -    if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
 -    spin_lock(&bo->bdev->glob->lru_lock);
 -    if (list_empty(&bo->lru))
 -    ttm_bo_add_to_lru(bo);
 -    else
 -    ttm_bo_move_to_lru_tail(bo, NULL);
 -    spin_unlock(&bo->bdev->glob->lru_lock);
 -    }
 +    spin_lock(&bo->bdev->glob->lru_lock);
 +    if (list_empty(&bo->lru))
 +    ttm_bo_add_to_lru(bo);
 +    else
 +    ttm_bo_move_to_lru_tail(bo, NULL);
 Going just by the function names, this seems to do the exact opposite
 of what the change description says.

 [Oak] +1, when I read the description, I also get lost...So please do
 add a more accurate description.
>>> I'm puzzled why you are confused. We now keep the BOs on the LRU while
>>> they are reserved, so on unreserve we now need to explicitly remove them
>>> from the LRU when they are pinned.
>> I don't know about Felix and Oak, but for me "remove from the LRU" is
>> confusing, as I don't see that in the code, only adding to the LRU or
>> moving to its tail.
> Exactly. The names of the functions being called imply that something
> gets added or moved on the LRU list. You have to go look at the
> implementation of those functions to find out that they do something
> else for pinned BOs (that have TTM_PL_FLAG_NO_EVICT set in their
> placement flags).
>
> Fixing the function names would probably be overkill:
> ttm_bo_add_lru_unless_pinned and
> ttm_bo_move_to_lru_tail_or_remove_if_pinned. But maybe a comment in
> ttm_bo_unreserve would help.

Ah! Yes of course, I thought you mean the ttm_bo_unreserve function name.

Going to add a comment when we start to rename the functions.

Christian.

>
> Regards,
>     Felix
>
>
>>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx