Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3

Pan, Xinhui Thu, 13 Feb 2020 03:16:11 -0800


> 2020年2月13日 18:01，Koenig, Christian <christian.koe...@amd.com> 写道：
> 
> Am 13.02.20 um 05:11 schrieb Pan, Xinhui:
>> 
>> 
>>> 2020年2月12日 19:53，Christian König <ckoenig.leichtzumer...@gmail.com> 写道：
>>> 
>>> Am 12.02.20 um 07:23 schrieb Pan, Xinhui:
>>>>> 2020年2月11日 23:43，Christian König <ckoenig.leichtzumer...@gmail.com> 写道：
>>>>> 
>>>>> When non-imported BOs are resurrected for delayed delete we replace
>>>>> the dma_resv object to allow for easy reclaiming of the resources.
>>>>> 
>>>>> v2: move that to ttm_bo_individualize_resv
>>>>> v3: add a comment to explain what's going on
>>>>> 
>>>>> Signed-off-by: Christian König <christian.koe...@amd.com>
>>>>> Reviewed-by: xinhui pan <xinhui....@amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 14 +++++++++++++-
>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>> 
>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> index bfc42a9e4fb4..8174603d390f 100644
>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> @@ -393,6 +393,18 @@ static int ttm_bo_individualize_resv(struct 
>>>>> ttm_buffer_object *bo)
>>>>> 
>>>>>   r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>>>>>   dma_resv_unlock(&bo->base._resv);
>>>>> + if (r)
>>>>> +         return r;
>>>>> +
>>>>> + if (bo->type != ttm_bo_type_sg) {
>>>>> +         /* This works because the BO is about to be destroyed and nobody
>>>>> +          * reference it any more. The only tricky case is the trylock on
>>>>> +          * the resv object while holding the lru_lock.
>>>>> +          */
>>>>> +         spin_lock(&ttm_bo_glob.lru_lock);
>>>>> +         bo->base.resv = &bo->base._resv;
>>>>> +         spin_unlock(&ttm_bo_glob.lru_lock);
>>>>> + }
>>>>> 
>>>> how about something like that.
>>>> the basic idea is to do the bo cleanup work in bo release first and avoid 
>>>> any race with evict.
>>>> As in bo dieing progress, evict also just do bo cleanup work.
>>>> 
>>>> If bo is busy, neither bo_release nor evict  can do cleanupwork  on it. 
>>>> For the bo release case, we just add bo back to lru list.
>>>> So we can clean it up  both in workqueue and shrinker as the past way  did.
>>>> 
>>>> @@ -405,8 +405,9 @@ static int ttm_bo_individualize_resv(struct 
>>>> ttm_buffer_object *bo)
>>>>       if (bo->type != ttm_bo_type_sg) {
>>>>         spin_lock(&ttm_bo_glob.lru_lock);
>>>> -       bo->base.resv = &bo->base._resv;
>>>> +       ttm_bo_del_from_lru(bo);
>>>>         spin_unlock(&ttm_bo_glob.lru_lock);
>>>> +       bo->base.resv = &bo->base._resv;
>>>>     }
>>>>       return r;
>>>> @@ -606,10 +607,9 @@ static void ttm_bo_release(struct kref *kref)
>>>>          * shrinkers, now that they are queued for
>>>>          * destruction.
>>>>          */
>>>> -       if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
>>>> +       if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT)
>>>>             bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
>>>> -           ttm_bo_move_to_lru_tail(bo, NULL);
>>>> -       }
>>>> +       ttm_bo_add_mem_to_lru(bo, &bo->mem);
>>>>           kref_init(&bo->kref);
>>>>         list_add_tail(&bo->ddestroy, &bdev->ddestroy);
>>> Yeah, thought about that as well. But this has the major drawback that the 
>>> deleted BO moves to the end of the LRU, which is something we don't want.
>> well, as the bo is busy, looks like it needs time to being idle. putting it 
>> to tail seems fair.
> 
> No, see BOs should move to the tail of the LRU whenever they are used. 
> Freeing up a BO is basically the opposite of using it.
> 
> So what would happen on the next memory contention is that the MM would evict 
> BOs which are still used and only after come to the delete BO which could 
> have been removed long ago.
> 
>>> I think the real solution to this problem is to go a completely different 
>>> way and remove the delayed delete feature from TTM altogether. Instead this 
>>> should be part of some DRM domain handler component.
>>> 
>> yes, completely agree. As long as we can shrink bos when OOM, the workqueue 
>> is not necessary, The workqueue does not  help in a heavy workload case.
>> 
>> Pls see my patches below, I remove the workqueue, and what’s more, we can 
>> clearup the bo without lru lock hold.
>> That would reduce the lock contention. I run kfdtest and got a good 
>> performance result.
> 
> No, that's an approach we had before as well and it also doesn't work that 
> well.
> 
> See the problem is that we can only remove the BO from the LRU after it has 
> released the memory it references. Otherwise we run into the issue that some 
> threads can't wait for the memory to be freed any more and run into an OOM 
> situation.
>


ok, we can keep the workqueue at it is.
Now I come up with another idea that evict and swap can touch the destroy list 
first, then lru list.
Looks like putting a dieing bo in lru list is useless.

thanks
xinhui

> Regards,
> Christian.
> 
>> 
>> 
>>> In other words it should not matter if a BO is evicted, moved or freed. 
>>> Whenever a piece of memory becomes available again we keep around a fence 
>>> which marks the end of using this piece of memory.
>>> 
>>> When then somebody asks for new memory we work through the LRU and test if 
>>> using a certain piece of memory makes sense or not. If we find that a BO 
>>> needs to be evicted for this we return a reference to the BO in question to 
>>> the upper level handling.
>>> 
>>> If we find that we can do the allocation but only with recently freed up 
>>> memory we gather the fences and say you can only use the newly allocated 
>>> memory after waiting for those.
>>> 
>>> HEY! Wait a second! Did I just outlined what a potential replacement to TTM 
>>> would look like?
>> yes, that is a good picture. Looks like we could do more work hen. :)
>> 
>> thanks
>> xinhui
>> 
>> 
>> --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index e795d5b5f8af..ac826a09b4ec 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -405,6 +405,9 @@ static int ttm_bo_individualize_resv(struct 
>> ttm_buffer_object *bo)
>>      if (bo->type != ttm_bo_type_sg) {
>>              spin_lock(&ttm_bo_glob.lru_lock);
>> +            /* it is very likely to release bo successfully.
>> +             * if not, just adding it back.
>> +             */
>>              ttm_bo_del_from_lru(bo);
>>              spin_unlock(&ttm_bo_glob.lru_lock);
>>              bo->base.resv = &bo->base._resv;
>> @@ -466,18 +469,20 @@ static int ttm_bo_cleanup_refs(struct 
>> ttm_buffer_object *bo,
>>              if (unlock_resv)
>>                      dma_resv_unlock(bo->base.resv);
>> -            spin_unlock(&ttm_bo_glob.lru_lock);
>>              lret = dma_resv_wait_timeout_rcu(resv, true, interruptible,
>>                                               30 * HZ);
>>              if (lret < 0)
>> -                    return lret;
>> -            else if (lret == 0)
>> -                    return -EBUSY;
>> +                    goto busy;
>> +            else if (lret == 0) {
>> +                    ret = -EBUSY;
>> +                    goto busy;
>> +            }
>>  -           spin_lock(&ttm_bo_glob.lru_lock);
>>              if (unlock_resv && !dma_resv_trylock(bo->base.resv)) {
>> +                    /* no race should be on it now */
>> +                    BUG();
>>                      /*
>>                       * We raced, and lost, someone else holds the 
>> reservation now,
>>                       * and is probably busy in ttm_bo_cleanup_memtype_use.
>> @@ -486,20 +491,18 @@ static int ttm_bo_cleanup_refs(struct 
>> ttm_buffer_object *bo,
>>                       * delayed destruction would succeed, so just return 
>> success
>>                       * here.
>>                       */
>> -                    spin_unlock(&ttm_bo_glob.lru_lock);
>>                      return 0;
>>              }
>>              ret = 0;
>>      }
>>  -   if (ret || unlikely(list_empty(&bo->ddestroy))) {
>> +    if (ret) {
>>              if (unlock_resv)
>>                      dma_resv_unlock(bo->base.resv);
>> -            spin_unlock(&ttm_bo_glob.lru_lock);
>> -            return ret;
>> +            goto busy;
>>      }
>>  -   ttm_bo_del_from_lru(bo);
>> +    spin_lock(&ttm_bo_glob.lru_lock);
>>      list_del_init(&bo->ddestroy);
>>      spin_unlock(&ttm_bo_glob.lru_lock);
>>      ttm_bo_cleanup_memtype_use(bo);
>> @@ -510,11 +513,20 @@ static int ttm_bo_cleanup_refs(struct 
>> ttm_buffer_object *bo,
>>      ttm_bo_put(bo);
>>      return 0;
>> +
>> +busy:
>> +    spin_lock(&ttm_bo_glob.lru_lock);
>> +    ttm_bo_add_mem_to_lru(bo, &bo->mem);
>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>> +    return  ret;
>>  }
>>    /**
>>   * Traverse the delayed list, and call ttm_bo_cleanup_refs on all
>>   * encountered buffers.
>> + *
>> + * only called bo device release
>>   */
>>  static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool 
>> remove_all)
>>  {
>> @@ -533,17 +545,17 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device 
>> *bdev, bool remove_all)
>>              list_move_tail(&bo->ddestroy, &removed);
>>              if (!ttm_bo_get_unless_zero(bo))
>>                      continue;
>> +            ttm_bo_del_from_lru(bo);
>> +            spin_unlock(&glob->lru_lock);
>>              if (remove_all || bo->base.resv != &bo->base._resv) {
>> -                    spin_unlock(&glob->lru_lock);
>>                      dma_resv_lock(bo->base.resv, NULL);
>> -
>> -                    spin_lock(&glob->lru_lock);
>>                      ttm_bo_cleanup_refs(bo, false, !remove_all, true);
>> -
>>              } else if (dma_resv_trylock(bo->base.resv)) {
>>                      ttm_bo_cleanup_refs(bo, false, !remove_all, true);
>>              } else {
>> +                    spin_lock(&glob->lru_lock);
>> +                    ttm_bo_add_mem_to_lru(bo, &bo->mem);
>>                      spin_unlock(&glob->lru_lock);
>>              }
>>  @@ -559,12 +571,8 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device 
>> *bdev, bool remove_all)
>>    static void ttm_bo_delayed_workqueue(struct work_struct *work)
>>  {
>> -    struct ttm_bo_device *bdev =
>> -        container_of(work, struct ttm_bo_device, wq.work);
>> -
>> -    if (!ttm_bo_delayed_delete(bdev, false))
>> -            schedule_delayed_work(&bdev->wq,
>> -                                  ((HZ / 100) < 1) ? 1 : HZ / 100);
>> +    WARN(true, "do not schedule it");
>> +    return;
>>  }
>>    static void ttm_bo_release(struct kref *kref)
>> @@ -595,6 +603,7 @@ static void ttm_bo_release(struct kref *kref)
>>              ttm_mem_io_unlock(man);
>>      }
>>  +   /* if bo is busy, spend a little more time to add bo to lru and 
>> ddestroy list*/
>>      if (!dma_resv_test_signaled_rcu(bo->base.resv, true)) {
>>              /* The BO is not idle, resurrect it for delayed destroy */
>>              ttm_bo_flush_all_fences(bo);
>> @@ -615,8 +624,6 @@ static void ttm_bo_release(struct kref *kref)
>>              list_add_tail(&bo->ddestroy, &bdev->ddestroy);
>>              spin_unlock(&ttm_bo_glob.lru_lock);
>>  -           schedule_delayed_work(&bdev->wq,
>> -                                  ((HZ / 100) < 1) ? 1 : HZ / 100);
>>              return;
>>      }
>>  @@ -842,6 +849,9 @@ static int ttm_mem_evict_first(struct ttm_bo_device 
>> *bdev,
>>              return ret;
>>      }
>>  +   ttm_bo_del_from_lru(bo);
>> +    spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>>      if (bo->deleted) {
>>              ret = ttm_bo_cleanup_refs(bo, ctx->interruptible,
>>                                        ctx->no_wait_gpu, locked);
>> @@ -849,8 +859,6 @@ static int ttm_mem_evict_first(struct ttm_bo_device 
>> *bdev,
>>              return ret;
>>      }
>>  -   spin_unlock(&ttm_bo_glob.lru_lock);
>> -
>>      ret = ttm_bo_evict(bo, ctx);
>>      if (locked)
>>              ttm_bo_unreserve(bo);
>> @@ -1809,14 +1817,15 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, 
>> struct ttm_operation_ctx *ctx)
>>              return ret;
>>      }
>>  +   ttm_bo_del_from_lru(bo);
>> +    spin_unlock(&glob->lru_lock);
>> +
>>      if (bo->deleted) {
>>              ret = ttm_bo_cleanup_refs(bo, false, false, locked);
>>              ttm_bo_put(bo);
>>              return ret;
>>      }
>>  -   ttm_bo_del_from_lru(bo);
>> -    spin_unlock(&glob->lru_lock);
>>      /**
>>       * Move to system cached
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/ttm: replace dma_resv object on deleted BOs v3

Reply via email to