ttm: add waiter list to prevent allocation not in order

Chunming Zhou Wed, 31 Jan 2018 02:31:39 -0800


On 2018年01月26日 22:35, Christian König wrote:

I just realized that a change I'm thinking about for a while wouldsolve your problem as well, but keep concurrent allocation possible.
See ttm_mem_evict_first() unlocks the BO after evicting it:
        ttm_bo_del_from_lru(bo);
        spin_unlock(&glob->lru_lock);

        ret = ttm_bo_evict(bo, ctx);
        if (locked) {
                ttm_bo_unreserve(bo); <-------- here
        } else {
                spin_lock(&glob->lru_lock);
                ttm_bo_add_to_lru(bo);
                spin_unlock(&glob->lru_lock);
        }

        kref_put(&bo->list_kref, ttm_bo_release_list);
The effect is that in your example process C can not only beat processB once, but many many times because we run into a ping/pong situationwhere B evicts resources while C moves them back in.

For ping/pong case, I want to disable busy placement for allocationperiod, only enable it for cs bo validation.

For a while now I'm thinking about dropping those reservations onlyafter the original allocation succeeded.
The effect would be that process C can still beat process B initially,but sooner or process B would evict some resources from process C aswell and then it can succeed with its allocation.

If it is from process C cs validation, process B still need evict theresource only after process C command submission completion.

The problem is for this approach to work we need to core change to theww_mutexes to be able to handle this efficiently.

Yes, ww_mutex doesn't support this net lock, which easily deadlockwithout ticket and class.


So I think preventing validation on same place is a simpler way:

process B bo's place is fpfn~lpfn, it will only try to evict LRU BOs inthat range, while eviction, we just prevent those validation to thisrange(fpfn~lpfn), if out of this range, the allocation/validation stillcan be go on.


Any negative?

Regards,
David Zhou

Regards,
Christian.

Am 26.01.2018 um 14:59 schrieb Christian König:
I know, but this has the same effect. You prevent concurrentallocation from happening.
What we could do is to pipeline reusing of deleted memory as well,this makes it less likely to cause the problem you are seeing becausethe evicting processes doesn't need to block for deleted BOs any more.
But that other processes can grab memory during eviction isintentional. Otherwise greedy processes would completely dominatecommand submission.
Regards,
Christian.

Am 26.01.2018 um 14:50 schrieb Zhou, David(ChunMing):
I don't want to prevent all, my new approach is to prevent the laterallocation is trying and ahead of front to get the memory space thatthe front made from eviction.
发自坚果 Pro
Christian K鰊ig <ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日下午9:24写道：
Yes, exactly that's the problem.
See when you want to prevent a process B from allocating the memoryprocess A has evicted, you need to prevent all concurrent allocation.
And we don't do that because it causes a major performance drop.

Regards,
Christian.

Am 26.01.2018 um 14:21 schrieb Zhou, David(ChunMing):
You patch will prevent concurrent allocation, and will result inallocation performance drop much.
发自坚果 Pro
Christian K鰊ig <ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日下午9:04写道：
Attached is what you actually want to do cleanly implemented. Butas I said this is a NO-GO.
Regards,
Christian.

Am 26.01.2018 um 13:43 schrieb Christian König:
After my investigation, this issue should be detect of TTM designself, which breaks scheduling balance.
Yeah, but again. This is indented design we can't change easily.

Regards,
Christian.

Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing):
I am off work, so reply mail by phone, the format could not be text.

back to topic itself:
the problem indeed happen on amdgpu driver, someone reports methat application runs with two instances, the performance aredifferent.I also reproduced the issue with unit test(bo_eviction_test).They always think our scheduler isn't working as expected.
After my investigation, this issue should be detect of TTM designself, which breaks scheduling balance.
Further, if we run containers for our gpu, container A could runhigh score, container B runs low score with same benchmark.
So this is bug that we need fix.

Regards,
David Zhou

发自坚果 Pro
Christian K鰊ig <ckoenig.leichtzumer...@gmail.com> 于 2018年1月26日下午6:31写道：
Am 26.01.2018 um 11:22 schrieb Chunming Zhou:
> there is a scheduling balance issue about get node like:
> a. process A allocates full memory and use it for submission.
> b. process B tries to allocates memory, will wait for process ABO idle in eviction.> c. process A completes the job, process B eviction will putprocess A BO node,> but in the meantime, process C is comming to allocate BO, whilldirectly get node successfully, and do submission,
> process B will again wait for process C BO idle.
> d. repeat the above setps, process B could be delayed much more.
>
> later allocation must not be ahead of front in same place.

Again NAK to the whole approach.

At least with amdgpu the problem you described above never occurs
because evictions are pipelined operations. We could only block for
deleted regions to become free.
But independent of that incoming memory requests while we makeroom for
eviction are intended to be served first.
Changing that is certainly a no-go cause that would favor memoryhungry
applications over small clients.

Regards,
Christian.

>
> Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18
> Signed-off-by: Chunming Zhou <david1.z...@amd.com>
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 69+++++++++++++++++++++++++++++++++++++++--
>   include/drm/ttm/ttm_bo_api.h    |  7 +++++
>   include/drm/ttm/ttm_bo_driver.h |  7 +++++
>   3 files changed, 80 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.cb/drivers/gpu/drm/ttm/ttm_bo.c
> index d33a6bb742a1..558ec2cf465d 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(structttm_buffer_object *bo,
>        return 0;
>   }
>
> +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter,
> +                             struct ttm_buffer_object *bo,
> +                             const struct ttm_place *place)
> +{
> +     waiter->tbo = bo;
> + memcpy((void *)&waiter->place, (void *)place,sizeof(*place));
> + INIT_LIST_HEAD(&waiter->list);
> +}
> +
> +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man,
> +                            struct ttm_bo_waiter *waiter)
> +{
> +     if (!waiter)
> +             return;
> +     spin_lock(&man->wait_lock);
> +     list_add_tail(&waiter->list, &man->waiter_list);
> + spin_unlock(&man->wait_lock);
> +}
> +
> +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man,
> +                            struct ttm_bo_waiter *waiter)
> +{
> +     if (!waiter)
> +             return;
> +     spin_lock(&man->wait_lock);
> +     if (!list_empty(&waiter->list))
> + list_del(&waiter->list);
> + spin_unlock(&man->wait_lock);
> +     kfree(waiter);
> +}
> +
> +int ttm_man_check_bo(struct ttm_mem_type_manager *man,
> +                  struct ttm_buffer_object *bo,
> +                  const struct ttm_place *place)
> +{
> +     struct ttm_bo_waiter *waiter, *tmp;
> +
> +     spin_lock(&man->wait_lock);
> + list_for_each_entry_safe(waiter, tmp, &man->waiter_list,list) {
> +             if ((bo != waiter->tbo) &&
> +                 ((place->fpfn >= waiter->place.fpfn &&
> +                   place->fpfn <= waiter->place.lpfn) ||
> + (place->lpfn <= waiter->place.lpfn &&place->lpfn >=
> + waiter->place.fpfn)))
> +                 goto later_bo;
> +     }
> + spin_unlock(&man->wait_lock);
> +     return true;
> +later_bo:
> + spin_unlock(&man->wait_lock);
> +     return false;
> +}
>   /**
> * Repeatedly evict memory from the LRU for @mem_type untilwe create enough> * space, or we've evicted everything and there isn't enoughspace.> @@ -853,17 +905,26 @@ static int ttm_bo_mem_force_space(structttm_buffer_object *bo,
>   {
>        struct ttm_bo_device *bdev = bo->bdev;
>        struct ttm_mem_type_manager *man = &bdev->man[mem_type];
> +     struct ttm_bo_waiter waiter;
>        int ret;
>
> +     ttm_man_init_waiter(&waiter, bo, place);
> +     ttm_man_add_waiter(man, &waiter);
>        do {
>                ret = (*man->func->get_node)(man, bo, place, mem);
> -             if (unlikely(ret != 0))
> +             if (unlikely(ret != 0)) {
> + ttm_man_del_waiter(man, &waiter);
>                        return ret;
> -             if (mem->mm_node)
> +             }
> +             if (mem->mm_node) {
> + ttm_man_del_waiter(man, &waiter);
>                        break;
> +             }
> ret = ttm_mem_evict_first(bdev, mem_type, place,ctx);
> -             if (unlikely(ret != 0))
> +             if (unlikely(ret != 0)) {
> + ttm_man_del_waiter(man, &waiter);
>                        return ret;
> +             }
>        } while (1);
>        mem->mem_type = mem_type;
>        return ttm_bo_add_move_fence(bo, man, mem);
> @@ -1450,6 +1511,8 @@ int ttm_bo_init_mm(struct ttm_bo_device*bdev, unsigned type,
>        man->use_io_reserve_lru = false;
> mutex_init(&man->io_reserve_mutex);
> spin_lock_init(&man->move_lock);
> + spin_lock_init(&man->wait_lock);
> + INIT_LIST_HEAD(&man->waiter_list);
> INIT_LIST_HEAD(&man->io_reserve_lru);
>
>        ret = bdev->driver->init_mem_type(bdev, type, man);
> diff --git a/include/drm/ttm/ttm_bo_api.hb/include/drm/ttm/ttm_bo_api.h
> index 2cd025c2abe7..0fce4dbd02e7 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -40,6 +40,7 @@
>   #include <linux/mm.h>
>   #include <linux/bitmap.h>
>   #include <linux/reservation.h>
> +#include <drm/ttm/ttm_placement.h>
>
>   struct ttm_bo_device;
>
> @@ -232,6 +233,12 @@ struct ttm_buffer_object {
>        struct mutex wu_mutex;
>   };
>
> +struct ttm_bo_waiter {
> +     struct ttm_buffer_object *tbo;
> +     struct ttm_place place;
> +     struct list_head list;
> +};
> +
>   /**
>    * struct ttm_bo_kmap_obj
>    *
> diff --git a/include/drm/ttm/ttm_bo_driver.hb/include/drm/ttm/ttm_bo_driver.h
> index 9b417eb2df20..dc6b8b4c9e06 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -293,6 +293,10 @@ struct ttm_mem_type_manager {
>        bool io_reserve_fastpath;
>        spinlock_t move_lock;
>
> +     /* waiters in list */
> +     spinlock_t wait_lock;
> +     struct list_head waiter_list;
> +
>        /*
>         * Protected by @io_reserve_mutex:
>         */
> @@ -748,6 +752,9 @@ int ttm_bo_mem_space(structttm_buffer_object *bo,
>                     struct ttm_mem_reg *mem,
>                     struct ttm_operation_ctx *ctx);
>
> +int ttm_man_check_bo(struct ttm_mem_type_manager *man,
> +                  struct ttm_buffer_object *bo,
> +                  const struct ttm_place *place);
> void ttm_bo_mem_put(struct ttm_buffer_object *bo, structttm_mem_reg *mem);
>   void ttm_bo_mem_put_locked(struct ttm_buffer_object *bo,
>                           struct ttm_mem_reg *mem);



_______________________________________________
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order

Reply via email to