Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
Am 08.05.19 um 10:34 schrieb Thomas Hellstrom: > [SNIP] No, what I mean is to add the acquire_ctx as separate parameter to ttm_mem_evict_first(). E.g. we only need it in this function and it is actually not related to the ttm operation context filled in by the driver. >>> >>> FWIW, I think it would be nice at some point to have a reservation >>> context being part of the ttm operation context, so that validate and >>> evict could do sleeping reservations, and have bos remain on the lru >>> even when reserved... >> Yeah, well that's exactly what the ctx->resv parameter is good for :) > > Hmm. I don't quite follow? It looks to me like ctx->resv is there to > work around recursive reservations? Well yes and no, this is to allow eviction of BOs which share the same reservation object. > > > What I'm after is being able to do sleeping reservations within validate > and evict and open up for returning -EDEADLK. One benefit would be to > scan over the LRU lists, reserving exactly those bos we want to evict, > and when all are reserved, we evict them. If we hit an -EDEADLK while > evicting we need to restart. Then we need an acquire_ctx in the > ttm_operation_ctx. The acquire_ctx is available from the BO you try to find space for. But we already tried this approach and it doesn't work. We have a lot of BOs which now share the same reservation object and so would cause an -EDEADLK. >> And yes, we do keep the BOs on the LRU even when they are reserved. > > static inline int ttm_bo_reserve(struct ttm_buffer_object *bo, > bool interruptible, bool no_wait, > struct ww_acquire_ctx *ticket) ttm_bo_reserve() is not always used any more outside of TTM. The for DMA-buf as well as amdgpu VMs code the reservation object is locked without calling ttm_bo_reserve now. Regards, Christian. > > /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
On 5/7/19 1:42 PM, Koenig, Christian wrote: Am 07.05.19 um 13:37 schrieb Thomas Hellstrom: [CAUTION: External Email] On 5/7/19 1:24 PM, Christian König wrote: Am 07.05.19 um 13:22 schrieb zhoucm1: On 2019年05月07日 19:13, Koenig, Christian wrote: Am 07.05.19 um 13:08 schrieb zhoucm1: On 2019年05月07日 18:53, Koenig, Christian wrote: Am 07.05.19 um 11:36 schrieb Chunming Zhou: heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned. 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket. 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved). 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO. 6. If any of the "If's" above fail we just back off and return -EBUSY. v2: fix some minor check v3: address Christian v2 comments. v4: fix some missing v5: handle first_bo unlock and bo_get/put v6: abstract unified iterate function, and handle all possible usecase not only pinned bo. Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/ttm/ttm_bo.c | 113 ++- 1 file changed, 97 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8502b3ed2d88..bbf1d14d00a7 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; *locked = false; + if (busy) + *busy = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT @@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, } else { *locked = reservation_object_trylock(bo->resv); ret = *locked; + if (!ret && busy) + *busy = true; } return ret; } -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object* +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, + struct ttm_mem_type_manager *man, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + struct ttm_buffer_object **first_bo, + bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + int i; - spin_lock(&glob->lru_lock); + if (first_bo) + *first_bo = NULL; for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy = false; + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, + &busy)) { A newline between declaration and code please. + if (first_bo && !(*first_bo) && busy) { + ttm_bo_get(bo); + *first_bo = bo; + } continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { - if (locked) + if (*locked) reservation_object_unlock(bo->resv); continue; } + break; } @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; } + return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + struct ttm_buffer_object *bo = NULL, *first_bo = NULL; + bool locked = false; + int ret; + + spin_lock(&glob->lru_lock); + bo = ttm_mem_find_evitable_b
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
On 5/7/19 1:24 PM, Christian König wrote: Am 07.05.19 um 13:22 schrieb zhoucm1: On 2019年05月07日 19:13, Koenig, Christian wrote: Am 07.05.19 um 13:08 schrieb zhoucm1: On 2019年05月07日 18:53, Koenig, Christian wrote: Am 07.05.19 um 11:36 schrieb Chunming Zhou: heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned. 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket. 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved). 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO. 6. If any of the "If's" above fail we just back off and return -EBUSY. v2: fix some minor check v3: address Christian v2 comments. v4: fix some missing v5: handle first_bo unlock and bo_get/put v6: abstract unified iterate function, and handle all possible usecase not only pinned bo. Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/ttm/ttm_bo.c | 113 ++- 1 file changed, 97 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8502b3ed2d88..bbf1d14d00a7 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; *locked = false; + if (busy) + *busy = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT @@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, } else { *locked = reservation_object_trylock(bo->resv); ret = *locked; + if (!ret && busy) + *busy = true; } return ret; } -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object* +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, + struct ttm_mem_type_manager *man, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + struct ttm_buffer_object **first_bo, + bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + int i; - spin_lock(&glob->lru_lock); + if (first_bo) + *first_bo = NULL; for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy = false; + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, + &busy)) { A newline between declaration and code please. + if (first_bo && !(*first_bo) && busy) { + ttm_bo_get(bo); + *first_bo = bo; + } continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { - if (locked) + if (*locked) reservation_object_unlock(bo->resv); continue; } + break; } @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; } + return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + struct ttm_buffer_object *bo = NULL, *first_bo = NULL; + bool locked = false; + int ret; + + spin_lock(&glob->lru_lock); + bo = ttm_mem_find_evitable_bo(bdev, man, place, ctx, &first_bo, + &locked); if (!bo) { + struct ttm_operation_ctx busy_ctx; +
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
Am 07.05.19 um 13:37 schrieb Thomas Hellstrom: > [CAUTION: External Email] > > On 5/7/19 1:24 PM, Christian König wrote: >> Am 07.05.19 um 13:22 schrieb zhoucm1: >>> >>> >>> On 2019年05月07日 19:13, Koenig, Christian wrote: Am 07.05.19 um 13:08 schrieb zhoucm1: > > On 2019年05月07日 18:53, Koenig, Christian wrote: >> Am 07.05.19 um 11:36 schrieb Chunming Zhou: >>> heavy gpu job could occupy memory long time, which lead other user >>> fail to get memory. >>> >>> basically pick up Christian idea: >>> >>> 1. Reserve the BO in DC using a ww_mutex ticket (trivial). >>> 2. If we then run into this EBUSY condition in TTM check if the BO >>> we need memory for (or rather the ww_mutex of its reservation >>> object) has a ticket assigned. >>> 3. If we have a ticket we grab a reference to the first BO on the >>> LRU, drop the LRU lock and try to grab the reservation lock with >>> the >>> ticket. >>> 4. If getting the reservation lock with the ticket succeeded we >>> check if the BO is still the first one on the LRU in question (the >>> BO could have moved). >>> 5. If the BO is still the first one on the LRU in question we >>> try to >>> evict it as we would evict any other BO. >>> 6. If any of the "If's" above fail we just back off and return >>> -EBUSY. >>> >>> v2: fix some minor check >>> v3: address Christian v2 comments. >>> v4: fix some missing >>> v5: handle first_bo unlock and bo_get/put >>> v6: abstract unified iterate function, and handle all possible >>> usecase not only pinned bo. >>> >>> Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 >>> Signed-off-by: Chunming Zhou >>> --- >>> drivers/gpu/drm/ttm/ttm_bo.c | 113 >>> ++- >>> 1 file changed, 97 insertions(+), 16 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>> b/drivers/gpu/drm/ttm/ttm_bo.c >>> index 8502b3ed2d88..bbf1d14d00a7 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>> @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); >>> * b. Otherwise, trylock it. >>> */ >>> static bool ttm_bo_evict_swapout_allowable(struct >>> ttm_buffer_object *bo, >>> - struct ttm_operation_ctx *ctx, bool *locked) >>> + struct ttm_operation_ctx *ctx, bool *locked, bool >>> *busy) >>> { >>> bool ret = false; >>> *locked = false; >>> + if (busy) >>> + *busy = false; >>> if (bo->resv == ctx->resv) { >>> reservation_object_assert_held(bo->resv); >>> if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT >>> @@ -779,35 +781,45 @@ static bool >>> ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, >>> } else { >>> *locked = reservation_object_trylock(bo->resv); >>> ret = *locked; >>> + if (!ret && busy) >>> + *busy = true; >>> } >>> return ret; >>> } >>> -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, >>> - uint32_t mem_type, >>> - const struct ttm_place *place, >>> - struct ttm_operation_ctx *ctx) >>> +static struct ttm_buffer_object* >>> +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, >>> + struct ttm_mem_type_manager *man, >>> + const struct ttm_place *place, >>> + struct ttm_operation_ctx *ctx, >>> + struct ttm_buffer_object **first_bo, >>> + bool *locked) >>> { >>> - struct ttm_bo_global *glob = bdev->glob; >>> - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; >>> struct ttm_buffer_object *bo = NULL; >>> - bool locked = false; >>> - unsigned i; >>> - int ret; >>> + int i; >>> - spin_lock(&glob->lru_lock); >>> + if (first_bo) >>> + *first_bo = NULL; >>> for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { >>> list_for_each_entry(bo, &man->lru[i], lru) { >>> - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) >>> + bool busy = false; >>> + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, >>> + &busy)) { >> A newline between declaration and code please. >> >>> + if (first_bo && !(*first_bo) && busy) { >>> + ttm_bo_get(bo); >>> + *first_bo = bo; >>> + } >>> continue; >>> + } >>> if (place && >>> !bdev->driver->eviction_valuable(bo, >>> place)) { >>>
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
Am 07.05.19 um 13:22 schrieb zhoucm1: On 2019年05月07日 19:13, Koenig, Christian wrote: Am 07.05.19 um 13:08 schrieb zhoucm1: On 2019年05月07日 18:53, Koenig, Christian wrote: Am 07.05.19 um 11:36 schrieb Chunming Zhou: heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned. 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket. 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved). 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO. 6. If any of the "If's" above fail we just back off and return -EBUSY. v2: fix some minor check v3: address Christian v2 comments. v4: fix some missing v5: handle first_bo unlock and bo_get/put v6: abstract unified iterate function, and handle all possible usecase not only pinned bo. Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/ttm/ttm_bo.c | 113 ++- 1 file changed, 97 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8502b3ed2d88..bbf1d14d00a7 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; *locked = false; + if (busy) + *busy = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT @@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, } else { *locked = reservation_object_trylock(bo->resv); ret = *locked; + if (!ret && busy) + *busy = true; } return ret; } -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object* +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, + struct ttm_mem_type_manager *man, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + struct ttm_buffer_object **first_bo, + bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + int i; - spin_lock(&glob->lru_lock); + if (first_bo) + *first_bo = NULL; for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy = false; + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, + &busy)) { A newline between declaration and code please. + if (first_bo && !(*first_bo) && busy) { + ttm_bo_get(bo); + *first_bo = bo; + } continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { - if (locked) + if (*locked) reservation_object_unlock(bo->resv); continue; } + break; } @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; } + return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + struct ttm_buffer_object *bo = NULL, *first_bo = NULL; + bool locked = false; + int ret; + + spin_lock(&glob->lru_lock); + bo = ttm_mem_find_evitable_bo(bdev, man, place, ctx, &first_bo, + &locked); if (!bo) { + struct ttm_operation_ctx busy_ctx; + spin_unlock(&glob->lru_lock); -
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
On 2019年05月07日 19:13, Koenig, Christian wrote: Am 07.05.19 um 13:08 schrieb zhoucm1: On 2019年05月07日 18:53, Koenig, Christian wrote: Am 07.05.19 um 11:36 schrieb Chunming Zhou: heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned. 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket. 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved). 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO. 6. If any of the "If's" above fail we just back off and return -EBUSY. v2: fix some minor check v3: address Christian v2 comments. v4: fix some missing v5: handle first_bo unlock and bo_get/put v6: abstract unified iterate function, and handle all possible usecase not only pinned bo. Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/ttm/ttm_bo.c | 113 ++- 1 file changed, 97 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8502b3ed2d88..bbf1d14d00a7 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; *locked = false; + if (busy) + *busy = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT @@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, } else { *locked = reservation_object_trylock(bo->resv); ret = *locked; + if (!ret && busy) + *busy = true; } return ret; } -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object* +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, + struct ttm_mem_type_manager *man, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + struct ttm_buffer_object **first_bo, + bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + int i; - spin_lock(&glob->lru_lock); + if (first_bo) + *first_bo = NULL; for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy = false; + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, + &busy)) { A newline between declaration and code please. + if (first_bo && !(*first_bo) && busy) { + ttm_bo_get(bo); + *first_bo = bo; + } continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { - if (locked) + if (*locked) reservation_object_unlock(bo->resv); continue; } + break; } @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; } + return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + struct ttm_buffer_object *bo = NULL, *first_bo = NULL; + bool locked = false; + int ret; + + spin_lock(&glob->lru_lock); + bo = ttm_mem_find_evitable_bo(bdev, man, place, ctx, &first_bo, + &locked); if (!bo) { + struct ttm_operation_ctx busy_ctx; + spin_unlock(&glob->lru_lock); - return -EBUS
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
Am 07.05.19 um 13:08 schrieb zhoucm1: > > > On 2019年05月07日 18:53, Koenig, Christian wrote: >> Am 07.05.19 um 11:36 schrieb Chunming Zhou: >>> heavy gpu job could occupy memory long time, which lead other user >>> fail to get memory. >>> >>> basically pick up Christian idea: >>> >>> 1. Reserve the BO in DC using a ww_mutex ticket (trivial). >>> 2. If we then run into this EBUSY condition in TTM check if the BO >>> we need memory for (or rather the ww_mutex of its reservation >>> object) has a ticket assigned. >>> 3. If we have a ticket we grab a reference to the first BO on the >>> LRU, drop the LRU lock and try to grab the reservation lock with the >>> ticket. >>> 4. If getting the reservation lock with the ticket succeeded we >>> check if the BO is still the first one on the LRU in question (the >>> BO could have moved). >>> 5. If the BO is still the first one on the LRU in question we try to >>> evict it as we would evict any other BO. >>> 6. If any of the "If's" above fail we just back off and return -EBUSY. >>> >>> v2: fix some minor check >>> v3: address Christian v2 comments. >>> v4: fix some missing >>> v5: handle first_bo unlock and bo_get/put >>> v6: abstract unified iterate function, and handle all possible >>> usecase not only pinned bo. >>> >>> Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 >>> Signed-off-by: Chunming Zhou >>> --- >>> drivers/gpu/drm/ttm/ttm_bo.c | 113 >>> ++- >>> 1 file changed, 97 insertions(+), 16 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>> b/drivers/gpu/drm/ttm/ttm_bo.c >>> index 8502b3ed2d88..bbf1d14d00a7 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>> @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); >>> * b. Otherwise, trylock it. >>> */ >>> static bool ttm_bo_evict_swapout_allowable(struct >>> ttm_buffer_object *bo, >>> - struct ttm_operation_ctx *ctx, bool *locked) >>> + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) >>> { >>> bool ret = false; >>> *locked = false; >>> + if (busy) >>> + *busy = false; >>> if (bo->resv == ctx->resv) { >>> reservation_object_assert_held(bo->resv); >>> if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT >>> @@ -779,35 +781,45 @@ static bool >>> ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, >>> } else { >>> *locked = reservation_object_trylock(bo->resv); >>> ret = *locked; >>> + if (!ret && busy) >>> + *busy = true; >>> } >>> return ret; >>> } >>> -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, >>> - uint32_t mem_type, >>> - const struct ttm_place *place, >>> - struct ttm_operation_ctx *ctx) >>> +static struct ttm_buffer_object* >>> +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, >>> + struct ttm_mem_type_manager *man, >>> + const struct ttm_place *place, >>> + struct ttm_operation_ctx *ctx, >>> + struct ttm_buffer_object **first_bo, >>> + bool *locked) >>> { >>> - struct ttm_bo_global *glob = bdev->glob; >>> - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; >>> struct ttm_buffer_object *bo = NULL; >>> - bool locked = false; >>> - unsigned i; >>> - int ret; >>> + int i; >>> - spin_lock(&glob->lru_lock); >>> + if (first_bo) >>> + *first_bo = NULL; >>> for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { >>> list_for_each_entry(bo, &man->lru[i], lru) { >>> - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) >>> + bool busy = false; >>> + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, >>> + &busy)) { >> A newline between declaration and code please. >> >>> + if (first_bo && !(*first_bo) && busy) { >>> + ttm_bo_get(bo); >>> + *first_bo = bo; >>> + } >>> continue; >>> + } >>> if (place && !bdev->driver->eviction_valuable(bo, >>> place)) { >>> - if (locked) >>> + if (*locked) >>> reservation_object_unlock(bo->resv); >>> continue; >>> } >>> + >>> break; >>> } >>> @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct >>> ttm_bo_device *bdev, >>> bo = NULL; >>> } >>> + return bo; >>> +} >>> + >>> +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, >>> + uint32_t mem_type, >>> + const struct ttm_place *place, >>> + struct ttm_operation_ctx *ctx) >>> +{ >>> + struct ttm_bo_global *glob = bdev->glob; >>> + struct
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
On 2019年05月07日 18:53, Koenig, Christian wrote: Am 07.05.19 um 11:36 schrieb Chunming Zhou: heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned. 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket. 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved). 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO. 6. If any of the "If's" above fail we just back off and return -EBUSY. v2: fix some minor check v3: address Christian v2 comments. v4: fix some missing v5: handle first_bo unlock and bo_get/put v6: abstract unified iterate function, and handle all possible usecase not only pinned bo. Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/ttm/ttm_bo.c | 113 ++- 1 file changed, 97 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8502b3ed2d88..bbf1d14d00a7 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; *locked = false; + if (busy) + *busy = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT @@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, } else { *locked = reservation_object_trylock(bo->resv); ret = *locked; + if (!ret && busy) + *busy = true; } return ret; } -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object* +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, +struct ttm_mem_type_manager *man, +const struct ttm_place *place, +struct ttm_operation_ctx *ctx, +struct ttm_buffer_object **first_bo, +bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + int i; - spin_lock(&glob->lru_lock); + if (first_bo) + *first_bo = NULL; for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy = false; + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, + &busy)) { A newline between declaration and code please. + if (first_bo && !(*first_bo) && busy) { + ttm_bo_get(bo); + *first_bo = bo; + } continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { - if (locked) + if (*locked) reservation_object_unlock(bo->resv); continue; } + break; } @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; } + return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_mem_type_manager *man = &bd
Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
Am 07.05.19 um 11:36 schrieb Chunming Zhou: > heavy gpu job could occupy memory long time, which lead other user fail to > get memory. > > basically pick up Christian idea: > > 1. Reserve the BO in DC using a ww_mutex ticket (trivial). > 2. If we then run into this EBUSY condition in TTM check if the BO we need > memory for (or rather the ww_mutex of its reservation object) has a ticket > assigned. > 3. If we have a ticket we grab a reference to the first BO on the LRU, drop > the LRU lock and try to grab the reservation lock with the ticket. > 4. If getting the reservation lock with the ticket succeeded we check if the > BO is still the first one on the LRU in question (the BO could have moved). > 5. If the BO is still the first one on the LRU in question we try to evict it > as we would evict any other BO. > 6. If any of the "If's" above fail we just back off and return -EBUSY. > > v2: fix some minor check > v3: address Christian v2 comments. > v4: fix some missing > v5: handle first_bo unlock and bo_get/put > v6: abstract unified iterate function, and handle all possible usecase not > only pinned bo. > > Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 > Signed-off-by: Chunming Zhou > --- > drivers/gpu/drm/ttm/ttm_bo.c | 113 ++- > 1 file changed, 97 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index 8502b3ed2d88..bbf1d14d00a7 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); >* b. Otherwise, trylock it. >*/ > static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, > - struct ttm_operation_ctx *ctx, bool *locked) > + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) > { > bool ret = false; > > *locked = false; > + if (busy) > + *busy = false; > if (bo->resv == ctx->resv) { > reservation_object_assert_held(bo->resv); > if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT > @@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct > ttm_buffer_object *bo, > } else { > *locked = reservation_object_trylock(bo->resv); > ret = *locked; > + if (!ret && busy) > + *busy = true; > } > > return ret; > } > > -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, > -uint32_t mem_type, > -const struct ttm_place *place, > -struct ttm_operation_ctx *ctx) > +static struct ttm_buffer_object* > +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, > + struct ttm_mem_type_manager *man, > + const struct ttm_place *place, > + struct ttm_operation_ctx *ctx, > + struct ttm_buffer_object **first_bo, > + bool *locked) > { > - struct ttm_bo_global *glob = bdev->glob; > - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; > struct ttm_buffer_object *bo = NULL; > - bool locked = false; > - unsigned i; > - int ret; > + int i; > > - spin_lock(&glob->lru_lock); > + if (first_bo) > + *first_bo = NULL; > for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { > list_for_each_entry(bo, &man->lru[i], lru) { > - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) > + bool busy = false; > + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, > + &busy)) { A newline between declaration and code please. > + if (first_bo && !(*first_bo) && busy) { > + ttm_bo_get(bo); > + *first_bo = bo; > + } > continue; > + } > > if (place && !bdev->driver->eviction_valuable(bo, > place)) { > - if (locked) > + if (*locked) > reservation_object_unlock(bo->resv); > continue; > } > + > break; > } > > @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device > *bdev, > bo = NULL; > } > > + return bo; > +} > + > +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, > +uint32_t mem_type, > +const struct ttm_place *place, > +struct ttm_operation_ctx *ctx) > +{ > + struct t
[PATCH 1/2] drm/ttm: fix busy memory to fail other user v6
heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned. 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket. 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved). 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO. 6. If any of the "If's" above fail we just back off and return -EBUSY. v2: fix some minor check v3: address Christian v2 comments. v4: fix some missing v5: handle first_bo unlock and bo_get/put v6: abstract unified iterate function, and handle all possible usecase not only pinned bo. Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/ttm/ttm_bo.c | 113 ++- 1 file changed, 97 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8502b3ed2d88..bbf1d14d00a7 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; *locked = false; + if (busy) + *busy = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT @@ -779,35 +781,45 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, } else { *locked = reservation_object_trylock(bo->resv); ret = *locked; + if (!ret && busy) + *busy = true; } return ret; } -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object* +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, +struct ttm_mem_type_manager *man, +const struct ttm_place *place, +struct ttm_operation_ctx *ctx, +struct ttm_buffer_object **first_bo, +bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + int i; - spin_lock(&glob->lru_lock); + if (first_bo) + *first_bo = NULL; for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy = false; + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, + &busy)) { + if (first_bo && !(*first_bo) && busy) { + ttm_bo_get(bo); + *first_bo = bo; + } continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { - if (locked) + if (*locked) reservation_object_unlock(bo->resv); continue; } + break; } @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; } + return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + struct ttm_buffer_object *bo = NULL, *first_bo = NULL; + bool locked = false; + int ret; + + spin_lo