Re: [Linaro-mm-sig] [PATCH] dma-buf: avoid scheduling on fence status query v2
On Wed, May 24, 2017 at 09:47:49AM +1000, Dave Airlie wrote: > On 28 April 2017 at 07:27, Gustavo Padovan wrote: > > 2017-04-26 Christian König : > > > >> Am 26.04.2017 um 16:46 schrieb Andres Rodriguez: > >> > When a timeout of zero is specified, the caller is only interested in > >> > the fence status. > >> > > >> > In the current implementation, dma_fence_default_wait will always call > >> > schedule_timeout() at least once for an unsignaled fence. This adds a > >> > significant overhead to a fence status query. > >> > > >> > Avoid this overhead by returning early if a zero timeout is specified. > >> > > >> > v2: move early return after enable_signaling > >> > > >> > Signed-off-by: Andres Rodriguez > >> > >> Reviewed-by: Christian König > > > > pushed to drm-misc-next. Thanks all. > > I don't see this patch in -rc2, where did it end up going? Queued for 4.13. Makes imo sense since it's just a performance improvement, not a clear bugfix. But it's in your drm-next, so if you want to fast-track you can cherry-pick it over: commit 03c0c5f6641533f5fc14bf4e76d2304197402552 Author: Andres Rodriguez Date: Wed Apr 26 10:46:20 2017 -0400 dma-buf: avoid scheduling on fence status query v2 Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [Linaro-mm-sig] [PATCH] dma-buf: avoid scheduling on fence status query v2
On 28 April 2017 at 07:27, Gustavo Padovan wrote: > 2017-04-26 Christian König : > >> Am 26.04.2017 um 16:46 schrieb Andres Rodriguez: >> > When a timeout of zero is specified, the caller is only interested in >> > the fence status. >> > >> > In the current implementation, dma_fence_default_wait will always call >> > schedule_timeout() at least once for an unsignaled fence. This adds a >> > significant overhead to a fence status query. >> > >> > Avoid this overhead by returning early if a zero timeout is specified. >> > >> > v2: move early return after enable_signaling >> > >> > Signed-off-by: Andres Rodriguez >> >> Reviewed-by: Christian König > > pushed to drm-misc-next. Thanks all. I don't see this patch in -rc2, where did it end up going? Dave.
Re: [PATCH] dma-buf: avoid scheduling on fence status query v2
2017-04-26 Christian König : > Am 26.04.2017 um 16:46 schrieb Andres Rodriguez: > > When a timeout of zero is specified, the caller is only interested in > > the fence status. > > > > In the current implementation, dma_fence_default_wait will always call > > schedule_timeout() at least once for an unsignaled fence. This adds a > > significant overhead to a fence status query. > > > > Avoid this overhead by returning early if a zero timeout is specified. > > > > v2: move early return after enable_signaling > > > > Signed-off-by: Andres Rodriguez > > Reviewed-by: Christian König pushed to drm-misc-next. Thanks all. Gustavo
Re: [PATCH] dma-buf: avoid scheduling on fence status query v2
Am 26.04.2017 um 16:46 schrieb Andres Rodriguez: When a timeout of zero is specified, the caller is only interested in the fence status. In the current implementation, dma_fence_default_wait will always call schedule_timeout() at least once for an unsignaled fence. This adds a significant overhead to a fence status query. Avoid this overhead by returning early if a zero timeout is specified. v2: move early return after enable_signaling Signed-off-by: Andres Rodriguez Reviewed-by: Christian König --- If I'm understanding correctly, I don't think we need to register the default wait callback. But if that isn't the case please let me know. This patch has the same perf improvements as v1. drivers/dma-buf/dma-fence.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f..57da14c 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -402,6 +402,11 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) } } + if (!timeout) { + ret = 0; + goto out; + } + cb.base.func = dma_fence_default_wait_cb; cb.task = current; list_add(&cb.base.node, &fence->cb_list);
[PATCH] dma-buf: avoid scheduling on fence status query v2
When a timeout of zero is specified, the caller is only interested in the fence status. In the current implementation, dma_fence_default_wait will always call schedule_timeout() at least once for an unsignaled fence. This adds a significant overhead to a fence status query. Avoid this overhead by returning early if a zero timeout is specified. v2: move early return after enable_signaling Signed-off-by: Andres Rodriguez --- If I'm understanding correctly, I don't think we need to register the default wait callback. But if that isn't the case please let me know. This patch has the same perf improvements as v1. drivers/dma-buf/dma-fence.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f..57da14c 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -402,6 +402,11 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) } } + if (!timeout) { + ret = 0; + goto out; + } + cb.base.func = dma_fence_default_wait_cb; cb.task = current; list_add(&cb.base.node, &fence->cb_list); -- 2.9.3
Re: [PATCH] dma-buf: avoid scheduling on fence status query
On 2017-04-26 06:13 AM, Christian König wrote: Am 26.04.2017 um 11:59 schrieb Dave Airlie: On 26 April 2017 at 17:20, Christian König wrote: NAK, I'm wondering how often I have to reject that change. We should probably add a comment here. Even with a zero timeout we still need to enable signaling, otherwise some fence will never signal if userspace just polls on them. If a caller is only interested in the fence status without enabling the signaling it should call dma_fence_is_signaled() instead. Can we not move the return 0 (with spin unlock) down after we enabling signalling, but before we enter the schedule_timeout(1)? Yes, that would be an option. I was actually arguing with Dave about this on IRC yesterday. Seems like I owe him a beer now. -Andres Christian. Dave.
Re: [PATCH] dma-buf: avoid scheduling on fence status query
Am 26.04.2017 um 11:59 schrieb Dave Airlie: On 26 April 2017 at 17:20, Christian König wrote: NAK, I'm wondering how often I have to reject that change. We should probably add a comment here. Even with a zero timeout we still need to enable signaling, otherwise some fence will never signal if userspace just polls on them. If a caller is only interested in the fence status without enabling the signaling it should call dma_fence_is_signaled() instead. Can we not move the return 0 (with spin unlock) down after we enabling signalling, but before we enter the schedule_timeout(1)? Yes, that would be an option. Christian. Dave.
Re: [PATCH] dma-buf: avoid scheduling on fence status query
On 26 April 2017 at 17:20, Christian König wrote: > NAK, I'm wondering how often I have to reject that change. We should > probably add a comment here. > > Even with a zero timeout we still need to enable signaling, otherwise some > fence will never signal if userspace just polls on them. > > If a caller is only interested in the fence status without enabling the > signaling it should call dma_fence_is_signaled() instead. Can we not move the return 0 (with spin unlock) down after we enabling signalling, but before we enter the schedule_timeout(1)? Dave.
Re: [PATCH] dma-buf: avoid scheduling on fence status query
NAK, I'm wondering how often I have to reject that change. We should probably add a comment here. Even with a zero timeout we still need to enable signaling, otherwise some fence will never signal if userspace just polls on them. If a caller is only interested in the fence status without enabling the signaling it should call dma_fence_is_signaled() instead. Regards, Christian. Am 26.04.2017 um 04:50 schrieb Andres Rodriguez: CC a few extra lists I missed. Regards, Andres On 2017-04-25 09:36 PM, Andres Rodriguez wrote: When a timeout of zero is specified, the caller is only interested in the fence status. In the current implementation, dma_fence_default_wait will always call schedule_timeout() at least once for an unsignaled fence. This adds a significant overhead to a fence status query. Avoid this overhead by returning early if a zero timeout is specified. Signed-off-by: Andres Rodriguez --- This heavily affects the performance of the Source2 engine running on radv. This patch improves dota2(radv) perf on a i7-6700k+RX480 system from 72fps->81fps. drivers/dma-buf/dma-fence.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f..348e9e2 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -380,6 +380,9 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return ret; +if (!timeout) +return 0; + spin_lock_irqsave(fence->lock, flags); if (intr && signal_pending(current)) {
Re: [PATCH] dma-buf: avoid scheduling on fence status query
CC a few extra lists I missed. Regards, Andres On 2017-04-25 09:36 PM, Andres Rodriguez wrote: When a timeout of zero is specified, the caller is only interested in the fence status. In the current implementation, dma_fence_default_wait will always call schedule_timeout() at least once for an unsignaled fence. This adds a significant overhead to a fence status query. Avoid this overhead by returning early if a zero timeout is specified. Signed-off-by: Andres Rodriguez --- This heavily affects the performance of the Source2 engine running on radv. This patch improves dota2(radv) perf on a i7-6700k+RX480 system from 72fps->81fps. drivers/dma-buf/dma-fence.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f..348e9e2 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -380,6 +380,9 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return ret; + if (!timeout) + return 0; + spin_lock_irqsave(fence->lock, flags); if (intr && signal_pending(current)) {