Re: [PATCH] [RFC]drm: add syncobj timeline support v5
Am 19.09.2018 um 05:20 schrieb zhoucm1: On 2018年09月18日 16:32, Christian König wrote: + for (i = 0; i < args->count_handles; i++) { + if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) { + DRM_ERROR("timeline syncobj cannot reset!\n"); Why not? I mean that should still work or do I miss anything? timeline semaphore spec doesn't require reset interface, it says the timeline value only can be changed by signal operations. Yeah, but we don't care about the timeline spec in the kernel. Question is rather if that still makes sense to support that and as far as I can see it should be trivial to reinitialize the object. Hi Daniel Rakos, Could you give a comment on this question? Is it necessary to support timeline reset interface? I only see the timeline value can be changed by signal operations in Spec. I think the Vulkan spec is rather irrelevant here. We added the reset operation because of drivers which wanted to re-use an existing syncobj. That is still a valid use case even when Vulkan doesn't have an interface for it. Christian. Thanks, David Zhou ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
On 2018年09月18日 16:32, Christian König wrote: + for (i = 0; i < args->count_handles; i++) { + if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) { + DRM_ERROR("timeline syncobj cannot reset!\n"); Why not? I mean that should still work or do I miss anything? timeline semaphore spec doesn't require reset interface, it says the timeline value only can be changed by signal operations. Yeah, but we don't care about the timeline spec in the kernel. Question is rather if that still makes sense to support that and as far as I can see it should be trivial to reinitialize the object. Hi Daniel Rakos, Could you give a comment on this question? Is it necessary to support timeline reset interface? I only see the timeline value can be changed by signal operations in Spec. Thanks, David Zhou ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
Am 17.09.2018 um 11:33 schrieb zhoucm1: [SNIP] +static struct dma_fence * +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 flags) +{ + struct dma_fence *fence; + int ret = 0; + + if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) { + ret = wait_event_timeout(syncobj->syncobj_timeline.wq, + point <= syncobj->syncobj_timeline.signal_point, + msecs_to_jiffies(1)); /* wait 10s */ Either don't use a timeout or provide the timeout explicitely, but don't hard code it here. Additional to that we probably need an incorruptible wait. Initially, I used interruptible wait, I found it sometimes often return -ERESTARTSYS without waiting any second when I wrote unit test. Any ideas for that? Yeah, that is perfectly normal. The application just received a signal and we need to abort the IOCTL ASAP. still need to interruptible wait? Yes, that should certainly be an interruptible wait. Otherwise an application could block on this forever when somebody forgets to supply a signal point. - for (i = 0; i < args->count_handles; i++) - drm_syncobj_replace_fence(syncobjs[i], 0, NULL); - + for (i = 0; i < args->count_handles; i++) { + if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) { + DRM_ERROR("timeline syncobj cannot reset!\n"); Why not? I mean that should still work or do I miss anything? timeline semaphore spec doesn't require reset interface, it says the timeline value only can be changed by signal operations. Yeah, but we don't care about the timeline spec in the kernel. Question is rather if that still makes sense to support that and as far as I can see it should be trivial to reinitialize the object. + ret = -EINVAL; + goto out; + } + drm_syncobj_timeline_fini(syncobjs[i], + &syncobjs[i]->syncobj_timeline); + drm_syncobj_timeline_init(syncobjs[i], + &syncobjs[i]->syncobj_timeline); + } +out: drm_syncobj_array_free(syncobjs, args->count_handles); - return 0; + return ret; } int diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 0a8d2d64f380..579e91a5858b 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb, if (!(flags & I915_EXEC_FENCE_WAIT)) continue; - fence = drm_syncobj_fence_get(syncobj); + drm_syncobj_search_fence(syncobj, 0, + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, Not sure if we can make DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT the default here. Maybe better if drivers explicitly need to ask for it? if you don't like the flag used in specific driver, I can wrap it to a new function. Yeah, that is probably a good idea. + &fence); if (!fence) return -EINVAL; diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index 425432b85a87..9535521e6623 100644 --- a/include/drm/drm_syncobj.h +++ b/include/drm/drm_syncobj.h @@ -30,6 +30,25 @@ struct drm_syncobj_cb; +enum drm_syncobj_type { + DRM_SYNCOBJ_TYPE_NORMAL, Does anybody have a better suggestion for _TYPE_NORMAL? That sounds like timeline would be special. How about _TYPE_INDIVIDUAL? Oh, really good idea. btw: I encounter a problem after removing advanced wait pt, I debug it two days, but still don't find reason, from log, it's timeount when wait_event_timeout, that means it don't receive signal operation. Good question. Maybe a race condition? Have you tried to trace both threads to the trace buffer, e.g. use trace_printk? "./deqp-vk -n dEQP-VK*semaphore*" will fail on 'dEQP-VK.synchronization.basic.semaphore.chain' case, but can pass if only run that one case like "./deqp-vk -n dEQP-VK.synchronization.basic.semaphore.chain". Both old and my previous implementation can pass and have no this problem. Can you reproduce that as well with the unit tests in libdrm? Christian. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
On 2018年09月17日 16:37, Christian König wrote: Am 14.09.2018 um 12:37 schrieb Chunming Zhou: This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side: This extension introduces a new type of syncobj that has an integer payload identifying a point in a timeline. Such timeline syncobjs support the following operations: * CPU query - A host operation that allows querying the payload of the timeline syncobj. * CPU wait - A host operation that allows a blocking wait for a timeline syncobj to reach a specified value. * Device wait - A device operation that allows waiting for a timeline syncobj to reach a specified value. * Device signal - A device operation that allows advancing the timeline syncobj to a specified value. Since it's a timeline, that means the front time point(PT) always is signaled before the late PT. a. signal PT design: Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled, the timeline will increase to value of PT[N]. b. wait PT design: Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline, so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to perform that. v2: 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the change to drm_syncobj_replace_fence() in a separate patch. 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) v3: 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) a. normal syncobj signal op will create a signal PT to tail of signal pt list. b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT. 2. many bug fix and clean up 3. stub fence moving is moved to other patch. v4: 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. (Christian) 4. fix timeline path issues. 5. write a timeline test in libdrm v5: (Christian) 1. semaphore is called syncobj in kernel side. 2. don't need 'timeline' characters in some function name. 3. keep syncobj cb normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline syncobj is tested by ./amdgpu_test -s 9 Signed-off-by: Chunming Zhou Cc: Christian Konig Cc: Dave Airlie Cc: Daniel Rakos Cc: Daniel Vetter --- drivers/gpu/drm/drm_syncobj.c | 294 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- include/drm/drm_syncobj.h | 62 +++-- include/uapi/drm/drm.h | 1 + 4 files changed, 292 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index e9ce623d049e..e78d076f2703 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -56,6 +56,9 @@ #include "drm_internal.h" #include +/* merge normal syncobj to timeline syncobj, the point interval is 1 */ +#define DRM_SYNCOBJ_NORMAL_POINT 1 + struct drm_syncobj_stub_fence { struct dma_fence base; spinlock_t lock; @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = { .release = drm_syncobj_stub_fence_release, }; +struct drm_syncobj_signal_pt { + struct dma_fence_array *base; + u64 value; + struct list_head list; +}; /** * drm_syncobj_find - lookup and reference a sync object. @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, { int ret; - *fence = drm_syncobj_fence_get(syncobj); + ret = drm_syncobj_search_fence(syncobj, 0, 0, fence); if (*fence) Don't we need to check ret here instead? both ok, if you like check ret, will update it in v6. return 1; @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, * have the lock, try one more time just to be sure we don't add a * callback when a fence has already been set. */ - if (syncobj->fence) { - *fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, - lockdep_is_held(&syncobj->lock))); - ret = 1; + if (fence) { + drm_syncobj_search_fence(syncobj, 0, 0, fence); + if (*fence) + ret = 1; That doesn't looks correct to me, drm_syncobj_search_fence() would try to grab the lock once more.
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
Am 14.09.2018 um 12:37 schrieb Chunming Zhou: This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side: This extension introduces a new type of syncobj that has an integer payload identifying a point in a timeline. Such timeline syncobjs support the following operations: * CPU query - A host operation that allows querying the payload of the timeline syncobj. * CPU wait - A host operation that allows a blocking wait for a timeline syncobj to reach a specified value. * Device wait - A device operation that allows waiting for a timeline syncobj to reach a specified value. * Device signal - A device operation that allows advancing the timeline syncobj to a specified value. Since it's a timeline, that means the front time point(PT) always is signaled before the late PT. a. signal PT design: Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled, the timeline will increase to value of PT[N]. b. wait PT design: Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline, so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to perform that. v2: 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the change to drm_syncobj_replace_fence() in a separate patch. 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) v3: 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) a. normal syncobj signal op will create a signal PT to tail of signal pt list. b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT. 2. many bug fix and clean up 3. stub fence moving is moved to other patch. v4: 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. (Christian) 4. fix timeline path issues. 5. write a timeline test in libdrm v5: (Christian) 1. semaphore is called syncobj in kernel side. 2. don't need 'timeline' characters in some function name. 3. keep syncobj cb normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline syncobj is tested by ./amdgpu_test -s 9 Signed-off-by: Chunming Zhou Cc: Christian Konig Cc: Dave Airlie Cc: Daniel Rakos Cc: Daniel Vetter --- drivers/gpu/drm/drm_syncobj.c | 294 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- include/drm/drm_syncobj.h | 62 +++-- include/uapi/drm/drm.h | 1 + 4 files changed, 292 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index e9ce623d049e..e78d076f2703 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -56,6 +56,9 @@ #include "drm_internal.h" #include +/* merge normal syncobj to timeline syncobj, the point interval is 1 */ +#define DRM_SYNCOBJ_NORMAL_POINT 1 + struct drm_syncobj_stub_fence { struct dma_fence base; spinlock_t lock; @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = { .release = drm_syncobj_stub_fence_release, }; +struct drm_syncobj_signal_pt { + struct dma_fence_array *base; + u64value; + struct list_head list; +}; /** * drm_syncobj_find - lookup and reference a sync object. @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, { int ret; - *fence = drm_syncobj_fence_get(syncobj); + ret = drm_syncobj_search_fence(syncobj, 0, 0, fence); if (*fence) Don't we need to check ret here instead? return 1; @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, * have the lock, try one more time just to be sure we don't add a * callback when a fence has already been set. */ - if (syncobj->fence) { - *fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, - lockdep_is_held(&syncobj->lock))); - ret = 1; + if (fence) { + drm_syncobj_search_fence(syncobj, 0, 0, fence); + if (*fence) + ret = 1; That doesn't looks correct to me, drm_syncobj_search_fence() would try to gr
RE: [PATCH] [RFC]drm: add syncobj timeline support v5
> -Original Message- > From: Daniel Vetter On Behalf Of Daniel Vetter > Sent: Saturday, September 15, 2018 12:11 AM > To: Koenig, Christian > Cc: Zhou, David(ChunMing) ; dri- > de...@lists.freedesktop.org; amd-...@lists.freedesktop.org; Dave Airlie > ; Rakos, Daniel ; Daniel > Vetter > Subject: Re: [PATCH] [RFC]drm: add syncobj timeline support v5 > > On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote: > > Am 14.09.2018 um 12:37 schrieb Chunming Zhou: > > > This patch is for VK_KHR_timeline_semaphore extension, semaphore is > called syncobj in kernel side: > > > This extension introduces a new type of syncobj that has an integer > > > payload identifying a point in a timeline. Such timeline syncobjs > > > support the following operations: > > > * CPU query - A host operation that allows querying the payload of the > > > timeline syncobj. > > > * CPU wait - A host operation that allows a blocking wait for a > > > timeline syncobj to reach a specified value. > > > * Device wait - A device operation that allows waiting for a > > > timeline syncobj to reach a specified value. > > > * Device signal - A device operation that allows advancing the > > > timeline syncobj to a specified value. > > > > > > Since it's a timeline, that means the front time point(PT) always is > signaled before the late PT. > > > a. signal PT design: > > > Signal PT fence N depends on PT[N-1] fence and signal opertion > > > fence, when PT[N] fence is signaled, the timeline will increase to value > > > of > PT[N]. > > > b. wait PT design: > > > Wait PT fence is signaled by reaching timeline point value, when > > > timeline is increasing, will compare wait PTs value with new > > > timeline value, if PT value is lower than timeline value, then wait > > > PT will be signaled, otherwise keep in list. syncobj wait operation > > > can wait on any point of timeline, so need a RB tree to order them. And > wait PT could ahead of signal PT, we need a sumission fence to perform that. > > > > > > v2: > > > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. > move > > > unexposed denitions to .c file. (Daniel Vetter) 3. split up the > > > change to drm_syncobj_find_fence() in a separate patch. (Christian) > > > 4. split up the change to drm_syncobj_replace_fence() in a separate > patch. > > > 5. drop the submission_fence implementation and instead use > > > wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL > > > type syncobj case. (Daniel Vetter) > > > > > > v3: > > > 1. replace normal syncobj with timeline implemenation. (Vetter and > Christian) > > > a. normal syncobj signal op will create a signal PT to tail of > > > signal pt list. > > > b. normal syncobj wait op will create a wait pt with last signal > > > point, and > this wait PT is only signaled by related signal point PT. > > > 2. many bug fix and clean up > > > 3. stub fence moving is moved to other patch. > > > > > > v4: > > > 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. > > > fix syncobj lifecycle. (Christian) 3. only enable_signaling when > > > there is wait_pt. (Christian) 4. fix timeline path issues. > > > 5. write a timeline test in libdrm > > > > > > v5: (Christian) > > > 1. semaphore is called syncobj in kernel side. > > > 2. don't need 'timeline' characters in some function name. > > > 3. keep syncobj cb > > > > > > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline > > > syncobj is tested by ./amdgpu_test -s 9 > > > > > > Signed-off-by: Chunming Zhou > > > Cc: Christian Konig > > > Cc: Dave Airlie > > > Cc: Daniel Rakos > > > Cc: Daniel Vetter > > > > At least on first glance that looks like it should work, going to do a > > detailed review on Monday. > > Just for my understanding, it's all condensed down to 1 patch now? Yes, Christian suggest that. >I kinda > didn't follow the detailed discussion last few days at all :-/ > > Also, is there a testcase, igt highly preferred (because then we'll run it in > our > intel-gfx CI, and a bunch of people outside of intel have already discovered > that and are using it). I already wrote the test in libdrm unit test, since I'm not familiar with IGT stuff. Thanks, David Zhou > >
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
Am 14.09.2018 um 20:24 schrieb Daniel Vetter: On Fri, Sep 14, 2018 at 6:43 PM, Christian König wrote: Am 14.09.2018 um 18:10 schrieb Daniel Vetter: On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote: Am 14.09.2018 um 12:37 schrieb Chunming Zhou: This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side: This extension introduces a new type of syncobj that has an integer payload identifying a point in a timeline. Such timeline syncobjs support the following operations: * CPU query - A host operation that allows querying the payload of the timeline syncobj. * CPU wait - A host operation that allows a blocking wait for a timeline syncobj to reach a specified value. * Device wait - A device operation that allows waiting for a timeline syncobj to reach a specified value. * Device signal - A device operation that allows advancing the timeline syncobj to a specified value. Since it's a timeline, that means the front time point(PT) always is signaled before the late PT. a. signal PT design: Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled, the timeline will increase to value of PT[N]. b. wait PT design: Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline, so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to perform that. v2: 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the change to drm_syncobj_replace_fence() in a separate patch. 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) v3: 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) a. normal syncobj signal op will create a signal PT to tail of signal pt list. b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT. 2. many bug fix and clean up 3. stub fence moving is moved to other patch. v4: 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. (Christian) 4. fix timeline path issues. 5. write a timeline test in libdrm v5: (Christian) 1. semaphore is called syncobj in kernel side. 2. don't need 'timeline' characters in some function name. 3. keep syncobj cb normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline syncobj is tested by ./amdgpu_test -s 9 Signed-off-by: Chunming Zhou Cc: Christian Konig Cc: Dave Airlie Cc: Daniel Rakos Cc: Daniel Vetter At least on first glance that looks like it should work, going to do a detailed review on Monday. Just for my understanding, it's all condensed down to 1 patch now? I kinda didn't follow the detailed discussion last few days at all :-/ I've already committed all the cleanup/fix prerequisites to drm-misc-next. The driver specific implementation needs to come on top and maybe a new CPU wait IOCTL. But essentially this patch is just the core of the kernel implementation. Ah cool, missed that. Also, is there a testcase, igt highly preferred (because then we'll run it in our intel-gfx CI, and a bunch of people outside of intel have already discovered that and are using it). libdrm patches and I think amdgpu based test cases where already published as well. Not sure about igt testcases. I guess we can write them when the intel implementation shows up. Just kinda still hoping that we'd have a more unfified test suite. And not really well-kept secret: We do have an amdgpu in our CI, in the form of kbl-g :-) But unfortunately it's not running the full test set for patches (only for drm-tip). But we could perhaps run more of the amdgpu tests somehow, if there's serious interest. Well I wouldn't mind if we sooner or later get rid of the amdgpu unit tests in libdrm. They are more or less just a really bloody mess. Christian. Cheers, Daniel Christian. Thanks, Daniel Christian. --- drivers/gpu/drm/drm_syncobj.c | 294 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- include/drm/drm_syncobj.h | 62 +++-- include/uapi/drm/drm.h | 1 + 4 files changed, 292 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index e9ce623d049e..e78d076f2703 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
On Fri, Sep 14, 2018 at 6:43 PM, Christian König wrote: > Am 14.09.2018 um 18:10 schrieb Daniel Vetter: >> >> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote: >>> >>> Am 14.09.2018 um 12:37 schrieb Chunming Zhou: This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side: This extension introduces a new type of syncobj that has an integer payload identifying a point in a timeline. Such timeline syncobjs support the following operations: * CPU query - A host operation that allows querying the payload of the timeline syncobj. * CPU wait - A host operation that allows a blocking wait for a timeline syncobj to reach a specified value. * Device wait - A device operation that allows waiting for a timeline syncobj to reach a specified value. * Device signal - A device operation that allows advancing the timeline syncobj to a specified value. Since it's a timeline, that means the front time point(PT) always is signaled before the late PT. a. signal PT design: Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled, the timeline will increase to value of PT[N]. b. wait PT design: Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline, so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to perform that. v2: 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the change to drm_syncobj_replace_fence() in a separate patch. 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) v3: 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) a. normal syncobj signal op will create a signal PT to tail of signal pt list. b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT. 2. many bug fix and clean up 3. stub fence moving is moved to other patch. v4: 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. (Christian) 4. fix timeline path issues. 5. write a timeline test in libdrm v5: (Christian) 1. semaphore is called syncobj in kernel side. 2. don't need 'timeline' characters in some function name. 3. keep syncobj cb normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline syncobj is tested by ./amdgpu_test -s 9 Signed-off-by: Chunming Zhou Cc: Christian Konig Cc: Dave Airlie Cc: Daniel Rakos Cc: Daniel Vetter >>> >>> At least on first glance that looks like it should work, going to do a >>> detailed review on Monday. >> >> Just for my understanding, it's all condensed down to 1 patch now? I kinda >> didn't follow the detailed discussion last few days at all :-/ > > > I've already committed all the cleanup/fix prerequisites to drm-misc-next. > > The driver specific implementation needs to come on top and maybe a new CPU > wait IOCTL. > > But essentially this patch is just the core of the kernel implementation. Ah cool, missed that. >> Also, is there a testcase, igt highly preferred (because then we'll run it >> in our intel-gfx CI, and a bunch of people outside of intel have already >> discovered that and are using it). > > > libdrm patches and I think amdgpu based test cases where already published > as well. > > Not sure about igt testcases. I guess we can write them when the intel implementation shows up. Just kinda still hoping that we'd have a more unfified test suite. And not really well-kept secret: We do have an amdgpu in our CI, in the form of kbl-g :-) But unfortunately it's not running the full test set for patches (only for drm-tip). But we could perhaps run more of the amdgpu tests somehow, if there's serious interest. Cheers, Daniel > Christian. > > >> >> Thanks, Daniel >> >>> Christian. >>> --- drivers/gpu/drm/drm_syncobj.c | 294 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- include/drm/drm_syncobj.h | 62 +++-- include/uapi/drm
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
Am 14.09.2018 um 18:10 schrieb Daniel Vetter: On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote: Am 14.09.2018 um 12:37 schrieb Chunming Zhou: This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side: This extension introduces a new type of syncobj that has an integer payload identifying a point in a timeline. Such timeline syncobjs support the following operations: * CPU query - A host operation that allows querying the payload of the timeline syncobj. * CPU wait - A host operation that allows a blocking wait for a timeline syncobj to reach a specified value. * Device wait - A device operation that allows waiting for a timeline syncobj to reach a specified value. * Device signal - A device operation that allows advancing the timeline syncobj to a specified value. Since it's a timeline, that means the front time point(PT) always is signaled before the late PT. a. signal PT design: Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled, the timeline will increase to value of PT[N]. b. wait PT design: Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline, so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to perform that. v2: 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the change to drm_syncobj_replace_fence() in a separate patch. 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) v3: 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) a. normal syncobj signal op will create a signal PT to tail of signal pt list. b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT. 2. many bug fix and clean up 3. stub fence moving is moved to other patch. v4: 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. (Christian) 4. fix timeline path issues. 5. write a timeline test in libdrm v5: (Christian) 1. semaphore is called syncobj in kernel side. 2. don't need 'timeline' characters in some function name. 3. keep syncobj cb normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline syncobj is tested by ./amdgpu_test -s 9 Signed-off-by: Chunming Zhou Cc: Christian Konig Cc: Dave Airlie Cc: Daniel Rakos Cc: Daniel Vetter At least on first glance that looks like it should work, going to do a detailed review on Monday. Just for my understanding, it's all condensed down to 1 patch now? I kinda didn't follow the detailed discussion last few days at all :-/ I've already committed all the cleanup/fix prerequisites to drm-misc-next. The driver specific implementation needs to come on top and maybe a new CPU wait IOCTL. But essentially this patch is just the core of the kernel implementation. Also, is there a testcase, igt highly preferred (because then we'll run it in our intel-gfx CI, and a bunch of people outside of intel have already discovered that and are using it). libdrm patches and I think amdgpu based test cases where already published as well. Not sure about igt testcases. Christian. Thanks, Daniel Christian. --- drivers/gpu/drm/drm_syncobj.c | 294 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- include/drm/drm_syncobj.h | 62 +++-- include/uapi/drm/drm.h | 1 + 4 files changed, 292 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index e9ce623d049e..e78d076f2703 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -56,6 +56,9 @@either #include "drm_internal.h" #include +/* merge normal syncobj to timeline syncobj, the point interval is 1 */ +#define DRM_SYNCOBJ_NORMAL_POINT 1 + struct drm_syncobj_stub_fence { struct dma_fence base; spinlock_t lock; @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = { .release = drm_syncobj_stub_fence_release, }; +struct drm_syncobj_signal_pt { + struct dma_fence_array *base; + u64value; + struct list_head list; +}; /** * drm_syncobj_find - lookup and reference a sync object. @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(s
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote: > Am 14.09.2018 um 12:37 schrieb Chunming Zhou: > > This patch is for VK_KHR_timeline_semaphore extension, semaphore is called > > syncobj in kernel side: > > This extension introduces a new type of syncobj that has an integer payload > > identifying a point in a timeline. Such timeline syncobjs support the > > following operations: > > * CPU query - A host operation that allows querying the payload of the > > timeline syncobj. > > * CPU wait - A host operation that allows a blocking wait for a > > timeline syncobj to reach a specified value. > > * Device wait - A device operation that allows waiting for a > > timeline syncobj to reach a specified value. > > * Device signal - A device operation that allows advancing the > > timeline syncobj to a specified value. > > > > Since it's a timeline, that means the front time point(PT) always is > > signaled before the late PT. > > a. signal PT design: > > Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when > > PT[N] fence is signaled, > > the timeline will increase to value of PT[N]. > > b. wait PT design: > > Wait PT fence is signaled by reaching timeline point value, when timeline > > is increasing, will compare > > wait PTs value with new timeline value, if PT value is lower than timeline > > value, then wait PT will be > > signaled, otherwise keep in list. syncobj wait operation can wait on any > > point of timeline, > > so need a RB tree to order them. And wait PT could ahead of signal PT, we > > need a sumission fence to > > perform that. > > > > v2: > > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) > > 2. move unexposed denitions to .c file. (Daniel Vetter) > > 3. split up the change to drm_syncobj_find_fence() in a separate patch. > > (Christian) > > 4. split up the change to drm_syncobj_replace_fence() in a separate patch. > > 5. drop the submission_fence implementation and instead use wait_event() > > for that. (Christian) > > 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) > > > > v3: > > 1. replace normal syncobj with timeline implemenation. (Vetter and > > Christian) > > a. normal syncobj signal op will create a signal PT to tail of signal > > pt list. > > b. normal syncobj wait op will create a wait pt with last signal > > point, and this wait PT is only signaled by related signal point PT. > > 2. many bug fix and clean up > > 3. stub fence moving is moved to other patch. > > > > v4: > > 1. fix RB tree loop with while(node=rb_first(...)). (Christian) > > 2. fix syncobj lifecycle. (Christian) > > 3. only enable_signaling when there is wait_pt. (Christian) > > 4. fix timeline path issues. > > 5. write a timeline test in libdrm > > > > v5: (Christian) > > 1. semaphore is called syncobj in kernel side. > > 2. don't need 'timeline' characters in some function name. > > 3. keep syncobj cb > > > > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* > > timeline syncobj is tested by ./amdgpu_test -s 9 > > > > Signed-off-by: Chunming Zhou > > Cc: Christian Konig > > Cc: Dave Airlie > > Cc: Daniel Rakos > > Cc: Daniel Vetter > > At least on first glance that looks like it should work, going to do a > detailed review on Monday. Just for my understanding, it's all condensed down to 1 patch now? I kinda didn't follow the detailed discussion last few days at all :-/ Also, is there a testcase, igt highly preferred (because then we'll run it in our intel-gfx CI, and a bunch of people outside of intel have already discovered that and are using it). Thanks, Daniel > > Christian. > > > --- > > drivers/gpu/drm/drm_syncobj.c | 294 ++--- > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- > > include/drm/drm_syncobj.h | 62 +++-- > > include/uapi/drm/drm.h | 1 + > > 4 files changed, 292 insertions(+), 69 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c > > index e9ce623d049e..e78d076f2703 100644 > > --- a/drivers/gpu/drm/drm_syncobj.c > > +++ b/drivers/gpu/drm/drm_syncobj.c > > @@ -56,6 +56,9 @@either > > #include "drm_internal.h" > > #include > > +/* merge normal syncobj to timeline syncobj, the point interval is 1 */ > > +#define DRM_SYNCOBJ_NORMAL_POINT 1 > > + > > struct drm_syncobj_stub_fence { > > struct dma_fence base; > > spinlock_t lock; > > @@ -82,6 +85,11 @@ static const struct dma_fence_ops > > drm_syncobj_stub_fence_ops = { > > .release = drm_syncobj_stub_fence_release, > > }; > > +struct drm_syncobj_signal_pt { > > + struct dma_fence_array *base; > > + u64value; > > + struct list_head list; > > +}; > > /** > >* drm_syncobj_find - lookup and reference a sync object. > > @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct > > drm_syncobj *syncobj, > > { > >
Re: [PATCH] [RFC]drm: add syncobj timeline support v5
Am 14.09.2018 um 12:37 schrieb Chunming Zhou: This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side: This extension introduces a new type of syncobj that has an integer payload identifying a point in a timeline. Such timeline syncobjs support the following operations: * CPU query - A host operation that allows querying the payload of the timeline syncobj. * CPU wait - A host operation that allows a blocking wait for a timeline syncobj to reach a specified value. * Device wait - A device operation that allows waiting for a timeline syncobj to reach a specified value. * Device signal - A device operation that allows advancing the timeline syncobj to a specified value. Since it's a timeline, that means the front time point(PT) always is signaled before the late PT. a. signal PT design: Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled, the timeline will increase to value of PT[N]. b. wait PT design: Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline, so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to perform that. v2: 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the change to drm_syncobj_replace_fence() in a separate patch. 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) v3: 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) a. normal syncobj signal op will create a signal PT to tail of signal pt list. b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT. 2. many bug fix and clean up 3. stub fence moving is moved to other patch. v4: 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. (Christian) 4. fix timeline path issues. 5. write a timeline test in libdrm v5: (Christian) 1. semaphore is called syncobj in kernel side. 2. don't need 'timeline' characters in some function name. 3. keep syncobj cb normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline syncobj is tested by ./amdgpu_test -s 9 Signed-off-by: Chunming Zhou Cc: Christian Konig Cc: Dave Airlie Cc: Daniel Rakos Cc: Daniel Vetter At least on first glance that looks like it should work, going to do a detailed review on Monday. Christian. --- drivers/gpu/drm/drm_syncobj.c | 294 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- include/drm/drm_syncobj.h | 62 +++-- include/uapi/drm/drm.h | 1 + 4 files changed, 292 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index e9ce623d049e..e78d076f2703 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -56,6 +56,9 @@either #include "drm_internal.h" #include +/* merge normal syncobj to timeline syncobj, the point interval is 1 */ +#define DRM_SYNCOBJ_NORMAL_POINT 1 + struct drm_syncobj_stub_fence { struct dma_fence base; spinlock_t lock; @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = { .release = drm_syncobj_stub_fence_release, }; +struct drm_syncobj_signal_pt { + struct dma_fence_array *base; + u64value; + struct list_head list; +}; /** * drm_syncobj_find - lookup and reference a sync object. @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, { int ret; - *fence = drm_syncobj_fence_get(syncobj); + ret = drm_syncobj_search_fence(syncobj, 0, 0, fence); if (*fence) return 1; @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, * have the lock, try one more time just to be sure we don't add a * callback when a fence has already been set. */ - if (syncobj->fence) { - *fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, - lockdep_is_held(&syncobj->lock))); - ret = 1; + if (fence) { + drm_syncobj_search_fence(syncobj, 0, 0, fence); + if (*fence) + ret = 1;
[PATCH] [RFC]drm: add syncobj timeline support v5
This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side: This extension introduces a new type of syncobj that has an integer payload identifying a point in a timeline. Such timeline syncobjs support the following operations: * CPU query - A host operation that allows querying the payload of the timeline syncobj. * CPU wait - A host operation that allows a blocking wait for a timeline syncobj to reach a specified value. * Device wait - A device operation that allows waiting for a timeline syncobj to reach a specified value. * Device signal - A device operation that allows advancing the timeline syncobj to a specified value. Since it's a timeline, that means the front time point(PT) always is signaled before the late PT. a. signal PT design: Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled, the timeline will increase to value of PT[N]. b. wait PT design: Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline, so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to perform that. v2: 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2. move unexposed denitions to .c file. (Daniel Vetter) 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian) 4. split up the change to drm_syncobj_replace_fence() in a separate patch. 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter) v3: 1. replace normal syncobj with timeline implemenation. (Vetter and Christian) a. normal syncobj signal op will create a signal PT to tail of signal pt list. b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT. 2. many bug fix and clean up 3. stub fence moving is moved to other patch. v4: 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2. fix syncobj lifecycle. (Christian) 3. only enable_signaling when there is wait_pt. (Christian) 4. fix timeline path issues. 5. write a timeline test in libdrm v5: (Christian) 1. semaphore is called syncobj in kernel side. 2. don't need 'timeline' characters in some function name. 3. keep syncobj cb normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline syncobj is tested by ./amdgpu_test -s 9 Signed-off-by: Chunming Zhou Cc: Christian Konig Cc: Dave Airlie Cc: Daniel Rakos Cc: Daniel Vetter --- drivers/gpu/drm/drm_syncobj.c | 294 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- include/drm/drm_syncobj.h | 62 +++-- include/uapi/drm/drm.h | 1 + 4 files changed, 292 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index e9ce623d049e..e78d076f2703 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -56,6 +56,9 @@ #include "drm_internal.h" #include +/* merge normal syncobj to timeline syncobj, the point interval is 1 */ +#define DRM_SYNCOBJ_NORMAL_POINT 1 + struct drm_syncobj_stub_fence { struct dma_fence base; spinlock_t lock; @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = { .release = drm_syncobj_stub_fence_release, }; +struct drm_syncobj_signal_pt { + struct dma_fence_array *base; + u64value; + struct list_head list; +}; /** * drm_syncobj_find - lookup and reference a sync object. @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, { int ret; - *fence = drm_syncobj_fence_get(syncobj); + ret = drm_syncobj_search_fence(syncobj, 0, 0, fence); if (*fence) return 1; @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, * have the lock, try one more time just to be sure we don't add a * callback when a fence has already been set. */ - if (syncobj->fence) { - *fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, - lockdep_is_held(&syncobj->lock))); - ret = 1; + if (fence) { + drm_syncobj_search_fence(syncobj, 0, 0, fence); + if (*fence) + ret = 1; } else { *fence = NULL; drm_syncobj_add_callback_locked(syncobj, cb, func); @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj *synco