Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On 28/02/2020 13:46, Michel Dänzer wrote: On 2020-02-28 12:02 p.m., Erik Faye-Lund wrote: On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote: On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund wrote: On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote: Yeah, changes on vulkan drivers or backend compilers should be fairly sandboxed. We also have tools that only work for intel stuff, that should never trigger anything on other people's HW. Could something be worked out using the tags? I think so! We have the pre-defined environment variable CI_MERGE_REQUEST_LABELS, and we can do variable conditions: https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables That sounds like a pretty neat middle-ground to me. I just hope that new pipelines are triggered if new labels are added, because not everyone is allowed to set labels, and sometimes people forget... There's also this which is somewhat more robust: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569 I'm not sure it's more robust, but yeah that a useful tool too. The reason I'm skeptical about the robustness is that we'll miss testing if this misses a path. Surely missing a path will be less likely / often to happen compared to an MR missing a label. (Users which aren't members of the project can't even set labels for an MR) Sounds like a good alternative to tags. -Lionel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On 28/02/2020 11:28, Erik Faye-Lund wrote: On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote: On Fri, 28 Feb 2020 at 07:27, Daniel Vetter wrote: Hi all, You might have read the short take in the X.org board meeting minutes already, here's the long version. The good news: gitlab.fd.o has become very popular with our communities, and is used extensively. This especially includes all the CI integration. Modern development process and tooling, yay! The bad news: The cost in growth has also been tremendous, and it's breaking our bank account. With reasonable estimates for continued growth we're expecting hosting expenses totalling 75k USD this year, and 90k USD next year. With the current sponsors we've set up we can't sustain that. We estimate that hosting expenses for gitlab.fd.o without any of the CI features enabled would total 30k USD, which is within X.org's ability to support through various sponsorships, mostly through XDC. Note that X.org does no longer sponsor any CI runners themselves, we've stopped that. The huge additional expenses are all just in storing and serving build artifacts and images to outside CI runners sponsored by various companies. A related topic is that with the growth in fd.o it's becoming infeasible to maintain it all on volunteer admin time. X.org is therefore also looking for admin sponsorship, at least medium term. Assuming that we want cash flow reserves for one year of gitlab.fd.o (without CI support) and a trimmed XDC and assuming no sponsor payment meanwhile, we'd have to cut CI services somewhere between May and June this year. The board is of course working on acquiring sponsors, but filling a shortfall of this magnitude is neither easy nor quick work, and we therefore decided to give an early warning as soon as possible. Any help in finding sponsors for fd.o is very much appreciated. a) Ouch. b) we probably need to take a large step back here. I kinda agree, but maybe the step doesn't have to be *too* large? I wonder if we could solve this by restructuring the project a bit. I'm talking purely from a Mesa point of view here, so it might not solve the full problem, but: 1. It feels silly that we need to test changes to e.g the i965 driver on dragonboards. We only have a big "do not run CI at all" escape- hatch. Yeah, changes on vulkan drivers or backend compilers should be fairly sandboxed. We also have tools that only work for intel stuff, that should never trigger anything on other people's HW. Could something be worked out using the tags? -Lionel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/syncobj: fix leaking dma_fence in drm_syncobj_query_ioctl
On 22/07/2019 16:21, Christian König wrote: Am 22.07.19 um 15:16 schrieb Lionel Landwerlin: On 22/07/2019 15:59, Christian König wrote: We need to check the context number instead if the previous sequence to detect an error and if an error is detected we need to drop the reference to the current fence or otherwise would leak it. Signed-off-by: Christian König Fixes: 27b575a9aa2f ("drm/syncobj: add timeline payload query ioctl v6") Reviewed-by: Lionel Landwerlin CC stable? I'm not sure when this got upstream. Christian. I thought it would get picked up automatically for the relevant stable version (and none if it's not upstream yet). We also have to be nice to people cherry picking stuff like ChromeOS. -Lionel --- drivers/gpu/drm/drm_syncobj.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 75cb4bb7619e..1438dcb3ebb1 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1298,14 +1298,14 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct dma_fence *iter, *last_signaled = NULL; dma_fence_chain_for_each(iter, fence) { - if (!iter) - break; - dma_fence_put(last_signaled); - last_signaled = dma_fence_get(iter); - if (!to_dma_fence_chain(last_signaled)->prev_seqno) + if (iter->context != fence->context) { + dma_fence_put(iter); /* It is most likely that timeline has * unorder points. */ break; + } + dma_fence_put(last_signaled); + last_signaled = dma_fence_get(iter); } point = dma_fence_is_signaled(last_signaled) ? last_signaled->seqno : ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/syncobj: fix leaking dma_fence in drm_syncobj_query_ioctl
On 22/07/2019 15:59, Christian König wrote: We need to check the context number instead if the previous sequence to detect an error and if an error is detected we need to drop the reference to the current fence or otherwise would leak it. Signed-off-by: Christian König Fixes: 27b575a9aa2f ("drm/syncobj: add timeline payload query ioctl v6") Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/drm_syncobj.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 75cb4bb7619e..1438dcb3ebb1 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1298,14 +1298,14 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct dma_fence *iter, *last_signaled = NULL; dma_fence_chain_for_each(iter, fence) { - if (!iter) - break; - dma_fence_put(last_signaled); - last_signaled = dma_fence_get(iter); - if (!to_dma_fence_chain(last_signaled)->prev_seqno) + if (iter->context != fence->context) { + dma_fence_put(iter); /* It is most likely that timeline has * unorder points. */ break; + } + dma_fence_put(last_signaled); + last_signaled = dma_fence_get(iter); } point = dma_fence_is_signaled(last_signaled) ? last_signaled->seqno : ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm 1/7] addr cs chunk for syncobj timeline
With the small nits, patches 2 & 4 are : Reviewed-by: Lionel Landwerlin The other patches are a bit amdgpu specific so maybe you might want someone more familiar with amdgpu to review them. Still I didn't see anything wrong with them so remaining patches are : Acked-by: Lionel Landwerlin I'll send the IGT stuff shortly. Thanks, -Lionel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm 4/7] add timeline signal/transfer ioctls v2
On 13/05/2019 10:53, Chunming Zhou wrote: v2: use one transfer ioctl Signed-off-by: Chunming Zhou --- xf86drm.c | 33 + xf86drm.h | 6 ++ 2 files changed, 39 insertions(+) diff --git a/xf86drm.c b/xf86drm.c index 17e3d880..acd16fab 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -4257,6 +4257,21 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t *handles, return ret; } +drm_public int drmSyncobjTimelineSignal(int fd, const uint32_t *handles, + uint64_t *points, uint32_t handle_count) +{ +struct drm_syncobj_timeline_array args; +int ret; + +memclear(args); +args.handles = (uintptr_t)handles; +args.points = (uint64_t)(uintptr_t)points; I think you can drop the uint64_t cast. +args.count_handles = handle_count; + +ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, ); +return ret; +} + drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points, unsigned num_handles, int64_t timeout_nsec, unsigned flags, @@ -4299,4 +4314,22 @@ drm_public int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points, return 0; } +drm_public int drmSyncobjTransfer(int fd, + uint32_t dst_handle, uint64_t dst_point, + uint32_t src_handle, uint64_t src_point, + uint32_t flags) +{ +struct drm_syncobj_transfer args; +int ret; + +memclear(args); +args.src_handle = src_handle; +args.dst_handle = dst_handle; +args.src_point = src_point; +args.dst_point = dst_point; +args.flags = flags; + +ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TRANSFER, ); +return ret; +} diff --git a/xf86drm.h b/xf86drm.h index 60c7a84f..3fb1d1ca 100644 --- a/xf86drm.h +++ b/xf86drm.h @@ -876,12 +876,18 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, unsigned num_handles, uint32_t *first_signaled); extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t handle_count); extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t handle_count); +extern int drmSyncobjTimelineSignal(int fd, const uint32_t *handles, + uint64_t *points, uint32_t handle_count); extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points, unsigned num_handles, int64_t timeout_nsec, unsigned flags, uint32_t *first_signaled); extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points, uint32_t handle_count); +extern int drmSyncobjTransfer(int fd, + uint32_t dst_handle, uint64_t dst_point, + uint32_t src_handle, uint64_t src_point, + uint32_t flags); #if defined(__cplusplus) } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm 2/7] add timeline wait/query ioctl v2
On 13/05/2019 10:53, Chunming Zhou wrote: v2: drop export/import Signed-off-by: Chunming Zhou --- xf86drm.c | 44 xf86drm.h | 6 ++ 2 files changed, 50 insertions(+) diff --git a/xf86drm.c b/xf86drm.c index 2c19376b..17e3d880 100644 --- a/xf86drm.c +++ b/xf86drm.c @@ -4256,3 +4256,47 @@ drm_public int drmSyncobjSignal(int fd, const uint32_t *handles, ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_SIGNAL, ); return ret; } + +drm_public int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points, + unsigned num_handles, + int64_t timeout_nsec, unsigned flags, + uint32_t *first_signaled) +{ +struct drm_syncobj_timeline_wait args; +int ret; + +memclear(args); +args.handles = (uintptr_t)handles; +args.points = (uint64_t)(uintptr_t)points; I don't think you need those uintptr_t -> uint64_t casts. +args.timeout_nsec = timeout_nsec; +args.count_handles = num_handles; +args.flags = flags; + +ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, ); +if (ret < 0) +return -errno; + +if (first_signaled) +*first_signaled = args.first_signaled; +return ret; +} + + +drm_public int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points, + uint32_t handle_count) +{ +struct drm_syncobj_timeline_array args; +int ret; + +memclear(args); +args.handles = (uintptr_t)handles; +args.points = (uint64_t)(uintptr_t)points; Same here. +args.count_handles = handle_count; + +ret = drmIoctl(fd, DRM_IOCTL_SYNCOBJ_QUERY, ); +if (ret) +return ret; +return 0; +} + + diff --git a/xf86drm.h b/xf86drm.h index 887ecc76..60c7a84f 100644 --- a/xf86drm.h +++ b/xf86drm.h @@ -876,6 +876,12 @@ extern int drmSyncobjWait(int fd, uint32_t *handles, unsigned num_handles, uint32_t *first_signaled); extern int drmSyncobjReset(int fd, const uint32_t *handles, uint32_t handle_count); extern int drmSyncobjSignal(int fd, const uint32_t *handles, uint32_t handle_count); +extern int drmSyncobjTimelineWait(int fd, uint32_t *handles, uint64_t *points, + unsigned num_handles, + int64_t timeout_nsec, unsigned flags, + uint32_t *first_signaled); +extern int drmSyncobjQuery(int fd, uint32_t *handles, uint64_t *points, + uint32_t handle_count); #if defined(__cplusplus) } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm 1/7] addr cs chunk for syncobj timeline
Sorry for the delay, I'll try to review this tomorrow. -Lionel On 13/05/2019 11:15, zhoucm1 wrote: ping... for patch set. On 2019年05月13日 17:52, Chunming Zhou wrote: [CAUTION: External Email] Signed-off-by: Chunming Zhou --- include/drm/amdgpu_drm.h | 9 + 1 file changed, 9 insertions(+) diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h index d0701ffc..3d0318e6 100644 --- a/include/drm/amdgpu_drm.h +++ b/include/drm/amdgpu_drm.h @@ -528,6 +528,8 @@ struct drm_amdgpu_gem_va { #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05 #define AMDGPU_CHUNK_ID_BO_HANDLES 0x06 #define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES 0x07 +#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT 0x08 +#define AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL 0x09 struct drm_amdgpu_cs_chunk { __u32 chunk_id; @@ -608,6 +610,13 @@ struct drm_amdgpu_cs_chunk_sem { __u32 handle; }; +struct drm_amdgpu_cs_chunk_syncobj { + __u32 handle; + __u32 flags; + __u64 point; +}; + + #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ 0 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNCOBJ_FD 1 #define AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD 2 -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4
On 01/04/2019 11:50, zhoucm1 wrote: On 2019年04月01日 16:19, Lionel Landwerlin wrote: On 01/04/2019 06:54, Zhou, David(ChunMing) wrote: -Original Message- From: Lionel Landwerlin Sent: Saturday, March 30, 2019 10:09 PM To: Koenig, Christian ; Zhou, David(ChunMing) ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; ja...@jlekstrand.net; Hector, Tobias Subject: Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4 On 28/03/2019 15:18, Christian König wrote: Am 28.03.19 um 14:50 schrieb Lionel Landwerlin: On 25/03/2019 08:32, Chunming Zhou wrote: From: Christian König Use the dma_fence_chain object to create a timeline of fence objects instead of just replacing the existing fence. v2: rebase and cleanup v3: fix garbage collection parameters v4: add unorder point check, print a warn calltrace Signed-off-by: Christian König Cc: Lionel Landwerlin --- drivers/gpu/drm/drm_syncobj.c | 39 +++ include/drm/drm_syncobj.h | 5 + 2 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct drm_syncobj *syncobj, spin_unlock(>lock); } +/** + * drm_syncobj_add_point - add new timeline point to the syncobj + * @syncobj: sync object to add timeline point do + * @chain: chain node to use to add the point + * @fence: fence to encapsulate in the chain node + * @point: sequence number to use for the point + * + * Add the chain node as new timeline point to the syncobj. + */ +void drm_syncobj_add_point(struct drm_syncobj *syncobj, + struct dma_fence_chain *chain, + struct dma_fence *fence, + uint64_t point) +{ + struct syncobj_wait_entry *cur, *tmp; + struct dma_fence *prev; + + dma_fence_get(fence); + + spin_lock(>lock); + + prev = drm_syncobj_fence_get(syncobj); + /* You are adding an unorder point to timeline, which could cause payload returned from query_ioctl is 0! */ + WARN_ON_ONCE(prev && prev->seqno >= point); I think the WARN/BUG macros should only fire when there is an issue with programming from within the kernel. But this particular warning can be triggered by an application. Probably best to just remove it? Yeah, that was also my argument against it. Key point here is that we still want to note somehow that userspace did something wrong and returning an error is not an option. Maybe just use DRM_ERROR with a static variable to print the message only once. Christian. I don't really see any point in printing an error once. If you run your application twice you end up thinking there was an issue just on the first run but it's actually always wrong. Except this nitpick, is there any other concern to push whole patch set? Is that time to push whole patch set? -David Looks good to me. Does that mean we can add your RB on patch set so that we can submit the patch set to drm-misc-next branch? Yes : Reviewed-by: Lionel Landwerlin Not too sure about the process for drm-misc-next. Sounds like Sumit Semwal is the go to person according to MAINTAINERS. I have an additional change to make drm_syncobj_find_fence() also return the drm_syncobj : https://github.com/djdeath/linux/commit/0b7732b267b931339d71fe6f493ea6fa4eab453e This is needed in i915 to avoid looking up the drm_syncobj handle twice. Our driver allows to wait on the syncobj's dma_fence that we're then going to replace so we need to get bot the fence & syncobj at the same time. I guess it can go in a follow up series. Yes, agree. Thanks for your effort as well, -David Thanks to you! -Lionel -Lionel Unless we're willing to take the syncobj lock for longer periods of time when adding points, I guess we'll have to defer validation to validation layers. -Lionel -Lionel + dma_fence_chain_init(chain, prev, fence, point); + rcu_assign_pointer(syncobj->fence, >base); + + list_for_each_entry_safe(cur, tmp, >cb_list, node) { + list_del_init(>node); + syncobj_wait_syncobj_func(syncobj, cur); + } + spin_unlock(>lock); + + /* Walk the chain once to trigger garbage collection */ + dma_fence_chain_for_each(fence, prev); + dma_fence_put(prev); +} +EXPORT_SYMBOL(drm_syncobj_add_point); + /** * drm_syncobj_replace_fence - replace fence in a sync object. * @syncobj: Sync object to replace fence in diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index 0311c9fdbd2f..6cf7243a1dc5 100644 --- a/include/drm/drm_syncobj.h +++ b/include/drm/drm_syncobj.h @@ -27,6 +27,7 @@ #define __DRM_SYNCOBJ_H__ #include +#include struct drm_file; @@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_sy
Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4
On 01/04/2019 06:54, Zhou, David(ChunMing) wrote: -Original Message- From: Lionel Landwerlin Sent: Saturday, March 30, 2019 10:09 PM To: Koenig, Christian ; Zhou, David(ChunMing) ; dri-de...@lists.freedesktop.org; amd- g...@lists.freedesktop.org; ja...@jlekstrand.net; Hector, Tobias Subject: Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4 On 28/03/2019 15:18, Christian König wrote: Am 28.03.19 um 14:50 schrieb Lionel Landwerlin: On 25/03/2019 08:32, Chunming Zhou wrote: From: Christian König Use the dma_fence_chain object to create a timeline of fence objects instead of just replacing the existing fence. v2: rebase and cleanup v3: fix garbage collection parameters v4: add unorder point check, print a warn calltrace Signed-off-by: Christian König Cc: Lionel Landwerlin --- drivers/gpu/drm/drm_syncobj.c | 39 +++ include/drm/drm_syncobj.h | 5 + 2 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct drm_syncobj *syncobj, spin_unlock(>lock); } +/** + * drm_syncobj_add_point - add new timeline point to the syncobj + * @syncobj: sync object to add timeline point do + * @chain: chain node to use to add the point + * @fence: fence to encapsulate in the chain node + * @point: sequence number to use for the point + * + * Add the chain node as new timeline point to the syncobj. + */ +void drm_syncobj_add_point(struct drm_syncobj *syncobj, + struct dma_fence_chain *chain, + struct dma_fence *fence, + uint64_t point) +{ + struct syncobj_wait_entry *cur, *tmp; + struct dma_fence *prev; + + dma_fence_get(fence); + + spin_lock(>lock); + + prev = drm_syncobj_fence_get(syncobj); + /* You are adding an unorder point to timeline, which could cause payload returned from query_ioctl is 0! */ + WARN_ON_ONCE(prev && prev->seqno >= point); I think the WARN/BUG macros should only fire when there is an issue with programming from within the kernel. But this particular warning can be triggered by an application. Probably best to just remove it? Yeah, that was also my argument against it. Key point here is that we still want to note somehow that userspace did something wrong and returning an error is not an option. Maybe just use DRM_ERROR with a static variable to print the message only once. Christian. I don't really see any point in printing an error once. If you run your application twice you end up thinking there was an issue just on the first run but it's actually always wrong. Except this nitpick, is there any other concern to push whole patch set? Is that time to push whole patch set? -David Looks good to me. I have an additional change to make drm_syncobj_find_fence() also return the drm_syncobj : https://github.com/djdeath/linux/commit/0b7732b267b931339d71fe6f493ea6fa4eab453e This is needed in i915 to avoid looking up the drm_syncobj handle twice. Our driver allows to wait on the syncobj's dma_fence that we're then going to replace so we need to get bot the fence & syncobj at the same time. I guess it can go in a follow up series. -Lionel Unless we're willing to take the syncobj lock for longer periods of time when adding points, I guess we'll have to defer validation to validation layers. -Lionel -Lionel + dma_fence_chain_init(chain, prev, fence, point); + rcu_assign_pointer(syncobj->fence, >base); + + list_for_each_entry_safe(cur, tmp, >cb_list, node) { + list_del_init(>node); + syncobj_wait_syncobj_func(syncobj, cur); + } + spin_unlock(>lock); + + /* Walk the chain once to trigger garbage collection */ + dma_fence_chain_for_each(fence, prev); + dma_fence_put(prev); +} +EXPORT_SYMBOL(drm_syncobj_add_point); + /** * drm_syncobj_replace_fence - replace fence in a sync object. * @syncobj: Sync object to replace fence in diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index 0311c9fdbd2f..6cf7243a1dc5 100644 --- a/include/drm/drm_syncobj.h +++ b/include/drm/drm_syncobj.h @@ -27,6 +27,7 @@ #define __DRM_SYNCOBJ_H__ #include +#include struct drm_file; @@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_syncobj *syncobj) struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private, u32 handle); +void drm_syncobj_add_point(struct drm_syncobj *syncobj, + struct dma_fence_chain *chain, + struct dma_fence *fence, + uint64_t point); void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *fence); int drm_syncobj_f
Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4
On 28/03/2019 15:18, Christian König wrote: Am 28.03.19 um 14:50 schrieb Lionel Landwerlin: On 25/03/2019 08:32, Chunming Zhou wrote: From: Christian König Use the dma_fence_chain object to create a timeline of fence objects instead of just replacing the existing fence. v2: rebase and cleanup v3: fix garbage collection parameters v4: add unorder point check, print a warn calltrace Signed-off-by: Christian König Cc: Lionel Landwerlin --- drivers/gpu/drm/drm_syncobj.c | 39 +++ include/drm/drm_syncobj.h | 5 + 2 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct drm_syncobj *syncobj, spin_unlock(>lock); } +/** + * drm_syncobj_add_point - add new timeline point to the syncobj + * @syncobj: sync object to add timeline point do + * @chain: chain node to use to add the point + * @fence: fence to encapsulate in the chain node + * @point: sequence number to use for the point + * + * Add the chain node as new timeline point to the syncobj. + */ +void drm_syncobj_add_point(struct drm_syncobj *syncobj, + struct dma_fence_chain *chain, + struct dma_fence *fence, + uint64_t point) +{ + struct syncobj_wait_entry *cur, *tmp; + struct dma_fence *prev; + + dma_fence_get(fence); + + spin_lock(>lock); + + prev = drm_syncobj_fence_get(syncobj); + /* You are adding an unorder point to timeline, which could cause payload returned from query_ioctl is 0! */ + WARN_ON_ONCE(prev && prev->seqno >= point); I think the WARN/BUG macros should only fire when there is an issue with programming from within the kernel. But this particular warning can be triggered by an application. Probably best to just remove it? Yeah, that was also my argument against it. Key point here is that we still want to note somehow that userspace did something wrong and returning an error is not an option. Maybe just use DRM_ERROR with a static variable to print the message only once. Christian. I don't really see any point in printing an error once. If you run your application twice you end up thinking there was an issue just on the first run but it's actually always wrong. Unless we're willing to take the syncobj lock for longer periods of time when adding points, I guess we'll have to defer validation to validation layers. -Lionel -Lionel + dma_fence_chain_init(chain, prev, fence, point); + rcu_assign_pointer(syncobj->fence, >base); + + list_for_each_entry_safe(cur, tmp, >cb_list, node) { + list_del_init(>node); + syncobj_wait_syncobj_func(syncobj, cur); + } + spin_unlock(>lock); + + /* Walk the chain once to trigger garbage collection */ + dma_fence_chain_for_each(fence, prev); + dma_fence_put(prev); +} +EXPORT_SYMBOL(drm_syncobj_add_point); + /** * drm_syncobj_replace_fence - replace fence in a sync object. * @syncobj: Sync object to replace fence in diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index 0311c9fdbd2f..6cf7243a1dc5 100644 --- a/include/drm/drm_syncobj.h +++ b/include/drm/drm_syncobj.h @@ -27,6 +27,7 @@ #define __DRM_SYNCOBJ_H__ #include +#include struct drm_file; @@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_syncobj *syncobj) struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private, u32 handle); +void drm_syncobj_add_point(struct drm_syncobj *syncobj, + struct dma_fence_chain *chain, + struct dma_fence *fence, + uint64_t point); void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *fence); int drm_syncobj_find_fence(struct drm_file *file_private, ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v5
On 28/03/2019 13:33, Lionel Landwerlin wrote: On 28/03/2019 13:08, Chunming Zhou wrote: 在 2019/3/28 20:53, Lionel Landwerlin 写道: On 25/03/2019 08:32, Chunming Zhou wrote: v2: individually allocate chain array, since chain node is free independently. v3: all existing points must be already signaled before cpu perform signal operation, so add check condition for that. v4: remove v3 change and add checking to prevent out-of-order v5: unify binary and timeline Signed-off-by: Chunming Zhou Cc: Tobias Hector Cc: Jason Ekstrand Cc: Dave Airlie Cc: Chris Wilson Cc: Lionel Landwerlin --- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/drm_ioctl.c | 2 + drivers/gpu/drm/drm_syncobj.c | 73 ++ include/uapi/drm/drm.h | 1 + 4 files changed, 78 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index dd11ae5f1eef..d9a483a5fce0 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 92b3b7b2fd81..d337f161909c 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, drm_syncobj_timeline_signal_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index ee2d66e047e7..099596190845 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1183,6 +1183,79 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } +int +drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + struct dma_fence_chain **chains; + uint64_t *points; + uint32_t i, j; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -EOPNOTSUPP; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, + u64_to_user_ptr(args->handles), + args->count_handles, + ); + if (ret < 0) + return ret; + + points = kmalloc_array(args->count_handles, sizeof(*points), + GFP_KERNEL); + if (!points) { + ret = -ENOMEM; + goto out; + } + if (!u64_to_user_ptr(args->points)) { + memset(points, 0, args->count_handles * sizeof(uint64_t)); + } else if (copy_from_user(points, u64_to_user_ptr(args->points), + sizeof(uint64_t) * args->count_handles)) { + ret = -EFAULT; + goto err_points; + } + + chains = kmalloc_array(args->count_handles, sizeof(void *), GFP_KERNEL); + if (!chains) { + ret = -ENOMEM; + goto err_points; + } + for (i = 0; i < args->count_handles; i++) { + chains[i] = kzalloc(sizeof(struct dma_fence_chain), GFP_KERNEL); + if (!chains[i]) { + for (j = 0; j < i; j++) + kfree(chains[j]); + ret = -ENOMEM; + goto err_chains; + } + } + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence *fence = dma_fence_get_stub(); + + drm_syncobj_add_point(syncobjs[i], chains[i], + fence, points[i]); + dma_fence_put(fence); + } I think I found an issue where the implementation doesn't match what the spec requires on host signaling. Consider the following : Timeline semaphore has a value of 4. A device submits a commands to signal point 5. The host signals point 6. The value of the timeline will remain 4 until the device commands complete, at which point it will change to 6. But the specification seems to say that the timeline value should be 6 once
Re: [PATCH] drm/gamma: Clarify gamma lut uapi
On 29/03/2019 09:20, Daniel Vetter wrote: Interpreting it as a 0.16 fixed point means we can't accurately represent 1.0. Which is one of the values we really should be able to represent. Since most (all?) luts have lower precision this will only affect rounding of 0x. Cc: Uma Shankar Cc: Ville Syrjälä Cc: Shashank Sharma Cc: "Kumar, Kiran S" Cc: Kausal Malladi Cc: Lionel Landwerlin Cc: Matt Roper Cc: Rob Bradford Cc: Daniel Stone Cc: Stefan Schake Cc: Eric Anholt Cc: Maarten Lankhorst Cc: Harry Wentland Cc: Leo Li Cc: amd-gfx@lists.freedesktop.org Cc: James (Qian) Wang Cc: Liviu Dudau Cc: Mali DP Maintainers Cc: CK Hu Cc: Philipp Zabel Cc: Yannick Fertre Cc: Philippe Cornu Cc: Benjamin Gaignard Cc: Vincent Abriou Cc: Tomi Valkeinen Cc: Boris Brezillon Signed-off-by: Daniel Vetter Signed-off-by: Daniel Vetter --- include/uapi/drm/drm_mode.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h index 09d72966899a..83cd1636b9be 100644 --- a/include/uapi/drm/drm_mode.h +++ b/include/uapi/drm/drm_mode.h @@ -621,7 +621,8 @@ struct drm_color_ctm { struct drm_color_lut { /* -* Data is U0.16 fixed point format. +* Values are mapped linearly to 0.0 - 1.0 range, with 0x0 == 0.0 and +* 0x == 1.0. */ __u16 red; __u16 green; Thanks, that was the intention when it was introduced. Acked-by: Lionel Landwerlin ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/9] drm/syncobj: add new drm_syncobj_add_point interface v4
On 25/03/2019 08:32, Chunming Zhou wrote: From: Christian König Use the dma_fence_chain object to create a timeline of fence objects instead of just replacing the existing fence. v2: rebase and cleanup v3: fix garbage collection parameters v4: add unorder point check, print a warn calltrace Signed-off-by: Christian König Cc: Lionel Landwerlin --- drivers/gpu/drm/drm_syncobj.c | 39 +++ include/drm/drm_syncobj.h | 5 + 2 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 5329e66598c6..19a9ce638119 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -122,6 +122,45 @@ static void drm_syncobj_remove_wait(struct drm_syncobj *syncobj, spin_unlock(>lock); } +/** + * drm_syncobj_add_point - add new timeline point to the syncobj + * @syncobj: sync object to add timeline point do + * @chain: chain node to use to add the point + * @fence: fence to encapsulate in the chain node + * @point: sequence number to use for the point + * + * Add the chain node as new timeline point to the syncobj. + */ +void drm_syncobj_add_point(struct drm_syncobj *syncobj, + struct dma_fence_chain *chain, + struct dma_fence *fence, + uint64_t point) +{ + struct syncobj_wait_entry *cur, *tmp; + struct dma_fence *prev; + + dma_fence_get(fence); + + spin_lock(>lock); + + prev = drm_syncobj_fence_get(syncobj); + /* You are adding an unorder point to timeline, which could cause payload returned from query_ioctl is 0! */ + WARN_ON_ONCE(prev && prev->seqno >= point); I think the WARN/BUG macros should only fire when there is an issue with programming from within the kernel. But this particular warning can be triggered by an application. Probably best to just remove it? -Lionel + dma_fence_chain_init(chain, prev, fence, point); + rcu_assign_pointer(syncobj->fence, >base); + + list_for_each_entry_safe(cur, tmp, >cb_list, node) { + list_del_init(>node); + syncobj_wait_syncobj_func(syncobj, cur); + } + spin_unlock(>lock); + + /* Walk the chain once to trigger garbage collection */ + dma_fence_chain_for_each(fence, prev); + dma_fence_put(prev); +} +EXPORT_SYMBOL(drm_syncobj_add_point); + /** * drm_syncobj_replace_fence - replace fence in a sync object. * @syncobj: Sync object to replace fence in diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index 0311c9fdbd2f..6cf7243a1dc5 100644 --- a/include/drm/drm_syncobj.h +++ b/include/drm/drm_syncobj.h @@ -27,6 +27,7 @@ #define __DRM_SYNCOBJ_H__ #include +#include struct drm_file; @@ -112,6 +113,10 @@ drm_syncobj_fence_get(struct drm_syncobj *syncobj) struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private, u32 handle); +void drm_syncobj_add_point(struct drm_syncobj *syncobj, + struct dma_fence_chain *chain, + struct dma_fence *fence, + uint64_t point); void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *fence); int drm_syncobj_find_fence(struct drm_file *file_private, ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v5
On 28/03/2019 13:08, Chunming Zhou wrote: 在 2019/3/28 20:53, Lionel Landwerlin 写道: On 25/03/2019 08:32, Chunming Zhou wrote: v2: individually allocate chain array, since chain node is free independently. v3: all existing points must be already signaled before cpu perform signal operation, so add check condition for that. v4: remove v3 change and add checking to prevent out-of-order v5: unify binary and timeline Signed-off-by: Chunming Zhou Cc: Tobias Hector Cc: Jason Ekstrand Cc: Dave Airlie Cc: Chris Wilson Cc: Lionel Landwerlin --- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/drm_ioctl.c | 2 + drivers/gpu/drm/drm_syncobj.c | 73 ++ include/uapi/drm/drm.h | 1 + 4 files changed, 78 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index dd11ae5f1eef..d9a483a5fce0 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 92b3b7b2fd81..d337f161909c 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, drm_syncobj_timeline_signal_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index ee2d66e047e7..099596190845 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1183,6 +1183,79 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } +int +drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + struct dma_fence_chain **chains; + uint64_t *points; + uint32_t i, j; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -EOPNOTSUPP; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, + u64_to_user_ptr(args->handles), + args->count_handles, + ); + if (ret < 0) + return ret; + + points = kmalloc_array(args->count_handles, sizeof(*points), + GFP_KERNEL); + if (!points) { + ret = -ENOMEM; + goto out; + } + if (!u64_to_user_ptr(args->points)) { + memset(points, 0, args->count_handles * sizeof(uint64_t)); + } else if (copy_from_user(points, u64_to_user_ptr(args->points), + sizeof(uint64_t) * args->count_handles)) { + ret = -EFAULT; + goto err_points; + } + + chains = kmalloc_array(args->count_handles, sizeof(void *), GFP_KERNEL); + if (!chains) { + ret = -ENOMEM; + goto err_points; + } + for (i = 0; i < args->count_handles; i++) { + chains[i] = kzalloc(sizeof(struct dma_fence_chain), GFP_KERNEL); + if (!chains[i]) { + for (j = 0; j < i; j++) + kfree(chains[j]); + ret = -ENOMEM; + goto err_chains; + } + } + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence *fence = dma_fence_get_stub(); + + drm_syncobj_add_point(syncobjs[i], chains[i], + fence, points[i]); + dma_fence_put(fence); + } I think I found an issue where the implementation doesn't match what the spec requires on host signaling. Consider the following : Timeline semaphore has a value of 4. A device submits a commands to signal point 5. The host signals point 6. The value of the timeline will remain 4 until the device commands complete, at which point it will change to 6. But the specification seems to say that the timeline value should be 6 once the host signaling completes. The only o
Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v3
On 20/03/2019 03:53, zhoucm1 wrote: On 2019年03月19日 19:54, Lionel Landwerlin wrote: On 15/03/2019 12:09, Chunming Zhou wrote: v2: individually allocate chain array, since chain node is free independently. v3: all existing points must be already signaled before cpu perform signal operation, so add check condition for that. Signed-off-by: Chunming Zhou --- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/drm_ioctl.c | 2 + drivers/gpu/drm/drm_syncobj.c | 103 + include/uapi/drm/drm.h | 1 + 4 files changed, 108 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index dd11ae5f1eef..d9a483a5fce0 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 92b3b7b2fd81..d337f161909c 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, drm_syncobj_timeline_signal_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 306c7b7e2770..eaeb038f97d7 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1183,6 +1183,109 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } +int +drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + struct dma_fence_chain **chains; + uint64_t *points; + uint32_t i, j, timeline_count = 0; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -EOPNOTSUPP; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, + u64_to_user_ptr(args->handles), + args->count_handles, + ); + if (ret < 0) + return ret; + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence_chain *chain; + struct dma_fence *fence; + + fence = drm_syncobj_fence_get(syncobjs[i]); + chain = to_dma_fence_chain(fence); + if (chain) { + struct dma_fence *iter; + + dma_fence_chain_for_each(iter, fence) { + if (!iter) + break; + if (!dma_fence_is_signaled(iter)) { + dma_fence_put(iter); + DRM_ERROR("Client must guarantee all existing timeline points signaled before performing host signal operation!"); + ret = -EPERM; + goto out; Sorry if I'm failing to remember whether we discussed this before. Signaling a point from the host should be fine even if the previous points in the timeline are not signaled. ok, will remove that checking. After all this can happen on the device side as well (out of order signaling). I thought the thing we didn't want is out of order submission. Just checking the last chain node seqno against the host signal point should be enough. What about simply returning -EPERM, we can warn the application from userspace? OK, will add that. Sorry for coming back to this again, but we probably ought to drop this check completely. This is allowed on the device signaling side, so I'm not quite sure what that protects us from. Also we do the check at this point in the signal_timeline_ioctl() function but there is nothing that says that when we later call the add_point() function this condition still holds. -Lionel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https:
Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v3
On 20/03/2019 03:53, zhoucm1 wrote: On 2019年03月19日 19:54, Lionel Landwerlin wrote: On 15/03/2019 12:09, Chunming Zhou wrote: v2: individually allocate chain array, since chain node is free independently. v3: all existing points must be already signaled before cpu perform signal operation, so add check condition for that. Signed-off-by: Chunming Zhou --- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/drm_ioctl.c | 2 + drivers/gpu/drm/drm_syncobj.c | 103 + include/uapi/drm/drm.h | 1 + 4 files changed, 108 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index dd11ae5f1eef..d9a483a5fce0 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 92b3b7b2fd81..d337f161909c 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, drm_syncobj_timeline_signal_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 306c7b7e2770..eaeb038f97d7 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1183,6 +1183,109 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } +int +drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + struct dma_fence_chain **chains; + uint64_t *points; + uint32_t i, j, timeline_count = 0; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -EOPNOTSUPP; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, + u64_to_user_ptr(args->handles), + args->count_handles, + ); + if (ret < 0) + return ret; + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence_chain *chain; + struct dma_fence *fence; + + fence = drm_syncobj_fence_get(syncobjs[i]); + chain = to_dma_fence_chain(fence); + if (chain) { + struct dma_fence *iter; + + dma_fence_chain_for_each(iter, fence) { + if (!iter) + break; + if (!dma_fence_is_signaled(iter)) { + dma_fence_put(iter); + DRM_ERROR("Client must guarantee all existing timeline points signaled before performing host signal operation!"); + ret = -EPERM; + goto out; Sorry if I'm failing to remember whether we discussed this before. Signaling a point from the host should be fine even if the previous points in the timeline are not signaled. ok, will remove that checking. After all this can happen on the device side as well (out of order signaling). I thought the thing we didn't want is out of order submission. Just checking the last chain node seqno against the host signal point should be enough. What about simply returning -EPERM, we can warn the application from userspace? OK, will add that. + } + } + } + } + + points = kmalloc_array(args->count_handles, sizeof(*points), + GFP_KERNEL); + if (!points) { + ret = -ENOMEM; + goto out; + } + if (!u64_to_user_ptr(args->points)) { + memset(points, 0, args->count_handles * sizeof(uint64_t)); + } else if (copy_from_user(points, u64_to_user_ptr(args->points), + sizeof(uint64_t) * args->count_han
Re: [PATCH 1/9] dma-buf: add new dma_fence_chain container v5
On 15/03/2019 12:09, Chunming Zhou wrote: From: Christian König Lockless container implementation similar to a dma_fence_array, but with only two elements per node and automatic garbage collection. v2: properly document dma_fence_chain_for_each, add dma_fence_chain_find_seqno, drop prev reference during garbage collection if it's not a chain fence. v3: use head and iterator for dma_fence_chain_for_each v4: fix reference count in dma_fence_chain_enable_signaling v5: fix iteration when walking each chain node Signed-off-by: Christian König --- drivers/dma-buf/Makefile | 3 +- drivers/dma-buf/dma-fence-chain.c | 241 ++ include/linux/dma-fence-chain.h | 81 ++ 3 files changed, 324 insertions(+), 1 deletion(-) create mode 100644 drivers/dma-buf/dma-fence-chain.c create mode 100644 include/linux/dma-fence-chain.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 0913a6ccab5a..1f006e083eb9 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,4 +1,5 @@ -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ +reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c new file mode 100644 index ..0c5e3c902fa0 --- /dev/null +++ b/drivers/dma-buf/dma-fence-chain.c @@ -0,0 +1,241 @@ +/* + * fence-chain: chain fences together in a timeline + * + * Copyright (C) 2018 Advanced Micro Devices, Inc. + * Authors: + * Christian König + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include + +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence); + +/** + * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence + * @chain: chain node to get the previous node from + * + * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the + * chain node. + */ +static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain *chain) +{ + struct dma_fence *prev; + + rcu_read_lock(); + prev = dma_fence_get_rcu_safe(>prev); + rcu_read_unlock(); + return prev; +} + +/** + * dma_fence_chain_walk - chain walking function + * @fence: current chain node + * + * Walk the chain to the next node. Returns the next fence or NULL if we are at + * the end of the chain. Garbage collects chain nodes which are already + * signaled. + */ +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) +{ + struct dma_fence_chain *chain, *prev_chain; + struct dma_fence *prev, *replacement, *tmp; + + chain = to_dma_fence_chain(fence); + if (!chain) { + dma_fence_put(fence); + return NULL; + } + + while ((prev = dma_fence_chain_get_prev(chain))) { + + prev_chain = to_dma_fence_chain(prev); + if (prev_chain) { + if (!dma_fence_is_signaled(prev_chain->fence)) + break; + + replacement = dma_fence_chain_get_prev(prev_chain); + } else { + if (!dma_fence_is_signaled(prev)) + break; + + replacement = NULL; + } + + tmp = cmpxchg(>prev, prev, replacement); + if (tmp == prev) + dma_fence_put(tmp); + else + dma_fence_put(replacement); + dma_fence_put(prev); + } + + dma_fence_put(fence); + return prev; +} +EXPORT_SYMBOL(dma_fence_chain_walk); + +/** + * dma_fence_chain_find_seqno - find fence chain node by seqno + * @pfence: pointer to the chain node where to start + * @seqno: the sequence number to search for + * + * Advance the fence pointer to the chain node which will signal this sequence + * number. If no sequence number is provided then this is a no-op. + * + * Returns EINVAL if the fence is not a chain node or the sequence number has + * not yet advanced far enough. + */ +int dma_fence_chain_find_seqno(struct dma_fence **pfence, uint64_t seqno) +{ + struct dma_fence_chain *chain; + + if (!seqno) + return 0; + + chain = to_dma_fence_chain(*pfence); + if (!chain || chain->base.seqno < seqno) + return -EINVAL;
Re: [PATCH 8/9] drm/syncobj: add timeline signal ioctl for syncobj v3
On 15/03/2019 12:09, Chunming Zhou wrote: v2: individually allocate chain array, since chain node is free independently. v3: all existing points must be already signaled before cpu perform signal operation, so add check condition for that. Signed-off-by: Chunming Zhou --- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/drm_ioctl.c| 2 + drivers/gpu/drm/drm_syncobj.c | 103 + include/uapi/drm/drm.h | 1 + 4 files changed, 108 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index dd11ae5f1eef..d9a483a5fce0 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -190,6 +190,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 92b3b7b2fd81..d337f161909c 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -696,6 +696,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, drm_syncobj_timeline_signal_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 306c7b7e2770..eaeb038f97d7 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1183,6 +1183,109 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } +int +drm_syncobj_timeline_signal_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + struct dma_fence_chain **chains; + uint64_t *points; + uint32_t i, j, timeline_count = 0; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -EOPNOTSUPP; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, +u64_to_user_ptr(args->handles), +args->count_handles, +); + if (ret < 0) + return ret; + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence_chain *chain; +struct dma_fence *fence; + +fence = drm_syncobj_fence_get(syncobjs[i]); +chain = to_dma_fence_chain(fence); +if (chain) { +struct dma_fence *iter; + +dma_fence_chain_for_each(iter, fence) { +if (!iter) +break; + if (!dma_fence_is_signaled(iter)) { + dma_fence_put(iter); + DRM_ERROR("Client must guarantee all existing timeline points signaled before performing host signal operation!"); + ret = -EPERM; + goto out; Sorry if I'm failing to remember whether we discussed this before. Signaling a point from the host should be fine even if the previous points in the timeline are not signaled. After all this can happen on the device side as well (out of order signaling). I thought the thing we didn't want is out of order submission. Just checking the last chain node seqno against the host signal point should be enough. What about simply returning -EPERM, we can warn the application from userspace? + } +} +} +} + + points = kmalloc_array(args->count_handles, sizeof(*points), + GFP_KERNEL); + if (!points) { + ret = -ENOMEM; + goto out; + } + if (!u64_to_user_ptr(args->points)) { + memset(points, 0,
Re: [PATCH 3/9] drm/syncobj: add support for timeline point wait v8
On 18/03/2019 17:20, Koenig, Christian wrote: - if (dma_fence_is_signaled(entries[i].fence)) { + if (fence) + entries[i].fence = fence; + else + entries[i].fence = dma_fence_get_stub(); + + if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE) || Hey David, Could you remind me what DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE is used for? I didn't have a use for it in userspace, The flag is used to wait for fences for a sequence number to show up. Christian. Thanks Christian. I guess I missed the additive nature of WAIT_FOR_SUBMIT. It feels like WAIT_AVAILABLE almost deserves its own commit as it affects both timelines & binary syncobjs. -Lionel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 3/9] drm/syncobj: add support for timeline point wait v8
On 15/03/2019 12:09, Chunming Zhou wrote: points array is one-to-one match with syncobjs array. v2: add seperate ioctl for timeline point wait, otherwise break uapi. v3: userspace can specify two kinds waits:: a. Wait for time point to be completed. b. and wait for time point to become available v4: rebase v5: add comment for xxx_WAIT_AVAILABLE v6: rebase and rework on new container v7: drop _WAIT_COMPLETED, it is the default anyway v8: correctly handle garbage collected fences Signed-off-by: Chunming Zhou Signed-off-by: Christian König Cc: Daniel Rakos Cc: Jason Ekstrand Cc: Bas Nieuwenhuizen Cc: Dave Airlie Cc: Chris Wilson --- drivers/gpu/drm/drm_internal.h | 2 + drivers/gpu/drm/drm_ioctl.c| 2 + drivers/gpu/drm/drm_syncobj.c | 153 ++--- include/uapi/drm/drm.h | 15 4 files changed, 143 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index 251d67e04c2d..331ac6225b58 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -182,6 +182,8 @@ int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_timeline_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 687943df58e1..c984654646fa 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -688,6 +688,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, drm_syncobj_timeline_wait_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 19a9ce638119..eae51978cda4 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -61,6 +61,7 @@ struct syncobj_wait_entry { struct task_struct *task; struct dma_fence *fence; struct dma_fence_cb fence_cb; + u64point; }; static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj, @@ -95,6 +96,8 @@ EXPORT_SYMBOL(drm_syncobj_find); static void drm_syncobj_fence_add_wait(struct drm_syncobj *syncobj, struct syncobj_wait_entry *wait) { + struct dma_fence *fence; + if (wait->fence) return; @@ -103,11 +106,15 @@ static void drm_syncobj_fence_add_wait(struct drm_syncobj *syncobj, * have the lock, try one more time just to be sure we don't add a * callback when a fence has already been set. */ - if (syncobj->fence) - wait->fence = dma_fence_get( - rcu_dereference_protected(syncobj->fence, 1)); - else + fence = dma_fence_get(rcu_dereference_protected(syncobj->fence, 1)); + if (!fence || dma_fence_chain_find_seqno(, wait->point)) { + dma_fence_put(fence); list_add_tail(>node, >cb_list); + } else if (!fence) { + wait->fence = dma_fence_get_stub(); + } else { + wait->fence = fence; + } spin_unlock(>lock); } @@ -149,10 +156,8 @@ void drm_syncobj_add_point(struct drm_syncobj *syncobj, dma_fence_chain_init(chain, prev, fence, point); rcu_assign_pointer(syncobj->fence, >base); - list_for_each_entry_safe(cur, tmp, >cb_list, node) { - list_del_init(>node); + list_for_each_entry_safe(cur, tmp, >cb_list, node) syncobj_wait_syncobj_func(syncobj, cur); - } spin_unlock(>lock); /* Walk the chain once to trigger garbage collection */ @@ -184,10 +189,8 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, rcu_assign_pointer(syncobj->fence, fence); if (fence != old_fence) { - list_for_each_entry_safe(cur, tmp, >cb_list, node) { - list_del_init(>node); + list_for_each_entry_safe(cur, tmp, >cb_list, node) syncobj_wait_syncobj_func(syncobj, cur); - } } spin_unlock(>lock); @@ -644,13 +647,27 @@
Re: [PATCH 09/11] drm/syncobj: add transition iotcls between binary and timeline
Thanks David, Will give this a go! -Lionel On 19/02/2019 10:46, zhoucm1 wrote: Hi Lionel, the attached should fix your problem and also messed signal order. Hi Christian, Could you have a look if it's reasonable? btw: I pushed to change to https://github.com/amingriyue/timeline-syncobj-kernel, which is already rebased to latest drm-misc(kernel 5.0). You can directly use that branch. -David On 2019年02月19日 01:01, Koenig, Christian wrote: Am 18.02.19 um 13:07 schrieb Lionel Landwerlin: Thanks guys :) You mentioned that signaling out of order is illegal. Is this illegal with regard to the vulkan spec or to the syncobj implementation? David is the expert on that, but as far as I know that is forbidden by the vulkan spec. I'm not finding anything in the vulkan spec that makes out of order signaling illegal. That's why I came up with this test, just verifying that the timeline does not go backward in term of its payload. Well we need to handle this case gracefully in the kernel, so it is still a good testcase. Christian. -Lionel On 18/02/2019 11:01, Koenig, Christian wrote: Hi David, well I think Lionel is testing the invalid signal order on purpose :) Anyway we really need to handle invalid order graceful here. E.g. either the same way as during CS or we abort and return an error message. I think just using the same approach as during CS ist the best we can do. Regards, Christian Am 18.02.2019 11:35 schrieb "Zhou, David(ChunMing)" : Hi Lionel, I checked your igt test case, uint64_t points[5] = { 1, 5, 3, 7, 6 }; which is illegal signal order. I must admit we should handle it gracefully if signal isn't in-order, and we shouldn't lead to deadlock. Hi Christian, Can we just ignore when signal point X <= timeline Y? Or just give a warning? Otherwise like Lionel's unexpected use cases, which easily leads to deadlock. -David On 2019年02月15日 22:28, Lionel Landwerlin wrote: Hi David, Thanks a lot for point me to the tests you've added in IGT. While adding a test with that signals fences imported into a timeline syncobj out of order, I ran into a deadlock. Here is the test : https://github.com/djdeath/intel-gpu-tools/commit/1e46cf7e7bff09b78a24367ddc2314f97eb0a1b9 Trying to kill the deadlocked process I got this backtrace : [ 33.969136] [IGT] syncobj_timeline: starting subtest signal-order [ 60.452823] watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [syncobj_timelin:2021] [ 60.452826] Modules linked in: rfcomm cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio sch_fq_codel ib_iser snd_hda_intel rdma_cm iw_cm snd_hda_codec ib_cm snd_hda_core snd_hwdep intel_rapl snd_pcm ib_core x86_pkg_temp_thermal intel_powerclamp configf s coretemp iscsi_tcp snd_seq_midi libiscsi_tcp snd_seq_midi_event libiscsi kvm_intel scsi_transport_iscsi kvm btusb snd_rawmidi irqbypass btrtl intel_cstate intel_rapl_perf btbcm btintel bluetooth snd_seq snd_seq_device snd_timer input_leds ecdh_generic snd soundcore mei_me mei intel_pch_thermal mac_hid acpi_pad parp ort_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel prime_numbers drm_kms_helper aesni_intel syscopyarea sysfillrect [ 60.452876] sysimgblt fb_sys_fops aes_x86_64 crypto_simd sdhci_pci cryptd drm e1000e glue_helper cqhci sdhci wmi video [ 60.452881] CPU: 6 PID: 2021 Comm: syncobj_timelin Tainted: G U 5.0.0-rc5+ #337 [ 60.452882] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0042.2016.0929.1933 09/29/2016 [ 60.452886] RIP: 0010:dma_fence_chain_walk+0x22c/0x260 [ 60.452888] Code: ff e9 93 fe ff ff 48 8b 45 08 48 8b 40 18 48 85 c0 74 0c 48 89 ef e8 33 0f 58 00 84 c0 75 23 f0 41 ff 4d 00 0f 88 99 87 2f 00 <0f> 85 05 fe ff ff 4c 89 ef e8 56 ea ff ff 48 89 d8 5b 5d 41 5c 41 [ 60.452888] RSP: 0018:9a5804653ca8 EFLAGS: 00010296 ORIG_RAX: ff13 [ 60.452889] RAX: RBX: 8f5690fb2480 RCX: 8f5690fb2f00 [ 60.452890] RDX: 003e3730 RSI: RDI: 8f5690fb2180 [ 60.452891] RBP: 8f5690fb2180 R08: R09: 8f5690fb2eb0 [ 60.452891] R10: R11: 8f5660469860 R12: 8f5690fb2f68 [ 60.452892] R13: 8f5690fb2f00 R14: 0003 R15: 8f5655a45fc0 [ 60.452913] FS: 7fdc5c459980() GS:8f569eb8() knlGS: [ 60.452913] CS: 0010 DS: ES: CR0: 80050033 [ 60.452914] CR2: 7f9d74336dd8 CR3: 00084a67e004 CR4: 003606e0 [ 60.452915] DR0: DR1: DR2: [ 60.452915] DR3: DR6: fffe0ff0 DR7: 0400 [ 60
Re: [PATCH 09/11] drm/syncobj: add transition iotcls between binary and timeline
Thanks guys :) You mentioned that signaling out of order is illegal. Is this illegal with regard to the vulkan spec or to the syncobj implementation? I'm not finding anything in the vulkan spec that makes out of order signaling illegal. That's why I came up with this test, just verifying that the timeline does not go backward in term of its payload. -Lionel On 18/02/2019 11:01, Koenig, Christian wrote: Hi David, well I think Lionel is testing the invalid signal order on purpose :) Anyway we really need to handle invalid order graceful here. E.g. either the same way as during CS or we abort and return an error message. I think just using the same approach as during CS ist the best we can do. Regards, Christian Am 18.02.2019 11:35 schrieb "Zhou, David(ChunMing)" : Hi Lionel, I checked your igt test case, uint64_t points[5] = { 1, 5, 3, 7, 6 }; which is illegal signal order. I must admit we should handle it gracefully if signal isn't in-order, and we shouldn't lead to deadlock. Hi Christian, Can we just ignore when signal point X <= timeline Y? Or just give a warning? Otherwise like Lionel's unexpected use cases, which easily leads to deadlock. -David On 2019年02月15日 22:28, Lionel Landwerlin wrote: Hi David, Thanks a lot for point me to the tests you've added in IGT. While adding a test with that signals fences imported into a timeline syncobj out of order, I ran into a deadlock. Here is the test : https://github.com/djdeath/intel-gpu-tools/commit/1e46cf7e7bff09b78a24367ddc2314f97eb0a1b9 Trying to kill the deadlocked process I got this backtrace : [ 33.969136] [IGT] syncobj_timeline: starting subtest signal-order [ 60.452823] watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [syncobj_timelin:2021] [ 60.452826] Modules linked in: rfcomm cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio sch_fq_codel ib_iser snd_hda_intel rdma_cm iw_cm snd_hda_codec ib_cm snd_hda_core snd_hwdep intel_rapl snd_pcm ib_core x86_pkg_temp_thermal intel_powerclamp configf s coretemp iscsi_tcp snd_seq_midi libiscsi_tcp snd_seq_midi_event libiscsi kvm_intel scsi_transport_iscsi kvm btusb snd_rawmidi irqbypass btrtl intel_cstate intel_rapl_perf btbcm btintel bluetooth snd_seq snd_seq_device snd_timer input_leds ecdh_generic snd soundcore mei_me mei intel_pch_thermal mac_hid acpi_pad parp ort_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel prime_numbers drm_kms_helper aesni_intel syscopyarea sysfillrect [ 60.452876] sysimgblt fb_sys_fops aes_x86_64 crypto_simd sdhci_pci cryptd drm e1000e glue_helper cqhci sdhci wmi video [ 60.452881] CPU: 6 PID: 2021 Comm: syncobj_timelin Tainted: G U 5.0.0-rc5+ #337 [ 60.452882] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0042.2016.0929.1933 09/29/2016 [ 60.452886] RIP: 0010:dma_fence_chain_walk+0x22c/0x260 [ 60.452888] Code: ff e9 93 fe ff ff 48 8b 45 08 48 8b 40 18 48 85 c0 74 0c 48 89 ef e8 33 0f 58 00 84 c0 75 23 f0 41 ff 4d 00 0f 88 99 87 2f 00 <0f> 85 05 fe ff ff 4c 89 ef e8 56 ea ff ff 48 89 d8 5b 5d 41 5c 41 [ 60.452888] RSP: 0018:9a5804653ca8 EFLAGS: 00010296 ORIG_RAX: ff13 [ 60.452889] RAX: RBX: 8f5690fb2480 RCX: 8f5690fb2f00 [ 60.452890] RDX: 003e3730 RSI: RDI: 8f5690fb2180 [ 60.452891] RBP: 8f5690fb2180 R08: R09: 8f5690fb2eb0 [ 60.452891] R10: R11: 8f5660469860 R12: 8f5690fb2f68 [ 60.452892] R13: 8f5690fb2f00 R14: 0003 R15: 8f5655a45fc0 [ 60.452913] FS: 7fdc5c459980() GS:8f569eb8() knlGS: [ 60.452913] CS: 0010 DS: ES: CR0: 80050033 [ 60.452914] CR2: 7f9d74336dd8 CR3: 00084a67e004 CR4: 003606e0 [ 60.452915] DR0: DR1: DR2: [ 60.452915] DR3: DR6: fffe0ff0 DR7: 0400 [ 60.452916] Call Trace: [ 60.452958] drm_syncobj_add_point+0x102/0x160 [drm] [ 60.452965] ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm] [ 60.452971] drm_syncobj_transfer_ioctl+0x10f/0x180 [drm] [ 60.452978] drm_ioctl_kernel+0xac/0xf0 [drm] [ 60.452984] drm_ioctl+0x2eb/0x3b0 [drm] [ 60.452990] ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm] [ 60.452992] ? sw_sync_ioctl+0x347/0x370 [ 60.452994] do_vfs_ioctl+0xa4/0x640 [ 60.452995] ? __fput+0x134/0x220 [ 60.452997] ? do_fcntl+0x1a5/0x650 [ 60.452998] ksys_ioctl+0x70/0x80 [ 60.452999] __x64_sys_ioctl+0x16/0x20 [ 60.453002] do_syscall_64+0x55/0x110 [ 60.453004] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 60.453005] RIP:
Re: [PATCH 06/11] drm/syncobj: add timeline payload query ioctl v4
On 18/02/2019 07:28, Koenig, Christian wrote: Am 18.02.19 um 04:10 schrieb zhoucm1: On 2019年02月17日 03:22, Christian König wrote: Am 15.02.19 um 20:31 schrieb Lionel Landwerlin via amd-gfx: On 07/12/2018 09:55, Chunming Zhou wrote: user mode can query timeline payload. v2: check return value of copy_to_user v3: handle querying entry by entry v4: rebase on new chain container, simplify interface Signed-off-by: Chunming Zhou Cc: Daniel Rakos Cc: Jason Ekstrand Cc: Bas Nieuwenhuizen Cc: Dave Airlie Cc: Christian König Cc: Chris Wilson --- drivers/gpu/drm/drm_internal.h | 2 ++ drivers/gpu/drm/drm_ioctl.c | 2 ++ drivers/gpu/drm/drm_syncobj.c | 43 ++ include/uapi/drm/drm.h | 10 4 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index 18b41e10195c..dab4d5936441 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -184,6 +184,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); /* drm_framebuffer.c */ void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent, diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index a9a17ed35cc4..7578ef6dc1d1 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -681,6 +681,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER|DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 348079bb0965..f97fa00ca1d0 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1061,3 +1061,46 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } + +int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + uint64_t __user *points = u64_to_user_ptr(args->points); + uint32_t i; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -ENODEV; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, + u64_to_user_ptr(args->handles), + args->count_handles, + ); + if (ret < 0) + return ret; + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence_chain *chain; + struct dma_fence *fence; + uint64_t point; + + fence = drm_syncobj_fence_get(syncobjs[i]); + chain = to_dma_fence_chain(fence); + point = chain ? fence->seqno : 0; Sorry, I don' t want to sound annoying, but this looks like this could report values going backward. Well please be annoying as much as you can :) But yeah all that stuff has been discussed before as well. Anything add a point X to a timeline that has reached value Y with X < Y would trigger that. Yes, that can indeed happen. trigger what? when adding x (x < y), then return 0 when query? Why would this happen? No, syncobj->fence should always be there and the last chain node, if it is ever added. Well maybe Lionel should clarify a bit what he means? I thought he is concerned that the call could return values where X < Y, but that doesn't seem to be the case. Christian. I meant something like this : t = create_timeline_syncobj() signal(t, 2) query(t) => 2 signal(t, 1) query(t) => 1 -Lionel -David But adding a timeline point X which is before the already added point Y is illegal in the first place :) So when the application does something stupid and breaks it can just keep the pieces. In the kernel we still do the most defensive thing and sync to everything in this case. I'm just not sure if we should print an error into syslog or just continue silently. Regards, Christian. Either through the submission or userspace signaling or importing another syncpoint's fence. -Lionel + ret = copy_t
Re: [PATCH 06/11] drm/syncobj: add timeline payload query ioctl v4
On 07/12/2018 09:55, Chunming Zhou wrote: user mode can query timeline payload. v2: check return value of copy_to_user v3: handle querying entry by entry v4: rebase on new chain container, simplify interface Signed-off-by: Chunming Zhou Cc: Daniel Rakos Cc: Jason Ekstrand Cc: Bas Nieuwenhuizen Cc: Dave Airlie Cc: Christian König Cc: Chris Wilson --- drivers/gpu/drm/drm_internal.h | 2 ++ drivers/gpu/drm/drm_ioctl.c| 2 ++ drivers/gpu/drm/drm_syncobj.c | 43 ++ include/uapi/drm/drm.h | 10 4 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index 18b41e10195c..dab4d5936441 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -184,6 +184,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); /* drm_framebuffer.c */ void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent, diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index a9a17ed35cc4..7578ef6dc1d1 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -681,6 +681,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER|DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 348079bb0965..f97fa00ca1d0 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1061,3 +1061,46 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } + +int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + uint64_t __user *points = u64_to_user_ptr(args->points); + uint32_t i; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -ENODEV; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, +u64_to_user_ptr(args->handles), +args->count_handles, +); + if (ret < 0) + return ret; + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence_chain *chain; + struct dma_fence *fence; + uint64_t point; + + fence = drm_syncobj_fence_get(syncobjs[i]); + chain = to_dma_fence_chain(fence); + point = chain ? fence->seqno : 0; Sorry, I don' t want to sound annoying, but this looks like this could report values going backward. Anything add a point X to a timeline that has reached value Y with X < Y would trigger that. Either through the submission or userspace signaling or importing another syncpoint's fence. -Lionel + ret = copy_to_user([i], , sizeof(uint64_t)); + ret = ret ? -EFAULT : 0; + if (ret) + break; + } + drm_syncobj_array_free(syncobjs, args->count_handles); + + return ret; +} diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 0092111d002c..b2c36f2b2599 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -767,6 +767,14 @@ struct drm_syncobj_array { __u32 pad; }; +struct drm_syncobj_timeline_array { + __u64 handles; + __u64 points; + __u32 count_handles; + __u32 pad; +}; + + /* Query current scanout sequence number */ struct drm_crtc_get_sequence { __u32 crtc_id; /* requested crtc_id */ @@ -924,6 +932,8 @@ extern "C" { #define DRM_IOCTL_MODE_REVOKE_LEASE DRM_IOWR(0xC9, struct drm_mode_revoke_lease) #define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT DRM_IOWR(0xCA, struct drm_syncobj_timeline_wait) +#define DRM_IOCTL_SYNCOBJ_QUERYDRM_IOWR(0xCB, struct
Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container v4
On 15/02/2019 14:32, Koenig, Christian wrote: Am 15.02.19 um 15:23 schrieb Lionel Landwerlin: Hi Christian, David, For timeline semaphore we need points to signaled in order. I'm struggling to understand how this fence-chain implementation preserves ordering of the seqnos. One of the scenario I can see an issue happening is when you have a timeline with points 1 & 2 and userspace submits for 2 different engines : - first with let's say a blitter style engine on point 2 - then a 3d style engine on point 1 Yeah, and where exactly is the problem? Seqno 1 will signal when the 3d style engine finishes work. And seqno 2 will signal when both seqno 1 is signaled and the blitter style engine has finished its work. That's not really how I understood the spec, but I might be wrong. What makes me thing 1 should be signaled as soon as 2 is signaled (regardless of whether the fence attached on point 1 is been signaled), is that the spec defines wait & signal operations in term of the value of the timeline. -Lionel Another scenario would be signaling a timeline with points 1 & 2 with those points in reverse order in the submission array. That is actually illegal in the spec, but actually handled gracefully as well. E.g. when you add seqno 1 to the syncobj container it will only signal when 2 is signaled as well. Regards, Christian. -Lionel On 07/12/2018 09:55, Chunming Zhou wrote: From: Christian König Lockless container implementation similar to a dma_fence_array, but with only two elements per node and automatic garbage collection. v2: properly document dma_fence_chain_for_each, add dma_fence_chain_find_seqno, drop prev reference during garbage collection if it's not a chain fence. v3: use head and iterator for dma_fence_chain_for_each v4: fix reference count in dma_fence_chain_enable_signaling Signed-off-by: Christian König --- drivers/dma-buf/Makefile | 3 +- drivers/dma-buf/dma-fence-chain.c | 241 ++ include/linux/dma-fence-chain.h | 81 ++ 3 files changed, 324 insertions(+), 1 deletion(-) create mode 100644 drivers/dma-buf/dma-fence-chain.c create mode 100644 include/linux/dma-fence-chain.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 0913a6ccab5a..1f006e083eb9 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,4 +1,5 @@ -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ + reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c new file mode 100644 index ..0c5e3c902fa0 --- /dev/null +++ b/drivers/dma-buf/dma-fence-chain.c @@ -0,0 +1,241 @@ +/* + * fence-chain: chain fences together in a timeline + * + * Copyright (C) 2018 Advanced Micro Devices, Inc. + * Authors: + * Christian König + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include + +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence); + +/** + * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence + * @chain: chain node to get the previous node from + * + * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the + * chain node. + */ +static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain *chain) +{ + struct dma_fence *prev; + + rcu_read_lock(); + prev = dma_fence_get_rcu_safe(>prev); + rcu_read_unlock(); + return prev; +} + +/** + * dma_fence_chain_walk - chain walking function + * @fence: current chain node + * + * Walk the chain to the next node. Returns the next fence or NULL if we are at + * the end of the chain. Garbage collects chain nodes which are already + * signaled. + */ +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) +{ + struct dma_fence_chain *chain, *prev_chain; + struct dma_fence *prev, *replacement, *tmp; + + chain = to_dma_fence_chain(fence); + if (!chain) { + dma_fence_put(fence); + return NULL; + } + + while ((prev = dma_fence_chain_get_prev(chain))) { + + prev_chain = to_dma_fence_chain(prev); + if (prev_chain) { + if (!dma_fence_is_signaled(prev_chain->fence)) + break; + + replacement = dma_fence_chain_get_prev(prev_chain
Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container v4
Hi Christian, David, For timeline semaphore we need points to signaled in order. I'm struggling to understand how this fence-chain implementation preserves ordering of the seqnos. One of the scenario I can see an issue happening is when you have a timeline with points 1 & 2 and userspace submits for 2 different engines : - first with let's say a blitter style engine on point 2 - then a 3d style engine on point 1 Another scenario would be signaling a timeline with points 1 & 2 with those points in reverse order in the submission array. -Lionel On 07/12/2018 09:55, Chunming Zhou wrote: From: Christian König Lockless container implementation similar to a dma_fence_array, but with only two elements per node and automatic garbage collection. v2: properly document dma_fence_chain_for_each, add dma_fence_chain_find_seqno, drop prev reference during garbage collection if it's not a chain fence. v3: use head and iterator for dma_fence_chain_for_each v4: fix reference count in dma_fence_chain_enable_signaling Signed-off-by: Christian König --- drivers/dma-buf/Makefile | 3 +- drivers/dma-buf/dma-fence-chain.c | 241 ++ include/linux/dma-fence-chain.h | 81 ++ 3 files changed, 324 insertions(+), 1 deletion(-) create mode 100644 drivers/dma-buf/dma-fence-chain.c create mode 100644 include/linux/dma-fence-chain.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 0913a6ccab5a..1f006e083eb9 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,4 +1,5 @@ -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ +reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c new file mode 100644 index ..0c5e3c902fa0 --- /dev/null +++ b/drivers/dma-buf/dma-fence-chain.c @@ -0,0 +1,241 @@ +/* + * fence-chain: chain fences together in a timeline + * + * Copyright (C) 2018 Advanced Micro Devices, Inc. + * Authors: + * Christian König + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include + +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence); + +/** + * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence + * @chain: chain node to get the previous node from + * + * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the + * chain node. + */ +static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain *chain) +{ + struct dma_fence *prev; + + rcu_read_lock(); + prev = dma_fence_get_rcu_safe(>prev); + rcu_read_unlock(); + return prev; +} + +/** + * dma_fence_chain_walk - chain walking function + * @fence: current chain node + * + * Walk the chain to the next node. Returns the next fence or NULL if we are at + * the end of the chain. Garbage collects chain nodes which are already + * signaled. + */ +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) +{ + struct dma_fence_chain *chain, *prev_chain; + struct dma_fence *prev, *replacement, *tmp; + + chain = to_dma_fence_chain(fence); + if (!chain) { + dma_fence_put(fence); + return NULL; + } + + while ((prev = dma_fence_chain_get_prev(chain))) { + + prev_chain = to_dma_fence_chain(prev); + if (prev_chain) { + if (!dma_fence_is_signaled(prev_chain->fence)) + break; + + replacement = dma_fence_chain_get_prev(prev_chain); + } else { + if (!dma_fence_is_signaled(prev)) + break; + + replacement = NULL; + } + + tmp = cmpxchg(>prev, prev, replacement); + if (tmp == prev) + dma_fence_put(tmp); + else + dma_fence_put(replacement); + dma_fence_put(prev); + } + + dma_fence_put(fence); + return prev; +} +EXPORT_SYMBOL(dma_fence_chain_walk); + +/** + * dma_fence_chain_find_seqno - find fence chain node by seqno + * @pfence: pointer to the chain node where to start + * @seqno: the sequence number to search for + * + * Advance the fence pointer to the
Re: [PATCH 09/11] drm/syncobj: add transition iotcls between binary and timeline
Hi David, Thanks a lot for point me to the tests you've added in IGT. While adding a test with that signals fences imported into a timeline syncobj out of order, I ran into a deadlock. Here is the test : https://github.com/djdeath/intel-gpu-tools/commit/1e46cf7e7bff09b78a24367ddc2314f97eb0a1b9 Trying to kill the deadlocked process I got this backtrace : [ 33.969136] [IGT] syncobj_timeline: starting subtest signal-order [ 60.452823] watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [syncobj_timelin:2021] [ 60.452826] Modules linked in: rfcomm cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio sch_fq_codel ib_iser snd_hda_intel rdma_cm iw_cm snd_hda_codec ib_cm snd_hda_core snd_hwdep intel_rapl snd_pcm ib_core x86_pkg_temp_thermal intel_powerclamp configf s coretemp iscsi_tcp snd_seq_midi libiscsi_tcp snd_seq_midi_event libiscsi kvm_intel scsi_transport_iscsi kvm btusb snd_rawmidi irqbypass btrtl intel_cstate intel_rapl_perf btbcm btintel bluetooth snd_seq snd_seq_device snd_timer input_leds ecdh_generic snd soundcore mei_me mei intel_pch_thermal mac_hid acpi_pad parp ort_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel prime_numbers drm_kms_helper aesni_intel syscopyarea sysfillrect [ 60.452876] sysimgblt fb_sys_fops aes_x86_64 crypto_simd sdhci_pci cryptd drm e1000e glue_helper cqhci sdhci wmi video [ 60.452881] CPU: 6 PID: 2021 Comm: syncobj_timelin Tainted: G U 5.0.0-rc5+ #337 [ 60.452882] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0042.2016.0929.1933 09/29/2016 [ 60.452886] RIP: 0010:dma_fence_chain_walk+0x22c/0x260 [ 60.452888] Code: ff e9 93 fe ff ff 48 8b 45 08 48 8b 40 18 48 85 c0 74 0c 48 89 ef e8 33 0f 58 00 84 c0 75 23 f0 41 ff 4d 00 0f 88 99 87 2f 00 <0f> 85 05 fe ff ff 4c 89 ef e8 56 ea ff ff 48 89 d8 5b 5d 41 5c 41 [ 60.452888] RSP: 0018:9a5804653ca8 EFLAGS: 00010296 ORIG_RAX: ff13 [ 60.452889] RAX: RBX: 8f5690fb2480 RCX: 8f5690fb2f00 [ 60.452890] RDX: 003e3730 RSI: RDI: 8f5690fb2180 [ 60.452891] RBP: 8f5690fb2180 R08: R09: 8f5690fb2eb0 [ 60.452891] R10: R11: 8f5660469860 R12: 8f5690fb2f68 [ 60.452892] R13: 8f5690fb2f00 R14: 0003 R15: 8f5655a45fc0 [ 60.452913] FS: 7fdc5c459980() GS:8f569eb8() knlGS: [ 60.452913] CS: 0010 DS: ES: CR0: 80050033 [ 60.452914] CR2: 7f9d74336dd8 CR3: 00084a67e004 CR4: 003606e0 [ 60.452915] DR0: DR1: DR2: [ 60.452915] DR3: DR6: fffe0ff0 DR7: 0400 [ 60.452916] Call Trace: [ 60.452958] drm_syncobj_add_point+0x102/0x160 [drm] [ 60.452965] ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm] [ 60.452971] drm_syncobj_transfer_ioctl+0x10f/0x180 [drm] [ 60.452978] drm_ioctl_kernel+0xac/0xf0 [drm] [ 60.452984] drm_ioctl+0x2eb/0x3b0 [drm] [ 60.452990] ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm] [ 60.452992] ? sw_sync_ioctl+0x347/0x370 [ 60.452994] do_vfs_ioctl+0xa4/0x640 [ 60.452995] ? __fput+0x134/0x220 [ 60.452997] ? do_fcntl+0x1a5/0x650 [ 60.452998] ksys_ioctl+0x70/0x80 [ 60.452999] __x64_sys_ioctl+0x16/0x20 [ 60.453002] do_syscall_64+0x55/0x110 [ 60.453004] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 60.453005] RIP: 0033:0x7fdc5b6e45d7 [ 60.453006] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48 [ 60.453007] RSP: 002b:7fff25c4d198 EFLAGS: 0206 ORIG_RAX: 0010 [ 60.453008] RAX: ffda RBX: RCX: 7fdc5b6e45d7 [ 60.453008] RDX: 7fff25c4d200 RSI: c02064cc RDI: 0003 [ 60.453009] RBP: 7fff25c4d1d0 R08: R09: 001e [ 60.453010] R10: R11: 0206 R12: 563d3959e4d0 [ 60.453010] R13: 7fff25c4d620 R14: R15: [ 88.447359] watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [syncobj_timelin:2021] -Lionel On 07/12/2018 09:55, Chunming Zhou wrote: we need to import/export timeline point Signed-off-by: Chunming Zhou --- drivers/gpu/drm/drm_internal.h | 4 +++ drivers/gpu/drm/drm_ioctl.c| 6 drivers/gpu/drm/drm_syncobj.c | 66 ++ include/uapi/drm/drm.h | 10 ++ 4 files changed, 86 insertions(+) diff --git