Re: [PATCH 06/11] drm/syncobj: add timeline payload query ioctl v4
On 07/12/2018 09:55, Chunming Zhou wrote: user mode can query timeline payload. v2: check return value of copy_to_user v3: handle querying entry by entry v4: rebase on new chain container, simplify interface Signed-off-by: Chunming Zhou Cc: Daniel Rakos Cc: Jason Ekstrand Cc: Bas Nieuwenhuizen Cc: Dave Airlie Cc: Christian König Cc: Chris Wilson --- drivers/gpu/drm/drm_internal.h | 2 ++ drivers/gpu/drm/drm_ioctl.c| 2 ++ drivers/gpu/drm/drm_syncobj.c | 43 ++ include/uapi/drm/drm.h | 10 4 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index 18b41e10195c..dab4d5936441 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -184,6 +184,8 @@ int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, struct drm_file *file_private); +int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private); /* drm_framebuffer.c */ void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent, diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index a9a17ed35cc4..7578ef6dc1d1 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -681,6 +681,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl, + DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, drm_crtc_queue_sequence_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, DRM_MASTER|DRM_UNLOCKED), diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 348079bb0965..f97fa00ca1d0 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1061,3 +1061,46 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data, return ret; } + +int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_private) +{ + struct drm_syncobj_timeline_array *args = data; + struct drm_syncobj **syncobjs; + uint64_t __user *points = u64_to_user_ptr(args->points); + uint32_t i; + int ret; + + if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) + return -ENODEV; + + if (args->pad != 0) + return -EINVAL; + + if (args->count_handles == 0) + return -EINVAL; + + ret = drm_syncobj_array_find(file_private, +u64_to_user_ptr(args->handles), +args->count_handles, +&syncobjs); + if (ret < 0) + return ret; + + for (i = 0; i < args->count_handles; i++) { + struct dma_fence_chain *chain; + struct dma_fence *fence; + uint64_t point; + + fence = drm_syncobj_fence_get(syncobjs[i]); + chain = to_dma_fence_chain(fence); + point = chain ? fence->seqno : 0; Sorry, I don' t want to sound annoying, but this looks like this could report values going backward. Anything add a point X to a timeline that has reached value Y with X < Y would trigger that. Either through the submission or userspace signaling or importing another syncpoint's fence. -Lionel + ret = copy_to_user(&points[i], &point, sizeof(uint64_t)); + ret = ret ? -EFAULT : 0; + if (ret) + break; + } + drm_syncobj_array_free(syncobjs, args->count_handles); + + return ret; +} diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 0092111d002c..b2c36f2b2599 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -767,6 +767,14 @@ struct drm_syncobj_array { __u32 pad; }; +struct drm_syncobj_timeline_array { + __u64 handles; + __u64 points; + __u32 count_handles; + __u32 pad; +}; + + /* Query current scanout sequence number */ struct drm_crtc_get_sequence { __u32 crtc_id; /* requested crtc_id */ @@ -924,6 +932,8 @@ extern "C" { #define DRM_IOCTL_MODE_REVOKE_LEASE DRM_IOWR(0xC9, struct drm_mode_revoke_lease) #define DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT DRM_IOWR(0xCA, struct drm_syncobj_timeline_wait) +#define DRM_IOCTL_SYNCOBJ_QUERYDRM_IOWR(0xCB, struct drm_
Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container v4
On 15/02/2019 14:32, Koenig, Christian wrote: Am 15.02.19 um 15:23 schrieb Lionel Landwerlin: Hi Christian, David, For timeline semaphore we need points to signaled in order. I'm struggling to understand how this fence-chain implementation preserves ordering of the seqnos. One of the scenario I can see an issue happening is when you have a timeline with points 1 & 2 and userspace submits for 2 different engines : - first with let's say a blitter style engine on point 2 - then a 3d style engine on point 1 Yeah, and where exactly is the problem? Seqno 1 will signal when the 3d style engine finishes work. And seqno 2 will signal when both seqno 1 is signaled and the blitter style engine has finished its work. That's not really how I understood the spec, but I might be wrong. What makes me thing 1 should be signaled as soon as 2 is signaled (regardless of whether the fence attached on point 1 is been signaled), is that the spec defines wait & signal operations in term of the value of the timeline. -Lionel Another scenario would be signaling a timeline with points 1 & 2 with those points in reverse order in the submission array. That is actually illegal in the spec, but actually handled gracefully as well. E.g. when you add seqno 1 to the syncobj container it will only signal when 2 is signaled as well. Regards, Christian. -Lionel On 07/12/2018 09:55, Chunming Zhou wrote: From: Christian König Lockless container implementation similar to a dma_fence_array, but with only two elements per node and automatic garbage collection. v2: properly document dma_fence_chain_for_each, add dma_fence_chain_find_seqno, drop prev reference during garbage collection if it's not a chain fence. v3: use head and iterator for dma_fence_chain_for_each v4: fix reference count in dma_fence_chain_enable_signaling Signed-off-by: Christian König --- drivers/dma-buf/Makefile | 3 +- drivers/dma-buf/dma-fence-chain.c | 241 ++ include/linux/dma-fence-chain.h | 81 ++ 3 files changed, 324 insertions(+), 1 deletion(-) create mode 100644 drivers/dma-buf/dma-fence-chain.c create mode 100644 include/linux/dma-fence-chain.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 0913a6ccab5a..1f006e083eb9 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,4 +1,5 @@ -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ + reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c new file mode 100644 index ..0c5e3c902fa0 --- /dev/null +++ b/drivers/dma-buf/dma-fence-chain.c @@ -0,0 +1,241 @@ +/* + * fence-chain: chain fences together in a timeline + * + * Copyright (C) 2018 Advanced Micro Devices, Inc. + * Authors: + * Christian König + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include + +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence); + +/** + * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence + * @chain: chain node to get the previous node from + * + * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the + * chain node. + */ +static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain *chain) +{ + struct dma_fence *prev; + + rcu_read_lock(); + prev = dma_fence_get_rcu_safe(&chain->prev); + rcu_read_unlock(); + return prev; +} + +/** + * dma_fence_chain_walk - chain walking function + * @fence: current chain node + * + * Walk the chain to the next node. Returns the next fence or NULL if we are at + * the end of the chain. Garbage collects chain nodes which are already + * signaled. + */ +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) +{ + struct dma_fence_chain *chain, *prev_chain; + struct dma_fence *prev, *replacement, *tmp; + + chain = to_dma_fence_chain(fence); + if (!chain) { + dma_fence_put(fence); + return NULL; + } + + while ((prev = dma_fence_chain_get_prev(chain))) { + + prev_chain = to_dma_fence_chain(prev); + if (prev_chain) { + if (!dma_fence_is_signaled(prev_chain->fence)) + break; + + replacement = dma_fence_chain_get_prev(prev_chain); + } else { +
Re: [PATCH 02/11] dma-buf: add new dma_fence_chain container v4
Hi Christian, David, For timeline semaphore we need points to signaled in order. I'm struggling to understand how this fence-chain implementation preserves ordering of the seqnos. One of the scenario I can see an issue happening is when you have a timeline with points 1 & 2 and userspace submits for 2 different engines : - first with let's say a blitter style engine on point 2 - then a 3d style engine on point 1 Another scenario would be signaling a timeline with points 1 & 2 with those points in reverse order in the submission array. -Lionel On 07/12/2018 09:55, Chunming Zhou wrote: From: Christian König Lockless container implementation similar to a dma_fence_array, but with only two elements per node and automatic garbage collection. v2: properly document dma_fence_chain_for_each, add dma_fence_chain_find_seqno, drop prev reference during garbage collection if it's not a chain fence. v3: use head and iterator for dma_fence_chain_for_each v4: fix reference count in dma_fence_chain_enable_signaling Signed-off-by: Christian König --- drivers/dma-buf/Makefile | 3 +- drivers/dma-buf/dma-fence-chain.c | 241 ++ include/linux/dma-fence-chain.h | 81 ++ 3 files changed, 324 insertions(+), 1 deletion(-) create mode 100644 drivers/dma-buf/dma-fence-chain.c create mode 100644 include/linux/dma-fence-chain.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 0913a6ccab5a..1f006e083eb9 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,4 +1,5 @@ -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ +reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c new file mode 100644 index ..0c5e3c902fa0 --- /dev/null +++ b/drivers/dma-buf/dma-fence-chain.c @@ -0,0 +1,241 @@ +/* + * fence-chain: chain fences together in a timeline + * + * Copyright (C) 2018 Advanced Micro Devices, Inc. + * Authors: + * Christian König + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include + +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence); + +/** + * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence + * @chain: chain node to get the previous node from + * + * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the + * chain node. + */ +static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain *chain) +{ + struct dma_fence *prev; + + rcu_read_lock(); + prev = dma_fence_get_rcu_safe(&chain->prev); + rcu_read_unlock(); + return prev; +} + +/** + * dma_fence_chain_walk - chain walking function + * @fence: current chain node + * + * Walk the chain to the next node. Returns the next fence or NULL if we are at + * the end of the chain. Garbage collects chain nodes which are already + * signaled. + */ +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) +{ + struct dma_fence_chain *chain, *prev_chain; + struct dma_fence *prev, *replacement, *tmp; + + chain = to_dma_fence_chain(fence); + if (!chain) { + dma_fence_put(fence); + return NULL; + } + + while ((prev = dma_fence_chain_get_prev(chain))) { + + prev_chain = to_dma_fence_chain(prev); + if (prev_chain) { + if (!dma_fence_is_signaled(prev_chain->fence)) + break; + + replacement = dma_fence_chain_get_prev(prev_chain); + } else { + if (!dma_fence_is_signaled(prev)) + break; + + replacement = NULL; + } + + tmp = cmpxchg(&chain->prev, prev, replacement); + if (tmp == prev) + dma_fence_put(tmp); + else + dma_fence_put(replacement); + dma_fence_put(prev); + } + + dma_fence_put(fence); + return prev; +} +EXPORT_SYMBOL(dma_fence_chain_walk); + +/** + * dma_fence_chain_find_seqno - find fence chain node by seqno + * @pfence: pointer to the chain node where to start + * @seqno: the sequence number to search for + * + * Advance the fence point
Re: [PATCH 09/11] drm/syncobj: add transition iotcls between binary and timeline
Hi David, Thanks a lot for point me to the tests you've added in IGT. While adding a test with that signals fences imported into a timeline syncobj out of order, I ran into a deadlock. Here is the test : https://github.com/djdeath/intel-gpu-tools/commit/1e46cf7e7bff09b78a24367ddc2314f97eb0a1b9 Trying to kill the deadlocked process I got this backtrace : [ 33.969136] [IGT] syncobj_timeline: starting subtest signal-order [ 60.452823] watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [syncobj_timelin:2021] [ 60.452826] Modules linked in: rfcomm cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio sch_fq_codel ib_iser snd_hda_intel rdma_cm iw_cm snd_hda_codec ib_cm snd_hda_core snd_hwdep intel_rapl snd_pcm ib_core x86_pkg_temp_thermal intel_powerclamp configf s coretemp iscsi_tcp snd_seq_midi libiscsi_tcp snd_seq_midi_event libiscsi kvm_intel scsi_transport_iscsi kvm btusb snd_rawmidi irqbypass btrtl intel_cstate intel_rapl_perf btbcm btintel bluetooth snd_seq snd_seq_device snd_timer input_leds ecdh_generic snd soundcore mei_me mei intel_pch_thermal mac_hid acpi_pad parp ort_pc ppdev lp parport ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel prime_numbers drm_kms_helper aesni_intel syscopyarea sysfillrect [ 60.452876] sysimgblt fb_sys_fops aes_x86_64 crypto_simd sdhci_pci cryptd drm e1000e glue_helper cqhci sdhci wmi video [ 60.452881] CPU: 6 PID: 2021 Comm: syncobj_timelin Tainted: G U 5.0.0-rc5+ #337 [ 60.452882] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0042.2016.0929.1933 09/29/2016 [ 60.452886] RIP: 0010:dma_fence_chain_walk+0x22c/0x260 [ 60.452888] Code: ff e9 93 fe ff ff 48 8b 45 08 48 8b 40 18 48 85 c0 74 0c 48 89 ef e8 33 0f 58 00 84 c0 75 23 f0 41 ff 4d 00 0f 88 99 87 2f 00 <0f> 85 05 fe ff ff 4c 89 ef e8 56 ea ff ff 48 89 d8 5b 5d 41 5c 41 [ 60.452888] RSP: 0018:9a5804653ca8 EFLAGS: 00010296 ORIG_RAX: ff13 [ 60.452889] RAX: RBX: 8f5690fb2480 RCX: 8f5690fb2f00 [ 60.452890] RDX: 003e3730 RSI: RDI: 8f5690fb2180 [ 60.452891] RBP: 8f5690fb2180 R08: R09: 8f5690fb2eb0 [ 60.452891] R10: R11: 8f5660469860 R12: 8f5690fb2f68 [ 60.452892] R13: 8f5690fb2f00 R14: 0003 R15: 8f5655a45fc0 [ 60.452913] FS: 7fdc5c459980() GS:8f569eb8() knlGS: [ 60.452913] CS: 0010 DS: ES: CR0: 80050033 [ 60.452914] CR2: 7f9d74336dd8 CR3: 00084a67e004 CR4: 003606e0 [ 60.452915] DR0: DR1: DR2: [ 60.452915] DR3: DR6: fffe0ff0 DR7: 0400 [ 60.452916] Call Trace: [ 60.452958] drm_syncobj_add_point+0x102/0x160 [drm] [ 60.452965] ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm] [ 60.452971] drm_syncobj_transfer_ioctl+0x10f/0x180 [drm] [ 60.452978] drm_ioctl_kernel+0xac/0xf0 [drm] [ 60.452984] drm_ioctl+0x2eb/0x3b0 [drm] [ 60.452990] ? drm_syncobj_fd_to_handle_ioctl+0x1b0/0x1b0 [drm] [ 60.452992] ? sw_sync_ioctl+0x347/0x370 [ 60.452994] do_vfs_ioctl+0xa4/0x640 [ 60.452995] ? __fput+0x134/0x220 [ 60.452997] ? do_fcntl+0x1a5/0x650 [ 60.452998] ksys_ioctl+0x70/0x80 [ 60.452999] __x64_sys_ioctl+0x16/0x20 [ 60.453002] do_syscall_64+0x55/0x110 [ 60.453004] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 60.453005] RIP: 0033:0x7fdc5b6e45d7 [ 60.453006] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48 [ 60.453007] RSP: 002b:7fff25c4d198 EFLAGS: 0206 ORIG_RAX: 0010 [ 60.453008] RAX: ffda RBX: RCX: 7fdc5b6e45d7 [ 60.453008] RDX: 7fff25c4d200 RSI: c02064cc RDI: 0003 [ 60.453009] RBP: 7fff25c4d1d0 R08: R09: 001e [ 60.453010] R10: R11: 0206 R12: 563d3959e4d0 [ 60.453010] R13: 7fff25c4d620 R14: R15: [ 88.447359] watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [syncobj_timelin:2021] -Lionel On 07/12/2018 09:55, Chunming Zhou wrote: we need to import/export timeline point Signed-off-by: Chunming Zhou --- drivers/gpu/drm/drm_internal.h | 4 +++ drivers/gpu/drm/drm_ioctl.c| 6 drivers/gpu/drm/drm_syncobj.c | 66 ++ include/uapi/drm/drm.h | 10 ++ 4 files changed, 86 insertions(+) diff --git a/drivers/gpu/drm/drm_interna