On Wed, May 25, 2022 at 10:43 AM Daniel Vetter <dan...@ffwll.ch> wrote:

> On Wed, May 25, 2022 at 10:35:47AM -0500, Jason Ekstrand wrote:
> > On Wed, May 25, 2022 at 8:20 AM Daniel Vetter <dan...@ffwll.ch> wrote:
> >
> > > On Mon, May 09, 2022 at 07:54:19AM +0200, Christian König wrote:
> > > > Reviewed-by: Christian König <christian.koe...@amd.com> for the
> series.
> > > >
> > > > I assume you have the userspace part ready as well? If yes let's push
> > > this
> > > > to drm-misc-next asap.
> > >
> > > Hopefully I'm not too late, but I think all my review has also been
> > > addressed. On the series:
> > >
> > > Reviewed-by: Daniel Vetter <daniel.vet...@ffwll.ch>
> > >
> >
> > Thanks!  If Christian hasn't already, can we get this in drm-misc-next
> > please?  I don't have access AFAIK.
>
> We need to fix this?
>

I don't do enough kernel dev to be worth giving access, I don't think.
It's infrequent enough that I'm going to have to ask someone else how to
use the tools to push stuff every time anyway.

--Jason



> -Daniel
> >
> > --Jason
> >
> >
> >
> > > >
> > > > Christian.
> > > >
> > > > Am 06.05.22 um 20:02 schrieb Jason Ekstrand:
> > > > > Modern userspace APIs like Vulkan are built on an explicit
> > > > > synchronization model.  This doesn't always play nicely with the
> > > > > implicit synchronization used in the kernel and assumed by X11 and
> > > > > Wayland.  The client -> compositor half of the synchronization
> isn't
> > > too
> > > > > bad, at least on intel, because we can control whether or not i915
> > > > > synchronizes on the buffer and whether or not it's considered
> written.
> > > > >
> > > > > The harder part is the compositor -> client synchronization when
> we get
> > > > > the buffer back from the compositor.  We're required to be able to
> > > > > provide the client with a VkSemaphore and VkFence representing the
> > > point
> > > > > in time where the window system (compositor and/or display)
> finished
> > > > > using the buffer.  With current APIs, it's very hard to do this in
> such
> > > > > a way that we don't get confused by the Vulkan driver's access of
> the
> > > > > buffer.  In particular, once we tell the kernel that we're
> rendering to
> > > > > the buffer again, any CPU waits on the buffer or GPU dependencies
> will
> > > > > wait on some of the client rendering and not just the compositor.
> > > > >
> > > > > This new IOCTL solves this problem by allowing us to get a
> snapshot of
> > > > > the implicit synchronization state of a given dma-buf in the form
> of a
> > > > > sync file.  It's effectively the same as a poll() or I915_GEM_WAIT
> > > only,
> > > > > instead of CPU waiting directly, it encapsulates the wait
> operation, at
> > > > > the current moment in time, in a sync_file so we can check/wait on
> it
> > > > > later.  As long as the Vulkan driver does the sync_file export
> from the
> > > > > dma-buf before we re-introduce it for rendering, it will only
> contain
> > > > > fences from the compositor or display.  This allows to accurately
> turn
> > > > > it into a VkFence or VkSemaphore without any over-synchronization.
> > > > >
> > > > > By making this an ioctl on the dma-buf itself, it allows this new
> > > > > functionality to be used in an entirely driver-agnostic way without
> > > > > having access to a DRM fd. This makes it ideal for use in
> > > driver-generic
> > > > > code in Mesa or in a client such as a compositor where the DRM fd
> may
> > > be
> > > > > hard to reach.
> > > > >
> > > > > v2 (Jason Ekstrand):
> > > > >   - Use a wrapper dma_fence_array of all fences including the new
> one
> > > > >     when importing an exclusive fence.
> > > > >
> > > > > v3 (Jason Ekstrand):
> > > > >   - Lock around setting shared fences as well as exclusive
> > > > >   - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
> > > > >   - Initialize ret to 0 in dma_buf_wait_sync_file
> > > > >
> > > > > v4 (Jason Ekstrand):
> > > > >   - Use the new dma_resv_get_singleton helper
> > > > >
> > > > > v5 (Jason Ekstrand):
> > > > >   - Rename the IOCTLs to import/export rather than wait/signal
> > > > >   - Drop the WRITE flag and always get/set the exclusive fence
> > > > >
> > > > > v6 (Jason Ekstrand):
> > > > >   - Drop the sync_file import as it was all-around sketchy and not
> > > nearly
> > > > >     as useful as import.
> > > > >   - Re-introduce READ/WRITE flag support for export
> > > > >   - Rework the commit message
> > > > >
> > > > > v7 (Jason Ekstrand):
> > > > >   - Require at least one sync flag
> > > > >   - Fix a refcounting bug: dma_resv_get_excl() doesn't take a
> reference
> > > > >   - Use _rcu helpers since we're accessing the dma_resv read-only
> > > > >
> > > > > v8 (Jason Ekstrand):
> > > > >   - Return -ENOMEM if the sync_file_create fails
> > > > >   - Predicate support on IS_ENABLED(CONFIG_SYNC_FILE)
> > > > >
> > > > > v9 (Jason Ekstrand):
> > > > >   - Add documentation for the new ioctl
> > > > >
> > > > > v10 (Jason Ekstrand):
> > > > >   - Go back to dma_buf_sync_file as the ioctl struct name
> > > > >
> > > > > v11 (Daniel Vetter):
> > > > >   - Go back to dma_buf_export_sync_file as the ioctl struct name
> > > > >   - Better kerneldoc describing what the read/write flags do
> > > > >
> > > > > v12 (Christian König):
> > > > >   - Document why we chose to make it an ioctl on dma-buf
> > > > >
> > > > > v13 (Jason Ekstrand):
> > > > >   - Rebase on Christian König's fence rework
> > > > >
> > > > > v14 (Daniel Vetter & Christian König):
> > > > >   - Use dma_rev_usage_rw to get the properly inverted usage to
> pass to
> > > > >     dma_resv_get_singleton()
> > > > >   - Clean up the sync_file and fd if copy_to_user() fails
> > > > >
> > > > > Signed-off-by: Jason Ekstrand <ja...@jlekstrand.net>
> > > > > Signed-off-by: Jason Ekstrand <jason.ekstr...@intel.com>
> > > > > Signed-off-by: Jason Ekstrand <jason.ekstr...@collabora.com>
> > > > > Acked-by: Simon Ser <cont...@emersion.fr>
> > > > > Acked-by: Christian König <christian.koe...@amd.com>
> > > > > Reviewed-by: Daniel Vetter <daniel.vet...@ffwll.ch>
> > > > > Cc: Sumit Semwal <sumit.sem...@linaro.org>
> > > > > Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
> > > > > ---
> > > > >   drivers/dma-buf/dma-buf.c    | 67
> > > ++++++++++++++++++++++++++++++++++++
> > > > >   include/uapi/linux/dma-buf.h | 35 +++++++++++++++++++
> > > > >   2 files changed, 102 insertions(+)
> > > > >
> > > > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > > > > index 79795857be3e..6ff54f7e6119 100644
> > > > > --- a/drivers/dma-buf/dma-buf.c
> > > > > +++ b/drivers/dma-buf/dma-buf.c
> > > > > @@ -20,6 +20,7 @@
> > > > >   #include <linux/debugfs.h>
> > > > >   #include <linux/module.h>
> > > > >   #include <linux/seq_file.h>
> > > > > +#include <linux/sync_file.h>
> > > > >   #include <linux/poll.h>
> > > > >   #include <linux/dma-resv.h>
> > > > >   #include <linux/mm.h>
> > > > > @@ -192,6 +193,9 @@ static loff_t dma_buf_llseek(struct file *file,
> > > loff_t offset, int whence)
> > > > >    * Note that this only signals the completion of the respective
> > > fences, i.e. the
> > > > >    * DMA transfers are complete. Cache flushing and any other
> necessary
> > > > >    * preparations before CPU access can begin still need to happen.
> > > > > + *
> > > > > + * As an alternative to poll(), the set of fences on DMA buffer
> can be
> > > > > + * exported as a &sync_file using &dma_buf_sync_file_export.
> > > > >    */
> > > > >   static void dma_buf_poll_cb(struct dma_fence *fence, struct
> > > dma_fence_cb *cb)
> > > > > @@ -326,6 +330,64 @@ static long dma_buf_set_name(struct dma_buf
> > > *dmabuf, const char __user *buf)
> > > > >     return 0;
> > > > >   }
> > > > > +#if IS_ENABLED(CONFIG_SYNC_FILE)
> > > > > +static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
> > > > > +                                void __user *user_data)
> > > > > +{
> > > > > +   struct dma_buf_export_sync_file arg;
> > > > > +   enum dma_resv_usage usage;
> > > > > +   struct dma_fence *fence = NULL;
> > > > > +   struct sync_file *sync_file;
> > > > > +   int fd, ret;
> > > > > +
> > > > > +   if (copy_from_user(&arg, user_data, sizeof(arg)))
> > > > > +           return -EFAULT;
> > > > > +
> > > > > +   if (arg.flags & ~DMA_BUF_SYNC_RW)
> > > > > +           return -EINVAL;
> > > > > +
> > > > > +   if ((arg.flags & DMA_BUF_SYNC_RW) == 0)
> > > > > +           return -EINVAL;
> > > > > +
> > > > > +   fd = get_unused_fd_flags(O_CLOEXEC);
> > > > > +   if (fd < 0)
> > > > > +           return fd;
> > > > > +
> > > > > +   usage = dma_resv_usage_rw(arg.flags & DMA_BUF_SYNC_WRITE);
> > > > > +   ret = dma_resv_get_singleton(dmabuf->resv, usage, &fence);
> > > > > +   if (ret)
> > > > > +           goto err_put_fd;
> > > > > +
> > > > > +   if (!fence)
> > > > > +           fence = dma_fence_get_stub();
> > > > > +
> > > > > +   sync_file = sync_file_create(fence);
> > > > > +
> > > > > +   dma_fence_put(fence);
> > > > > +
> > > > > +   if (!sync_file) {
> > > > > +           ret = -ENOMEM;
> > > > > +           goto err_put_fd;
> > > > > +   }
> > > > > +
> > > > > +   arg.fd = fd;
> > > > > +   if (copy_to_user(user_data, &arg, sizeof(arg))) {
> > > > > +           ret = -EFAULT;
> > > > > +           goto err_put_file;
> > > > > +   }
> > > > > +
> > > > > +   fd_install(fd, sync_file->file);
> > > > > +
> > > > > +   return 0;
> > > > > +
> > > > > +err_put_file:
> > > > > +   fput(sync_file->file);
> > > > > +err_put_fd:
> > > > > +   put_unused_fd(fd);
> > > > > +   return ret;
> > > > > +}
> > > > > +#endif
> > > > > +
> > > > >   static long dma_buf_ioctl(struct file *file,
> > > > >                       unsigned int cmd, unsigned long arg)
> > > > >   {
> > > > > @@ -369,6 +431,11 @@ static long dma_buf_ioctl(struct file *file,
> > > > >     case DMA_BUF_SET_NAME_B:
> > > > >             return dma_buf_set_name(dmabuf, (const char __user
> *)arg);
> > > > > +#if IS_ENABLED(CONFIG_SYNC_FILE)
> > > > > +   case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
> > > > > +           return dma_buf_export_sync_file(dmabuf, (void __user
> > > *)arg);
> > > > > +#endif
> > > > > +
> > > > >     default:
> > > > >             return -ENOTTY;
> > > > >     }
> > > > > diff --git a/include/uapi/linux/dma-buf.h
> > > b/include/uapi/linux/dma-buf.h
> > > > > index 8e4a2ca0bcbf..46f1e3e98b02 100644
> > > > > --- a/include/uapi/linux/dma-buf.h
> > > > > +++ b/include/uapi/linux/dma-buf.h
> > > > > @@ -85,6 +85,40 @@ struct dma_buf_sync {
> > > > >   #define DMA_BUF_NAME_LEN  32
> > > > > +/**
> > > > > + * struct dma_buf_export_sync_file - Get a sync_file from a
> dma-buf
> > > > > + *
> > > > > + * Userspace can perform a DMA_BUF_IOCTL_EXPORT_SYNC_FILE to
> retrieve
> > > the
> > > > > + * current set of fences on a dma-buf file descriptor as a
> > > sync_file.  CPU
> > > > > + * waits via poll() or other driver-specific mechanisms typically
> > > wait on
> > > > > + * whatever fences are on the dma-buf at the time the wait begins.
> > > This
> > > > > + * is similar except that it takes a snapshot of the current
> fences
> > > on the
> > > > > + * dma-buf for waiting later instead of waiting immediately.
> This is
> > > > > + * useful for modern graphics APIs such as Vulkan which assume an
> > > explicit
> > > > > + * synchronization model but still need to inter-operate with
> dma-buf.
> > > > > + */
> > > > > +struct dma_buf_export_sync_file {
> > > > > +   /**
> > > > > +    * @flags: Read/write flags
> > > > > +    *
> > > > > +    * Must be DMA_BUF_SYNC_READ, DMA_BUF_SYNC_WRITE, or both.
> > > > > +    *
> > > > > +    * If DMA_BUF_SYNC_READ is set and DMA_BUF_SYNC_WRITE is not
> set,
> > > > > +    * the returned sync file waits on any writers of the dma-buf
> to
> > > > > +    * complete.  Waiting on the returned sync file is equivalent
> to
> > > > > +    * poll() with POLLIN.
> > > > > +    *
> > > > > +    * If DMA_BUF_SYNC_WRITE is set, the returned sync file waits
> on
> > > > > +    * any users of the dma-buf (read or write) to complete.
> Waiting
> > > > > +    * on the returned sync file is equivalent to poll() with
> POLLOUT.
> > > > > +    * If both DMA_BUF_SYNC_WRITE and DMA_BUF_SYNC_READ are set,
> this
> > > > > +    * is equivalent to just DMA_BUF_SYNC_WRITE.
> > > > > +    */
> > > > > +   __u32 flags;
> > > > > +   /** @fd: Returned sync file descriptor */
> > > > > +   __s32 fd;
> > > > > +};
> > > > > +
> > > > >   #define DMA_BUF_BASE              'b'
> > > > >   #define DMA_BUF_IOCTL_SYNC        _IOW(DMA_BUF_BASE, 0, struct
> > > dma_buf_sync)
> > > > > @@ -94,5 +128,6 @@ struct dma_buf_sync {
> > > > >   #define DMA_BUF_SET_NAME  _IOW(DMA_BUF_BASE, 1, const char *)
> > > > >   #define DMA_BUF_SET_NAME_A        _IOW(DMA_BUF_BASE, 1, u32)
> > > > >   #define DMA_BUF_SET_NAME_B        _IOW(DMA_BUF_BASE, 1, u64)
> > > > > +#define DMA_BUF_IOCTL_EXPORT_SYNC_FILE     _IOWR(DMA_BUF_BASE, 2,
> > > struct dma_buf_export_sync_file)
> > > > >   #endif
> > > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> > >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
>

Reply via email to