Thanks for reviewing, David. Please take a look in my comments in-line.
On 02/09/2016 07:26 AM, David Herrmann wrote: > > On Tue, Dec 22, 2015 at 10:36 PM, Tiago Vignatti > <tiago.vignatti at intel.com> wrote: >> From: Daniel Vetter <daniel.vetter at ffwll.ch> >> >> The userspace might need some sort of cache coherency management e.g. when >> CPU >> and GPU domains are being accessed through dma-buf at the same time. To >> circumvent this problem there are begin/end coherency markers, that forward >> directly to existing dma-buf device drivers vfunc hooks. Userspace can make >> use >> of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be >> used like following: >> - mmap dma-buf fd >> - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. >> read/write >> to mmap area 3. SYNC_END ioctl. This can be repeated as often as you >> want (with the new data being consumed by the GPU or say scanout >> device) >> - munmap once you don't need the buffer any more >> >> v2 (Tiago): Fix header file type names (u64 -> __u64) >> v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end >> dma-buf functions. Check for overflows in start/length. >> v4 (Tiago): use 2d regions for sync. >> v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and >> remove range information from struct dma_buf_sync. >> v6 (Tiago): use __u64 structured padded flags instead enum. Adjust >> documentation about the recommendation on using sync ioctls. >> v7 (Tiago): Alex' nit on flags definition and being even more wording in the >> doc about sync usage. >> >> Cc: Sumit Semwal <sumit.semwal at linaro.org> >> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com> >> Signed-off-by: Tiago Vignatti <tiago.vignatti at intel.com> >> --- >> Documentation/dma-buf-sharing.txt | 21 ++++++++++++++++++- >> drivers/dma-buf/dma-buf.c | 43 >> +++++++++++++++++++++++++++++++++++++++ >> include/uapi/linux/dma-buf.h | 38 ++++++++++++++++++++++++++++++++++ >> 3 files changed, 101 insertions(+), 1 deletion(-) >> create mode 100644 include/uapi/linux/dma-buf.h >> >> diff --git a/Documentation/dma-buf-sharing.txt >> b/Documentation/dma-buf-sharing.txt >> index 4f4a84b..32ac32e 100644 >> --- a/Documentation/dma-buf-sharing.txt >> +++ b/Documentation/dma-buf-sharing.txt >> @@ -350,7 +350,26 @@ Being able to mmap an export dma-buf buffer object has >> 2 main use-cases: >> handles, too). So it's beneficial to support this in a similar fashion >> on >> dma-buf to have a good transition path for existing Android userspace. >> >> - No special interfaces, userspace simply calls mmap on the dma-buf fd. >> + No special interfaces, userspace simply calls mmap on the dma-buf fd, >> making >> + sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is >> *always* >> + used when the access happens. This is discussed next paragraphs. >> + >> + Some systems might need some sort of cache coherency management e.g. when >> + CPU and GPU domains are being accessed through dma-buf at the same time. >> To >> + circumvent this problem there are begin/end coherency markers, that >> forward >> + directly to existing dma-buf device drivers vfunc hooks. Userspace can >> make >> + use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence >> + would be used like following: >> + - mmap dma-buf fd >> + - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. >> read/write >> + to mmap area 3. SYNC_END ioctl. This can be repeated as often as you >> + want (with the new data being consumed by the GPU or say scanout >> device) >> + - munmap once you don't need the buffer any more >> + >> + Therefore, for correctness and optimal performance, systems with the >> memory >> + cache shared by the GPU and CPU i.e. the "coherent" and also the >> + "incoherent" are always required to use SYNC_START and SYNC_END before >> and >> + after, respectively, when accessing the mapped address. >> >> 2. Supporting existing mmap interfaces in importers >> >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >> index b2ac13b..9a298bd 100644 >> --- a/drivers/dma-buf/dma-buf.c >> +++ b/drivers/dma-buf/dma-buf.c >> @@ -34,6 +34,8 @@ >> #include <linux/poll.h> >> #include <linux/reservation.h> >> >> +#include <uapi/linux/dma-buf.h> >> + >> static inline int is_dma_buf_file(struct file *); >> >> struct dma_buf_list { >> @@ -251,11 +253,52 @@ out: >> return events; >> } >> >> +static long dma_buf_ioctl(struct file *file, >> + unsigned int cmd, unsigned long arg) >> +{ >> + struct dma_buf *dmabuf; >> + struct dma_buf_sync sync; >> + enum dma_data_direction direction; >> + >> + dmabuf = file->private_data; >> + >> + if (!is_dma_buf_file(file)) >> + return -EINVAL; > > Why? This can never happen, and you better not use dma_buf_ioctl() > outside of dma_buf_fops.. > I guess it's simply copied from the other fop callbacks, but I don't > see why. dma_buf_poll() doesn't do it, neither should this, or one of > the other 3 callbacks. yup. I fixed now. >> + >> + switch (cmd) { >> + case DMA_BUF_IOCTL_SYNC: >> + if (copy_from_user(&sync, (void __user *) arg, sizeof(sync))) >> + return -EFAULT; >> + >> + if (sync.flags & DMA_BUF_SYNC_RW) >> + direction = DMA_BIDIRECTIONAL; >> + else if (sync.flags & DMA_BUF_SYNC_READ) >> + direction = DMA_FROM_DEVICE; >> + else if (sync.flags & DMA_BUF_SYNC_WRITE) >> + direction = DMA_TO_DEVICE; >> + else >> + return -EINVAL; > > This looks bogus. It always ends up being "DMA_BIDIRECTIONAL" or > EINVAL. I recommend changing it to: > > switch (sync.flags & DMA_BUF_SYNC_RW) { > case DMA_BUF_SYNC_READ: > direction = DMA_FROM_DEVICE; > break; > case DMA_BUF_SYNC_WRITE: > direction = DMA_TO_DEVICE; > break; > case DMA_BUF_SYNC_READ: > direction = DMA_BIDIRECTIONAL; > break; > default: > return -EINVAL; > } hmm I can't really get what's wrong with my snip. Why bogus? Can you double-check actually your suggestion, cause that's wrong with _READ being repeated. >> + >> + if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK) >> + return -EINVAL; > > Why isn't this done immediately after copy_from_user()? done. >> + >> + if (sync.flags & DMA_BUF_SYNC_END) >> + dma_buf_end_cpu_access(dmabuf, direction); >> + else >> + dma_buf_begin_cpu_access(dmabuf, direction); > > Why are SYNC_START and SYNC_END exclusive? It feels very natural to me > to invoke both at the same time (especially if two objects are stored > in the same dma-buf). Can you open a bit and teach how two objects would be stored in the same dma-buf? I didn't care about this case and if we want that, we'd need also to change the sequence of accesses as described in the dma-buf-sharing.txt I'm proposing in this patch. >> + >> + return 0; >> + default: >> + return -ENOTTY; >> + } >> +} >> + >> static const struct file_operations dma_buf_fops = { >> .release = dma_buf_release, >> .mmap = dma_buf_mmap_internal, >> .llseek = dma_buf_llseek, >> .poll = dma_buf_poll, >> + .unlocked_ioctl = dma_buf_ioctl, >> }; >> >> /* >> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h >> new file mode 100644 >> index 0000000..4a9b36b >> --- /dev/null >> +++ b/include/uapi/linux/dma-buf.h >> @@ -0,0 +1,38 @@ >> +/* >> + * Framework for buffer objects that can be shared across >> devices/subsystems. >> + * >> + * Copyright(C) 2015 Intel Ltd >> + * >> + * This program is free software; you can redistribute it and/or modify it >> + * under the terms of the GNU General Public License version 2 as published >> by >> + * the Free Software Foundation. >> + * >> + * This program is distributed in the hope that it will be useful, but >> WITHOUT >> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or >> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for >> + * more details. >> + * >> + * You should have received a copy of the GNU General Public License along >> with >> + * this program. If not, see <http://www.gnu.org/licenses/>. >> + */ >> + >> +#ifndef _DMA_BUF_UAPI_H_ >> +#define _DMA_BUF_UAPI_H_ >> + >> +/* begin/end dma-buf functions used for userspace mmap. */ >> +struct dma_buf_sync { >> + __u64 flags; >> +}; > > Please add '#include <linux/types.h>', otherwise this header cannot be > compiled on its own (missing __u64). done. >> + >> +#define DMA_BUF_SYNC_READ (1 << 0) >> +#define DMA_BUF_SYNC_WRITE (2 << 0) >> +#define DMA_BUF_SYNC_RW (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE) >> +#define DMA_BUF_SYNC_START (0 << 2) >> +#define DMA_BUF_SYNC_END (1 << 2) >> +#define DMA_BUF_SYNC_VALID_FLAGS_MASK \ >> + (DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END) >> + >> +#define DMA_BUF_BASE 'b' >> +#define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync) > > Why _IOW? A read-only ioctl should be able to call DMA_BUF_SYNC_READ, right? yup. I've changed it to _IOWR now. Tiago