Changes since v5 [1]:
* Compile fixes from a much improved coccinelle semantic patch (thanks
Julia!) that adds a 'flags' argument to all the ->mmap()
implementations in the kernel. (0day-kbuild-robot)
* Make the deprecated MAP_DENYWRITE and MAP_EXECUTABLE flags return
EOPNOTSUPP with the new mmap3() syscall. (Kirill)
* Minor changelog updates.
* Updated cover letter with a clarified summary and checklist of
questions to answer before proceeding further.
---
MAP_DIRECT is a mechanism to ask the kernel to atomically manage the
file-offset to physical-address block relationship of a mapping relative
to any memory-mapped access. It is complimentary to the proposed
MAP_SYNC mechanism which makes the same guarantee relative to cpu
faults. MAP_DIRECT goes a step further and makes this guarantee for
agents that may not generate mmu faults, but at the cost of restricting
the kernel's ability to mutate the block-map at will.
MAP_SYNC is preferred for scenarios that want full filesystem feature
support while avoiding fsync/msync overhead, but also do not need to
contend with hypervisors or RDMA agents that do not give the kernel an
mmu fault. In other words, the need for MAP_DIRECT is driven by the
scarcity of SVM capable hardware (Shared Virtual Memory, where hardware
generates mmu faults), hypervisors like Xen that need to interrogate the
physical address layout of a file to maintain their own physical-address
mapping metadata outside the kernel, and peer-to-peer DMA use cases that
always bypass the mmu.
The MAP_DIRECT mechanism allows a filesystem to be used for capacity
provisioning and access control where these aforementioned applications
would otherwise be forced to roll a custom solution on top of a raw
device-file.
Questions:
1/ Is the definition of MAP_DIRECT constrained enough to allow us to
make the restrictions it imposes on the kernel finer grained over time?
2/ Do the XFS changes look sane? They attempt to avoid adding any
overhead to the non-MAP_DIRECT case at the expense of the new
i_mapdcount atomic counter in the XFS inode.
3/ While the generic MAP_DIRECT description warns that the block-map may
not be actually be immutable for the lifetime of the mapping it also
does not preclude a filesystem from making that guarantee. In fact,
Dave wants to be able to get a stable view of the physical mapping
[2], and Xen has a need to do the same [3]. Do we want userspace to
start making "XFS + MAP_DIRECT == Immutable" assumptions, or do we
need a separate mechanism for that guarantee?
[1]: https://lkml.org/lkml/2017/8/16/114
[2]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1467677.html
[3]: https://lists.xen.org/archives/html/xen-devel/2017-04/msg00419.html
---
Dan Williams (5):
vfs: add flags parameter to ->mmap() in 'struct file_operations'
fs, xfs: introduce S_IOMAP_SEALED
mm: introduce mmap3 for safely defining new mmap flags
fs, xfs: introduce MAP_DIRECT for creating block-map-atomic file ranges
fs, fcntl: add F_MAP_DIRECT
arch/arc/kernel/arc_hostlink.c |3 -
arch/mips/kernel/vdso.c|2
arch/powerpc/kernel/proc_powerpc.c |3 -
arch/powerpc/kvm/book3s_64_vio.c |3 -
arch/powerpc/platforms/cell/spufs/file.c | 21 ++--
arch/powerpc/platforms/powernv/opal-prd.c |3 -
arch/um/drivers/mmapper_kern.c |3 -
arch/x86/entry/syscalls/syscall_32.tbl |1
arch/x86/entry/syscalls/syscall_64.tbl |1
drivers/android/binder.c |3 -
drivers/char/agp/frontend.c|3 -
drivers/char/bsr.c |3 -
drivers/char/hpet.c|6 +
drivers/char/mbcs.c|3 -
drivers/char/mbcs.h|3 -
drivers/char/mem.c | 11 +-
drivers/char/mspec.c |9 +-
drivers/char/uv_mmtimer.c |6 +
drivers/dax/device.c |3 -
drivers/dma-buf/dma-buf.c |4 +
drivers/firewire/core-cdev.c |3 -
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|3 -
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|3 -
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |5 +
drivers/gpu/drm/arc/arcpgu_drv.c |5 +
drivers/gpu/drm/ast/ast_drv.h |3 -
drivers/gpu/drm/ast/ast_ttm.c |3 -
drivers/gpu/drm/bochs/bochs.h |3 -
drivers/gpu/drm/bochs/bochs_mm.c |3 -
drivers/gpu/drm/cirrus/cirrus_drv.h|3 -
drivers/gpu/drm/cirrus/cirrus_ttm.c|3 -
drivers/gpu/drm/drm_gem.c