On 21.01.19 08:56, Peter Xu wrote:
> Hi,
> 
> This series implements initial write protection support for
> userfaultfd.  Currently both shmem and hugetlbfs are not supported
> yet, but only anonymous memory.
> 
> To be simple, either "userfaultfd-wp" or "uffd-wp" might be used in
> later paragraphs.
> 
> The whole series can also be found at:
> 
>   https://github.com/xzpeter/linux/tree/uffd-wp-merged
> 
> Any comment would be greatly welcomed.   Thanks.
> 
> Overview
> ====================
> 
> The uffd-wp work was initialized by Shaohua Li [1], and later
> continued by Andrea [2]. This series is based upon Andrea's latest
> userfaultfd tree, and it is a continuous works from both Shaohua and
> Andrea.  Many of the follow up ideas come from Andrea too.
> 
> Besides the old MISSING register mode of userfaultfd, the new uffd-wp
> support provides another alternative register mode called
> UFFDIO_REGISTER_MODE_WP that can be used to listen to not only missing
> page faults but also write protection page faults, or even they can be
> registered together.  At the same time, the new feature also provides
> a new userfaultfd ioctl called UFFDIO_WRITEPROTECT which allows the
> userspace to write protect a range or memory or fixup write permission
> of faulted pages.
> 
> Please refer to the document patch "userfaultfd: wp:
> UFFDIO_REGISTER_MODE_WP documentation update" for more information on
> the new interface and what it can do.
> 
> The major workflow of an uffd-wp program should be:
> 
>   1. Register a memory region with WP mode using UFFDIO_REGISTER_MODE_WP
> 
>   2. Write protect part of the whole registered region using
>      UFFDIO_WRITEPROTECT, passing in UFFDIO_WRITEPROTECT_MODE_WP to
>      show that we want to write protect the range.
> 
>   3. Start a working thread that modifies the protected pages,
>      meanwhile listening to UFFD messages.
> 
>   4. When a write is detected upon the protected range, page fault
>      happens, a UFFD message will be generated and reported to the
>      page fault handling thread
> 
>   5. The page fault handler thread resolves the page fault using the
>      new UFFDIO_WRITEPROTECT ioctl, but this time passing in
>      !UFFDIO_WRITEPROTECT_MODE_WP instead showing that we want to
>      recover the write permission.  Before this operation, the fault
>      handler thread can do anything it wants, e.g., dumps the page to
>      a persistent storage.
> 
>   6. The worker thread will continue running with the correctly
>      applied write permission from step 5.
> 
> Currently there are already two projects that are based on this new
> userfaultfd feature.
> 
> QEMU Live Snapshot: The project provides a way to allow the QEMU
>                     hypervisor to take snapshot of VMs without
>                     stopping the VM [3].
> 
> LLNL umap library:  The project provides a mmap-like interface and
>                     "allow to have an application specific buffer of
>                     pages cached from a large file, i.e. out-of-core
>                     execution using memory map" [4][5].
> 
> Before posting the patchset, this series was smoke tested against QEMU
> live snapshot and the LLNL umap library (by doing parallel quicksort
> using 128 sorting threads + 80 uffd servicing threads).  My sincere
> thanks to Marty Mcfadden and Denis Plotnikov for the help along the
> way.
> 
> Implementation
> ==============
> 
> Patch 1-4: The whole uffd-wp requires the kernel page fault path to
>            take more than one retries.  In the previous works starting
>            from Shaohua, a new fault flag FAULT_FLAG_ALLOW_UFFD_RETRY
>            was introduced for this [6]. However in this series we have
>            dropped that patch, instead the whole work is based on the
>            recent series "[PATCH RFC v3 0/4] mm: some enhancements to
>            the page fault mechanism" [7] which removes the assuption
>            that VM_FAULT_RETRY can only happen once.  This four
>            patches are identital patches but picked up here.  Please
>            refer to the cover letter [7] for more information.  More
>            discussion upstream shows that this work could even benefit
>            existing use case [8] so please help justify whether
>            patches 1-4 can be consider to be accepted even earlier
>            than the rest of the series.
> 
> Patch 5-21:   Implements the uffd-wp logic.  To avoid collision with
>               existing write protections (e.g., an private anonymous
>               page can be write protected if it was shared between
>               multiple processes), a new PTE bit (_PAGE_UFFD_WP) was
>               introduced to explicitly mark a PTE as userfault
>               write-protected.  A similar bit was also used in the
>               swap/migration entry (_PAGE_SWP_UFFD_WP) to make sure
>               even if the pages were swapped or migrated, the uffd-wp
>               tracking information won't be lost.  When resolving a
>               page fault, we'll do a page copy before hand if the page
>               was COWed to make sure we won't corrupt any shared
>               pages.  Etc.  Please see separated patches for more
>               details.
> 
> Patch 22:     Documentation update for uffd-wp
> 
> Patch 23,24:  Uffd-wp selftests
> 
> TODO
> =============
> 
> - hugetlbfs/shmem support
> - performance
> - more architectures
> - ...
> 
> References
> ==========
> 
> [1] https://lwn.net/Articles/666187/
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/log/?h=userfault
> [3] https://github.com/denis-plotnikov/qemu/commits/background-snapshot-kvm
> [4] https://github.com/LLNL/umap
> [5] https://llnl-umap.readthedocs.io/en/develop/
> [6] 
> https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=userfault&id=b245ecf6cf59156966f3da6e6b674f6695a5ffa5
> [7] https://lkml.org/lkml/2018/11/21/370
> [8] https://lkml.org/lkml/2018/12/30/64
> 
> Andrea Arcangeli (5):
>   userfaultfd: wp: add the writeprotect API to userfaultfd ioctl
>   userfaultfd: wp: hook userfault handler to write protection fault
>   userfaultfd: wp: add WP pagetable tracking to x86
>   userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers
>   userfaultfd: wp: add UFFDIO_COPY_MODE_WP
> 
> Martin Cracauer (1):
>   userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update
> 
> Peter Xu (15):
>   mm: gup: rename "nonblocking" to "locked" where proper
>   mm: userfault: return VM_FAULT_RETRY on signals
>   mm: allow VM_FAULT_RETRY for multiple times
>   mm: gup: allow VM_FAULT_RETRY for multiple times
>   mm: merge parameters for change_protection()
>   userfaultfd: wp: apply _PAGE_UFFD_WP bit
>   mm: export wp_page_copy()
>   userfaultfd: wp: handle COW properly for uffd-wp
>   userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork
>   userfaultfd: wp: add pmd_swp_*uffd_wp() helpers
>   userfaultfd: wp: support swap and page migration
>   userfaultfd: wp: don't wake up when doing write protect
>   khugepaged: skip collapse if uffd-wp detected
>   userfaultfd: selftests: refactor statistics
>   userfaultfd: selftests: add write-protect test
> 
> Shaohua Li (3):
>   userfaultfd: wp: add helper for writeprotect check
>   userfaultfd: wp: support write protection for userfault vma range
>   userfaultfd: wp: enabled write protection in userfaultfd API
> 
>  Documentation/admin-guide/mm/userfaultfd.rst |  51 +++++
>  arch/alpha/mm/fault.c                        |   4 +-
>  arch/arc/mm/fault.c                          |  12 +-
>  arch/arm/mm/fault.c                          |  17 +-
>  arch/arm64/mm/fault.c                        |  11 +-
>  arch/hexagon/mm/vm_fault.c                   |   3 +-
>  arch/ia64/mm/fault.c                         |   3 +-
>  arch/m68k/mm/fault.c                         |   5 +-
>  arch/microblaze/mm/fault.c                   |   3 +-
>  arch/mips/mm/fault.c                         |   3 +-
>  arch/nds32/mm/fault.c                        |   7 +-
>  arch/nios2/mm/fault.c                        |   5 +-
>  arch/openrisc/mm/fault.c                     |   3 +-
>  arch/parisc/mm/fault.c                       |   4 +-
>  arch/powerpc/mm/fault.c                      |   9 +-
>  arch/riscv/mm/fault.c                        |   9 +-
>  arch/s390/mm/fault.c                         |  14 +-
>  arch/sh/mm/fault.c                           |   5 +-
>  arch/sparc/mm/fault_32.c                     |   4 +-
>  arch/sparc/mm/fault_64.c                     |   4 +-
>  arch/um/kernel/trap.c                        |   6 +-
>  arch/unicore32/mm/fault.c                    |  10 +-
>  arch/x86/Kconfig                             |   1 +
>  arch/x86/include/asm/pgtable.h               |  67 ++++++
>  arch/x86/include/asm/pgtable_64.h            |   8 +-
>  arch/x86/include/asm/pgtable_types.h         |  11 +-
>  arch/x86/mm/fault.c                          |  13 +-
>  arch/xtensa/mm/fault.c                       |   4 +-
>  fs/userfaultfd.c                             | 110 +++++----
>  include/asm-generic/pgtable.h                |   1 +
>  include/asm-generic/pgtable_uffd.h           |  66 ++++++
>  include/linux/huge_mm.h                      |   2 +-
>  include/linux/mm.h                           |  21 +-
>  include/linux/swapops.h                      |   2 +
>  include/linux/userfaultfd_k.h                |  41 +++-
>  include/trace/events/huge_memory.h           |   1 +
>  include/uapi/linux/userfaultfd.h             |  28 ++-
>  init/Kconfig                                 |   5 +
>  mm/gup.c                                     |  61 ++---
>  mm/huge_memory.c                             |  28 ++-
>  mm/hugetlb.c                                 |   8 +-
>  mm/khugepaged.c                              |  23 ++
>  mm/memory.c                                  |  28 ++-
>  mm/mempolicy.c                               |   2 +-
>  mm/migrate.c                                 |   7 +
>  mm/mprotect.c                                |  99 +++++++--
>  mm/rmap.c                                    |   6 +
>  mm/userfaultfd.c                             |  92 +++++++-
>  tools/testing/selftests/vm/userfaultfd.c     | 222 ++++++++++++++-----
>  49 files changed, 898 insertions(+), 251 deletions(-)
>  create mode 100644 include/asm-generic/pgtable_uffd.h
> 

Does this series fix the "false positives" case I experienced on early
prototypes of uffd-wp? (getting notified about a write access although
it was not a write access?)

-- 

Thanks,

David / dhildenb

Reply via email to