Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback

2024-04-11 Thread Peter Xu
On Thu, Apr 11, 2024 at 06:55:44PM +0200, Paolo Bonzini wrote: > On Mon, Apr 8, 2024 at 3:56 PM Peter Xu wrote: > > Paolo, > > > > I may miss a bunch of details here (as I still remember some change_pte > > patches previously on the list..), however not sure wheth

Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback

2024-04-08 Thread Peter Xu
ecause I remember Andrea used to have a custom tree maintaining that part: https://github.com/aagit/aa/commit/c761078df7a77d13ddfaeebe56a0f4bc128b1968 Maybe it can't be enabled for some reason that I overlooked in the current tree, or we just decided to not to? Thanks, -- Peter Xu

Re: [PATCH v4 09/10] userfaultfd/shmem: modify shmem_mcopy_atomic_pte to use install_pte()

2021-04-20 Thread Peter Xu
th shared and private VMAs. > */ > -static int mcopy_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, > - struct vm_area_struct *dst_vma, > - unsigned long dst_addr, struct page *page, > - bool newly_allocated, bool wp_copy) > +int mcopy_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, > + struct vm_area_struct *dst_vma, > + unsigned long dst_addr, struct page *page, > + bool newly_allocated, bool wp_copy) > { > int ret; > pte_t _dst_pte, *dst_pte; > -- > 2.31.1.368.gbe11c130af-goog > -- Peter Xu

Re: [PATCH] KVM: selftests: Always run vCPU thread with blocked SIG_IPI

2021-04-20 Thread Peter Xu
On Tue, Apr 20, 2021 at 06:24:50PM +0200, Paolo Bonzini wrote: > On 20/04/21 17:32, Peter Xu wrote: > > On Tue, Apr 20, 2021 at 10:37:39AM -0400, Peter Xu wrote: > > > On Tue, Apr 20, 2021 at 04:16:14AM -0400, Paolo Bonzini wrote: > > > > The main thread could sta

[PATCH v4 2/2] KVM: selftests: Wait for vcpu thread before signal setup

2021-04-20 Thread Peter Xu
on receiving a SIG_USR1 without a handler (when vcpu runs far slower than main). Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 8 1 file changed, 8 insertions(+) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm

[PATCH v4 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-20 Thread Peter Xu
3641.23742-1-pet...@redhat.com/ [2] https://lore.kernel.org/lkml/20210417140956.GV4440@xz-x1/ Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Andrew Jones Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 62 1 file changed, 51 insertions(+), 11 deleti

[PATCH v4 0/2] KVM: selftests: fix races in dirty log test

2021-04-20 Thread Peter Xu
/20210420081614.684787-1-pbonz...@redhat.com/ Peter Xu (2): KVM: selftests: Sync data verify of dirty logging with guest sync KVM: selftests: Wait for vcpu thread before signal setup tools/testing/selftests/kvm/dirty_log_test.c | 70 +--- 1 file changed, 59 insertions(+), 11 deletions

Re: [PATCH] KVM: selftests: Always run vCPU thread with blocked SIG_IPI

2021-04-20 Thread Peter Xu
On Tue, Apr 20, 2021 at 10:37:39AM -0400, Peter Xu wrote: > On Tue, Apr 20, 2021 at 04:16:14AM -0400, Paolo Bonzini wrote: > > The main thread could start to send SIG_IPI at any time, even before signal > > blocked on vcpu thread. Therefore, start the vcpu thread with the sig

Re: [PATCH] KVM: selftests: Always run vCPU thread with blocked SIG_IPI

2021-04-20 Thread Peter Xu
_log_test could fail directly > on receiving a SIGUSR1 without a handler (when vcpu runs far slower than > main). > > Reported-by: Peter Xu > Cc: sta...@vger.kernel.org > Signed-off-by: Paolo Bonzini Yes, indeed better! :) Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v3 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-20 Thread Peter Xu
On Tue, Apr 20, 2021 at 10:07:16AM +0200, Paolo Bonzini wrote: > On 18/04/21 14:43, Peter Xu wrote: > > 8<- > > diff --git a/tools/testing/selftests/kvm/dirty_log_test.c > > b/tools/testing/selftests/kvm/dirty_log_test.c > > index 25230e799bc4..d3050d1c2cd0

Re: [PATCH] sched/isolation: don't do unbounded chomp on bootarg string

2021-04-19 Thread Peter Xu
it seems still the only place to set the new flag HK_FLAG_MANAGED_IRQ. If one day we'll finally obsolete isolcpus= we may need to think about where to put it? When I looked at it, I also noticed I see no caller to set HK_FLAG_SCHED at all. Is it really used anywhere? Regarding this patch...

Re: [PATCH v3 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-18 Thread Peter Xu
On Sat, Apr 17, 2021 at 10:36:01AM -0400, Peter Xu wrote: > This fixes a bug that can trigger with e.g. "taskset -c 0 ./dirty_log_test" or > when the testing host is very busy. > > A similar previous attempt is done [1] but that is not enough, the reason is &g

[PATCH v3 2/2] KVM: selftests: Wait for vcpu thread before signal setup

2021-04-17 Thread Peter Xu
on receiving a SIG_USR1 without a handler (when vcpu runs far slower than main). Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 8 1 file changed, 8 insertions(+) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm

[PATCH v3 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-17 Thread Peter Xu
3641.23742-1-pet...@redhat.com/ [2] https://lore.kernel.org/lkml/20210417140956.GV4440@xz-x1/ Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Andrew Jones Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 60 1 file changed, 50 insertions(+), 10 deleti

[PATCH v3 0/2] KVM: selftests: fix races in dirty log test

2021-04-17 Thread Peter Xu
this patch: (1) while :; do taskset -c 1 ./dirty_log_test; done (2) taskset -c 1 bash -c "while :; do :; done" Review comments are greatly welcomed. Thanks, [1] https://lore.kernel.org/lkml/20210413213641.23742-1-pet...@redhat.com/ Peter Xu (2): KVM: selftests: Sync data verif

Re: [PATCH v2] kvm/selftests: Fix race condition with dirty_log_test

2021-04-17 Thread Peter Xu
ay. I tested longer yesterday but haven't updated this patch yet. More below. On Sat, Apr 17, 2021 at 02:59:48PM +0200, Paolo Bonzini wrote: > On 13/04/21 23:36, Peter Xu wrote: > > This patch closes this race by allowing the main thread to give the vcpu > > thread > > chan

Re: [PATCH v5] hrtimer: avoid retrigger_next_event IPI

2021-04-16 Thread Peter Xu
hat any subsequently armed timers on > CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. > > Signed-off-by: Marcelo Tosatti > > --- > > v5: > - Add missing hrtimer_update_base (Peter Xu). > > v4: >- Drop unused code (Thomas). > &g

Re: [PATCH v2 3/9] userfaultfd/shmem: support minor fault registration for shmem

2021-04-14 Thread Peter Xu
the pte (in 4/9) will do its > shmem_getpage_gfp(), and that will bring in the swap if user > did not already do so: so I was wrong to claim more robustness > the other way, this placement should be fine. I think. > > > if (xa_is_value(page)) { > > error = shmem_swapin_page(inode, index, , > > sgp, gfp, vma, fault_type); > > -- > > 2.31.1.295.g9ea45b61b8-goog > -- Peter Xu

[PATCH v2] kvm/selftests: Fix race condition with dirty_log_test

2021-04-13 Thread Peter Xu
d this specific race condition. Cc: Andrew Jones Cc: Paolo Bonzini Cc: Vitaly Kuznetsov Cc: Sean Christopherson Signed-off-by: Peter Xu --- v2: - drop one unnecessary check on "!matched" --- tools/testing/selftests/kvm/dirty_log_test.c | 53 +++- 1 file changed, 52

[PATCH] kvm/selftests: Fix race condition with dirty_log_test

2021-04-13 Thread Peter Xu
d this specific race condition. Cc: Andrew Jones Cc: Paolo Bonzini Cc: Vitaly Kuznetsov Cc: Sean Christopherson Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 54 +++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/tools/testing/sel

Re: [PATCH v2 3/9] userfaultfd/shmem: support minor fault registration for shmem

2021-04-13 Thread Peter Xu
UFFDIO_CONTINUE ioctl for shmem-backed > minor faults, though, so userspace doesn't yet have a way to resolve > such faults. > > Signed-off-by: Axel Rasmussen Everything looks right to me, but it'll be great if Andrea or Hugh will have a look too. Acked-by: Peter Xu -- Peter Xu

Re: [PATCH v2 6/9] userfaultfd/selftests: create alias mappings in the shmem test

2021-04-13 Thread Peter Xu
;); > + > + if (is_src) > + area_src_alias = area_alias; > + else > + area_dst_alias = area_alias; > +} It would be nice if shmem_allocate_area() could merge with hugetlb_allocate_area() somehow, but not that urgent. Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v2 5/9] userfaultfd/selftests: use memfd_create for shmem test type

2021-04-13 Thread Peter Xu
argv[] so we actually print out the > hugetlb file path. > > Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v2 7/9] userfaultfd/selftests: reinitialize test context in each test

2021-04-13 Thread Peter Xu
init() at the entry of each test, and clear() after finish one test? > + > uffdio_register.range.start = (unsigned long) area_dst; > uffdio_register.range.len = nr_pages * page_size; > uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; The rest looks good to me. Thanks, -- Peter Xu

Re: [PATCH 4/9] userfaultfd/shmem: support UFFDIO_CONTINUE for shmem

2021-04-13 Thread Peter Xu
On Mon, Apr 12, 2021 at 09:40:22PM -0700, Axel Rasmussen wrote: > On Mon, Apr 12, 2021 at 4:17 PM Peter Xu wrote: > > > > On Thu, Apr 08, 2021 at 04:43:22PM -0700, Axel Rasmussen wrote: > > > +/* > > > + * Install PTEs, to map dst_addr (within dst_vma) to page.

Re: [PATCH v4] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTINUE behavior

2021-04-12 Thread Peter Xu
On Mon, Apr 12, 2021 at 05:51:14PM -0700, Hugh Dickins wrote: > On Mon, 12 Apr 2021, Peter Xu wrote: > > On Tue, Apr 06, 2021 at 11:14:30PM -0700, Hugh Dickins wrote: > > > > +static int mcopy_atomic_install_ptes(struct mm_struct *dst_mm, pm

Re: [PATCH 2/9] userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte

2021-04-12 Thread Peter Xu
. Then it'll further passed into shmem_mcopy_atomic_pte() now after this patch (as shmem_mfill_zeropage_pte() probably only did one thing good which is to clear src_addr). Not a big deal, though. All the rest looks sane to me. Reviewed-by: Peter Xu I'll wait to look at the selftests since

Re: [PATCH 1/9] userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h

2021-04-12 Thread Peter Xu
unsigned long address, unsigned int flags); > #ifdef CONFIG_USERFAULTFD > +enum mcopy_atomic_mode; (I'm not 100% sure, but.. maybe this can be moved even out of ifdef? Then you can define it once at the top rather than twice?) Reviewed-by: Peter Xu -- Peter Xu

[PATCH v2 4/5] userfaultfd/selftests: Only dump counts if mode enabled

2021-04-12 Thread Peter Xu
WP and MINOR modes are conditionally enabled on specific memory types. This patch avoids dumping tons of zeros for those cases when the modes are not supported at all. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 30

[PATCH v2 2/5] userfaultfd/selftests: Remove the time() check on delayed uffd

2021-04-12 Thread Peter Xu
latency of resolving thread. It may not mean an issue with uffd. Neither do I saw this error triggered either in the past runs. Even if it triggers, it'll be drown in all the rest of test logs. Remove it. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm

[PATCH v2 5/5] userfaultfd/selftests: Unify error handling

2021-04-12 Thread Peter Xu
Introduce err()/_err() and replace all the different ways to fail the program, mostly "fprintf" and "perror" with tons of exit() calls. Always stop the test program at any failure. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm

[PATCH v2 1/5] userfaultfd/selftests: Use user mode only

2021-04-12 Thread Peter Xu
Userfaultfd selftest does not need to handle kernel initiated fault. Set user mode so it can be run even if unprivileged_userfaultfd=0 (which is the default). Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 2 +- 1 file changed, 1 insertion

[PATCH v2 3/5] userfaultfd/selftests: Dropping VERIFY check in locking_thread

2021-04-12 Thread Peter Xu
the fault flag - just do it unconditionally. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 55 +--- 1 file changed, 1 insertion(+), 54 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing

[PATCH v2 0/5] userfaultfd/selftests: A few cleanups

2021-04-12 Thread Peter Xu
selftest on fault handling, to use an err() macro instead of either fprintf() or perror() then another exit() call. The huge cleanup is done in the last patch. The first 4 patches are some other standalone cleanups for the same file, so I put them together. Please review, thanks. Peter Xu (5

Re: [PATCH 4/9] userfaultfd/shmem: support UFFDIO_CONTINUE for shmem

2021-04-12 Thread Peter Xu
; put_page(page); page = NULL; hindex = index; } I think it won't happen for your case since the page should be uptodate already (the other thread should check and modify the page before CONTINUE), but still raise this up, since if the page was allocated it smells better to still install the fallocated page (do we need to clear the page and SetUptodate)? -- Peter Xu

Re: [PATCH v4] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTINUE behavior

2021-04-12 Thread Peter Xu
irty(pte)) pte = pte_mkdirty(pte); pte = clear_pte_bit(pte, __pgprot(PTE_WRITE)); pte = set_pte_bit(pte, __pgprot(PTE_RDONLY)); return pte; } So arm64 will explicitly set the dirty bit (from the HW dirty bit) when wr-protect. It seems to prove that at least for arm64 it's very valid to have !write && dirty pte. Thanks, -- Peter Xu

Re: [PATCH 0/9] userfaultfd: add minor fault handling for shmem

2021-04-09 Thread Peter Xu
icts in my tree? > > It's true that we haven't tested the hugetlbfs minor faults patch > extensively *with the shmem one also applied*, but it has had more > thorough review than the shmem one at this point (e.g. by Mike > Kravetz), and they're rather separate code paths (I'd be surprised if > one breaks the other). Yes I think the hugetlb part should have got more review done. IMHO it's a matter of whether Mike would still like to do a more thorough review, or seems okay to keep them. I can repost the selftest series later if needed, as long as I figured which is the suitable base commit. Those selftest patches are definitely not urgent for this release, so we can wait for the next release. Thanks, -- Peter Xu

Re: [PATCH 3/9] userfaultfd/shmem: support minor fault registration for shmem

2021-04-09 Thread Peter Xu
awkward to swapin here. Maybe move this chunk to right after pagecache_get_page() returns? Then no need to touch the rest. > + > + if (swapped) > + return 0; > + > if (page) > hindex = page->index; > if (page && sgp == SGP_WRITE) > -- > 2.31.1.295.g9ea45b61b8-goog > -- Peter Xu

Re: [PATCH 0/5] 4.14 backports of fixes for "CoW after fork() issue"

2021-04-07 Thread Peter Xu
/git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=mapcount_deshare=7c3a31caa34ac6ac4a4ec0559b1307b5edfc0821 [4] https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=mapcount_deshare=599aa62474f51a470408b28fd4365320a5357aca -- Peter Xu

Re: [PATCH v5] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTINUE behavior

2021-04-06 Thread Peter Xu
gt; } else { > VM_WARN_ON_ONCE(wp_copy); > err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, > src_addr, mode, page); > } > > +out: > return err; > } > > diff --git a/tools/testing/selftests/vm/userfaultfd.c > b/tools/testing/selftests/vm/userfaultfd.c > index f6c86b036d0f..d8541a59dae5 100644 > --- a/tools/testing/selftests/vm/userfaultfd.c > +++ b/tools/testing/selftests/vm/userfaultfd.c > @@ -485,6 +485,7 @@ static void wp_range(int ufd, __u64 start, __u64 len, > bool wp) > static void continue_range(int ufd, __u64 start, __u64 len) > { > struct uffdio_continue req; > + int ret; > > req.range.start = start; > req.range.len = len; > @@ -493,6 +494,17 @@ static void continue_range(int ufd, __u64 start, __u64 > len) > if (ioctl(ufd, UFFDIO_CONTINUE, )) > err("UFFDIO_CONTINUE failed for address 0x%" PRIx64, > (uint64_t)start); > + > + /* > + * Error handling within the kernel for continue is subtly different > + * from copy or zeropage, so it may be a source of bugs. Trigger an > + * error (-EEXIST) on purpose, to verify doing so doesn't cause a BUG. > + */ > + req.mapped = 0; > + ret = ioctl(ufd, UFFDIO_CONTINUE, ); > + if (ret >= 0 || req.mapped != -EEXIST) > + err("failed to exercise UFFDIO_CONTINUE error handling, ret=%d, > mapped=%" PRId64, > + ret, req.mapped); > } > > static void *locking_thread(void *arg) > -- > 2.31.0.208.g409f899ff0-goog > -- Peter Xu

Re: [PATCH 0/5] 4.14 backports of fixes for "CoW after fork() issue"

2021-04-01 Thread Peter Xu
and the map? */ > > if (page_mapcount(page) == 1 && page_count(page) > 2) > > goto keep_locked; > > > > in the pre-pinning days. > > > > But I really think that there are a number of other commits you're > > missing too, bec

Re: [PATCH v3] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE behavior

2021-03-31 Thread Peter Xu
nt, since we _know_ the page cache is there.. So I'm thinking maybe you need to handle the continue request in mfill_atomic_pte() before the VM_SHARED check so as to cover both cases. -- Peter Xu

Re: [PATCH v3] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE behavior

2021-03-30 Thread Peter Xu
set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); /* No need to invalidate - it was non-present before */ update_mmu_cache(dst_vma, dst_addr, dst_pte); pte_unmap_unlock(dst_pte, ptl); return 0; } Then at the entry of shmem_mcopy_atomic_pte(): if (is_continue) { page = find_lock_page(mapping, pgoff); if (!page) return -EFAULT; ret = shmem_install_uffd_pte(..., is_continue && !(dst_vma->vm_flags & VM_SHARED)); unlock_page(page); if (ret) put_page(page); return ret; } Do you think this would be cleaner? -- Peter Xu

Re: [PATCH v2] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE behavior

2021-03-29 Thread Peter Xu
atic void continue_range(int ufd, __u64 start, __u64 > len) > if (ioctl(ufd, UFFDIO_CONTINUE, )) > err("UFFDIO_CONTINUE failed for address 0x%" PRIx64, > (uint64_t)start); > + > + /* > + * Error handling within the kernel for continue is subtly different > + * from copy or zeropage, so it may be a source of bugs. Trigger an > + * error (-EEXIST) on purpose, to verify doing so doesn't cause a BUG. > + */ > + req.mapped = 0; > + ret = ioctl(ufd, UFFDIO_CONTINUE, ); > + if (ret >= 0 || req.mapped != -EEXIST) > + err("failed to exercise UFFDIO_CONTINUE error handling, ret=%d, > mapped=%" PRId64, > + ret, req.mapped); > } > > static void *locking_thread(void *arg) > -- > 2.31.0.291.g576ba9dcdaf-goog > -- Peter Xu

[PATCH v5 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-29 Thread Peter Xu
Userfaultfd write-protect mode is supported starting from Linux 5.7. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/ioctl_userfaultfd.2 | 84 ++-- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/man2/ioctl_userfaultfd.2 b/man2

[PATCH v5 3/4] ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-29 Thread Peter Xu
UFFD_FEATURE_THREAD_ID is supported in Linux 4.14. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/ioctl_userfaultfd.2 | 5 + 1 file changed, 5 insertions(+) diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 index 47ae5f473..d4a8375b8 100644 --- a/man2

[PATCH v5 0/4] man2: udpate mm/userfaultfd manpages to latest

2021-03-29 Thread Peter Xu
ly after the whole hugetlbfs/shmem minor mode reaches the linux master branch. Please review, thanks. Peter Xu (4): userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs userfaultfd.2: Add write-protect mode ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs ioctl_userfaultfd.2: Add write-protec

[PATCH v5 1/4] userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-29 Thread Peter Xu
UFFD_FEATURE_THREAD_ID is supported since Linux 4.14. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/userfaultfd.2 | 13 + 1 file changed, 13 insertions(+) diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2 index e7dc9f813..5c41e4816 100644 --- a/man2/userfaultfd.2

[PATCH v5 2/4] userfaultfd.2: Add write-protect mode

2021-03-29 Thread Peter Xu
Write-protect mode is supported starting from Linux 5.7. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/userfaultfd.2 | 108 +++-- 1 file changed, 104 insertions(+), 4 deletions(-) diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2 index

Re: [PATCH v4 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-29 Thread Peter Xu
On Thu, Mar 25, 2021 at 10:32:20PM +0100, Alejandro Colomar (man-pages) wrote: > Hi Peter, > > On 3/23/21 8:16 PM, Peter Xu wrote: > > On Tue, Mar 23, 2021 at 07:11:04PM +0100, Alejandro Colomar (man-pages) > > wrote: > > > > +.TP > > > > +.B UFFDIO_

Re: [PATCH] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE error handling + accounting

2021-03-27 Thread Peter Xu
only if both WRITE|SHARED set for the vma flags. E.g., shmem_mcopy_atomic_pte() of a normal uffdio-copy will fill in the page cache into pte, however what if this mapping is privately mapped? IMHO we can't apply write bit otherwise the process will be writting to the page cache directly. However I think that question will be irrelevant to this patch. Thanks, -- Peter Xu

Re: [PATCH] userfaultfd/shmem: fix minor fault page leak

2021-03-24 Thread Peter Xu
ifferent commit ID here: commit 63c826b1372c4930f89b8a55092699fa7f0d6f4e Author: Axel Rasmussen Date: Thu Mar 18 10:20:43 2021 -0400 userfaultfd: support minor fault handling for shmem Axel, did you fetched the commit ID from your local tree, perhaps? Since I should have fetched from hnaz/linux-mm and I can see Andrew's sign-off too. Thanks, -- Peter Xu

Re: [PATCH v4 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-23 Thread Peter Xu
rn -EFAULT; But I didn't check other places, generally I'd return -EFAULT if I can't find a proper other replacement which has a clearer meaning. I don't think this is really helpful to user app too because no user app would start to read this -EFAULT to do anything useful.. how about I drop it too if you think the description is confusing? Thanks, -- Peter Xu

Re: [PATCH v4 2/4] userfaultfd.2: Add write-protect mode

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 07:19:12PM +0100, Alejandro Colomar (man-pages) wrote: > Hi Peter, > > Please see a few more comments below. > > Thanks, > > Alex > > On 3/22/21 11:08 PM, Peter Xu wrote: > > Write-protect mode is supported starting from Linux 5.7.

Re: [PATCH 07/23] mm: Introduce zap_details.zap_flags

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 02:11:29AM +, Matthew Wilcox wrote: > On Mon, Mar 22, 2021 at 08:48:56PM -0400, Peter Xu wrote: > > +/* Whether to check page->mapping when zapping */ > > +#define ZAP_FLAG_CHECK_MAPPING BIT(0) > > + > > /* >

Re: [PATCH 02/23] mm: Clear vmf->pte after pte_unmap_same() returns

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 10:34:45AM +0800, Miaohe Lin wrote: > Hi: > On 2021/3/23 8:48, Peter Xu wrote: > > pte_unmap_same() will always unmap the pte pointer. After the unmap, > > vmf->pte > > will not be valid any more. We should clear it. > > > > It wa

Re: [PATCH v4 1/4] userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 10:27:34AM +0200, Mike Rapoport wrote: > On Mon, Mar 22, 2021 at 06:08:45PM -0400, Peter Xu wrote: > > UFFD_FEATURE_THREAD_ID is supported since Linux 4.14. > > > > Signed-off-by: Peter Xu > > --- > > man2/userfaultfd.2 | 13

Re: [PATCH] userfaultfd: Write protect when virtual memory range has no page table entry

2021-03-23 Thread Peter Xu
uming it's a zero page. QEMU plans to fix it using pre-faults as UFFDIO_COPY will complicate the live snapshot framework, but UFFD_FEATURE_WP_UNALLOCATED should be more efficient. It's just that we still needs to keep the old behavior. I'll see whether I can prepare a patch for it shortly, with some test case too. Thanks, -- Peter Xu

Re: [PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs

2021-03-22 Thread Peter Xu
On Mon, Mar 22, 2021 at 08:48:49PM -0400, Peter Xu wrote: > This patchset is based on tag v5.12-rc3-mmots-2021-03-17-22-26. To run the > selftest, need to apply the two patches to fix minor mode page leak: > > https://lore.kernel.org/lkml/20210322175132.36659-1-pet...@redhat.

[PATCH 23/23] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs

2021-03-22 Thread Peter Xu
/userfaultfd.h header files, because it may cause kernel header update to easily break userspace. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing

[PATCH 19/23] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler

2021-03-22 Thread Peter Xu
swap pte too just like a none pte. Note that we also need to teach UFFDIO_COPY about this special pte across the code path so that we can safely install a new page at this special pte as long as we know it's a stall entry. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5 - mm/hugetlb.c

[PATCH 22/23] mm/userfaultfd: Enable write protection for shmem & hugetlbfs

2021-03-22 Thread Peter Xu
with _UFFDIO_WRITEPROTECT too because all existing types now support write protection mode. Since vma_can_userfault() will be used elsewhere, move into userfaultfd_k.h. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 18 -- include/linux/userfaultfd_k.h| 14 ++ include

[PATCH 20/23] hugetlb/userfaultfd: Allow wr-protect none ptes

2021-03-22 Thread Peter Xu
ze fetcher. Signed-off-by: Peter Xu --- mm/hugetlb.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 448ef745d5ee..d4acf9d9d087 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5110,7 +5110,7 @@ uns

[PATCH 21/23] hugetlb/userfaultfd: Only drop uffd-wp special pte if required

2021-03-22 Thread Peter Xu
taken hugetlb fault mutex so that no concurrent page fault would trigger. While the call to hugetlb_vmdelete_list() in hugetlbfs_punch_hole() is not safe. That's why the previous call will be with ZAP_FLAG_DROP_FILE_UFFD_WP, while the latter one won't be able to. Signed-off-by: Peter Xu --- fs

[PATCH 18/23] mm/hugetlb: Introduce huge version of special swap pte helpers

2021-03-22 Thread Peter Xu
This is to let hugetlbfs be prepared to also recognize swap special ptes just like uffd-wp special swap ptes. Signed-off-by: Peter Xu --- mm/hugetlb.c | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fd3e87517e10

[PATCH 17/23] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT

2021-03-22 Thread Peter Xu
This starts from passing cp_flags into hugetlb_change_protection() so hugetlb will be able to handle MM_CP_UFFD_WP[_RESOLVE] requests. huge_pte_clear_uffd_wp() is introduced to handle the case where the UFFDIO_WRITEPROTECT is requested upon migrating huge page entries. Signed-off-by: Peter Xu

[PATCH 16/23] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

2021-03-22 Thread Peter Xu
if UFFDIO_COPY_MODE_WP is provided, so that the core mm will know this page contains valid data and never drop it. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 5 + include/linux/hugetlb.h | 6 -- mm/hugetlb.c | 22 +- mm/userfaultfd.c

[PATCH 13/23] shmem/userfaultfd: Handle the left-overed special swap ptes

2021-03-22 Thread Peter Xu
userfaultfd itself on either UFFDIO_COPY or handling page faults, so that everything will still work as expected. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 15 +++ mm/shmem.c | 13 - 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b

[PATCH 15/23] hugetlb/userfaultfd: Hook page faults for uffd write protection

2021-03-22 Thread Peter Xu
Hook up hugetlbfs_fault() with the capability to handle userfaultfd-wp faults. We do this slightly earlier than hugetlb_cow() so that we can avoid taking some extra locks that we definitely don't need. Signed-off-by: Peter Xu --- mm/hugetlb.c | 19 +++ 1 file changed, 19

[PATCH 14/23] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork()

2021-03-22 Thread Peter Xu
It should be handled similarly like other uffd-wp wr-protected ptes: we should pass it over when the dst_vma has VM_UFFD_WP armed, otherwise drop it. Signed-off-by: Peter Xu --- mm/memory.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm

[PATCH 09/23] mm: Pass zap_flags into unmap_mapping_pages()

2021-03-22 Thread Peter Xu
ING, while even_cow==true means a none zap flag to pass in (though in most cases we have had even_cow==false). No functional change intended. Signed-off-by: Peter Xu --- fs/dax.c | 10 ++ include/linux/mm.h | 4 ++-- mm/khugepaged.c| 3 ++- mm/memory.c| 15 -

[PATCH 11/23] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem

2021-03-22 Thread Peter Xu
t. Note that this patch only covers the small pages (pte level) but not covering any of the transparent huge pages yet. But this will be a base for thps too. Signed-off-by: Peter Xu --- mm/mprotect.c | 48 1 file changed, 48 insertions(+) diff --git a/mm/mp

[PATCH 12/23] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps

2021-03-22 Thread Peter Xu
table lock. Signed-off-by: Peter Xu --- mm/mprotect.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 6b63e3544b47..51c954afa406 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -296,8 +296,16 @@ static inline unsigned long change_pmd_ra

[PATCH 10/23] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed

2021-03-22 Thread Peter Xu
, or punching a hole in a shmem file. For the latter, we can only drop the uffd-wp bit when holding the page lock. It means the unmap_mapping_range() in shmem_fallocate() still reuqires to zap without ZAP_FLAG_DROP_FILE_UFFD_WP because that's still racy with the page faults. Signed-off-by: Peter

[PATCH 07/23] mm: Introduce zap_details.zap_flags

2021-03-22 Thread Peter Xu
, it'll be very easy to grep this information by simply grepping the flag. It'll also make life easier when we want to e.g. pass in zap_flags into the callers like unmap_mapping_pages() (instead of adding new booleans besides the even_cows parameter). Signed-off-by: Peter Xu --- inc

[PATCH 08/23] mm: Introduce ZAP_FLAG_SKIP_SWAP

2021-03-22 Thread Peter Xu
tries"), but introduce ZAP_FLAG_SKIP_SWAP flag, which means the opposite of previous "details" parameter: the caller should explicitly set this to skip swap entries, otherwise swap entries will always be considered (which is still the major case here). Cc: Kirill A. Shutemov Sig

[PATCH 03/23] mm/userfaultfd: Introduce special pte for unmapped file-backed mem

2021-03-22 Thread Peter Xu
k: https://lore.kernel.org/lkml/20201126222359.8120-1-pet...@redhat.com/ Link: https://lore.kernel.org/lkml/20201130230603.46187-1-pet...@redhat.com/ Suggested-by: Andrea Arcangeli Suggested-by: Hugh Dickins Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h

[PATCH 05/23] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler

2021-03-22 Thread Peter Xu
-around with the new flag could confuse all the rest of pages when installing ptes from page cache when there's a cache hit. Signed-off-by: Peter Xu --- include/linux/mm.h| 2 + include/linux/userfaultfd_k.h | 11 mm/memory.c | 103

[PATCH 06/23] mm: Drop first_index/last_index in zap_details

2021-03-22 Thread Peter Xu
zap_details and let them simply be parameters of unmap_mapping_range_tree(), which is inlined. Signed-off-by: Peter Xu --- include/linux/mm.h | 2 -- mm/memory.c| 20 ++-- 2 files changed, 10 insertions(+), 12 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index

[PATCH 04/23] mm/swap: Introduce the idea of special swap ptes

2021-03-22 Thread Peter Xu
operly (e.g., in do_swap_page()) when we see a special swap pte - we should never call do_swap_page() upon those ptes, but just to bail out early if it happens. Signed-off-by: Peter Xu --- arch/arm64/kernel/mte.c | 2 +- fs/proc/task_mmu.c | 14 -- include/linux/swapops.h | 39 +++

[PATCH 02/23] mm: Clear vmf->pte after pte_unmap_same() returns

2021-03-22 Thread Peter Xu
f->pte first. Or, alloc_set_pte() will make sure to allocate a new pte even after calling pte_unmap_same(). Since we'll need to modify vmf->pte, directly pass in vmf into pte_unmap_same() and then we can also avoid the long parameter list. Signed-off-by: Peter Xu --- mm/memory.c | 13 +++

[PATCH 01/23] shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

2021-03-22 Thread Peter Xu
or uffd-wp, that could lead to data loss if without the dirty bit set. Note that shmem_mfill_zeropage_pte() will always call shmem_mfill_atomic_pte() with wp_copy==false because UFFDIO_ZEROCOPY does not support UFFDIO_COPY_MODE_WP. Signed-off-by: Peter Xu --- include/linux/shmem_fs.h | 5 +++--

[PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs

2021-03-22 Thread Peter Xu
efault umap only supports anonymous. So to test it we need to build [3] then [2]. Any comment would be greatly welcomed. Thanks, [1] https://github.com/xzpeter/linux/tree/uffd-wp-shmem-hugetlbfs [2] https://github.com/LLNL/umap-apps [3] https://github.com/xzpeter/umap/tree/peter-shmem-hugetlbfs Pe

[PATCH v4 1/4] userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-22 Thread Peter Xu
UFFD_FEATURE_THREAD_ID is supported since Linux 4.14. Signed-off-by: Peter Xu --- man2/userfaultfd.2 | 13 + 1 file changed, 13 insertions(+) diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2 index e7dc9f813..555e37409 100644 --- a/man2/userfaultfd.2 +++ b/man2/userfaultfd.2

[PATCH v4 2/4] userfaultfd.2: Add write-protect mode

2021-03-22 Thread Peter Xu
Write-protect mode is supported starting from Linux 5.7. Signed-off-by: Peter Xu --- man2/userfaultfd.2 | 104 - 1 file changed, 102 insertions(+), 2 deletions(-) diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2 index 555e37409..8ad4a71b5 100644

[PATCH v4 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-22 Thread Peter Xu
Userfaultfd write-protect mode is supported starting from Linux 5.7. Signed-off-by: Peter Xu --- man2/ioctl_userfaultfd.2 | 84 ++-- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 index

[PATCH v4 3/4] ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-22 Thread Peter Xu
UFFD_FEATURE_THREAD_ID is supported in Linux 4.14. Signed-off-by: Peter Xu --- man2/ioctl_userfaultfd.2 | 5 + 1 file changed, 5 insertions(+) diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 index 47ae5f473..d4a8375b8 100644 --- a/man2/ioctl_userfaultfd.2 +++ b/man2

[PATCH v4 0/4] man2: udpate mm/userfaultfd manpages to latest

2021-03-22 Thread Peter Xu
o features missing in current manpage, namely: (1) Userfaultfd Thread-ID feature (2) Userfaultfd write protect mode There's also a 3rd one which was just contributed from Axel - Axel, I think it would be great if you can add that part too, probably after the whole hugetlbfs/shmem minor mode re

Re: [PATCH v3 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-22 Thread Peter Xu
3. Thanks for looking, I'll repost shortly. -- Peter Xu

Re: [PATCH] userfaultfd/shmem: fix minor fault page leak

2021-03-22 Thread Peter Xu
g for shmem") > Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu -- Peter Xu

[PATCH] userfaultfd/hugetlbfs: Fix minor fault page leak

2021-03-22 Thread Peter Xu
: Mike Kravetz Cc: Mike Rapoport Cc: Andrew Morton Fixes: f2bf15fb0969 ("userfaultfd: add minor fault registration mode") Signed-off-by: Peter Xu --- mm/hugetlb.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 408dbc08298a..56b78a206913 10064

Re: [PATCH] userfaultfd: Write protect when virtual memory range has no page table entry

2021-03-22 Thread Peter Xu
tain the same semantic as UFFDIO_ZEROCOPY so less data copy too (UFFDIO_ZEROCOPY does not support UFFDIO_COPY_MODE_WP so far). However we need to be careful on mixture use of these, e.g., I think UFFD_FEATURE_WP_UNALLOCATED at least shouldn't be allowed with UFFDIO_REGISTER_MODE_MISSING, otherwise the

Re: [PATCH v3 0/6] Some cleanups for huge_memory

2021-03-18 Thread Peter Xu
n bool > mm/huge_memory.c: rework the function do_huge_pmd_numa_page() slightly > mm/huge_memory.c: remove redundant PageCompound() check > mm/huge_memory.c: remove unused macro > TRANSPARENT_HUGEPAGE_DEBUG_COW_FLAG > mm/huge_memory.c: use helper function migration_entry_to_page() Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v2 1/2] mm: Allow non-VM_DONTEXPAND and VM_PFNMAP mappings with MREMAP_DONTUNMAP

2021-03-17 Thread Peter Xu
er in that case in vma_to_resize() we'll bail out even earlier than line 676 when checking against the size: https://elixir.bootlin.com/linux/v5.12-rc3/source/mm/mremap.c#L667 So IIUC we'll still need the change as Hugh suggested previously. Thanks, -- Peter Xu

Re: [PATCH v2 1/2] mm: Allow non-VM_DONTEXPAND and VM_PFNMAP mappings with MREMAP_DONTUNMAP

2021-03-17 Thread Peter Xu
vma->vm_flags & VM_SHARED)) > - return ERR_PTR(-EINVAL); > - > if (is_vm_hugetlb_page(vma)) > return ERR_PTR(-EINVAL); The code change seems to be not aligned with what the commit message said. Did you perhaps forget to add the checks against VM_DONTEXPAND | VM_PFNMAP? I'm guessing that (instead of commit message to be touched up) because you still attached the revert patch, then that check seems to be needed. Thanks, -- Peter Xu

Re: [PATCH v2 1/6] mm/huge_memory.c: rework the function vma_adjust_trans_huge()

2021-03-17 Thread Peter Xu
On Wed, Mar 17, 2021 at 10:18:40AM +0800, Miaohe Lin wrote: > Hi: > On 2021/3/17 4:40, Peter Xu wrote: > > On Tue, Mar 16, 2021 at 08:40:02AM -0400, Miaohe Lin wrote: > >> +static inline void split_huge_pmd_if_needed(struct vm_area_struct *vma, >

Re: [PATCH v2 1/6] mm/huge_memory.c: rework the function vma_adjust_trans_huge()

2021-03-16 Thread Peter Xu
be use ALIGN/ALIGN_DOWN too against HPAGE_PMD_SIZE? > + split_huge_pmd_address(vma, address, false, NULL); > +} -- Peter Xu

Re: [PATCH] mm: Allow shmem mappings with MREMAP_DONTUNMAP

2021-03-16 Thread Peter Xu
though > vma_is_anonymous() will no longer protect it. > > Was there an mremap(2) man page update for MREMAP_DONTUNMAP? > Whether or not there was before, it ought to get one now. I'm curious whether it's okay to expand MREMAP_DONTUNMAP to PFNMAP too.. E.g. vfio maps device MMIO regions with both VM_DONTEXPAND|VM_PFNMAP, to me it makes sense to allow the userspace to get such MMIO region remapped/duplicated somewhere else as long as the size won't change. With the strict check as above we kill all those possibilities. Though in that case we'll still need commits like cd544fd1dc92 to protect any customized ->mremap() when they're not supported. Thanks, -- Peter Xu

Re: [PATCH] vfio/pci: Handle concurrent vma faults

2021-03-11 Thread Peter Xu
On Thu, Mar 11, 2021 at 11:35:24AM +, Christoph Hellwig wrote: > On Wed, Mar 10, 2021 at 03:06:07PM -0500, Peter Xu wrote: > > On Wed, Mar 10, 2021 at 02:40:11PM -0400, Jason Gunthorpe wrote: > > > On Wed, Mar 10, 2021 at 11:34:06AM -0700, Alex Williamson wrote: > > &g

[PATCH v3 3/4] ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-10 Thread Peter Xu
UFFD_FEATURE_THREAD_ID is supported in Linux 4.14. Signed-off-by: Peter Xu --- man2/ioctl_userfaultfd.2 | 5 + 1 file changed, 5 insertions(+) diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 index 47ae5f473..d4a8375b8 100644 --- a/man2/ioctl_userfaultfd.2 +++ b/man2

  1   2   3   4   5   6   7   8   9   10   >