Re: [PATCH v3 1/6] mm: userfaultfd: generic continue for non hugetlbfs

2025-06-11 Thread Peter Xu
On Wed, Jun 11, 2025 at 01:09:32PM +0100, Nikita Kalyazin wrote: > > > On 10/06/2025 23:22, Peter Xu wrote: > > On Fri, Apr 04, 2025 at 03:43:47PM +, Nikita Kalyazin wrote: > > > Remove shmem-specific code from UFFDIO_CONTINUE implementation for > > > non-hu

Re: [PATCH v3 1/6] mm: userfaultfd: generic continue for non hugetlbfs

2025-06-10 Thread Peter Xu
lly not useful to non-userfault users, meanwhile we also don't need to hand-cook the vm_fault struct below just to suite the current fault() interfacing. Thanks, -- Peter Xu

Re: [PATCH v3 4/6] KVM: guest_memfd: add support for userfaultfd minor

2025-06-10 Thread Peter Xu
goto out_unlock; I thought it would fail guest-memfd already on a CONTINUE request, and it doesn't seem to be touched yet in this series. I'm not yet sure how the test worked out without hitting things like it. Highly likely I missed something. Some explanations would be welcomed.. Thanks, > vmf->page = folio_file_page(folio, vmf->pgoff); > > out_folio: > -- > 2.47.1 > -- Peter Xu

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-15 Thread Peter Xu
On Thu, Mar 13, 2025 at 03:25:16PM +, Nikita Kalyazin wrote: > > > On 12/03/2025 19:32, Peter Xu wrote: > > On Wed, Mar 12, 2025 at 05:07:25PM +, Nikita Kalyazin wrote: > > > However if MISSING is not registered, the kernel will auto-populate with a > > &g

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-15 Thread Peter Xu
also prefault by writing zeros in a loop after mmap(). Thanks, -- Peter Xu

Re: [RFC PATCH 14/39] KVM: guest_memfd: hugetlb: initialization and cleanup

2025-03-15 Thread Peter Xu
ould raise this up here anyway at least as a pure question. > + kvm_gmem_hugetlb_filemap_remove_folio(folio); > + mutex_unlock(&hugetlb_fault_mutex_table[hash]); > + > + num_freed++; > + } > + folio_batch_release(&fbatch); > + cond_resched(); > + } > + > + return num_freed; > +} -- Peter Xu

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-14 Thread Peter Xu
-be-posted too). I also have a QEMU branch ready that can boot with it (I didn't yet test more things). https://github.com/xzpeter/qemu/commits/peter-gmem-v0.2/ For example, besides guest-memfd alone, we definitely also need guest-memfd being trappable by userfaultfd, as what you are trying to do here, one way or another. Thanks, -- Peter Xu

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-13 Thread Peter Xu
mmap()ed VAs to NIC as buffers (e.g. in recvmsg(), for example, as part of iovec[]), and as long as the mmap()ed ranges are not registered by KVM memslots, there's no concern on non-atomic copy. Thanks, -- Peter Xu

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-12 Thread Peter Xu
ing trying to access it will be trapped. [1] -- Peter Xu

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-12 Thread Peter Xu
On Tue, Mar 11, 2025 at 04:56:47PM +, Nikita Kalyazin wrote: > > > On 10/03/2025 19:57, Peter Xu wrote: > > On Mon, Mar 10, 2025 at 06:12:22PM +, Nikita Kalyazin wrote: > > > > > > > > > On 05/03/2025 20:29, Peter Xu wrote: > > >

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-10 Thread Peter Xu
On Mon, Mar 10, 2025 at 06:12:22PM +, Nikita Kalyazin wrote: > > > On 05/03/2025 20:29, Peter Xu wrote: > > On Wed, Mar 05, 2025 at 11:35:27AM -0800, James Houghton wrote: > > > I think it might be useful to implement an fs-generic MINOR mode. The > > > fau

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-05 Thread Peter Xu
and when folio lock is frequently taken elsewhere too. It might boil down to how many more FSes would support minor fault, and whether we would care about such difference at last to shmem users. If gmem is the only one after existing ones, IIUC there's still option we implement it in gmem code. After all, I expect the change should be very under control (<20 LOCs?).. -- Peter Xu

Re: [RFC PATCH 27/39] KVM: guest_memfd: Allow mmapping guest_memfd files

2025-03-04 Thread Peter Xu
rrently it uses a lot of mm functions that are not yet exported, so AFAIU it will only build if kvm is builtin. Thanks, -- Peter Xu

Re: [RFC PATCH 0/5] KVM: guest_memfd: support for uffd missing

2025-03-03 Thread Peter Xu
e I used that to allow gmem report huge page supports on faults. Said that, above only existed in my own tree so far, so I also don't know whether something like that could be accepted (even if it'll work for you). Thanks, -- Peter Xu

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2025-02-25 Thread Peter Xu
m private info. > inode->i_op = &kvm_gmem_iops; > inode->i_mapping->a_ops = &kvm_gmem_aops; > @@ -1097,6 +1178,8 @@ static struct inode > *kvm_gmem_inode_make_secure_inode(const char *name, > > return inode; > > +free_private: > + kfree(private); > out: > iput(inode); > > -- > 2.46.0.598.g6f2099f65c-goog > -- Peter Xu

Re: [PATCH] selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation

2025-02-19 Thread Peter Xu
riginal constraints in place. > > Fixes: 2e47a445d7b3 ("selftests/mm: run_vmtests.sh: fix hugetlb mem size > calculation") > Signed-off-by: Rafael Aquini Oops.. thanks! Reviewed-by: Peter Xu -- Peter Xu

Re: [RFC PATCH 15/39] KVM: guest_memfd: hugetlb: allocate and truncate from hugetlb

2025-02-13 Thread Peter Xu
On Thu, Feb 13, 2025 at 07:52:43AM +, Ackerley Tng wrote: > Peter Xu writes: > > > On Tue, Sep 10, 2024 at 11:43:46PM +, Ackerley Tng wrote: > >> +static struct folio *kvm_gmem_hugetlb_alloc_folio(struct hstate *h, > >> +

Re: [PATCH v1 1/2] mm: Clear uffd-wp PTE/PMD state on mremap()

2025-01-23 Thread Peter Xu
n the comment too in that path: move_normal_pud(): /* * The destination pud shouldn't be established, free_pgtables() * should have released it. */ if (WARN_ON_ONCE(!pud_none(*new_pud))) return false; PMD path has similar implications. Thanks, -- Peter Xu

Re: [RFC PATCH 27/39] KVM: guest_memfd: Allow mmapping guest_memfd files

2025-01-20 Thread Peter Xu
gt; { > struct kvm_gmem_hugetlb *hgmem; > > + /* TODO: Check if even_cows should be 0 or 1 */ > + unmap_mapping_range(inode->i_mapping, 0, LLONG_MAX, 0); Setting to 0 is ok in both places: even_cows only applies to MAP_PRIVATE, which gmemfd doesn't support. So feel free to drop the two comment lines. Thanks, -- Peter Xu

Re: [PATCH v1 1/2] mm: Clear uffd-wp PTE/PMD state on mremap()

2025-01-15 Thread Peter Xu
rotect API to > userfaultfd ioctl") > Cc: sta...@vger.kernel.org Nothing I see wrong: Reviewed-by: Peter Xu One trivial thing: some multiple-line comments is following the net/ coding style rather than mm/, but well.. I don't think it's a huge deal. https://www.kernel.org/doc/html/v4.10/process/coding-style.html#commenting Thanks again. -- Peter Xu

Re: [PATCH v1 1/2] mm: Clear uffd-wp PTE/PMD state on mremap()

2025-01-15 Thread Peter Xu
esn't have any acks. I don't suppose you would > be able to do a quick review to calm the nerves?? Heh, I fully trusted you, and I appreciated your help too. I'll need to run for 1-2 hours, but I'll read it this afternoon. Side note: no review is as good as tests on reliability POV if that was the concern, but I'll try my best. Thanks, -- Peter Xu

Re: [RFC PATCH 14/39] KVM: guest_memfd: hugetlb: initialization and cleanup

2024-12-01 Thread Peter Xu
subpool() -> unlock_or_release_subpool(). > + > + spin_lock(&inode->i_lock); > + inode->i_blocks -= blocks_per_huge_page(h) * num_freed; > + spin_unlock(&inode->i_lock); > +} -- Peter Xu

Re: [RFC PATCH 15/39] KVM: guest_memfd: hugetlb: allocate and truncate from hugetlb

2024-12-01 Thread Peter Xu
hugepage_subpool_put_pages(spool, 1); > + > +err_cancel_charge: > + if (memcg_charge_was_prepared) > + mem_cgroup_cancel_charge(memcg, pages_per_huge_page(h)); > + > +err: > + folio = ERR_PTR(-ENOMEM); > + goto out; > +} -- Peter Xu

Re: [PATCH v1 2/2] selftests/mm: fix coccinelle WARNING recommending the use of ARRAY_SIZE()

2024-11-01 Thread Peter Xu
t; ./tools/testing/selftests/mm/uffd-unit-tests.c:1485:30-31: WARNING: Use > ARRAY_SIZE > > Fixes: 16a45b57cbf2 ("selftests/mm: add framework for uffd-unit-test") > Cc: Andrew Morton > Cc: Shuah Khan > Cc: Peter Xu > Cc: linux...@kvack.org > Cc: linux-kselft...@vger.ke

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2024-10-17 Thread Peter Xu
s all over the places over cgroup/pool/meminfo/etc. -- Peter Xu

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2024-10-17 Thread Peter Xu
On Thu, Oct 17, 2024 at 01:47:13PM -0300, Jason Gunthorpe wrote: > On Thu, Oct 17, 2024 at 10:58:29AM -0400, Peter Xu wrote: > > > My question was more torwards whether gmemfd could still expose the > > possibility to be used in VA forms to other modules that may not support

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2024-10-17 Thread Peter Xu
On Wed, Oct 16, 2024 at 08:54:24PM -0300, Jason Gunthorpe wrote: > On Wed, Oct 16, 2024 at 07:49:31PM -0400, Peter Xu wrote: > > On Wed, Oct 16, 2024 at 07:51:57PM -0300, Jason Gunthorpe wrote: > > > On Wed, Oct 16, 2024 at 04:16:17PM -0400, Peter Xu wrote: > > > >

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2024-10-16 Thread Peter Xu
On Wed, Oct 16, 2024 at 07:51:57PM -0300, Jason Gunthorpe wrote: > On Wed, Oct 16, 2024 at 04:16:17PM -0400, Peter Xu wrote: > > > > Is there chance that when !CoCo will be supported, then external modules > > (e.g. VFIO) can reuse the old user mappings, just like befor

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2024-10-16 Thread Peter Xu
On Wed, Oct 16, 2024 at 10:45:43AM +0200, David Hildenbrand wrote: > On 16.10.24 01:42, Ackerley Tng wrote: > > Peter Xu writes: > > > > > On Fri, Oct 11, 2024 at 11:32:11PM +, Ackerley Tng wrote: > > > > Peter Xu writes: > > > > > > &g

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2024-10-15 Thread Peter Xu
On Fri, Oct 11, 2024 at 11:32:11PM +, Ackerley Tng wrote: > Peter Xu writes: > > > On Tue, Sep 10, 2024 at 11:43:57PM +, Ackerley Tng wrote: > >> The faultability xarray is stored on the inode since faultability is a > >> property of the guest_memfd's

Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private

2024-10-10 Thread Peter Xu
-CoCo context for 1G? I saw that you also mentioned you have working QEMU prototypes ready in another email. It'll be great if you can push your kernel/QEMU's latest tree (including all dependency patches) somewhere so anyone can have a closer look, or play with it. Thanks, -- Peter Xu

Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback

2024-04-11 Thread Peter Xu
On Thu, Apr 11, 2024 at 06:55:44PM +0200, Paolo Bonzini wrote: > On Mon, Apr 8, 2024 at 3:56 PM Peter Xu wrote: > > Paolo, > > > > I may miss a bunch of details here (as I still remember some change_pte > > patches previously on the list..), however not sure whether

Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback

2024-04-08 Thread Peter Xu
ked because I remember Andrea used to have a custom tree maintaining that part: https://github.com/aagit/aa/commit/c761078df7a77d13ddfaeebe56a0f4bc128b1968 Maybe it can't be enabled for some reason that I overlooked in the current tree, or we just decided to not to? Thanks, -- Peter Xu

Re: [PATCH v4 09/10] userfaultfd/shmem: modify shmem_mcopy_atomic_pte to use install_pte()

2021-04-20 Thread Peter Xu
ORMAL and _CONTINUE for both > shmem > + * and anon, and for both shared and private VMAs. > */ > -static int mcopy_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, > - struct vm_area_struct *dst_vma, > - unsigned long dst_addr, struct page *page, > - bool newly_allocated, bool wp_copy) > +int mcopy_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, > + struct vm_area_struct *dst_vma, > + unsigned long dst_addr, struct page *page, > + bool newly_allocated, bool wp_copy) > { > int ret; > pte_t _dst_pte, *dst_pte; > -- > 2.31.1.368.gbe11c130af-goog > -- Peter Xu

Re: [PATCH] KVM: selftests: Always run vCPU thread with blocked SIG_IPI

2021-04-20 Thread Peter Xu
On Tue, Apr 20, 2021 at 06:24:50PM +0200, Paolo Bonzini wrote: > On 20/04/21 17:32, Peter Xu wrote: > > On Tue, Apr 20, 2021 at 10:37:39AM -0400, Peter Xu wrote: > > > On Tue, Apr 20, 2021 at 04:16:14AM -0400, Paolo Bonzini wrote: > > > > The main thread could sta

[PATCH v4 2/2] KVM: selftests: Wait for vcpu thread before signal setup

2021-04-20 Thread Peter Xu
on receiving a SIG_USR1 without a handler (when vcpu runs far slower than main). Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 8 1 file changed, 8 insertions(+) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm

[PATCH v4 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-20 Thread Peter Xu
.org/lkml/20210413213641.23742-1-pet...@redhat.com/ [2] https://lore.kernel.org/lkml/20210417140956.GV4440@xz-x1/ Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Andrew Jones Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 62 1 file changed, 51 inse

[PATCH v4 0/2] KVM: selftests: fix races in dirty log test

2021-04-20 Thread Peter Xu
.kernel.org/kvm/20210420081614.684787-1-pbonz...@redhat.com/ Peter Xu (2): KVM: selftests: Sync data verify of dirty logging with guest sync KVM: selftests: Wait for vcpu thread before signal setup tools/testing/selftests/kvm/dirty_log_test.c | 70 +--- 1 file changed, 59 insertions(+

Re: [PATCH] KVM: selftests: Always run vCPU thread with blocked SIG_IPI

2021-04-20 Thread Peter Xu
On Tue, Apr 20, 2021 at 10:37:39AM -0400, Peter Xu wrote: > On Tue, Apr 20, 2021 at 04:16:14AM -0400, Paolo Bonzini wrote: > > The main thread could start to send SIG_IPI at any time, even before signal > > blocked on vcpu thread. Therefore, start the vcpu thread with the sig

Re: [PATCH] KVM: selftests: Always run vCPU thread with blocked SIG_IPI

2021-04-20 Thread Peter Xu
_log_test could fail directly > on receiving a SIGUSR1 without a handler (when vcpu runs far slower than > main). > > Reported-by: Peter Xu > Cc: sta...@vger.kernel.org > Signed-off-by: Paolo Bonzini Yes, indeed better! :) Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v3 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-20 Thread Peter Xu
On Tue, Apr 20, 2021 at 10:07:16AM +0200, Paolo Bonzini wrote: > On 18/04/21 14:43, Peter Xu wrote: > > 8<- > > diff --git a/tools/testing/selftests/kvm/dirty_log_test.c > > b/tools/testing/selftests/kvm/dirty_log_test.c > > index 25230e799bc4..d3050d1c2cd0

Re: [PATCH] sched/isolation: don't do unbounded chomp on bootarg string

2021-04-19 Thread Peter Xu
puset. However it seems still the only place to set the new flag HK_FLAG_MANAGED_IRQ. If one day we'll finally obsolete isolcpus= we may need to think about where to put it? When I looked at it, I also noticed I see no caller to set HK_FLAG_SCHED at all. Is it really used anywhere? Reg

Re: [PATCH v3 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-18 Thread Peter Xu
On Sat, Apr 17, 2021 at 10:36:01AM -0400, Peter Xu wrote: > This fixes a bug that can trigger with e.g. "taskset -c 0 ./dirty_log_test" or > when the testing host is very busy. > > A similar previous attempt is done [1] but that is not enough, the reason is > stated in

[PATCH v3 2/2] KVM: selftests: Wait for vcpu thread before signal setup

2021-04-17 Thread Peter Xu
on receiving a SIG_USR1 without a handler (when vcpu runs far slower than main). Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 8 1 file changed, 8 insertions(+) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm

[PATCH v3 1/2] KVM: selftests: Sync data verify of dirty logging with guest sync

2021-04-17 Thread Peter Xu
.org/lkml/20210413213641.23742-1-pet...@redhat.com/ [2] https://lore.kernel.org/lkml/20210417140956.GV4440@xz-x1/ Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Andrew Jones Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 60 1 file changed, 50 inse

[PATCH v3 0/2] KVM: selftests: fix races in dirty log test

2021-04-17 Thread Peter Xu
test this patch: (1) while :; do taskset -c 1 ./dirty_log_test; done (2) taskset -c 1 bash -c "while :; do :; done" Review comments are greatly welcomed. Thanks, [1] https://lore.kernel.org/lkml/20210413213641.23742-1-pet...@redhat.com/ Peter Xu (2): KVM: selftests: Sync data

Re: [PATCH v2] kvm/selftests: Fix race condition with dirty_log_test

2021-04-17 Thread Peter Xu
ay. I tested longer yesterday but haven't updated this patch yet. More below. On Sat, Apr 17, 2021 at 02:59:48PM +0200, Paolo Bonzini wrote: > On 13/04/21 23:36, Peter Xu wrote: > > This patch closes this race by allowing the main thread to give the vcpu > > thread > >

Re: [PATCH v5] hrtimer: avoid retrigger_next_event IPI

2021-04-16 Thread Peter Xu
that any subsequently armed timers on > CLOCK_REALTIME and CLOCK_TAI are evaluated with the correct offsets. > > Signed-off-by: Marcelo Tosatti > > --- > > v5: > - Add missing hrtimer_update_base (Peter Xu). > > v4: >- Drop unused code (Thomas). >

Re: [PATCH v2 3/9] userfaultfd/shmem: support minor fault registration for shmem

2021-04-14 Thread Peter Xu
But I might be slowly > realizing that the ioctl to add the pte (in 4/9) will do its > shmem_getpage_gfp(), and that will bring in the swap if user > did not already do so: so I was wrong to claim more robustness > the other way, this placement should be fine. I think. > > > if (xa_is_value(page)) { > > error = shmem_swapin_page(inode, index, &page, > > sgp, gfp, vma, fault_type); > > -- > > 2.31.1.295.g9ea45b61b8-goog > -- Peter Xu

[PATCH v2] kvm/selftests: Fix race condition with dirty_log_test

2021-04-13 Thread Peter Xu
help avoid this specific race condition. Cc: Andrew Jones Cc: Paolo Bonzini Cc: Vitaly Kuznetsov Cc: Sean Christopherson Signed-off-by: Peter Xu --- v2: - drop one unnecessary check on "!matched" --- tools/testing/selftests/kvm/dirty_log_test.c | 53 +++- 1 file chan

[PATCH] kvm/selftests: Fix race condition with dirty_log_test

2021-04-13 Thread Peter Xu
help avoid this specific race condition. Cc: Andrew Jones Cc: Paolo Bonzini Cc: Vitaly Kuznetsov Cc: Sean Christopherson Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 54 +++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/tools/t

Re: [PATCH v2 3/9] userfaultfd/shmem: support minor fault registration for shmem

2021-04-13 Thread Peter Xu
the UFFDIO_CONTINUE ioctl for shmem-backed > minor faults, though, so userspace doesn't yet have a way to resolve > such faults. > > Signed-off-by: Axel Rasmussen Everything looks right to me, but it'll be great if Andrea or Hugh will have a look too. Acked-by: Peter Xu -- Peter Xu

Re: [PATCH v2 6/9] userfaultfd/selftests: create alias mappings in the shmem test

2021-04-13 Thread Peter Xu
mfd alias failed"); > + > + if (is_src) > + area_src_alias = area_alias; > + else > + area_dst_alias = area_alias; > +} It would be nice if shmem_allocate_area() could merge with hugetlb_allocate_area() somehow, but not that urgent. Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v2 5/9] userfaultfd/selftests: use memfd_create for shmem test type

2021-04-13 Thread Peter Xu
ss in the right argv[] so we actually print out the > hugetlb file path. > > Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu -- Peter Xu

Re: [PATCH v2 7/9] userfaultfd/selftests: reinitialize test context in each test

2021-04-13 Thread Peter Xu
Would it look even nicer to init() at the entry of each test, and clear() after finish one test? > + > uffdio_register.range.start = (unsigned long) area_dst; > uffdio_register.range.len = nr_pages * page_size; > uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; The rest looks good to me. Thanks, -- Peter Xu

Re: [PATCH 4/9] userfaultfd/shmem: support UFFDIO_CONTINUE for shmem

2021-04-13 Thread Peter Xu
On Mon, Apr 12, 2021 at 09:40:22PM -0700, Axel Rasmussen wrote: > On Mon, Apr 12, 2021 at 4:17 PM Peter Xu wrote: > > > > On Thu, Apr 08, 2021 at 04:43:22PM -0700, Axel Rasmussen wrote: > > > +/* > > > + * Install PTEs, to map dst_addr (within dst_vma) to page.

Re: [PATCH v4] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTINUE behavior

2021-04-12 Thread Peter Xu
On Mon, Apr 12, 2021 at 05:51:14PM -0700, Hugh Dickins wrote: > On Mon, 12 Apr 2021, Peter Xu wrote: > > On Tue, Apr 06, 2021 at 11:14:30PM -0700, Hugh Dickins wrote: > > > > +static int mcopy_atomic_install_ptes(struct mm_struct *dst_mm, pm

Re: [PATCH 2/9] userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte

2021-04-12 Thread Peter Xu
copy_atomic()... Then it'll further passed into shmem_mcopy_atomic_pte() now after this patch (as shmem_mfill_zeropage_pte() probably only did one thing good which is to clear src_addr). Not a big deal, though. All the rest looks sane to me. Reviewed-by: Peter Xu I'll wait to lo

Re: [PATCH 1/9] userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h

2021-04-12 Thread Peter Xu
unsigned long address, unsigned int flags); > #ifdef CONFIG_USERFAULTFD > +enum mcopy_atomic_mode; (I'm not 100% sure, but.. maybe this can be moved even out of ifdef? Then you can define it once at the top rather than twice?) Reviewed-by: Peter Xu -- Peter Xu

[PATCH v2 4/5] userfaultfd/selftests: Only dump counts if mode enabled

2021-04-12 Thread Peter Xu
WP and MINOR modes are conditionally enabled on specific memory types. This patch avoids dumping tons of zeros for those cases when the modes are not supported at all. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 30

[PATCH v2 2/5] userfaultfd/selftests: Remove the time() check on delayed uffd

2021-04-12 Thread Peter Xu
edule latency of resolving thread. It may not mean an issue with uffd. Neither do I saw this error triggered either in the past runs. Even if it triggers, it'll be drown in all the rest of test logs. Remove it. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/se

[PATCH v2 5/5] userfaultfd/selftests: Unify error handling

2021-04-12 Thread Peter Xu
Introduce err()/_err() and replace all the different ways to fail the program, mostly "fprintf" and "perror" with tons of exit() calls. Always stop the test program at any failure. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm

[PATCH v2 1/5] userfaultfd/selftests: Use user mode only

2021-04-12 Thread Peter Xu
Userfaultfd selftest does not need to handle kernel initiated fault. Set user mode so it can be run even if unprivileged_userfaultfd=0 (which is the default). Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 2 +- 1 file changed, 1 insertion

[PATCH v2 3/5] userfaultfd/selftests: Dropping VERIFY check in locking_thread

2021-04-12 Thread Peter Xu
conditionally check the fault flag - just do it unconditionally. Reviewed-by: Axel Rasmussen Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 55 +--- 1 file changed, 1 insertion(+), 54 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultf

[PATCH v2 0/5] userfaultfd/selftests: A few cleanups

2021-04-12 Thread Peter Xu
serfaultfd selftest on fault handling, to use an err() macro instead of either fprintf() or perror() then another exit() call. The huge cleanup is done in the last patch. The first 4 patches are some other standalone cleanups for the same file, so I put them together. Please review, thanks. P

Re: [PATCH 4/9] userfaultfd/shmem: support UFFDIO_CONTINUE for shmem

2021-04-12 Thread Peter Xu
clear; unlock_page(page); put_page(page); page = NULL; hindex = index; } I think it won't happen for your case since the page should be uptodate already (the other thread should check and modify the page before CONTINUE), but still raise this up, since if the page was allocated it smells better to still install the fallocated page (do we need to clear the page and SetUptodate)? -- Peter Xu

Re: [PATCH v4] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTINUE behavior

2021-04-12 Thread Peter Xu
t. */ if (pte_hw_dirty(pte)) pte = pte_mkdirty(pte); pte = clear_pte_bit(pte, __pgprot(PTE_WRITE)); pte = set_pte_bit(pte, __pgprot(PTE_RDONLY)); return pte; } So arm64 will explicitly set the dirty bit (from the HW dirty bit) when wr-protect. It seems to prove that at least for arm64 it's very valid to have !write && dirty pte. Thanks, -- Peter Xu

Re: [PATCH 0/9] userfaultfd: add minor fault handling for shmem

2021-04-09 Thread Peter Xu
n your tree, without the shmem > series? And then I'll resolve any conflicts in my tree? > > It's true that we haven't tested the hugetlbfs minor faults patch > extensively *with the shmem one also applied*, but it has had more > thorough review than the shmem one at this point (e.g. by Mike > Kravetz), and they're rather separate code paths (I'd be surprised if > one breaks the other). Yes I think the hugetlb part should have got more review done. IMHO it's a matter of whether Mike would still like to do a more thorough review, or seems okay to keep them. I can repost the selftest series later if needed, as long as I figured which is the suitable base commit. Those selftest patches are definitely not urgent for this release, so we can wait for the next release. Thanks, -- Peter Xu

Re: [PATCH 3/9] userfaultfd/shmem: support minor fault registration for shmem

2021-04-09 Thread Peter Xu
#x27;s indeed a bit awkward to swapin here. Maybe move this chunk to right after pagecache_get_page() returns? Then no need to touch the rest. > + > + if (swapped) > + return 0; > + > if (page) > hindex = page->index; > if (page && sgp == SGP_WRITE) > -- > 2.31.1.295.g9ea45b61b8-goog > -- Peter Xu

Re: [PATCH 0/5] 4.14 backports of fixes for "CoW after fork() issue"

2021-04-07 Thread Peter Xu
og/?h=mapcount_deshare [3] https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=mapcount_deshare&id=7c3a31caa34ac6ac4a4ec0559b1307b5edfc0821 [4] https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=mapcount_deshare&id=599aa62474f51a470408b28fd4365320a5357aca -- Peter Xu

Re: [PATCH v5] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTINUE behavior

2021-04-06 Thread Peter Xu
case MCOPY_ATOMIC_CONTINUE: > - err = -EINVAL; > - break; > - } > } else { > VM_WARN_ON_ONCE(wp_copy); > err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, >src_addr, mode, page); > } > > +out: > return err; > } > > diff --git a/tools/testing/selftests/vm/userfaultfd.c > b/tools/testing/selftests/vm/userfaultfd.c > index f6c86b036d0f..d8541a59dae5 100644 > --- a/tools/testing/selftests/vm/userfaultfd.c > +++ b/tools/testing/selftests/vm/userfaultfd.c > @@ -485,6 +485,7 @@ static void wp_range(int ufd, __u64 start, __u64 len, > bool wp) > static void continue_range(int ufd, __u64 start, __u64 len) > { > struct uffdio_continue req; > + int ret; > > req.range.start = start; > req.range.len = len; > @@ -493,6 +494,17 @@ static void continue_range(int ufd, __u64 start, __u64 > len) > if (ioctl(ufd, UFFDIO_CONTINUE, &req)) > err("UFFDIO_CONTINUE failed for address 0x%" PRIx64, > (uint64_t)start); > + > + /* > + * Error handling within the kernel for continue is subtly different > + * from copy or zeropage, so it may be a source of bugs. Trigger an > + * error (-EEXIST) on purpose, to verify doing so doesn't cause a BUG. > + */ > + req.mapped = 0; > + ret = ioctl(ufd, UFFDIO_CONTINUE, &req); > + if (ret >= 0 || req.mapped != -EEXIST) > + err("failed to exercise UFFDIO_CONTINUE error handling, ret=%d, > mapped=%" PRId64, > + ret, req.mapped); > } > > static void *locking_thread(void *arg) > -- > 2.31.0.208.g409f899ff0-goog > -- Peter Xu

Re: [PATCH 0/5] 4.14 backports of fixes for "CoW after fork() issue"

2021-04-01 Thread Peter Xu
gle mapper, more references than us and the map? */ > > if (page_mapcount(page) == 1 && page_count(page) > 2) > > goto keep_locked; > > > > in the pre-pinning days. > > > > But I really think that there are a number of other commit

Re: [PATCH v3] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE behavior

2021-03-31 Thread Peter Xu
TINUE is slightly different, since we _know_ the page cache is there.. So I'm thinking maybe you need to handle the continue request in mfill_atomic_pte() before the VM_SHARED check so as to cover both cases. -- Peter Xu

Re: [PATCH v3] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE behavior

2021-03-30 Thread Peter Xu
counter_file(page)); page_add_file_rmap(page, false); set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); /* No need to invalidate - it was non-present before */ update_mmu_cache(dst_vma, dst_addr, dst_pte); pte_unmap_unlock(dst_pte, ptl); return 0; } Then at the entry of shmem_mcopy_atomic_pte(): if (is_continue) { page = find_lock_page(mapping, pgoff); if (!page) return -EFAULT; ret = shmem_install_uffd_pte(..., is_continue && !(dst_vma->vm_flags & VM_SHARED)); unlock_page(page); if (ret) put_page(page); return ret; } Do you think this would be cleaner? -- Peter Xu

Re: [PATCH v2] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE behavior

2021-03-29 Thread Peter Xu
t = start; > req.range.len = len; > @@ -493,6 +494,17 @@ static void continue_range(int ufd, __u64 start, __u64 > len) > if (ioctl(ufd, UFFDIO_CONTINUE, &req)) > err("UFFDIO_CONTINUE failed for address 0x%" PRIx64, > (uint64_t)start); > + > + /* > + * Error handling within the kernel for continue is subtly different > + * from copy or zeropage, so it may be a source of bugs. Trigger an > + * error (-EEXIST) on purpose, to verify doing so doesn't cause a BUG. > + */ > + req.mapped = 0; > + ret = ioctl(ufd, UFFDIO_CONTINUE, &req); > + if (ret >= 0 || req.mapped != -EEXIST) > + err("failed to exercise UFFDIO_CONTINUE error handling, ret=%d, > mapped=%" PRId64, > + ret, req.mapped); > } > > static void *locking_thread(void *arg) > -- > 2.31.0.291.g576ba9dcdaf-goog > -- Peter Xu

[PATCH v5 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-29 Thread Peter Xu
Userfaultfd write-protect mode is supported starting from Linux 5.7. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/ioctl_userfaultfd.2 | 84 ++-- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/man2/ioctl_userfaultfd.2 b/man2

[PATCH v5 0/4] man2: udpate mm/userfaultfd manpages to latest

2021-03-29 Thread Peter Xu
art too, probably after the whole hugetlbfs/shmem minor mode reaches the linux master branch. Please review, thanks. Peter Xu (4): userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs userfaultfd.2: Add write-protect mode ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs ioctl_userfaultfd.2: A

[PATCH v5 1/4] userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-29 Thread Peter Xu
UFFD_FEATURE_THREAD_ID is supported since Linux 4.14. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/userfaultfd.2 | 13 + 1 file changed, 13 insertions(+) diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2 index e7dc9f813..5c41e4816 100644 --- a/man2/userfaultfd.2

[PATCH v5 2/4] userfaultfd.2: Add write-protect mode

2021-03-29 Thread Peter Xu
Write-protect mode is supported starting from Linux 5.7. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/userfaultfd.2 | 108 +++-- 1 file changed, 104 insertions(+), 4 deletions(-) diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2 index

[PATCH v5 3/4] ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-29 Thread Peter Xu
UFFD_FEATURE_THREAD_ID is supported in Linux 4.14. Acked-by: Mike Rapoport Signed-off-by: Peter Xu --- man2/ioctl_userfaultfd.2 | 5 + 1 file changed, 5 insertions(+) diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 index 47ae5f473..d4a8375b8 100644 --- a/man2

Re: [PATCH v4 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-29 Thread Peter Xu
On Thu, Mar 25, 2021 at 10:32:20PM +0100, Alejandro Colomar (man-pages) wrote: > Hi Peter, > > On 3/23/21 8:16 PM, Peter Xu wrote: > > On Tue, Mar 23, 2021 at 07:11:04PM +0100, Alejandro Colomar (man-pages) > > wrote: > > > > +.TP > > > > +.B UFFDIO_

Re: [PATCH] userfaultfd/shmem: fix MCOPY_ATOMIC_CONTNUE error handling + accounting

2021-03-27 Thread Peter Xu
bit for normal uffdio_copy case only if both WRITE|SHARED set for the vma flags. E.g., shmem_mcopy_atomic_pte() of a normal uffdio-copy will fill in the page cache into pte, however what if this mapping is privately mapped? IMHO we can't apply write bit otherwise the process will be writting to the page cache directly. However I think that question will be irrelevant to this patch. Thanks, -- Peter Xu

Re: [PATCH] userfaultfd/shmem: fix minor fault page leak

2021-03-24 Thread Peter Xu
do have a different commit ID here: commit 63c826b1372c4930f89b8a55092699fa7f0d6f4e Author: Axel Rasmussen Date: Thu Mar 18 10:20:43 2021 -0400 userfaultfd: support minor fault handling for shmem Axel, did you fetched the commit ID from your local tree, perhaps? Since I should have fetched from hnaz/linux-mm and I can see Andrew's sign-off too. Thanks, -- Peter Xu

Re: [PATCH v4 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

2021-03-23 Thread Peter Xu
ffdio_writeprotect))) return -EFAULT; But I didn't check other places, generally I'd return -EFAULT if I can't find a proper other replacement which has a clearer meaning. I don't think this is really helpful to user app too because no user app would start to read this -EFAULT to do anything useful.. how about I drop it too if you think the description is confusing? Thanks, -- Peter Xu

Re: [PATCH v4 2/4] userfaultfd.2: Add write-protect mode

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 07:19:12PM +0100, Alejandro Colomar (man-pages) wrote: > Hi Peter, > > Please see a few more comments below. > > Thanks, > > Alex > > On 3/22/21 11:08 PM, Peter Xu wrote: > > Write-protect mode is supported starting from Linux 5.7.

Re: [PATCH 07/23] mm: Introduce zap_details.zap_flags

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 02:11:29AM +, Matthew Wilcox wrote: > On Mon, Mar 22, 2021 at 08:48:56PM -0400, Peter Xu wrote: > > +/* Whether to check page->mapping when zapping */ > > +#define ZAP_FLAG_CHECK_MAPPING BIT(0) > > + > > /* > >

Re: [PATCH 02/23] mm: Clear vmf->pte after pte_unmap_same() returns

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 10:34:45AM +0800, Miaohe Lin wrote: > Hi: > On 2021/3/23 8:48, Peter Xu wrote: > > pte_unmap_same() will always unmap the pte pointer. After the unmap, > > vmf->pte > > will not be valid any more. We should clear it. > > > > It wa

Re: [PATCH v4 1/4] userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-23 Thread Peter Xu
On Tue, Mar 23, 2021 at 10:27:34AM +0200, Mike Rapoport wrote: > On Mon, Mar 22, 2021 at 06:08:45PM -0400, Peter Xu wrote: > > UFFD_FEATURE_THREAD_ID is supported since Linux 4.14. > > > > Signed-off-by: Peter Xu > > --- > > man2/userfaultfd.2 | 13

Re: [PATCH] userfaultfd: Write protect when virtual memory range has no page table entry

2021-03-23 Thread Peter Xu
this page it'll skip zeroing it assuming it's a zero page. QEMU plans to fix it using pre-faults as UFFDIO_COPY will complicate the live snapshot framework, but UFFD_FEATURE_WP_UNALLOCATED should be more efficient. It's just that we still needs to keep the old behavior. I'll see whether I can prepare a patch for it shortly, with some test case too. Thanks, -- Peter Xu

Re: [PATCH 00/23] userfaultfd-wp: Support shmem and hugetlbfs

2021-03-22 Thread Peter Xu
On Mon, Mar 22, 2021 at 08:48:49PM -0400, Peter Xu wrote: > This patchset is based on tag v5.12-rc3-mmots-2021-03-17-22-26. To run the > selftest, need to apply the two patches to fix minor mode page leak: > > https://lore.kernel.org/lkml/20210322175132.36659-1-pet...@redhat.

[PATCH 23/23] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs

2021-03-22 Thread Peter Xu
linux/userfaultfd.h header files, because it may cause kernel header update to easily break userspace. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/te

[PATCH 20/23] hugetlb/userfaultfd: Allow wr-protect none ptes

2021-03-22 Thread Peter Xu
ze fetcher. Signed-off-by: Peter Xu --- mm/hugetlb.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 448ef745d5ee..d4acf9d9d087 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5110,7 +5110,7 @@ uns

[PATCH 21/23] hugetlb/userfaultfd: Only drop uffd-wp special pte if required

2021-03-22 Thread Peter Xu
ptes, because it has taken hugetlb fault mutex so that no concurrent page fault would trigger. While the call to hugetlb_vmdelete_list() in hugetlbfs_punch_hole() is not safe. That's why the previous call will be with ZAP_FLAG_DROP_FILE_UFFD_WP, while the latter one won't be able to. Si

[PATCH 19/23] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler

2021-03-22 Thread Peter Xu
he special swap pte too just like a none pte. Note that we also need to teach UFFDIO_COPY about this special pte across the code path so that we can safely install a new page at this special pte as long as we know it's a stall entry. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 5

[PATCH 22/23] mm/userfaultfd: Enable write protection for shmem & hugetlbfs

2021-03-22 Thread Peter Xu
with _UFFDIO_WRITEPROTECT too because all existing types now support write protection mode. Since vma_can_userfault() will be used elsewhere, move into userfaultfd_k.h. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 18 -- include/linux/userfaultfd_k.h| 14 ++ in

[PATCH 18/23] mm/hugetlb: Introduce huge version of special swap pte helpers

2021-03-22 Thread Peter Xu
This is to let hugetlbfs be prepared to also recognize swap special ptes just like uffd-wp special swap ptes. Signed-off-by: Peter Xu --- mm/hugetlb.c | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fd3e87517e10

[PATCH 17/23] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT

2021-03-22 Thread Peter Xu
This starts from passing cp_flags into hugetlb_change_protection() so hugetlb will be able to handle MM_CP_UFFD_WP[_RESOLVE] requests. huge_pte_clear_uffd_wp() is introduced to handle the case where the UFFDIO_WRITEPROTECT is requested upon migrating huge page entries. Signed-off-by: Peter Xu

[PATCH 16/23] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

2021-03-22 Thread Peter Xu
it even if UFFDIO_COPY_MODE_WP is provided, so that the core mm will know this page contains valid data and never drop it. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 5 + include/linux/hugetlb.h | 6 -- mm/hugetlb.c | 22 +- mm/use

[PATCH 13/23] shmem/userfaultfd: Handle the left-overed special swap ptes

2021-03-22 Thread Peter Xu
d to also teach userfaultfd itself on either UFFDIO_COPY or handling page faults, so that everything will still work as expected. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 15 +++ mm/shmem.c | 13 - 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/fs/

[PATCH 15/23] hugetlb/userfaultfd: Hook page faults for uffd write protection

2021-03-22 Thread Peter Xu
Hook up hugetlbfs_fault() with the capability to handle userfaultfd-wp faults. We do this slightly earlier than hugetlb_cow() so that we can avoid taking some extra locks that we definitely don't need. Signed-off-by: Peter Xu --- mm/hugetlb.c | 19 +++ 1 file change

  1   2   3   4   5   6   7   8   9   10   >