Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-17 Thread Nadav Amit
> On Jan 17, 2021, at 11:25 AM, Yu Zhao wrote: > > On Sun, Jan 17, 2021 at 02:13:43AM -0800, Nadav Amit wrote: >>> On Jan 17, 2021, at 1:16 AM, Yu Zhao wrote: >>> >>> On Sat, Jan 16, 2021 at 11:32:22PM -0800, Nadav Amit wrote: > On Jan 16, 2021, at 8:41 PM, Yu Zhao wrote: > > On T

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-17 Thread Yu Zhao
On Sun, Jan 17, 2021 at 02:13:43AM -0800, Nadav Amit wrote: > > On Jan 17, 2021, at 1:16 AM, Yu Zhao wrote: > > > > On Sat, Jan 16, 2021 at 11:32:22PM -0800, Nadav Amit wrote: > >>> On Jan 16, 2021, at 8:41 PM, Yu Zhao wrote: > >>> > >>> On Tue, Jan 12, 2021 at 09:43:38PM +, Will Deacon wro

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-17 Thread Nadav Amit
> On Jan 17, 2021, at 1:16 AM, Yu Zhao wrote: > > On Sat, Jan 16, 2021 at 11:32:22PM -0800, Nadav Amit wrote: >>> On Jan 16, 2021, at 8:41 PM, Yu Zhao wrote: >>> >>> On Tue, Jan 12, 2021 at 09:43:38PM +, Will Deacon wrote: On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: >>>

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-17 Thread Yu Zhao
On Sat, Jan 16, 2021 at 11:32:22PM -0800, Nadav Amit wrote: > > On Jan 16, 2021, at 8:41 PM, Yu Zhao wrote: > > > > On Tue, Jan 12, 2021 at 09:43:38PM +, Will Deacon wrote: > >> On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: > On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: >

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-16 Thread Nadav Amit
> On Jan 16, 2021, at 8:41 PM, Yu Zhao wrote: > > On Tue, Jan 12, 2021 at 09:43:38PM +, Will Deacon wrote: >> On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote: > I

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-16 Thread Yu Zhao
On Tue, Jan 12, 2021 at 09:43:38PM +, Will Deacon wrote: > On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: > > > On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: > > > On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote: > > >> I will send an RFC soon for per-table deferred TLB

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Andy Lutomirski
> On Jan 12, 2021, at 2:29 PM, Nadav Amit wrote: > >  >> >> On Jan 12, 2021, at 1:43 PM, Will Deacon wrote: >> >> On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: > On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote:

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Will Deacon
On Tue, Jan 12, 2021 at 02:29:51PM -0800, Nadav Amit wrote: > > On Jan 12, 2021, at 1:43 PM, Will Deacon wrote: > > > > On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: > >>> On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: > >>> On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote:

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Nadav Amit
> On Jan 12, 2021, at 1:43 PM, Will Deacon wrote: > > On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: >>> On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: >>> On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote: I will send an RFC soon for per-table deferred TLB flushes track

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Nadav Amit
> On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: > > On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote: >>> On Jan 12, 2021, at 11:02 AM, Laurent Dufour >>> wrote: >>> >>> Le 12/01/2021 à 17:57, Peter Zijlstra a écrit : On Tue, Jan 12, 2021 at 04:47:17PM +0100, Laurent Dufour wrote

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Yu Zhao
On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: > > On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: > > > > On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote: > >>> On Jan 12, 2021, at 11:02 AM, Laurent Dufour > >>> wrote: > >>> > >>> Le 12/01/2021 à 17:57, Peter Zijlstra a é

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Will Deacon
On Tue, Jan 12, 2021 at 12:38:34PM -0800, Nadav Amit wrote: > > On Jan 12, 2021, at 11:56 AM, Yu Zhao wrote: > > On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote: > >> I will send an RFC soon for per-table deferred TLB flushes tracking. > >> The basic idea is to save a generation in the

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Yu Zhao
On Tue, Jan 12, 2021 at 11:15:43AM -0800, Nadav Amit wrote: > > On Jan 12, 2021, at 11:02 AM, Laurent Dufour > > wrote: > > > > Le 12/01/2021 à 17:57, Peter Zijlstra a écrit : > >> On Tue, Jan 12, 2021 at 04:47:17PM +0100, Laurent Dufour wrote: > >>> Le 12/01/2021 à 12:43, Vinayak Menon a écrit

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Nadav Amit
> On Jan 12, 2021, at 11:02 AM, Laurent Dufour > wrote: > > Le 12/01/2021 à 17:57, Peter Zijlstra a écrit : >> On Tue, Jan 12, 2021 at 04:47:17PM +0100, Laurent Dufour wrote: >>> Le 12/01/2021 à 12:43, Vinayak Menon a écrit : Possibility of race against other PTE modifiers 1) For

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Laurent Dufour
Le 12/01/2021 à 17:57, Peter Zijlstra a écrit : On Tue, Jan 12, 2021 at 04:47:17PM +0100, Laurent Dufour wrote: Le 12/01/2021 à 12:43, Vinayak Menon a écrit : Possibility of race against other PTE modifiers 1) Fork - We have seen a case of SPF racing with fork marking PTEs RO and that is des

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Peter Zijlstra
On Tue, Jan 12, 2021 at 04:47:17PM +0100, Laurent Dufour wrote: > Le 12/01/2021 à 12:43, Vinayak Menon a écrit : > > Possibility of race against other PTE modifiers > > > > 1) Fork - We have seen a case of SPF racing with fork marking PTEs RO and > > that > > is described and fixed here https://

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Peter Zijlstra
On Tue, Jan 05, 2021 at 01:03:48PM -0500, Andrea Arcangeli wrote: > On Tue, Jan 05, 2021 at 04:37:27PM +0100, Peter Zijlstra wrote: > > (your other email clarified this point; the COW needs to copy while > > holding the PTL and we need TLBI under PTL if we're to change this) > > The COW doesn't ne

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Laurent Dufour
Le 12/01/2021 à 12:43, Vinayak Menon a écrit : On 1/5/2021 9:07 PM, Peter Zijlstra wrote: On Mon, Dec 21, 2020 at 08:16:11PM -0800, Linus Torvalds wrote: So I think the basic rule is that "if you hold mmap_sem for writing, you're always safe". And that really should be considered the "default"

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-12 Thread Vinayak Menon
On 1/5/2021 9:07 PM, Peter Zijlstra wrote: On Mon, Dec 21, 2020 at 08:16:11PM -0800, Linus Torvalds wrote: So I think the basic rule is that "if you hold mmap_sem for writing, you're always safe". And that really should be considered the "default" locking. ANY time you make a modification to t

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-05 Thread Andrea Arcangeli
On Tue, Jan 05, 2021 at 04:37:27PM +0100, Peter Zijlstra wrote: > (your other email clarified this point; the COW needs to copy while > holding the PTL and we need TLBI under PTL if we're to change this) The COW doesn't need to hold the PT lock, the TLBI broadcast doesn't need to be delivered unde

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2021-01-05 Thread Peter Zijlstra
On Mon, Dec 21, 2020 at 08:16:11PM -0800, Linus Torvalds wrote: > So I think the basic rule is that "if you hold mmap_sem for writing, > you're always safe". And that really should be considered the > "default" locking. > > ANY time you make a modification to the VM layer, you should basically >

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-24 Thread Andrea Arcangeli
On Thu, Dec 24, 2020 at 01:49:45PM -0500, Andrea Arcangeli wrote: > Without the above, can't the CPU decrement the tlb_flush_pending while > the IPI to ack didn't arrive yet in csd_lock_wait? Ehm: csd_lock_wait has smp_acquire__after_ctrl_dep() so the write side looks ok after all sorry.

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-24 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 09:18:09PM -0800, Nadav Amit wrote: > I am not trying to be argumentative, and I did not think through about an > alternative solution. It sounds to me that your proposed solution is correct > and would probably be eventually (slightly) more efficient than anything > that I

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Nadav Amit
> On Dec 23, 2020, at 8:01 PM, Andrea Arcangeli wrote: > >> On Wed, Dec 23, 2020 at 07:09:10PM -0800, Nadav Amit wrote: >>> Perhaps holding some small bitmap based on part of the deferred flushed >>> pages (e.g., bits 12-17 of the address or some other kind of a single >>> hash-function bloom-fil

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Nadav Amit
> On Dec 23, 2020, at 7:34 PM, Yu Zhao wrote: > > On Wed, Dec 23, 2020 at 07:09:10PM -0800, Nadav Amit wrote: >>> On Dec 23, 2020, at 6:00 PM, Andrea Arcangeli wrote: >>> >>> On Wed, Dec 23, 2020 at 05:21:43PM -0800, Andy Lutomirski wrote: I don’t love this as a long term fix. AFAICT we ca

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
> On Wed, Dec 23, 2020 at 07:09:10PM -0800, Nadav Amit wrote: > > I think there are other cases in which Andy’s concern is relevant > > (MADV_PAGEOUT). I didn't try to figure how it would help MADV_PAGEOUT. MADV_PAGEOUT sounds cool feature, but maybe it'd need a way to flush the invalidates out an

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Yu Zhao
On Wed, Dec 23, 2020 at 07:09:10PM -0800, Nadav Amit wrote: > > On Dec 23, 2020, at 6:00 PM, Andrea Arcangeli wrote: > > > > On Wed, Dec 23, 2020 at 05:21:43PM -0800, Andy Lutomirski wrote: > >> I don’t love this as a long term fix. AFAICT we can have > >> mm_tlb_flush_pending set for quite a wh

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 09:00:26PM -0500, Andrea Arcangeli wrote: > One other near zero cost improvement easy to add if this would be "if > (vma->vm_flags & (VM_SOFTDIRTY|VM_UFFD_WP))" and it could be made The next worry then is if UFFDIO_WRITEPROTECT is very large then there would be a flood of w

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Nadav Amit
> On Dec 23, 2020, at 7:09 PM, Nadav Amit wrote: > >> On Dec 23, 2020, at 6:00 PM, Andrea Arcangeli wrote: >> >> On Wed, Dec 23, 2020 at 05:21:43PM -0800, Andy Lutomirski wrote: >>> I don’t love this as a long term fix. AFAICT we can have >>> mm_tlb_flush_pending set for quite a while — mprote

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Nadav Amit
> On Dec 23, 2020, at 6:00 PM, Andrea Arcangeli wrote: > > On Wed, Dec 23, 2020 at 05:21:43PM -0800, Andy Lutomirski wrote: >> I don’t love this as a long term fix. AFAICT we can have >> mm_tlb_flush_pending set for quite a while — mprotect seems like it can wait >> in IO while splitting a huge

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 05:21:43PM -0800, Andy Lutomirski wrote: > I don’t love this as a long term fix. AFAICT we can have mm_tlb_flush_pending > set for quite a while — mprotect seems like it can wait in IO while splitting > a huge page, for example. That gives us a window in which every write

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andy Lutomirski
> On Dec 23, 2020, at 2:29 PM, Yu Zhao wrote: > > > I was hesitant to suggest the following because it isn't that straight > forward. But since you seem to be less concerned with the complexity, > I'll just bring it on the table -- it would take care of both ufd and > clear_refs_write, wouldn

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
Hello Linus, On Wed, Dec 23, 2020 at 03:39:53PM -0800, Linus Torvalds wrote: > On Wed, Dec 23, 2020 at 1:39 PM Andrea Arcangeli wrote: > > > > On Tue, Dec 22, 2020 at 08:36:04PM -0700, Yu Zhao wrote: > > > Thanks for the details. > > > > I hope we can find a way put the page_mapcount back where t

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 02:45:59PM -0800, Nadav Amit wrote: > I think it may be reasonable. Whatever solution used, there will be 2 users of it: uffd-wp will use whatever technique used by clear_refs_write to avoid the mmap_write_lock. My favorite is Yu's patch and not the group lock anymore. The

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Linus Torvalds
On Wed, Dec 23, 2020 at 1:39 PM Andrea Arcangeli wrote: > > On Tue, Dec 22, 2020 at 08:36:04PM -0700, Yu Zhao wrote: > > Thanks for the details. > > I hope we can find a way put the page_mapcount back where there's a > page_count right now. I really don't think that's ever going to happen - at le

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 03:29:51PM -0700, Yu Zhao wrote: > I was hesitant to suggest the following because it isn't that straight > forward. But since you seem to be less concerned with the complexity, > I'll just bring it on the table -- it would take care of both ufd and > clear_refs_write, would

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Nadav Amit
> On Dec 23, 2020, at 2:05 PM, Andrea Arcangeli wrote: > > On Tue, Dec 22, 2020 at 04:40:32AM -0800, Nadav Amit wrote: >>> On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: >>> >>> On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds wrote: On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote:

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Yu Zhao
On Wed, Dec 23, 2020 at 04:39:00PM -0500, Andrea Arcangeli wrote: > On Tue, Dec 22, 2020 at 08:36:04PM -0700, Yu Zhao wrote: > > Thanks for the details. > > I hope we can find a way put the page_mapcount back where there's a > page_count right now. > > If you're so worried about having to maintai

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 04:40:32AM -0800, Nadav Amit wrote: > > On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: > > > > On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds wrote: > >> On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote: > >>> Using mmap_write_lock() was my initial fix and there w

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 08:36:04PM -0700, Yu Zhao wrote: > Thanks for the details. I hope we can find a way put the page_mapcount back where there's a page_count right now. If you're so worried about having to maintain a all defined well documented (or to be documented even better if you ACK it)

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 10:52:35AM -0500, Peter Xu wrote: > On Tue, Dec 22, 2020 at 08:36:04PM -0700, Yu Zhao wrote: > > In your patch, do we need to take wrprotect_rwsem in > > handle_userfault() as well? Otherwise, it seems userspace would have > > to synchronize between its wrprotect ioctl and f

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Peter Xu
On Wed, Dec 23, 2020 at 12:12:23PM -0700, Yu Zhao wrote: > On Wed, Dec 23, 2020 at 11:24:16AM -0500, Peter Xu wrote: > > On Wed, Dec 23, 2020 at 03:06:30AM -0700, Yu Zhao wrote: > > > On Wed, Dec 23, 2020 at 01:44:42AM -0800, Linus Torvalds wrote: > > > > On Tue, Dec 22, 2020 at 4:01 PM Linus Torva

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Yu Zhao
On Wed, Dec 23, 2020 at 11:24:16AM -0500, Peter Xu wrote: > On Wed, Dec 23, 2020 at 03:06:30AM -0700, Yu Zhao wrote: > > On Wed, Dec 23, 2020 at 01:44:42AM -0800, Linus Torvalds wrote: > > > On Tue, Dec 22, 2020 at 4:01 PM Linus Torvalds > > > wrote: > > > > > > > > The more I look at the mprotect

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Nadav Amit
> On Dec 23, 2020, at 8:23 AM, Will Deacon wrote: > > On Tue, Dec 22, 2020 at 11:20:21AM -0800, Nadav Amit wrote: >>> On Dec 22, 2020, at 10:30 AM, Yu Zhao wrote: >>> >>> On Tue, Dec 22, 2020 at 04:40:32AM -0800, Nadav Amit wrote: > On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: > >

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 01:51:59PM -0500, Andrea Arcangeli wrote: > NOTE: about the above comment, that mprotect takes > mmap_read_lock. Your above code change in the commit above, still has write Correction to avoid any confusion.

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Andrea Arcangeli
On Wed, Dec 23, 2020 at 11:24:16AM -0500, Peter Xu wrote: > I think this is not against Linus's example - where cpu2 does not have tlb > cached so it sees RO while cpu3 does have tlb cached so cpu3 can still modify > it. So IMHO there's no problem here. > > But I do think in step 2 here we overlo

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Peter Xu
On Wed, Dec 23, 2020 at 03:06:30AM -0700, Yu Zhao wrote: > On Wed, Dec 23, 2020 at 01:44:42AM -0800, Linus Torvalds wrote: > > On Tue, Dec 22, 2020 at 4:01 PM Linus Torvalds > > wrote: > > > > > > The more I look at the mprotect code, the less I like it. We seem to > > > be much better about the T

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Will Deacon
On Tue, Dec 22, 2020 at 11:20:21AM -0800, Nadav Amit wrote: > > On Dec 22, 2020, at 10:30 AM, Yu Zhao wrote: > > > > On Tue, Dec 22, 2020 at 04:40:32AM -0800, Nadav Amit wrote: > >>> On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: > >>> > >>> On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Peter Xu
On Tue, Dec 22, 2020 at 08:36:04PM -0700, Yu Zhao wrote: > In your patch, do we need to take wrprotect_rwsem in > handle_userfault() as well? Otherwise, it seems userspace would have > to synchronize between its wrprotect ioctl and fault handler? i.e., > the fault hander needs to be aware that the

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Yu Zhao
On Wed, Dec 23, 2020 at 01:44:42AM -0800, Linus Torvalds wrote: > On Tue, Dec 22, 2020 at 4:01 PM Linus Torvalds > wrote: > > > > The more I look at the mprotect code, the less I like it. We seem to > > be much better about the TLB flushes in other places (looking at > > mremap, for example). The

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-23 Thread Linus Torvalds
On Tue, Dec 22, 2020 at 4:01 PM Linus Torvalds wrote: > > The more I look at the mprotect code, the less I like it. We seem to > be much better about the TLB flushes in other places (looking at > mremap, for example). The mprotect code seems to be very laissez-faire > about the TLB flushing. No,

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Yu Zhao
On Tue, Dec 22, 2020 at 09:56:11PM -0500, Andrea Arcangeli wrote: > On Tue, Dec 22, 2020 at 04:39:46PM -0700, Yu Zhao wrote: > > We are talking about non-COW anon pages here -- they can't be mapped > > more than once. So why not just identify them by checking > > page_mapcount == 1 and then uncondi

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 05:23:39PM -0700, Yu Zhao wrote: > and 2) people are spearheading multiple efforts to reduce the mmap_lock > contention, which hopefully would make ufd users suffer less soon. In my view UFFD is an already deployed working solution that eliminates the mmap_lock_write conten

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 04:39:46PM -0700, Yu Zhao wrote: > We are talking about non-COW anon pages here -- they can't be mapped > more than once. So why not just identify them by checking > page_mapcount == 1 and then unconditionally reuse them? (This is > probably where I've missed things.) The p

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Linus Torvalds
On Tue, Dec 22, 2020 at 3:50 PM Linus Torvalds wrote: > > The rule is that the TLB flush has to be done before the page table > lock is released. I take that back. I guess it's ok as long as the mmap_sem is held for writing. Then the TLB flush can be delayed until just before releasing the mmap_s

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Yu Zhao
On Tue, Dec 22, 2020 at 04:01:45PM -0800, Linus Torvalds wrote: > On Tue, Dec 22, 2020 at 3:50 PM Linus Torvalds > wrote: > > > > See zap_pte_range() for an example of doing it right, even in the > > presence of complexities (ie that has an example of both flushing the > > TLB, and doing the actua

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Linus Torvalds
On Tue, Dec 22, 2020 at 3:50 PM Linus Torvalds wrote: > > See zap_pte_range() for an example of doing it right, even in the > presence of complexities (ie that has an example of both flushing the > TLB, and doing the actual "free the pages after flush", and it does > the two cases separately). Th

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Linus Torvalds
On Tue, Dec 22, 2020 at 3:39 PM Yu Zhao wrote: > > 2) is the false positive because of what we do, and it's causing the > memory corruption because do_wp_page() tries to make copies of pages > that seem to be RO but may have stale RW tlb entries pending flush. Yeah, that's definitely a different

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Yu Zhao
On Tue, Dec 22, 2020 at 05:02:37PM -0500, Andrea Arcangeli wrote: > On Tue, Dec 22, 2020 at 02:14:41PM -0700, Yu Zhao wrote: > > This works but I don't prefer this option because 1) this is new > > way of making pte_wrprotect safe and 2) it's specific to ufd and > > can't be applied to clear_soft_d

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 02:14:41PM -0700, Yu Zhao wrote: > This works but I don't prefer this option because 1) this is new > way of making pte_wrprotect safe and 2) it's specific to ufd and > can't be applied to clear_soft_dirty() which has no "catcher". No I didn't look into clear_soft_dirty iss

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 12:58:18PM -0800, Nadav Amit wrote: > I had somewhat similar ideas - saving in each page-struct the generation, > which would allow to: (1) extend pte_same() to detect interim changes > that were reverted (RO->RW->RO) and (2) per-PTE pending flushes. What don't you feel saf

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 12:19:49PM -0800, Nadav Amit wrote: > Perhaps any change to PTE in a page-table should increase a page-table > generation that we would save in the page-table page-struct (next to the The current rule is that by the time in the page fault we find a pte/hugepmd in certain st

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Yu Zhao
On Tue, Dec 22, 2020 at 02:31:52PM -0500, Andrea Arcangeli wrote: > On Mon, Dec 21, 2020 at 07:19:35PM -0800, Andy Lutomirski wrote: > > instance of this (with mmap_sem held for write) in x86: > > mark_screen_rdonly(). Dare I ask how broken this is? We could likely > > That one is buggy despite

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Nadav Amit
> On Dec 22, 2020, at 11:31 AM, Andrea Arcangeli wrote: > > From 4ace4d1b53f5cb3b22a5c2dc33facc4150b112d6 Mon Sep 17 00:00:00 2001 > From: Andrea Arcangeli > Date: Tue, 22 Dec 2020 14:30:16 -0500 > Subject: [PATCH 1/1] mm: userfaultfd: avoid leaving stale TLB after > userfaultfd_writeprotect() >

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Nadav Amit
> On Dec 22, 2020, at 12:34 PM, Andy Lutomirski wrote: > > On Sat, Dec 19, 2020 at 2:06 PM Nadav Amit wrote: >>> [ I have in mind another solution, such as keeping in each page-table a >>> “table-generation” which is the mm-generation at the time of the change, >>> and only flush if “table-gener

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andy Lutomirski
On Sat, Dec 19, 2020 at 2:06 PM Nadav Amit wrote: > > [ I have in mind another solution, such as keeping in each page-table a > > “table-generation” which is the mm-generation at the time of the change, > > and only flush if “table-generation”==“mm-generation”, but it requires > > some thought on

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Tue, Dec 22, 2020 at 08:15:53PM +, Matthew Wilcox wrote: > On Tue, Dec 22, 2020 at 02:31:52PM -0500, Andrea Arcangeli wrote: > > My previous suggestion to use a mutex to serialize > > userfaultfd_writeprotect with a mutex will still work, but we can run > > as many wrprotect and un-wrprotect

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Nadav Amit
> On Dec 22, 2020, at 11:44 AM, Andrea Arcangeli wrote: > > On Mon, Dec 21, 2020 at 02:55:12PM -0800, Nadav Amit wrote: >> wouldn’t mmap_write_downgrade() be executed before mprotect_fixup() (so > > I assume you mean "in" mprotect_fixup, after change_protection. > > If you would downgrade the m

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andy Lutomirski
On Mon, Dec 21, 2020 at 8:16 PM Linus Torvalds wrote: > > On Mon, Dec 21, 2020 at 7:19 PM Andy Lutomirski wrote: > > > > Ugh, this is unpleasantly complicated. > > What I *should* have said is that *because* userfaultfd is changing > the VM layout, it should always act as if it had to take the mm

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Matthew Wilcox
On Tue, Dec 22, 2020 at 02:31:52PM -0500, Andrea Arcangeli wrote: > My previous suggestion to use a mutex to serialize > userfaultfd_writeprotect with a mutex will still work, but we can run > as many wrprotect and un-wrprotect as we want in parallel, as long as > they're not simultaneous, we can d

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Mon, Dec 21, 2020 at 02:55:12PM -0800, Nadav Amit wrote: > wouldn’t mmap_write_downgrade() be executed before mprotect_fixup() (so I assume you mean "in" mprotect_fixup, after change_protection. If you would downgrade the mmap_lock to read there, then it'd severely slowdown the non contention

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Andrea Arcangeli
On Mon, Dec 21, 2020 at 07:19:35PM -0800, Andy Lutomirski wrote: > instance of this (with mmap_sem held for write) in x86: > mark_screen_rdonly(). Dare I ask how broken this is? We could likely That one is buggy despite it takes the mmap_write_lock... inverting the last two lines would fix it th

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Nadav Amit
> On Dec 22, 2020, at 10:30 AM, Yu Zhao wrote: > > On Tue, Dec 22, 2020 at 04:40:32AM -0800, Nadav Amit wrote: >>> On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: >>> >>> On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds wrote: On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote: > U

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Yu Zhao
On Tue, Dec 22, 2020 at 04:40:32AM -0800, Nadav Amit wrote: > > On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: > > > > On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds wrote: > >> On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote: > >>> Using mmap_write_lock() was my initial fix and there w

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Nadav Amit
> On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: > > On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds wrote: >> On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote: >>> Using mmap_write_lock() was my initial fix and there was a strong pushback >>> on this approach due to its potential impact

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-22 Thread Nadav Amit
> On Dec 21, 2020, at 7:19 PM, Andy Lutomirski wrote: > > On Mon, Dec 21, 2020 at 3:22 PM Linus Torvalds > wrote: >> On Mon, Dec 21, 2020 at 2:30 PM Peter Xu wrote: >>> AFAIU mprotect() is the only one who modifies the pte using the mmap write >>> lock. NUMA balancing is also using read mmap l

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 7:19 PM Andy Lutomirski wrote: > > Ugh, this is unpleasantly complicated. I probably should have phrased it differently, because the case you quote is actually a good one: > I will admit that any API that > takes an address and more-or-less-blindly marks it RO makes me q

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Andy Lutomirski
On Mon, Dec 21, 2020 at 3:22 PM Linus Torvalds wrote: > > On Mon, Dec 21, 2020 at 2:30 PM Peter Xu wrote: > > > > AFAIU mprotect() is the only one who modifies the pte using the mmap write > > lock. NUMA balancing is also using read mmap lock when changing pte > > protections, while my understan

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Yu Zhao
On Mon, Dec 21, 2020 at 04:11:01PM -0800, Linus Torvalds wrote: > On Mon, Dec 21, 2020 at 4:00 PM Yu Zhao wrote: > > > > My first instinct is to be conservative and revert 09854ba94c6a ("mm: > > do_wp_page() simplification") so people are less likely to come back > > and complain about performance

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 4:00 PM Yu Zhao wrote: > > My first instinct is to be conservative and revert 09854ba94c6a ("mm: > do_wp_page() simplification") so people are less likely to come back > and complain about performance issues from holding mmap lock for > write when clearing pte_write. Well,

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Yu Zhao
On Mon, Dec 21, 2020 at 03:33:30PM -0800, Linus Torvalds wrote: > On Mon, Dec 21, 2020 at 3:12 PM Yu Zhao wrote: > > > > I can't say I disagree with you but the man has made the call and I > > think we should just move on. > > "The man" can always be convinced by numbers. > > So if somebody come

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Nadav Amit
> On Dec 21, 2020, at 3:30 PM, Linus Torvalds > wrote: > > On Mon, Dec 21, 2020 at 2:55 PM Nadav Amit wrote: >> So as an alternative solution, I can do copying under the PTL after >> flushing, which seems to solve the problem. > > ... > Note that the "Re-validate under PTL" code in cow_user_pa

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 2:55 PM Nadav Amit wrote: > > So as an alternative solution, I can do copying under the PTL after > flushing, which seems to solve the problem. I think that's a valid model, but note that we do the "re-check ptl" in a (*completely(* different part than we do the actual PTE

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 3:12 PM Yu Zhao wrote: > > I can't say I disagree with you but the man has made the call and I > think we should just move on. "The man" can always be convinced by numbers. So if somebody comes up with an alternate patch, and explains it, and shows that it is better - go

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 2:30 PM Peter Xu wrote: > > AFAIU mprotect() is the only one who modifies the pte using the mmap write > lock. NUMA balancing is also using read mmap lock when changing pte > protections, while my understanding is mprotect() used write lock only because > it manipulates th

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 1:55 PM Peter Xu wrote: > > Frankly speaking I don't know why it's always safe to do data copy without the > pgtable lock in wp_page_copy(), since I don't know what guaranteed us from > data > changing on the original page due to any reason. So the reason it should be saf

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Yu Zhao
On Mon, Dec 21, 2020 at 05:30:41PM -0500, Peter Xu wrote: > On Mon, Dec 21, 2020 at 01:49:55PM -0800, Nadav Amit wrote: > > BTW: In general, I think that you are right, and that changing of PTEs > > should not require taking mmap_lock for write. However, I am not sure > > cow_user_page() is not the

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Nadav Amit
> On Dec 21, 2020, at 2:30 PM, Peter Xu wrote: > > On Mon, Dec 21, 2020 at 01:49:55PM -0800, Nadav Amit wrote: >> BTW: In general, I think that you are right, and that changing of PTEs >> should not require taking mmap_lock for write. However, I am not sure >> cow_user_page() is not the only one

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Peter Xu
On Mon, Dec 21, 2020 at 01:49:55PM -0800, Nadav Amit wrote: > BTW: In general, I think that you are right, and that changing of PTEs > should not require taking mmap_lock for write. However, I am not sure > cow_user_page() is not the only one that poses a problem and whether a more > systematic sol

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Peter Xu
On Mon, Dec 21, 2020 at 12:23:06PM -0800, Nadav Amit wrote: > 2. Copy the page in cow_user_page() while holding the PTL and after flushing >has been done. I am not sure if there are potential problems with >special-pages (2 flushes might be necessary for special pages). This seems to be a

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Nadav Amit
> On Dec 21, 2020, at 1:24 PM, Yu Zhao wrote: > > On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds wrote: >> On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote: >>> Using mmap_write_lock() was my initial fix and there was a strong pushback >>> on this approach due to its potential impact

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Yu Zhao
On Mon, Dec 21, 2020 at 12:26:22PM -0800, Linus Torvalds wrote: > On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote: > > > > Using mmap_write_lock() was my initial fix and there was a strong pushback > > on this approach due to its potential impact on performance. > > From whom? > > Somebody who

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 12:23 PM Nadav Amit wrote: > > Using mmap_write_lock() was my initial fix and there was a strong pushback > on this approach due to its potential impact on performance. >From whom? Somebody who doesn't understand that correctness is more important than performance? And th

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 12:21 PM Yu Zhao wrote: > > Well, unfortunately we have places that use optimizations like > > inc_tlb_flush_pending() > lock page table > pte_wrprotect > flush_tlb_range() > dec_tlb_flush_pending() > > which complicate things. My point is, none of that ma

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Nadav Amit
> On Dec 21, 2020, at 11:55 AM, Linus Torvalds > wrote: > > On Mon, Dec 21, 2020 at 11:16 AM Yu Zhao wrote: >> Nadav Amit found memory corruptions when running userfaultfd test above. >> It seems to me the problem is related to commit 09854ba94c6a ("mm: >> do_wp_page() simplification"). Can you

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Yu Zhao
On Mon, Dec 21, 2020 at 11:55:02AM -0800, Linus Torvalds wrote: > On Mon, Dec 21, 2020 at 11:16 AM Yu Zhao wrote: > > > > Nadav Amit found memory corruptions when running userfaultfd test above. > > It seems to me the problem is related to commit 09854ba94c6a ("mm: > > do_wp_page() simplification"

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Linus Torvalds
On Mon, Dec 21, 2020 at 11:16 AM Yu Zhao wrote: > > Nadav Amit found memory corruptions when running userfaultfd test above. > It seems to me the problem is related to commit 09854ba94c6a ("mm: > do_wp_page() simplification"). Can you please take a look? Thanks. > > TL;DR: it may not safe to make

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Peter Xu
On Mon, Dec 21, 2020 at 10:31:57AM -0800, Nadav Amit wrote: > > On Dec 21, 2020, at 9:27 AM, Peter Xu wrote: > > > > Hi, Nadav, > > > > On Sun, Dec 20, 2020 at 12:06:38AM -0800, Nadav Amit wrote: > > > > [...] > > > >> So to correct myself, I think that what I really encountered was actually >

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Yu Zhao
On Mon, Dec 21, 2020 at 10:31:57AM -0800, Nadav Amit wrote: > > On Dec 21, 2020, at 9:27 AM, Peter Xu wrote: > > > > Hi, Nadav, > > > > On Sun, Dec 20, 2020 at 12:06:38AM -0800, Nadav Amit wrote: > > > > [...] > > > >> So to correct myself, I think that what I really encountered was actually >

Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

2020-12-21 Thread Nadav Amit
> On Dec 21, 2020, at 9:27 AM, Peter Xu wrote: > > Hi, Nadav, > > On Sun, Dec 20, 2020 at 12:06:38AM -0800, Nadav Amit wrote: > > [...] > >> So to correct myself, I think that what I really encountered was actually >> during MM_CP_UFFD_WP_RESOLVE (i.e., when the protection is removed). The >>

  1   2   >