Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-29 Thread Linus Torvalds
On Tue, Dec 29, 2020 at 5:28 AM Kirill A. Shutemov wrote: > > I reworked do_fault_around() so it doesn't touch vmf->address and pass > original address down to ->map_pages(). No need in the new argument in > ->map_pages. filemap_map_pages() calculates address based on pgoff of the > page handled.

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-29 Thread Matthew Wilcox
On Tue, Dec 29, 2020 at 04:28:19PM +0300, Kirill A. Shutemov wrote: > > At that point, there would no longer be any need to update the > > address/pte fields in the vmf struct, and in fact I think it could be > > made a "const" pointer in this cal chain. > > Unfortunately, we would still need to N

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-29 Thread Kirill A. Shutemov
On Mon, Dec 28, 2020 at 01:58:40PM -0800, Linus Torvalds wrote: > On Mon, Dec 28, 2020 at 10:47 AM Linus Torvalds > wrote: > > > > I personally think it's wrong to update vmf->pte at all. We should > > just have a local 'ptep' pointer that we update as we walk along. But > > that requires another

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-28 Thread Hugh Dickins
Got it at last, sorry it's taken so long. On Tue, 29 Dec 2020, Kirill A. Shutemov wrote: > On Tue, Dec 29, 2020 at 01:05:48AM +0300, Kirill A. Shutemov wrote: > > On Mon, Dec 28, 2020 at 10:47:36AM -0800, Linus Torvalds wrote: > > > On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov > > > wrote:

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-28 Thread Linus Torvalds
On Mon, Dec 28, 2020 at 2:05 PM Kirill A. Shutemov wrote: > > > But I *think* we should be fine here: do_fault_around() limits start_pgoff > and end_pgoff to stay within the page table. Yeah, but I was thinking it would then update ->pte to just past the edge. But looking at that logic, you're r

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-28 Thread Kirill A. Shutemov
On Tue, Dec 29, 2020 at 01:05:48AM +0300, Kirill A. Shutemov wrote: > On Mon, Dec 28, 2020 at 10:47:36AM -0800, Linus Torvalds wrote: > > On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov > > wrote: > > > > > > So far I only found one more pin leak and always-true check. I don't see > > > how ca

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-28 Thread Kirill A. Shutemov
On Mon, Dec 28, 2020 at 10:47:36AM -0800, Linus Torvalds wrote: > On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov > wrote: > > > > So far I only found one more pin leak and always-true check. I don't see > > how can it lead to crash or corruption. Keep looking. > > Well, I noticed that the no

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-28 Thread Linus Torvalds
On Mon, Dec 28, 2020 at 10:47 AM Linus Torvalds wrote: > > I personally think it's wrong to update vmf->pte at all. We should > just have a local 'ptep' pointer that we update as we walk along. But > that requires another change to the calling convention, namely to > "do_set_pte()". Actually, I t

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-28 Thread Linus Torvalds
On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov wrote: > > So far I only found one more pin leak and always-true check. I don't see > how can it lead to crash or corruption. Keep looking. Well, I noticed that the nommu.c version of filemap_map_pages() needs fixing, but that's obviously not the

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-28 Thread Kirill A. Shutemov
On Sun, Dec 27, 2020 at 10:43:44PM -0800, Hugh Dickins wrote: > On Sun, 27 Dec 2020, Linus Torvalds wrote: > > On Sun, Dec 27, 2020 at 3:48 PM Kirill A. Shutemov > > wrote: > > > > > > I did what Hugh proposed and it got clear to my eyes. It gets somewhat > > > large, but take a look. > > > > Ok

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Hugh Dickins
On Sun, 27 Dec 2020, Linus Torvalds wrote: > On Sun, Dec 27, 2020 at 3:48 PM Kirill A. Shutemov > wrote: > > > > I did what Hugh proposed and it got clear to my eyes. It gets somewhat > > large, but take a look. > > Ok, it's not that much bigger, and the end result is certainly much > clearer wr

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Linus Torvalds
On Sun, Dec 27, 2020 at 3:48 PM Kirill A. Shutemov wrote: > > I did what Hugh proposed and it got clear to my eyes. It gets somewhat > large, but take a look. Ok, it's not that much bigger, and the end result is certainly much clearer wrt locking. So that last version of yours with the fix for t

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Kirill A. Shutemov
On Sun, Dec 27, 2020 at 03:40:36PM -0800, Linus Torvalds wrote: > I think Kirill was intending to move the "if (ret)" up into the path > that sets it, IOW something like > > + if (!(vma->vm_flags & VM_SHARED)) { > + ret = check_stable_address_space(vma->vm_mm); > +

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Kirill A. Shutemov
On Sun, Dec 27, 2020 at 11:38:22AM -0800, Linus Torvalds wrote: > On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote: > > > > This patch (like its antecedents) moves the pte_unmap_unlock() from > > after do_fault_around()'s "check if the page fault is solved" into > > filemap_map_pages() itself (w

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Linus Torvalds
On Sun, Dec 27, 2020 at 3:12 PM Linus Torvalds wrote: > > Ok, your fix for that folded in, and here's yet another version. Still not good. I don't know what happened, but the change of - vm_fault_t ret = 0; + vm_fault_t ret; is very very wrong. The next user is + if (!(vma->

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Linus Torvalds
On Sun, Dec 27, 2020 at 2:35 PM Hugh Dickins wrote: > > Yes, this one passes my testing on x86_64 and on i386. Ok, good. > But... > > ... Damian very helpfully reports that it does not build when > CONFIG_TRANSPARENT_HUGEPAGE is not set, since the "static " has > not been removed from the altern

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Hugh Dickins
On Sun, 27 Dec 2020, Damian Tometzki wrote: > On Sun, 27. Dec 11:38, Linus Torvalds wrote: > > On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote: > > > > > > This patch (like its antecedents) moves the pte_unmap_unlock() from > > > after do_fault_around()'s "check if the page fault is solved" int

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Damian Tometzki
On Sun, 27. Dec 11:38, Linus Torvalds wrote: > On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote: > > > > This patch (like its antecedents) moves the pte_unmap_unlock() from > > after do_fault_around()'s "check if the page fault is solved" into > > filemap_map_pages() itself (which apparently doe

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-27 Thread Linus Torvalds
On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote: > > This patch (like its antecedents) moves the pte_unmap_unlock() from > after do_fault_around()'s "check if the page fault is solved" into > filemap_map_pages() itself (which apparently does not NULLify vmf->pte > after unmapping it, which is p

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Hugh Dickins
On Sat, 26 Dec 2020, Hugh Dickins wrote: > On Sun, 27 Dec 2020, Kirill A. Shutemov wrote: > > > > Here's the fixup I have so far. It doesn't blow up immediately, but please > > take a closer look. Who knows what stupid mistake I did this time. :/ > > It's been running fine on x86_64 for a couple

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Hugh Dickins
On Sun, 27 Dec 2020, Kirill A. Shutemov wrote: > > Here's the fixup I have so far. It doesn't blow up immediately, but please > take a closer look. Who knows what stupid mistake I did this time. :/ It's been running fine on x86_64 for a couple of hours (but of course my testing is deficient, in n

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Kirill A. Shutemov
On Sat, Dec 26, 2020 at 01:16:09PM -0800, Linus Torvalds wrote: > On Sat, Dec 26, 2020 at 1:04 PM Hugh Dickins wrote: > > > > > > Hold on. I guess this one will suffer from the same bug as the previous. > > I was about to report back, after satisfactory overnight testing of that > > version - prov

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Matthew Wilcox
On Sat, Dec 26, 2020 at 11:43:35PM +0300, Kirill A. Shutemov wrote: > +static struct page *next_stable_page(struct page *page, struct vm_fault *vmf, > + struct xa_state *xas, pgoff_t end_pgoff) A "stable page" means one that doesn't currently have an outstanding wr

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Linus Torvalds
On Sat, Dec 26, 2020 at 1:04 PM Hugh Dickins wrote: > > > Hold on. I guess this one will suffer from the same bug as the previous. > I was about to report back, after satisfactory overnight testing of that > version - provided that one big little bug is fixed: > > --- a/mm/filemap.c > +++ b/mm/fil

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Linus Torvalds
I was going to just apply this patch, because I like it so much, but then I decided to take one last look, and: On Sat, Dec 26, 2020 at 12:43 PM Kirill A. Shutemov wrote: > > +static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) > +{ > + struct mm_struct *mm = vmf->vma->vm_m

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Hugh Dickins
On Sat, 26 Dec 2020, Kirill A. Shutemov wrote: > On Sat, Dec 26, 2020 at 09:57:13AM -0800, Linus Torvalds wrote: > > Because not only does that get rid of the "if (page)" test, I think it > > would make things a bit clearer. When I read the patch first, the > > initial "next_page()" call confused m

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Kirill A. Shutemov
On Sat, Dec 26, 2020 at 09:57:13AM -0800, Linus Torvalds wrote: > Because not only does that get rid of the "if (page)" test, I think it > would make things a bit clearer. When I read the patch first, the > initial "next_page()" call confused me. Agreed. Here we go: >From d12dea4abe94dbc24b794532

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-26 Thread Linus Torvalds
On Fri, Dec 25, 2020 at 3:31 AM Kirill A. Shutemov wrote: > > The new helper next_page() returns a stablized page, so filemap_map_pmd() > can clearly decide if we should set up a page table or a huge page. I really like that next_page() abstraction, my only comment is that I think it should be re

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-25 Thread Kirill A. Shutemov
On Wed, Dec 23, 2020 at 08:04:54PM -0800, Hugh Dickins wrote: > It's not ready yet. Thanks for the feedback. It's very helpful. The patch below should address the problems you've found. The new helper next_page() returns a stablized page, so filemap_map_pmd() can clearly decide if we should set

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-23 Thread Hugh Dickins
On Tue, 22 Dec 2020, Kirill A. Shutemov wrote: > > Updated patch is below. > > From 0ec1bc1fe95587350ac4f4c866d6482383740b36 Mon Sep 17 00:00:00 2001 > From: "Kirill A. Shutemov" > Date: Sat, 19 Dec 2020 15:19:23 +0300 > Subject: [PATCH] mm: Cleanup faultaround and finish_fault() codepaths > >

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-22 Thread Kirill A. Shutemov
On Sat, Dec 19, 2020 at 12:34:17PM -0800, Linus Torvalds wrote: > On Sat, Dec 19, 2020 at 4:41 AM Kirill A. Shutemov > wrote: > > > > @@ -2884,19 +2966,18 @@ void filemap_map_pages(struct vm_fault *vmf, > > if (vmf->pte) > > vmf->pte += xas.xa_index - last_

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-19 Thread Linus Torvalds
On Sat, Dec 19, 2020 at 4:41 AM Kirill A. Shutemov wrote: > > @@ -2884,19 +2966,18 @@ void filemap_map_pages(struct vm_fault *vmf, > if (vmf->pte) > vmf->pte += xas.xa_index - last_pgoff; > last_pgoff = xas.xa_index; > - if (all

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-19 Thread Linus Torvalds
On Sat, Dec 19, 2020 at 4:41 AM Kirill A. Shutemov wrote: > > Okay, but we only win the NULL check. xas_retry() and xa_is_value() has to > be repeated in the beginning of the loop. Yeah. I wonder if it might make sense to have a "xas_next_entry_rcu()" function that does something like while

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-19 Thread Kirill A. Shutemov
On Fri, Dec 18, 2020 at 10:56:55AM -0800, Linus Torvalds wrote: > No? Okay, but we only win the NULL check. xas_retry() and xa_is_value() has to be repeated in the beginning of the loop. >From b4f4c0d32e654b8459a1e439a453373499b8946a Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Sat,

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-18 Thread Linus Torvalds
On Fri, Dec 18, 2020 at 3:04 AM Kirill A. Shutemov wrote: > > This should do. See below. Looks fine. > > Then that second loop very naturally becomes a "do { } while ()" one. > > I don't see it. I haven't found a reasonable way to rework it do-while. Now that you return early for the "HEAD == N

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-18 Thread Kirill A. Shutemov
On Thu, Dec 17, 2020 at 10:22:33AM -0800, Linus Torvalds wrote: > + head = xas_find(&xas, end_pgoff); > + for (; ; head = xas_next_entry(&xas, end_pgoff)) { > + if (!head) { > + rcu_read_unlock(); > + return; > + }

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-17 Thread Linus Torvalds
On Thu, Dec 17, 2020 at 2:54 AM Kirill A. Shutemov wrote: > > Also if the range doesn't have a mappable page we would setup a page > table into the PMD entry. It means we cannot have huge page mapped there > later. It may be a bummer: getting the page table out of page table tree > requires mmap_w

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-17 Thread Kirill A. Shutemov
On Wed, Dec 16, 2020 at 10:41:36AM -0800, Linus Torvalds wrote: > On Wed, Dec 16, 2020 at 9:07 AM Kirill A. Shutemov > wrote: > > > > If this looks fine, I'll submit a proper patch. > > That patch looks good to me. > > It would be good if somebody else looked it through - maybe I like it > just

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-16 Thread Linus Torvalds
On Wed, Dec 16, 2020 at 9:07 AM Kirill A. Shutemov wrote: > > If this looks fine, I'll submit a proper patch. That patch looks good to me. It would be good if somebody else looked it through - maybe I like it just because I got to pee in the snow and make my mark. But i think that filemap_map_pa

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-16 Thread Kirill A. Shutemov
On Mon, Dec 14, 2020 at 09:54:06AM -0800, Linus Torvalds wrote: > Could we have a "filemap_map_pmd()" that does it all, and then the > filemap_map_pages() logic would be more along the lines of > > if (filemap_map_pmd(vmf, xas)) { > rcu_read_unlock(); > return; > }

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-14 Thread Matthew Wilcox
On Mon, Dec 14, 2020 at 09:54:06AM -0800, Linus Torvalds wrote: > > I expected to hate it more, but it looks reasonable. Opencoded > > xas_for_each() smells bad, but... > > I think the open-coded xas_for_each() per se isn't a problem, but I > agree that the startup condition is a bit ugly. And I'm

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-14 Thread Linus Torvalds
On Mon, Dec 14, 2020 at 8:07 AM Kirill A. Shutemov wrote: > > Here it is. Still barely tested. Ok, from looking at the patch (not applying it and looking at the end result), I think the locking - at least for the filemap_map_pages() case - is a lot easier to understand. So you seem to have fixed

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-14 Thread Kirill A. Shutemov
On Thu, Dec 10, 2020 at 09:23:53AM -0800, Linus Torvalds wrote: > Can we please move that part to the callers too - possibly with a > separate helper function? Here it is. Still barely tested. I expected to hate it more, but it looks reasonable. Opencoded xas_for_each() smells bad, but... And di

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-10 Thread Linus Torvalds
On Thu, Dec 10, 2020 at 7:08 AM Kirill A. Shutemov wrote: > > See lightly tested patch below. Is it something you had in mind? This is closer, in that at least it removes the ostensibly blocking allocation (that can't happen) from the prefault path. But the main issue remains: > > At that point

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-10 Thread Kirill A. Shutemov
On Wed, Dec 09, 2020 at 11:04:13AM -0800, Linus Torvalds wrote: > > Kirill - I think you know this code best. Would you be willing to look > at splitting out that "allocate and map" from alloc_set_pte() and into > the callers? See lightly tested patch below. Is it something you had in mind? I do

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-09 Thread Linus Torvalds
On Wed, Dec 9, 2020 at 12:32 PM Matthew Wilcox wrote: > > If a filesystem has put an Uptodate page into the page cache, the > rest of the kernel can read it without telling the filesystem. XFS does the same thing for xfs_file_read_iter() too. Not that I disagree with you - when you mmap a file,

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-09 Thread Matthew Wilcox
On Wed, Dec 09, 2020 at 11:04:13AM -0800, Linus Torvalds wrote: > In particular, it made it a nightmare to read what do_fault_around() > does: it does that odd > > if (pmd_none(*vmf->pmd)) { > vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm); > > and then it calls ->map_

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-09 Thread Linus Torvalds
On Wed, Dec 9, 2020 at 10:40 AM Will Deacon wrote: > > > And yes, that probably means that you need to change "alloc_set_pte()" > > to actually pass in the real address, and leave "vmf->address" alone - > > so that it can know which ones are prefaulted and which one is real, > > but that sounds li

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-09 Thread Will Deacon
On Wed, Dec 09, 2020 at 09:58:12AM -0800, Linus Torvalds wrote: > On Wed, Dec 9, 2020 at 8:40 AM Will Deacon wrote: > > > > @@ -3978,8 +3994,17 @@ static vm_fault_t do_fault_around(struct vm_fault > > *vmf) > > > > /* check if the page fault is solved */ > > vmf->pte -= (vmf->addr

Re: [PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-09 Thread Linus Torvalds
On Wed, Dec 9, 2020 at 8:40 AM Will Deacon wrote: > > @@ -3978,8 +3994,17 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) > > /* check if the page fault is solved */ > vmf->pte -= (vmf->address >> PAGE_SHIFT) - (address >> PAGE_SHIFT); > - if (!pte_none(*vmf->pte))

[PATCH 1/2] mm: Allow architectures to request 'old' entries when prefaulting

2020-12-09 Thread Will Deacon
Commit 5c0a85fad949 ("mm: make faultaround produce old ptes") changed the "faultaround" behaviour to initialise prefaulted PTEs as 'old', since this avoids vmscan wrongly assuming that they are hot, despite having never been explicitly accessed by userspace. The change has been shown to benefit num