On Tue, Dec 29, 2020 at 5:28 AM Kirill A. Shutemov wrote:
>
> I reworked do_fault_around() so it doesn't touch vmf->address and pass
> original address down to ->map_pages(). No need in the new argument in
> ->map_pages. filemap_map_pages() calculates address based on pgoff of the
> page handled.
On Tue, Dec 29, 2020 at 04:28:19PM +0300, Kirill A. Shutemov wrote:
> > At that point, there would no longer be any need to update the
> > address/pte fields in the vmf struct, and in fact I think it could be
> > made a "const" pointer in this cal chain.
>
> Unfortunately, we would still need to N
On Mon, Dec 28, 2020 at 01:58:40PM -0800, Linus Torvalds wrote:
> On Mon, Dec 28, 2020 at 10:47 AM Linus Torvalds
> wrote:
> >
> > I personally think it's wrong to update vmf->pte at all. We should
> > just have a local 'ptep' pointer that we update as we walk along. But
> > that requires another
Got it at last, sorry it's taken so long.
On Tue, 29 Dec 2020, Kirill A. Shutemov wrote:
> On Tue, Dec 29, 2020 at 01:05:48AM +0300, Kirill A. Shutemov wrote:
> > On Mon, Dec 28, 2020 at 10:47:36AM -0800, Linus Torvalds wrote:
> > > On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov
> > > wrote:
On Mon, Dec 28, 2020 at 2:05 PM Kirill A. Shutemov wrote:
>
>
> But I *think* we should be fine here: do_fault_around() limits start_pgoff
> and end_pgoff to stay within the page table.
Yeah, but I was thinking it would then update ->pte to just past the edge.
But looking at that logic, you're r
On Tue, Dec 29, 2020 at 01:05:48AM +0300, Kirill A. Shutemov wrote:
> On Mon, Dec 28, 2020 at 10:47:36AM -0800, Linus Torvalds wrote:
> > On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov
> > wrote:
> > >
> > > So far I only found one more pin leak and always-true check. I don't see
> > > how ca
On Mon, Dec 28, 2020 at 10:47:36AM -0800, Linus Torvalds wrote:
> On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov
> wrote:
> >
> > So far I only found one more pin leak and always-true check. I don't see
> > how can it lead to crash or corruption. Keep looking.
>
> Well, I noticed that the no
On Mon, Dec 28, 2020 at 10:47 AM Linus Torvalds
wrote:
>
> I personally think it's wrong to update vmf->pte at all. We should
> just have a local 'ptep' pointer that we update as we walk along. But
> that requires another change to the calling convention, namely to
> "do_set_pte()".
Actually, I t
On Mon, Dec 28, 2020 at 4:53 AM Kirill A. Shutemov wrote:
>
> So far I only found one more pin leak and always-true check. I don't see
> how can it lead to crash or corruption. Keep looking.
Well, I noticed that the nommu.c version of filemap_map_pages() needs
fixing, but that's obviously not the
On Sun, Dec 27, 2020 at 10:43:44PM -0800, Hugh Dickins wrote:
> On Sun, 27 Dec 2020, Linus Torvalds wrote:
> > On Sun, Dec 27, 2020 at 3:48 PM Kirill A. Shutemov
> > wrote:
> > >
> > > I did what Hugh proposed and it got clear to my eyes. It gets somewhat
> > > large, but take a look.
> >
> > Ok
On Sun, 27 Dec 2020, Linus Torvalds wrote:
> On Sun, Dec 27, 2020 at 3:48 PM Kirill A. Shutemov
> wrote:
> >
> > I did what Hugh proposed and it got clear to my eyes. It gets somewhat
> > large, but take a look.
>
> Ok, it's not that much bigger, and the end result is certainly much
> clearer wr
On Sun, Dec 27, 2020 at 3:48 PM Kirill A. Shutemov wrote:
>
> I did what Hugh proposed and it got clear to my eyes. It gets somewhat
> large, but take a look.
Ok, it's not that much bigger, and the end result is certainly much
clearer wrt locking.
So that last version of yours with the fix for t
On Sun, Dec 27, 2020 at 03:40:36PM -0800, Linus Torvalds wrote:
> I think Kirill was intending to move the "if (ret)" up into the path
> that sets it, IOW something like
>
> + if (!(vma->vm_flags & VM_SHARED)) {
> + ret = check_stable_address_space(vma->vm_mm);
> +
On Sun, Dec 27, 2020 at 11:38:22AM -0800, Linus Torvalds wrote:
> On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote:
> >
> > This patch (like its antecedents) moves the pte_unmap_unlock() from
> > after do_fault_around()'s "check if the page fault is solved" into
> > filemap_map_pages() itself (w
On Sun, Dec 27, 2020 at 3:12 PM Linus Torvalds
wrote:
>
> Ok, your fix for that folded in, and here's yet another version.
Still not good.
I don't know what happened, but the change of
- vm_fault_t ret = 0;
+ vm_fault_t ret;
is very very wrong. The next user is
+ if (!(vma->
On Sun, Dec 27, 2020 at 2:35 PM Hugh Dickins wrote:
>
> Yes, this one passes my testing on x86_64 and on i386.
Ok, good.
> But...
>
> ... Damian very helpfully reports that it does not build when
> CONFIG_TRANSPARENT_HUGEPAGE is not set, since the "static " has
> not been removed from the altern
On Sun, 27 Dec 2020, Damian Tometzki wrote:
> On Sun, 27. Dec 11:38, Linus Torvalds wrote:
> > On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote:
> > >
> > > This patch (like its antecedents) moves the pte_unmap_unlock() from
> > > after do_fault_around()'s "check if the page fault is solved" int
On Sun, 27. Dec 11:38, Linus Torvalds wrote:
> On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote:
> >
> > This patch (like its antecedents) moves the pte_unmap_unlock() from
> > after do_fault_around()'s "check if the page fault is solved" into
> > filemap_map_pages() itself (which apparently doe
On Sat, Dec 26, 2020 at 6:38 PM Hugh Dickins wrote:
>
> This patch (like its antecedents) moves the pte_unmap_unlock() from
> after do_fault_around()'s "check if the page fault is solved" into
> filemap_map_pages() itself (which apparently does not NULLify vmf->pte
> after unmapping it, which is p
On Sat, 26 Dec 2020, Hugh Dickins wrote:
> On Sun, 27 Dec 2020, Kirill A. Shutemov wrote:
> >
> > Here's the fixup I have so far. It doesn't blow up immediately, but please
> > take a closer look. Who knows what stupid mistake I did this time. :/
>
> It's been running fine on x86_64 for a couple
On Sun, 27 Dec 2020, Kirill A. Shutemov wrote:
>
> Here's the fixup I have so far. It doesn't blow up immediately, but please
> take a closer look. Who knows what stupid mistake I did this time. :/
It's been running fine on x86_64 for a couple of hours (but of course
my testing is deficient, in n
On Sat, Dec 26, 2020 at 01:16:09PM -0800, Linus Torvalds wrote:
> On Sat, Dec 26, 2020 at 1:04 PM Hugh Dickins wrote:
> >
> >
> > Hold on. I guess this one will suffer from the same bug as the previous.
> > I was about to report back, after satisfactory overnight testing of that
> > version - prov
On Sat, Dec 26, 2020 at 11:43:35PM +0300, Kirill A. Shutemov wrote:
> +static struct page *next_stable_page(struct page *page, struct vm_fault *vmf,
> + struct xa_state *xas, pgoff_t end_pgoff)
A "stable page" means one that doesn't currently have an outstanding
wr
On Sat, Dec 26, 2020 at 1:04 PM Hugh Dickins wrote:
>
>
> Hold on. I guess this one will suffer from the same bug as the previous.
> I was about to report back, after satisfactory overnight testing of that
> version - provided that one big little bug is fixed:
>
> --- a/mm/filemap.c
> +++ b/mm/fil
I was going to just apply this patch, because I like it so much, but
then I decided to take one last look, and:
On Sat, Dec 26, 2020 at 12:43 PM Kirill A. Shutemov
wrote:
>
> +static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page)
> +{
> + struct mm_struct *mm = vmf->vma->vm_m
On Sat, 26 Dec 2020, Kirill A. Shutemov wrote:
> On Sat, Dec 26, 2020 at 09:57:13AM -0800, Linus Torvalds wrote:
> > Because not only does that get rid of the "if (page)" test, I think it
> > would make things a bit clearer. When I read the patch first, the
> > initial "next_page()" call confused m
On Sat, Dec 26, 2020 at 09:57:13AM -0800, Linus Torvalds wrote:
> Because not only does that get rid of the "if (page)" test, I think it
> would make things a bit clearer. When I read the patch first, the
> initial "next_page()" call confused me.
Agreed. Here we go:
>From d12dea4abe94dbc24b794532
On Fri, Dec 25, 2020 at 3:31 AM Kirill A. Shutemov wrote:
>
> The new helper next_page() returns a stablized page, so filemap_map_pmd()
> can clearly decide if we should set up a page table or a huge page.
I really like that next_page() abstraction, my only comment is that I
think it should be re
On Wed, Dec 23, 2020 at 08:04:54PM -0800, Hugh Dickins wrote:
> It's not ready yet.
Thanks for the feedback. It's very helpful.
The patch below should address the problems you've found.
The new helper next_page() returns a stablized page, so filemap_map_pmd()
can clearly decide if we should set
On Tue, 22 Dec 2020, Kirill A. Shutemov wrote:
>
> Updated patch is below.
>
> From 0ec1bc1fe95587350ac4f4c866d6482383740b36 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov"
> Date: Sat, 19 Dec 2020 15:19:23 +0300
> Subject: [PATCH] mm: Cleanup faultaround and finish_fault() codepaths
>
>
On Sat, Dec 19, 2020 at 12:34:17PM -0800, Linus Torvalds wrote:
> On Sat, Dec 19, 2020 at 4:41 AM Kirill A. Shutemov
> wrote:
> >
> > @@ -2884,19 +2966,18 @@ void filemap_map_pages(struct vm_fault *vmf,
> > if (vmf->pte)
> > vmf->pte += xas.xa_index - last_
On Sat, Dec 19, 2020 at 4:41 AM Kirill A. Shutemov wrote:
>
> @@ -2884,19 +2966,18 @@ void filemap_map_pages(struct vm_fault *vmf,
> if (vmf->pte)
> vmf->pte += xas.xa_index - last_pgoff;
> last_pgoff = xas.xa_index;
> - if (all
On Sat, Dec 19, 2020 at 4:41 AM Kirill A. Shutemov wrote:
>
> Okay, but we only win the NULL check. xas_retry() and xa_is_value() has to
> be repeated in the beginning of the loop.
Yeah.
I wonder if it might make sense to have a "xas_next_entry_rcu()"
function that does something like
while
On Fri, Dec 18, 2020 at 10:56:55AM -0800, Linus Torvalds wrote:
> No?
Okay, but we only win the NULL check. xas_retry() and xa_is_value() has to
be repeated in the beginning of the loop.
>From b4f4c0d32e654b8459a1e439a453373499b8946a Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov"
Date: Sat,
On Fri, Dec 18, 2020 at 3:04 AM Kirill A. Shutemov wrote:
>
> This should do. See below.
Looks fine.
> > Then that second loop very naturally becomes a "do { } while ()" one.
>
> I don't see it. I haven't found a reasonable way to rework it do-while.
Now that you return early for the "HEAD == N
On Thu, Dec 17, 2020 at 10:22:33AM -0800, Linus Torvalds wrote:
> + head = xas_find(&xas, end_pgoff);
> + for (; ; head = xas_next_entry(&xas, end_pgoff)) {
> + if (!head) {
> + rcu_read_unlock();
> + return;
> + }
On Thu, Dec 17, 2020 at 2:54 AM Kirill A. Shutemov wrote:
>
> Also if the range doesn't have a mappable page we would setup a page
> table into the PMD entry. It means we cannot have huge page mapped there
> later. It may be a bummer: getting the page table out of page table tree
> requires mmap_w
On Wed, Dec 16, 2020 at 10:41:36AM -0800, Linus Torvalds wrote:
> On Wed, Dec 16, 2020 at 9:07 AM Kirill A. Shutemov
> wrote:
> >
> > If this looks fine, I'll submit a proper patch.
>
> That patch looks good to me.
>
> It would be good if somebody else looked it through - maybe I like it
> just
On Wed, Dec 16, 2020 at 9:07 AM Kirill A. Shutemov wrote:
>
> If this looks fine, I'll submit a proper patch.
That patch looks good to me.
It would be good if somebody else looked it through - maybe I like it
just because I got to pee in the snow and make my mark. But i think
that filemap_map_pa
On Mon, Dec 14, 2020 at 09:54:06AM -0800, Linus Torvalds wrote:
> Could we have a "filemap_map_pmd()" that does it all, and then the
> filemap_map_pages() logic would be more along the lines of
>
> if (filemap_map_pmd(vmf, xas)) {
> rcu_read_unlock();
> return;
> }
On Mon, Dec 14, 2020 at 09:54:06AM -0800, Linus Torvalds wrote:
> > I expected to hate it more, but it looks reasonable. Opencoded
> > xas_for_each() smells bad, but...
>
> I think the open-coded xas_for_each() per se isn't a problem, but I
> agree that the startup condition is a bit ugly. And I'm
On Mon, Dec 14, 2020 at 8:07 AM Kirill A. Shutemov wrote:
>
> Here it is. Still barely tested.
Ok, from looking at the patch (not applying it and looking at the end
result), I think the locking - at least for the filemap_map_pages()
case - is a lot easier to understand.
So you seem to have fixed
On Thu, Dec 10, 2020 at 09:23:53AM -0800, Linus Torvalds wrote:
> Can we please move that part to the callers too - possibly with a
> separate helper function?
Here it is. Still barely tested.
I expected to hate it more, but it looks reasonable. Opencoded
xas_for_each() smells bad, but...
And di
On Thu, Dec 10, 2020 at 7:08 AM Kirill A. Shutemov
wrote:
>
> See lightly tested patch below. Is it something you had in mind?
This is closer, in that at least it removes the ostensibly blocking
allocation (that can't happen) from the prefault path.
But the main issue remains:
> > At that point
On Wed, Dec 09, 2020 at 11:04:13AM -0800, Linus Torvalds wrote:
>
> Kirill - I think you know this code best. Would you be willing to look
> at splitting out that "allocate and map" from alloc_set_pte() and into
> the callers?
See lightly tested patch below. Is it something you had in mind?
I do
On Wed, Dec 9, 2020 at 12:32 PM Matthew Wilcox wrote:
>
> If a filesystem has put an Uptodate page into the page cache, the
> rest of the kernel can read it without telling the filesystem.
XFS does the same thing for xfs_file_read_iter() too.
Not that I disagree with you - when you mmap a file,
On Wed, Dec 09, 2020 at 11:04:13AM -0800, Linus Torvalds wrote:
> In particular, it made it a nightmare to read what do_fault_around()
> does: it does that odd
>
> if (pmd_none(*vmf->pmd)) {
> vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
>
> and then it calls ->map_
On Wed, Dec 9, 2020 at 10:40 AM Will Deacon wrote:
>
> > And yes, that probably means that you need to change "alloc_set_pte()"
> > to actually pass in the real address, and leave "vmf->address" alone -
> > so that it can know which ones are prefaulted and which one is real,
> > but that sounds li
On Wed, Dec 09, 2020 at 09:58:12AM -0800, Linus Torvalds wrote:
> On Wed, Dec 9, 2020 at 8:40 AM Will Deacon wrote:
> >
> > @@ -3978,8 +3994,17 @@ static vm_fault_t do_fault_around(struct vm_fault
> > *vmf)
> >
> > /* check if the page fault is solved */
> > vmf->pte -= (vmf->addr
On Wed, Dec 9, 2020 at 8:40 AM Will Deacon wrote:
>
> @@ -3978,8 +3994,17 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf)
>
> /* check if the page fault is solved */
> vmf->pte -= (vmf->address >> PAGE_SHIFT) - (address >> PAGE_SHIFT);
> - if (!pte_none(*vmf->pte))
Commit 5c0a85fad949 ("mm: make faultaround produce old ptes") changed
the "faultaround" behaviour to initialise prefaulted PTEs as 'old',
since this avoids vmscan wrongly assuming that they are hot, despite
having never been explicitly accessed by userspace. The change has been
shown to benefit num
51 matches
Mail list logo