Re: [PATCH v1 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races

2021-04-20 Thread 堀口 直也
On Tue, Apr 20, 2021 at 12:16:57PM +0200, Borislav Petkov wrote:
> On Tue, Apr 20, 2021 at 07:46:26AM +, HORIGUCHI NAOYA(堀口 直也) wrote:
> > If you have any other suggestion, please let me know.
> 
> Looks almost ok...
> 
> > From: Tony Luck 
> > Date: Tue, 20 Apr 2021 16:42:01 +0900
> > Subject: [PATCH 1/3] mm/memory-failure: Use a mutex to avoid 
> > memory_failure()
> >  races
> > 
> > There can be races when multiple CPUs consume poison from the same
> > page. The first into memory_failure() atomically sets the HWPoison
> > page flag and begins hunting for tasks that map this page. Eventually
> > it invalidates those mappings and may send a SIGBUS to the affected
> > tasks.
> > 
> > But while all that work is going on, other CPUs see a "success"
> > return code from memory_failure() and so they believe the error
> > has been handled and continue executing.
> > 
> > Fix by wrapping most of the internal parts of memory_failure() in
> > a mutex.
> > 
> > Along with introducing an additional goto label, this patch also
> 
> ... avoid having "This patch" or "This commit" in the commit message.
> It is tautologically useless. Also, you don't have to explain what the
> patch does - that's visible hopefully. :-)

OK, I'll drop this paragraph from next version.

Thanks,
Naoya Horiguchi

Re: [PATCH v1 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races

2021-04-20 Thread Borislav Petkov
On Tue, Apr 20, 2021 at 07:46:26AM +, HORIGUCHI NAOYA(堀口 直也) wrote:
> If you have any other suggestion, please let me know.

Looks almost ok...

> From: Tony Luck 
> Date: Tue, 20 Apr 2021 16:42:01 +0900
> Subject: [PATCH 1/3] mm/memory-failure: Use a mutex to avoid memory_failure()
>  races
> 
> There can be races when multiple CPUs consume poison from the same
> page. The first into memory_failure() atomically sets the HWPoison
> page flag and begins hunting for tasks that map this page. Eventually
> it invalidates those mappings and may send a SIGBUS to the affected
> tasks.
> 
> But while all that work is going on, other CPUs see a "success"
> return code from memory_failure() and so they believe the error
> has been handled and continue executing.
> 
> Fix by wrapping most of the internal parts of memory_failure() in
> a mutex.
> 
> Along with introducing an additional goto label, this patch also

... avoid having "This patch" or "This commit" in the commit message.
It is tautologically useless. Also, you don't have to explain what the
patch does - that's visible hopefully. :-)

Other than that, it makes sense and the "sandwitching" looks correct:

mutex_lock
lock_page

...

unlock_page
mutex_unlock

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v1 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races

2021-04-20 Thread 堀口 直也
On Mon, Apr 19, 2021 at 07:05:38PM +0200, Borislav Petkov wrote:
> On Tue, Apr 13, 2021 at 07:43:18AM +0900, Naoya Horiguchi wrote:
> > From: Tony Luck 
> > 
> > There can be races when multiple CPUs consume poison from the same
> > page. The first into memory_failure() atomically sets the HWPoison
> > page flag and begins hunting for tasks that map this page. Eventually
> > it invalidates those mappings and may send a SIGBUS to the affected
> > tasks.
> > 
> > But while all that work is going on, other CPUs see a "success"
> > return code from memory_failure() and so they believe the error
> > has been handled and continue executing.
> > 
> > Fix by wrapping most of the internal parts of memory_failure() in
> > a mutex.
> > 
> > Signed-off-by: Tony Luck 
> > Signed-off-by: Naoya Horiguchi 
> > ---
> >  mm/memory-failure.c | 19 +--
> >  1 file changed, 17 insertions(+), 2 deletions(-)
> > 
> > diff --git v5.12-rc5/mm/memory-failure.c 
> > v5.12-rc5_patched/mm/memory-failure.c
> > index 24210c9bd843..c1509f4b565e 100644
> > --- v5.12-rc5/mm/memory-failure.c
> > +++ v5.12-rc5_patched/mm/memory-failure.c
> > @@ -1381,6 +1381,8 @@ static int memory_failure_dev_pagemap(unsigned long 
> > pfn, int flags,
> > return rc;
> >  }
> >  
> > +static DEFINE_MUTEX(mf_mutex);
> > +
> >  /**
> >   * memory_failure - Handle memory failure of a page.
> >   * @pfn: Page Number of the corrupted page
> > @@ -1424,12 +1426,18 @@ int memory_failure(unsigned long pfn, int flags)
> > return -ENXIO;
> > }
> 
> So the locking patterns are done in two different ways, which are
> confusing when following this code:
>
> > +   mutex_lock(_mutex);
> > +
> >  try_again:
> > -   if (PageHuge(p))
> > -   return memory_failure_hugetlb(pfn, flags);
> > +   if (PageHuge(p)) {
> > +   res = memory_failure_hugetlb(pfn, flags);
> > +   goto out2;
> > +   }
> 
> You have the goto to a label where you do the unlocking (btw, pls do
> s/out2/out_unlock/g;)...
> 
> > +
> > if (TestSetPageHWPoison(p)) {
> > pr_err("Memory failure: %#lx: already hardware poisoned\n",
> > pfn);
> > +   mutex_unlock(_mutex);
> > return 0;
> 
> ... and you have the other case where you unlock before returning.
> 
> Since you've added the label, I think *all* the unlocking should do
> "goto out_unlock" instead of doing either/or.

Make sense, so I updated the patch like below.  The existing label "out"
could be renamed in the same manner, so I replaced it with "unlock_page"
and "out2" with "unlock_mutex". I thought of using "out_unlock_{page,mutex}"
but maybe it's clear enough without "out_".

If you have any other suggestion, please let me know.

Thanks,
Naoya Horiguchi

---
From 19dbd0e9cd2c7febf67370ac9957bdde83084316 Mon Sep 17 00:00:00 2001
From: Tony Luck 
Date: Tue, 20 Apr 2021 16:42:01 +0900
Subject: [PATCH 1/3] mm/memory-failure: Use a mutex to avoid memory_failure()
 races

There can be races when multiple CPUs consume poison from the same
page. The first into memory_failure() atomically sets the HWPoison
page flag and begins hunting for tasks that map this page. Eventually
it invalidates those mappings and may send a SIGBUS to the affected
tasks.

But while all that work is going on, other CPUs see a "success"
return code from memory_failure() and so they believe the error
has been handled and continue executing.

Fix by wrapping most of the internal parts of memory_failure() in
a mutex.

Along with introducing an additional goto label, this patch also
aligns the returning code with "goto out" pattern and renames
the existing label "out' with clearer one to make code clearer.

Signed-off-by: Tony Luck 
Signed-off-by: Naoya Horiguchi 
---
 mm/memory-failure.c | 37 -
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 24210c9bd843..4087308e4b32 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1381,6 +1381,8 @@ static int memory_failure_dev_pagemap(unsigned long pfn, 
int flags,
return rc;
 }
 
+static DEFINE_MUTEX(mf_mutex);
+
 /**
  * memory_failure - Handle memory failure of a page.
  * @pfn: Page Number of the corrupted page
@@ -1404,7 +1406,7 @@ int memory_failure(unsigned long pfn, int flags)
struct page *hpage;
struct page *orig_head;
struct dev_pagemap *pgmap;
-   int res;
+   int res = 0;
unsigned long page_flags;
bool retry = true;
 
@@ -1424,13 +1426,18 @@ int memory_failure(unsigned long pfn, int flags)
return -ENXIO;
}
 
+   mutex_lock(_mutex);
+
 try_again:
-   if (PageHuge(p))
-   return memory_failure_hugetlb(pfn, flags);
+   if (PageHuge(p)) {
+   res = memory_failure_hugetlb(pfn, flags);
+   goto unlock_mutex;
+   }
+
if (TestSetPageHWPoison(p)) {
pr_err("Memory failure: 

Re: [PATCH v1 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races

2021-04-19 Thread Borislav Petkov
On Tue, Apr 13, 2021 at 07:43:18AM +0900, Naoya Horiguchi wrote:
> From: Tony Luck 
> 
> There can be races when multiple CPUs consume poison from the same
> page. The first into memory_failure() atomically sets the HWPoison
> page flag and begins hunting for tasks that map this page. Eventually
> it invalidates those mappings and may send a SIGBUS to the affected
> tasks.
> 
> But while all that work is going on, other CPUs see a "success"
> return code from memory_failure() and so they believe the error
> has been handled and continue executing.
> 
> Fix by wrapping most of the internal parts of memory_failure() in
> a mutex.
> 
> Signed-off-by: Tony Luck 
> Signed-off-by: Naoya Horiguchi 
> ---
>  mm/memory-failure.c | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git v5.12-rc5/mm/memory-failure.c v5.12-rc5_patched/mm/memory-failure.c
> index 24210c9bd843..c1509f4b565e 100644
> --- v5.12-rc5/mm/memory-failure.c
> +++ v5.12-rc5_patched/mm/memory-failure.c
> @@ -1381,6 +1381,8 @@ static int memory_failure_dev_pagemap(unsigned long 
> pfn, int flags,
>   return rc;
>  }
>  
> +static DEFINE_MUTEX(mf_mutex);
> +
>  /**
>   * memory_failure - Handle memory failure of a page.
>   * @pfn: Page Number of the corrupted page
> @@ -1424,12 +1426,18 @@ int memory_failure(unsigned long pfn, int flags)
>   return -ENXIO;
>   }

So the locking patterns are done in two different ways, which are
confusing when following this code:

> + mutex_lock(_mutex);
> +
>  try_again:
> - if (PageHuge(p))
> - return memory_failure_hugetlb(pfn, flags);
> + if (PageHuge(p)) {
> + res = memory_failure_hugetlb(pfn, flags);
> + goto out2;
> + }

You have the goto to a label where you do the unlocking (btw, pls do
s/out2/out_unlock/g;)...

> +
>   if (TestSetPageHWPoison(p)) {
>   pr_err("Memory failure: %#lx: already hardware poisoned\n",
>   pfn);
> + mutex_unlock(_mutex);
>   return 0;

... and you have the other case where you unlock before returning.

Since you've added the label, I think *all* the unlocking should do
"goto out_unlock" instead of doing either/or.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


[PATCH v1 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races

2021-04-12 Thread Naoya Horiguchi
From: Tony Luck 

There can be races when multiple CPUs consume poison from the same
page. The first into memory_failure() atomically sets the HWPoison
page flag and begins hunting for tasks that map this page. Eventually
it invalidates those mappings and may send a SIGBUS to the affected
tasks.

But while all that work is going on, other CPUs see a "success"
return code from memory_failure() and so they believe the error
has been handled and continue executing.

Fix by wrapping most of the internal parts of memory_failure() in
a mutex.

Signed-off-by: Tony Luck 
Signed-off-by: Naoya Horiguchi 
---
 mm/memory-failure.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git v5.12-rc5/mm/memory-failure.c v5.12-rc5_patched/mm/memory-failure.c
index 24210c9bd843..c1509f4b565e 100644
--- v5.12-rc5/mm/memory-failure.c
+++ v5.12-rc5_patched/mm/memory-failure.c
@@ -1381,6 +1381,8 @@ static int memory_failure_dev_pagemap(unsigned long pfn, 
int flags,
return rc;
 }
 
+static DEFINE_MUTEX(mf_mutex);
+
 /**
  * memory_failure - Handle memory failure of a page.
  * @pfn: Page Number of the corrupted page
@@ -1424,12 +1426,18 @@ int memory_failure(unsigned long pfn, int flags)
return -ENXIO;
}
 
+   mutex_lock(_mutex);
+
 try_again:
-   if (PageHuge(p))
-   return memory_failure_hugetlb(pfn, flags);
+   if (PageHuge(p)) {
+   res = memory_failure_hugetlb(pfn, flags);
+   goto out2;
+   }
+
if (TestSetPageHWPoison(p)) {
pr_err("Memory failure: %#lx: already hardware poisoned\n",
pfn);
+   mutex_unlock(_mutex);
return 0;
}
 
@@ -1463,9 +1471,11 @@ int memory_failure(unsigned long pfn, int flags)
res = MF_FAILED;
}
action_result(pfn, MF_MSG_BUDDY, res);
+   mutex_unlock(_mutex);
return res == MF_RECOVERED ? 0 : -EBUSY;
} else {
action_result(pfn, MF_MSG_KERNEL_HIGH_ORDER, 
MF_IGNORED);
+   mutex_unlock(_mutex);
return -EBUSY;
}
}
@@ -1473,6 +1483,7 @@ int memory_failure(unsigned long pfn, int flags)
if (PageTransHuge(hpage)) {
if (try_to_split_thp_page(p, "Memory Failure") < 0) {
action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED);
+   mutex_unlock(_mutex);
return -EBUSY;
}
VM_BUG_ON_PAGE(!page_count(p), p);
@@ -1517,6 +1528,7 @@ int memory_failure(unsigned long pfn, int flags)
num_poisoned_pages_dec();
unlock_page(p);
put_page(p);
+   mutex_unlock(_mutex);
return 0;
}
if (hwpoison_filter(p)) {
@@ -1524,6 +1536,7 @@ int memory_failure(unsigned long pfn, int flags)
num_poisoned_pages_dec();
unlock_page(p);
put_page(p);
+   mutex_unlock(_mutex);
return 0;
}
 
@@ -1559,6 +1572,8 @@ int memory_failure(unsigned long pfn, int flags)
res = identify_page_state(pfn, p, page_flags);
 out:
unlock_page(p);
+out2:
+   mutex_unlock(_mutex);
return res;
 }
 EXPORT_SYMBOL_GPL(memory_failure);
-- 
2.25.1