Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-21 Thread Guenter Roeck
On Mon, Aug 20, 2018 at 01:18:43PM -0700, Andi Kleen wrote:
> On Mon, Aug 20, 2018 at 11:03:53AM -0700, Andi Kleen wrote:
> > On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> > > On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > > > Hi,
> > > > 
> > > > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and 
> > > > v4.9.121
> > > > with CONFIG_TRANSPARENT_HUGEPAGE=y, 
> > > > CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> > > 
> > > Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> > > macros"). I do not see it in stable 4.4 tree and it has been introduced
> > > much later in 4.14. This one gave us quite some headache because it is
> > > s easy to overlook.
> > 
> > Good catch!
> > 
> > I tested that with 4.9 and backporting the patch indeed fixes the
> > syzcaller test case running in a KVM VM. Backported patch appended.
> 
> Tested on 4.4 too and it fixes the syzkaller test case there too.
> 

Confirmed that the problem is fixed in v4.4.151-rc1 and v4.9.123-rc1.

Thanks a lot for tracking this down and backporting the fix!

Guenter


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-21 Thread Guenter Roeck
On Mon, Aug 20, 2018 at 01:18:43PM -0700, Andi Kleen wrote:
> On Mon, Aug 20, 2018 at 11:03:53AM -0700, Andi Kleen wrote:
> > On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> > > On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > > > Hi,
> > > > 
> > > > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and 
> > > > v4.9.121
> > > > with CONFIG_TRANSPARENT_HUGEPAGE=y, 
> > > > CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> > > 
> > > Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> > > macros"). I do not see it in stable 4.4 tree and it has been introduced
> > > much later in 4.14. This one gave us quite some headache because it is
> > > s easy to overlook.
> > 
> > Good catch!
> > 
> > I tested that with 4.9 and backporting the patch indeed fixes the
> > syzcaller test case running in a KVM VM. Backported patch appended.
> 
> Tested on 4.4 too and it fixes the syzkaller test case there too.
> 

Confirmed that the problem is fixed in v4.4.151-rc1 and v4.9.123-rc1.

Thanks a lot for tracking this down and backporting the fix!

Guenter


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Andi Kleen
On Mon, Aug 20, 2018 at 11:03:53AM -0700, Andi Kleen wrote:
> On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> > On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> > > with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> > 
> > Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> > macros"). I do not see it in stable 4.4 tree and it has been introduced
> > much later in 4.14. This one gave us quite some headache because it is
> > s easy to overlook.
> 
> Good catch!
> 
> I tested that with 4.9 and backporting the patch indeed fixes the
> syzcaller test case running in a KVM VM. Backported patch appended.

Tested on 4.4 too and it fixes the syzkaller test case there too.

-Andi


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Andi Kleen
On Mon, Aug 20, 2018 at 11:03:53AM -0700, Andi Kleen wrote:
> On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> > On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> > > with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> > 
> > Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> > macros"). I do not see it in stable 4.4 tree and it has been introduced
> > much later in 4.14. This one gave us quite some headache because it is
> > s easy to overlook.
> 
> Good catch!
> 
> I tested that with 4.9 and backporting the patch indeed fixes the
> syzcaller test case running in a KVM VM. Backported patch appended.

Tested on 4.4 too and it fixes the syzkaller test case there too.

-Andi


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Michal Hocko
On Mon 20-08-18 11:03:53, Andi Kleen wrote:
> On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> > On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> > > with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> > 
> > Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> > macros"). I do not see it in stable 4.4 tree and it has been introduced
> > much later in 4.14. This one gave us quite some headache because it is
> > s easy to overlook.
> 
> Good catch!
> 
> I tested that with 4.9 and backporting the patch indeed fixes the
> syzcaller test case running in a KVM VM. Backported patch appended.
> 
> Should probably go into 4.4 and 4.9.
> 
> Cannot explain the 4.17 report unfortunately.

I haven't seen that one yet and likely won't get to it tomorrow as well
but I would start looking for a direct pte_val usage. We have had som
out of tree xen code which was doing exactly this. Not really easy to
find by a code inspection.
-- 
Michal Hocko
SUSE Labs


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Michal Hocko
On Mon 20-08-18 11:03:53, Andi Kleen wrote:
> On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> > On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> > > with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> > 
> > Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> > macros"). I do not see it in stable 4.4 tree and it has been introduced
> > much later in 4.14. This one gave us quite some headache because it is
> > s easy to overlook.
> 
> Good catch!
> 
> I tested that with 4.9 and backporting the patch indeed fixes the
> syzcaller test case running in a KVM VM. Backported patch appended.
> 
> Should probably go into 4.4 and 4.9.
> 
> Cannot explain the 4.17 report unfortunately.

I haven't seen that one yet and likely won't get to it tomorrow as well
but I would start looking for a direct pte_val usage. We have had som
out of tree xen code which was doing exactly this. Not really easy to
find by a code inspection.
-- 
Michal Hocko
SUSE Labs


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Andi Kleen
On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > Hi,
> > 
> > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> > with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> 
> Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> macros"). I do not see it in stable 4.4 tree and it has been introduced
> much later in 4.14. This one gave us quite some headache because it is
> s easy to overlook.

Good catch!

I tested that with 4.9 and backporting the patch indeed fixes the
syzcaller test case running in a KVM VM. Backported patch appended.

Should probably go into 4.4 and 4.9.

Cannot explain the 4.17 report unfortunately.

I'll resend it as an email too

---

x86/mm: Simplify p[g4um]d_page() macros

Create a pgd_pfn() macro similar to the p[4um]d_pfn() macros and then
use the p[g4um]d_pfn() macros in the p[g4um]d_page() macros instead of
duplicating the code.

[Needed to fix crashes caused by earlier backports in 4.9 stable, likely 4.4 
too]

Signed-off-by: Tom Lendacky 
Reviewed-by: Thomas Gleixner 
Reviewed-by: Borislav Petkov 
Cc: Alexander Potapenko 
Cc: Andrey Ryabinin 
Cc: Andy Lutomirski 
Cc: Arnd Bergmann 
Cc: Borislav Petkov 
Cc: Brijesh Singh 
Cc: Dave Young 
Cc: Dmitry Vyukov 
Cc: Jonathan Corbet 
Cc: Konrad Rzeszutek Wilk 
Cc: Larry Woodman 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Peter Zijlstra 
Cc: Radim Krčmář 
Cc: Rik van Riel 
Cc: Toshimitsu Kani 
Cc: kasan-...@googlegroups.com
Cc: k...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux...@kvack.org
Link: 
http://lkml.kernel.org/r/e61eb533a6d0aac941db2723d8aa63ef6b882dee.1500319216.git.thomas.lenda...@amd.com
[Backported to 4.9 stable by AK, suggested by Michael Hocko]
Signed-off-by: Ingo Molnar 
Signed-off-by: Andi Kleen 

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 4de6c282c02a..68a55273ce0f 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -173,6 +173,11 @@ static inline unsigned long pud_pfn(pud_t pud)
return (pfn & pud_pfn_mask(pud)) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pgd_pfn(pgd_t pgd)
+{
+   return (pgd_val(pgd) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)  pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
@@ -578,8 +583,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pmd_page(pmd)  \
-   pfn_to_page((pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT)
+#define pmd_page(pmd)  pfn_to_page(pmd_pfn(pmd))
 
 /*
  * the pmd page can be thought of an array like this: pmd_t[PTRS_PER_PMD]
@@ -647,8 +651,7 @@ static inline unsigned long pud_page_vaddr(pud_t pud)
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pud_page(pud)  \
-   pfn_to_page((pud_val(pud) & pud_pfn_mask(pud)) >> PAGE_SHIFT)
+#define pud_page(pud)  pfn_to_page(pud_pfn(pud))
 
 /* Find an entry in the second-level page table.. */
 static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
@@ -688,7 +691,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pgd_page(pgd)  pfn_to_page(pgd_val(pgd) >> PAGE_SHIFT)
+#define pgd_page(pgd)  pfn_to_page(pgd_pfn(pgd))
 
 /* to find an entry in a page-table-directory. */
 static inline unsigned long pud_index(unsigned long address)


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Andi Kleen
On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > Hi,
> > 
> > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> > with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> 
> Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> macros"). I do not see it in stable 4.4 tree and it has been introduced
> much later in 4.14. This one gave us quite some headache because it is
> s easy to overlook.

Good catch!

I tested that with 4.9 and backporting the patch indeed fixes the
syzcaller test case running in a KVM VM. Backported patch appended.

Should probably go into 4.4 and 4.9.

Cannot explain the 4.17 report unfortunately.

I'll resend it as an email too

---

x86/mm: Simplify p[g4um]d_page() macros

Create a pgd_pfn() macro similar to the p[4um]d_pfn() macros and then
use the p[g4um]d_pfn() macros in the p[g4um]d_page() macros instead of
duplicating the code.

[Needed to fix crashes caused by earlier backports in 4.9 stable, likely 4.4 
too]

Signed-off-by: Tom Lendacky 
Reviewed-by: Thomas Gleixner 
Reviewed-by: Borislav Petkov 
Cc: Alexander Potapenko 
Cc: Andrey Ryabinin 
Cc: Andy Lutomirski 
Cc: Arnd Bergmann 
Cc: Borislav Petkov 
Cc: Brijesh Singh 
Cc: Dave Young 
Cc: Dmitry Vyukov 
Cc: Jonathan Corbet 
Cc: Konrad Rzeszutek Wilk 
Cc: Larry Woodman 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Peter Zijlstra 
Cc: Radim Krčmář 
Cc: Rik van Riel 
Cc: Toshimitsu Kani 
Cc: kasan-...@googlegroups.com
Cc: k...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux...@kvack.org
Link: 
http://lkml.kernel.org/r/e61eb533a6d0aac941db2723d8aa63ef6b882dee.1500319216.git.thomas.lenda...@amd.com
[Backported to 4.9 stable by AK, suggested by Michael Hocko]
Signed-off-by: Ingo Molnar 
Signed-off-by: Andi Kleen 

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 4de6c282c02a..68a55273ce0f 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -173,6 +173,11 @@ static inline unsigned long pud_pfn(pud_t pud)
return (pfn & pud_pfn_mask(pud)) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pgd_pfn(pgd_t pgd)
+{
+   return (pgd_val(pgd) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)  pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
@@ -578,8 +583,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pmd_page(pmd)  \
-   pfn_to_page((pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT)
+#define pmd_page(pmd)  pfn_to_page(pmd_pfn(pmd))
 
 /*
  * the pmd page can be thought of an array like this: pmd_t[PTRS_PER_PMD]
@@ -647,8 +651,7 @@ static inline unsigned long pud_page_vaddr(pud_t pud)
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pud_page(pud)  \
-   pfn_to_page((pud_val(pud) & pud_pfn_mask(pud)) >> PAGE_SHIFT)
+#define pud_page(pud)  pfn_to_page(pud_pfn(pud))
 
 /* Find an entry in the second-level page table.. */
 static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
@@ -688,7 +691,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
  * Currently stuck as a macro due to indirect forward reference to
  * linux/mmzone.h's __section_mem_map_addr() definition:
  */
-#define pgd_page(pgd)  pfn_to_page(pgd_val(pgd) >> PAGE_SHIFT)
+#define pgd_page(pgd)  pfn_to_page(pgd_pfn(pgd))
 
 /* to find an entry in a page-table-directory. */
 static inline unsigned long pud_index(unsigned long address)


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Michal Hocko
On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> Hi,
> 
> the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.

Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
macros"). I do not see it in stable 4.4 tree and it has been introduced
much later in 4.14. This one gave us quite some headache because it is
s easy to overlook.
-- 
Michal Hocko
SUSE Labs


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-20 Thread Michal Hocko
On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> Hi,
> 
> the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.

Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
macros"). I do not see it in stable 4.4 tree and it has been introduced
much later in 4.14. This one gave us quite some headache because it is
s easy to overlook.
-- 
Michal Hocko
SUSE Labs


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-17 Thread Andi Kleen
> Plus I'd have expected the problem to have been in mainline too, and
> apparently it's just the 4.4 and 4.9 backports.

There's another problem in 4.17, but not 4.18, see 
https://bugzilla.redhat.com/show_bug.cgi?id=1618792

Could be the same or different.

-Andi

> 
> Your test-case does have mprotect with PROT_NONE. Which together with
> that mask that *might* be PHYSICAL_PMD_PAGE_MASK makes me think it
> might be related.
> 
>  Linus


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-17 Thread Andi Kleen
> Plus I'd have expected the problem to have been in mainline too, and
> apparently it's just the 4.4 and 4.9 backports.

There's another problem in 4.17, but not 4.18, see 
https://bugzilla.redhat.com/show_bug.cgi?id=1618792

Could be the same or different.

-Andi

> 
> Your test-case does have mprotect with PROT_NONE. Which together with
> that mask that *might* be PHYSICAL_PMD_PAGE_MASK makes me think it
> might be related.
> 
>  Linus


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-17 Thread Guenter Roeck

On 08/17/2018 05:25 PM, Linus Torvalds wrote:

On Fri, Aug 17, 2018 at 3:27 PM Guenter Roeck  wrote:


[6.649970] random: crng init done
[6.689002] BUG: unable to handle kernel paging request at eafffa1a0020


Hmm. Lots of bits set.


[6.689082] RIP: 0010:[]  [] 
page_remove_rmap+0x10/0x230
[6.689082] RSP: 0018:c97abc18  EFLAGS: 0296
[6.689082] RAX: ea0005e58000 RBX: eafffa1a RCX: 2020
[6.689082] RDX: 3fe0 RSI: 0001 RDI: eafffa1a


Is that RDX value the same value as PHYSICAL_PMD_PAGE_MASK?

If I did my math right, it would be, if your CPU has 46 bits of
physical memory. Might that be the case?


Yes.


The reason I mention that is because we had the bug with spurious
inversion of the zero pte/pmd, fixed by

   f19f5c49bbc3 ("x86/speculation/l1tf: Exempt zeroed PTEs from inversion")


I applied that patch, but it didn't help. I get exactly the same crash and
traceback.


and that would make a zeroed pmd entry be inverted by
PHYSICAL_PMD_PAGE_MASK, and then you get odd garbage page pointers
etc.

Maybe. I could have gotten the math wrong too, but it sounds like the
register contents _potentially_ might match up with something like
this, and then we'd zap a bogus hugepage because of some confusion.

Although then I'd have expected the bisection to hit
"x86/speculation/l1tf: Invert all not present mappings" instead of the
one you hit, so I don't know.

Plus I'd have expected the problem to have been in mainline too, and
apparently it's just the 4.4 and 4.9 backports.


Personally I suspect that something went wrong or is missing in the backport
from 4.14 to 4.9. 5-level paging was introduced in between, and thp support
was extended to support additional architectures. With all those changes,
it is easy to miss something. Only I have no idea what that might be.

Guenter



Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-17 Thread Guenter Roeck

On 08/17/2018 05:25 PM, Linus Torvalds wrote:

On Fri, Aug 17, 2018 at 3:27 PM Guenter Roeck  wrote:


[6.649970] random: crng init done
[6.689002] BUG: unable to handle kernel paging request at eafffa1a0020


Hmm. Lots of bits set.


[6.689082] RIP: 0010:[]  [] 
page_remove_rmap+0x10/0x230
[6.689082] RSP: 0018:c97abc18  EFLAGS: 0296
[6.689082] RAX: ea0005e58000 RBX: eafffa1a RCX: 2020
[6.689082] RDX: 3fe0 RSI: 0001 RDI: eafffa1a


Is that RDX value the same value as PHYSICAL_PMD_PAGE_MASK?

If I did my math right, it would be, if your CPU has 46 bits of
physical memory. Might that be the case?


Yes.


The reason I mention that is because we had the bug with spurious
inversion of the zero pte/pmd, fixed by

   f19f5c49bbc3 ("x86/speculation/l1tf: Exempt zeroed PTEs from inversion")


I applied that patch, but it didn't help. I get exactly the same crash and
traceback.


and that would make a zeroed pmd entry be inverted by
PHYSICAL_PMD_PAGE_MASK, and then you get odd garbage page pointers
etc.

Maybe. I could have gotten the math wrong too, but it sounds like the
register contents _potentially_ might match up with something like
this, and then we'd zap a bogus hugepage because of some confusion.

Although then I'd have expected the bisection to hit
"x86/speculation/l1tf: Invert all not present mappings" instead of the
one you hit, so I don't know.

Plus I'd have expected the problem to have been in mainline too, and
apparently it's just the 4.4 and 4.9 backports.


Personally I suspect that something went wrong or is missing in the backport
from 4.14 to 4.9. 5-level paging was introduced in between, and thp support
was extended to support additional architectures. With all those changes,
it is easy to miss something. Only I have no idea what that might be.

Guenter



Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-17 Thread Linus Torvalds
On Fri, Aug 17, 2018 at 3:27 PM Guenter Roeck  wrote:
>
> [6.649970] random: crng init done
> [6.689002] BUG: unable to handle kernel paging request at eafffa1a0020

Hmm. Lots of bits set.

> [6.689082] RIP: 0010:[]  [] 
> page_remove_rmap+0x10/0x230
> [6.689082] RSP: 0018:c97abc18  EFLAGS: 0296
> [6.689082] RAX: ea0005e58000 RBX: eafffa1a RCX: 
> 2020
> [6.689082] RDX: 3fe0 RSI: 0001 RDI: 
> eafffa1a

Is that RDX value the same value as PHYSICAL_PMD_PAGE_MASK?

If I did my math right, it would be, if your CPU has 46 bits of
physical memory. Might that be the case?

The reason I mention that is because we had the bug with spurious
inversion of the zero pte/pmd, fixed by

  f19f5c49bbc3 ("x86/speculation/l1tf: Exempt zeroed PTEs from inversion")

and that would make a zeroed pmd entry be inverted by
PHYSICAL_PMD_PAGE_MASK, and then you get odd garbage page pointers
etc.

Maybe. I could have gotten the math wrong too, but it sounds like the
register contents _potentially_ might match up with something like
this, and then we'd zap a bogus hugepage because of some confusion.

Although then I'd have expected the bisection to hit
"x86/speculation/l1tf: Invert all not present mappings" instead of the
one you hit, so I don't know.

Plus I'd have expected the problem to have been in mainline too, and
apparently it's just the 4.4 and 4.9 backports.

Your test-case does have mprotect with PROT_NONE. Which together with
that mask that *might* be PHYSICAL_PMD_PAGE_MASK makes me think it
might be related.

 Linus


Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

2018-08-17 Thread Linus Torvalds
On Fri, Aug 17, 2018 at 3:27 PM Guenter Roeck  wrote:
>
> [6.649970] random: crng init done
> [6.689002] BUG: unable to handle kernel paging request at eafffa1a0020

Hmm. Lots of bits set.

> [6.689082] RIP: 0010:[]  [] 
> page_remove_rmap+0x10/0x230
> [6.689082] RSP: 0018:c97abc18  EFLAGS: 0296
> [6.689082] RAX: ea0005e58000 RBX: eafffa1a RCX: 
> 2020
> [6.689082] RDX: 3fe0 RSI: 0001 RDI: 
> eafffa1a

Is that RDX value the same value as PHYSICAL_PMD_PAGE_MASK?

If I did my math right, it would be, if your CPU has 46 bits of
physical memory. Might that be the case?

The reason I mention that is because we had the bug with spurious
inversion of the zero pte/pmd, fixed by

  f19f5c49bbc3 ("x86/speculation/l1tf: Exempt zeroed PTEs from inversion")

and that would make a zeroed pmd entry be inverted by
PHYSICAL_PMD_PAGE_MASK, and then you get odd garbage page pointers
etc.

Maybe. I could have gotten the math wrong too, but it sounds like the
register contents _potentially_ might match up with something like
this, and then we'd zap a bogus hugepage because of some confusion.

Although then I'd have expected the bisection to hit
"x86/speculation/l1tf: Invert all not present mappings" instead of the
one you hit, so I don't know.

Plus I'd have expected the problem to have been in mainline too, and
apparently it's just the 4.4 and 4.9 backports.

Your test-case does have mprotect with PROT_NONE. Which together with
that mask that *might* be PHYSICAL_PMD_PAGE_MASK makes me think it
might be related.

 Linus