Re: Bug with report THP eligibility for each vma

2018-12-24 Thread Michal Hocko
On Mon 24-12-18 14:12:51, Mike Rapoport wrote:
> On Mon, Dec 24, 2018 at 08:49:16AM +0100, Michal Hocko wrote:
> > [Cc-ing mailing list and people involved in the original patch]
> > 
> > On Fri 21-12-18 13:42:24, Paul Oppenheimer wrote:
> > > Hello! I've never reported a kernel bug before, and since its on the
> > > "next" tree I was told to email the author of the relevant commit.
> > > Please redirect me to the correct place if I've made a mistake.
> > > 
> > > When opening firefox or chrome, and using it for a good 7 seconds, it
> > > hangs in "uninterruptible sleep" and I recieve a "BUG" in dmesg. This
> > > doesn't occur when reverting this commit:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=48cf516f8c.
> > > Ive attached the output of decode_stacktrace.sh and the relevant dmesg
> > > log to this email.
> > > 
> > > Thanks
> > 
> > > BUG: unable to handle kernel NULL pointer dereference at 00e8
> > 
> > Thanks for the bug report! This is offset 232 and that matches
> > file->f_mapping as per pahole
> > pahole -C file ./vmlinux | grep f_mapping
> > struct address_space * f_mapping;/*   232 8 */
> > 
> > I thought that each file really has to have a mapping. But the following
> > should heal the issue and add an extra care.
> > 
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index f64733c23067..fc9d70a9fbd1 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -66,6 +66,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct 
> > *vma)
> >  {
> > if (vma_is_anonymous(vma))
> > return __transparent_hugepage_enabled(vma);
> > +   if (!vma->vm_file || !vma->vm_file->f_mapping)
> > +   return false;
> > if (shmem_mapping(vma->vm_file->f_mapping) && shmem_huge_enabled(vma))
> > return __transparent_hugepage_enabled(vma);
> 
> We have vma_is_shmem(), it can be used to replace shmem_mapping() without
> adding the check for !vma->vm_file

Yes, this looks like a much better choice. Thanks! Andrew, could you
fold this in instead.

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f64733c23067..e093cf5e4640 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -66,7 +66,7 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
 {
if (vma_is_anonymous(vma))
return __transparent_hugepage_enabled(vma);
-   if (shmem_mapping(vma->vm_file->f_mapping) && shmem_huge_enabled(vma))
+   if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
return __transparent_hugepage_enabled(vma);
 
return false;
-- 
Michal Hocko
SUSE Labs


Re: Bug with report THP eligibility for each vma

2018-12-24 Thread Mike Rapoport
On Mon, Dec 24, 2018 at 08:49:16AM +0100, Michal Hocko wrote:
> [Cc-ing mailing list and people involved in the original patch]
> 
> On Fri 21-12-18 13:42:24, Paul Oppenheimer wrote:
> > Hello! I've never reported a kernel bug before, and since its on the
> > "next" tree I was told to email the author of the relevant commit.
> > Please redirect me to the correct place if I've made a mistake.
> > 
> > When opening firefox or chrome, and using it for a good 7 seconds, it
> > hangs in "uninterruptible sleep" and I recieve a "BUG" in dmesg. This
> > doesn't occur when reverting this commit:
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=48cf516f8c.
> > Ive attached the output of decode_stacktrace.sh and the relevant dmesg
> > log to this email.
> > 
> > Thanks
> 
> > BUG: unable to handle kernel NULL pointer dereference at 00e8
> 
> Thanks for the bug report! This is offset 232 and that matches
> file->f_mapping as per pahole
> pahole -C file ./vmlinux | grep f_mapping
> struct address_space * f_mapping;/*   232 8 */
> 
> I thought that each file really has to have a mapping. But the following
> should heal the issue and add an extra care.
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f64733c23067..fc9d70a9fbd1 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -66,6 +66,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct 
> *vma)
>  {
>   if (vma_is_anonymous(vma))
>   return __transparent_hugepage_enabled(vma);
> + if (!vma->vm_file || !vma->vm_file->f_mapping)
> + return false;
>   if (shmem_mapping(vma->vm_file->f_mapping) && shmem_huge_enabled(vma))
>   return __transparent_hugepage_enabled(vma);

We have vma_is_shmem(), it can be used to replace shmem_mapping() without
adding the check for !vma->vm_file

>  
> Andrew, could you fold it to the original patch please?
> 
> Keeping the rest for the reference.
> 
> > #PF error: [normal kernel read fault]
> > PGD 0 P4D 0
> > Oops:  [#1] PREEMPT SMP PTI
> > CPU: 7 PID: 2687 Comm: StreamTrans #56 Tainted: G U
> > 4.20.0-rc7-next-20181221-beppy+ #15
> > Hardware name: Dell Inc. XPS 13 9360/0TPN17, BIOS 2.10.0 09/27/2018
> > RIP: 0010:transparent_hugepage_enabled (??:?) 
> > Code: 17 fd 00 e9 20 ff ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 
> > 89 fb 48 83 bf 90 00 00 00 00 74 27 48 8b 87 a0 00 00 00 <48> 8b b8 e8 00 
> > 00 00 e8 7a cc fa ff 84 c0 75 04 31 c0 5b c3 48 89
> > All code
> > 
> >0:   17  (bad)  
> >1:   fd  std
> >2:   00 e9   add%ch,%cl
> >4:   20 ff   and%bh,%bh
> >6:   ff  (bad)  
> >7:   ff 0f   decl   (%rdi)
> >9:   1f  (bad)  
> >a:   84 00   test   %al,(%rax)
> >c:   00 00   add%al,(%rax)
> >e:   00 00   add%al,(%rax)
> >   10:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> >   15:   53  push   %rbx
> >   16:   48 89 fbmov%rdi,%rbx
> >   19:   48 83 bf 90 00 00 00cmpq   $0x0,0x90(%rdi)
> >   20:   00 
> >   21:   74 27   je 0x4a
> >   23:   48 8b 87 a0 00 00 00mov0xa0(%rdi),%rax
> >   2a:*  48 8b b8 e8 00 00 00mov0xe8(%rax),%rdi  <-- 
> > trapping instruction
> >   31:   e8 7a cc fa ff  callq  0xfffaccb0
> >   36:   84 c0   test   %al,%al
> >   38:   75 04   jne0x3e
> >   3a:   31 c0   xor%eax,%eax
> >   3c:   5b  pop%rbx
> >   3d:   c3  retq   
> >   3e:   48  rex.W
> >   3f:   89  .byte 0x89
> > 
> > Code starting with the faulting instruction
> > ===
> >0:   48 8b b8 e8 00 00 00mov0xe8(%rax),%rdi
> >7:   e8 7a cc fa ff  callq  0xfffacc86
> >c:   84 c0   test   %al,%al
> >e:   75 04   jne0x14
> >   10:   31 c0   xor%eax,%eax
> >   12:   5b  pop%rbx
> >   13:   c3  retq   
> >   14:   48  rex.W
> >   15:   89  .byte 0x89
> > RSP: 0018:b79744f17d28 EFLAGS: 00010282
> > RAX:  RBX: 8948c17aff00 RCX: 
> > RDX: 0004 RSI: ab1165ba RDI: 8948c17aff00
> > RBP: 8948c17aff00 R08: 0007 R09: 894927e547b2
> > R10:  R11: 894927e549da R12: b79744f17d38
> > R13: 8948c17aff00 R14: 89489bef9400 R15: 89488b775a80
> > FS:  

Re: Bug with report THP eligibility for each vma

2018-12-24 Thread William Kucharski



> On Dec 24, 2018, at 12:49 AM, Michal Hocko  wrote:
> 
> [Cc-ing mailing list and people involved in the original patch]
> 
> On Fri 21-12-18 13:42:24, Paul Oppenheimer wrote:
>> Hello! I've never reported a kernel bug before, and since its on the
>> "next" tree I was told to email the author of the relevant commit.
>> Please redirect me to the correct place if I've made a mistake.
>> 
>> When opening firefox or chrome, and using it for a good 7 seconds, it
>> hangs in "uninterruptible sleep" and I recieve a "BUG" in dmesg. This
>> doesn't occur when reverting this commit:
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=48cf516f8c.
>> Ive attached the output of decode_stacktrace.sh and the relevant dmesg
>> log to this email.
>> 
>> Thanks
> 
>> BUG: unable to handle kernel NULL pointer dereference at 00e8
> 
> Thanks for the bug report! This is offset 232 and that matches
> file->f_mapping as per pahole
> pahole -C file ./vmlinux | grep f_mapping
>struct address_space * f_mapping;/*   232 8 */
> 
> I thought that each file really has to have a mapping. But the following
> should heal the issue and add an extra care.
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f64733c23067..fc9d70a9fbd1 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -66,6 +66,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct 
> *vma)
> {
>   if (vma_is_anonymous(vma))
>   return __transparent_hugepage_enabled(vma);
> + if (!vma->vm_file || !vma->vm_file->f_mapping)
> + return false;
>   if (shmem_mapping(vma->vm_file->f_mapping) && shmem_huge_enabled(vma))
>   return __transparent_hugepage_enabled(vma);

From what I see in code in mm/mmap.c, it seems if vma->vm_file is non-zero
vma->vm_file->f_mapping may be assumed to be non-NULL; see unlink_file_vma()
and __vma_link_file() for two examples, which both use the construct:

file = vma->vm_file;
if (file) {
struct address_space *mapping = file->f_mapping;

[ ... ]

[ code that dereferences "mapping" without further checks ]
}

I see nothing wrong with your second check but a few extra instructions
performed, but depending upon how often transparent_hugepage_enabled() is called
there may be at least theoretical performance concerns.

William Kucharski
william.kuchar...@oracle.com



Re: Bug with report THP eligibility for each vma

2018-12-23 Thread Michal Hocko
[Cc-ing mailing list and people involved in the original patch]

On Fri 21-12-18 13:42:24, Paul Oppenheimer wrote:
> Hello! I've never reported a kernel bug before, and since its on the
> "next" tree I was told to email the author of the relevant commit.
> Please redirect me to the correct place if I've made a mistake.
> 
> When opening firefox or chrome, and using it for a good 7 seconds, it
> hangs in "uninterruptible sleep" and I recieve a "BUG" in dmesg. This
> doesn't occur when reverting this commit:
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=48cf516f8c.
> Ive attached the output of decode_stacktrace.sh and the relevant dmesg
> log to this email.
> 
> Thanks

> BUG: unable to handle kernel NULL pointer dereference at 00e8

Thanks for the bug report! This is offset 232 and that matches
file->f_mapping as per pahole
pahole -C file ./vmlinux | grep f_mapping
struct address_space * f_mapping;/*   232 8 */

I thought that each file really has to have a mapping. But the following
should heal the issue and add an extra care.

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f64733c23067..fc9d70a9fbd1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -66,6 +66,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
 {
if (vma_is_anonymous(vma))
return __transparent_hugepage_enabled(vma);
+   if (!vma->vm_file || !vma->vm_file->f_mapping)
+   return false;
if (shmem_mapping(vma->vm_file->f_mapping) && shmem_huge_enabled(vma))
return __transparent_hugepage_enabled(vma);
 
Andrew, could you fold it to the original patch please?

Keeping the rest for the reference.

> #PF error: [normal kernel read fault]
> PGD 0 P4D 0
> Oops:  [#1] PREEMPT SMP PTI
> CPU: 7 PID: 2687 Comm: StreamTrans #56 Tainted: G U
> 4.20.0-rc7-next-20181221-beppy+ #15
> Hardware name: Dell Inc. XPS 13 9360/0TPN17, BIOS 2.10.0 09/27/2018
> RIP: 0010:transparent_hugepage_enabled (??:?) 
> Code: 17 fd 00 e9 20 ff ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 89 
> fb 48 83 bf 90 00 00 00 00 74 27 48 8b 87 a0 00 00 00 <48> 8b b8 e8 00 00 00 
> e8 7a cc fa ff 84 c0 75 04 31 c0 5b c3 48 89
> All code
> 
>0: 17  (bad)  
>1: fd  std
>2: 00 e9   add%ch,%cl
>4: 20 ff   and%bh,%bh
>6: ff  (bad)  
>7: ff 0f   decl   (%rdi)
>9: 1f  (bad)  
>a: 84 00   test   %al,(%rax)
>c: 00 00   add%al,(%rax)
>e: 00 00   add%al,(%rax)
>   10: 0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
>   15: 53  push   %rbx
>   16: 48 89 fbmov%rdi,%rbx
>   19: 48 83 bf 90 00 00 00cmpq   $0x0,0x90(%rdi)
>   20: 00 
>   21: 74 27   je 0x4a
>   23: 48 8b 87 a0 00 00 00mov0xa0(%rdi),%rax
>   2a:*48 8b b8 e8 00 00 00mov0xe8(%rax),%rdi  <-- 
> trapping instruction
>   31: e8 7a cc fa ff  callq  0xfffaccb0
>   36: 84 c0   test   %al,%al
>   38: 75 04   jne0x3e
>   3a: 31 c0   xor%eax,%eax
>   3c: 5b  pop%rbx
>   3d: c3  retq   
>   3e: 48  rex.W
>   3f: 89  .byte 0x89
> 
> Code starting with the faulting instruction
> ===
>0: 48 8b b8 e8 00 00 00mov0xe8(%rax),%rdi
>7: e8 7a cc fa ff  callq  0xfffacc86
>c: 84 c0   test   %al,%al
>e: 75 04   jne0x14
>   10: 31 c0   xor%eax,%eax
>   12: 5b  pop%rbx
>   13: c3  retq   
>   14: 48  rex.W
>   15: 89  .byte 0x89
> RSP: 0018:b79744f17d28 EFLAGS: 00010282
> RAX:  RBX: 8948c17aff00 RCX: 
> RDX: 0004 RSI: ab1165ba RDI: 8948c17aff00
> RBP: 8948c17aff00 R08: 0007 R09: 894927e547b2
> R10:  R11: 894927e549da R12: b79744f17d38
> R13: 8948c17aff00 R14: 89489bef9400 R15: 89488b775a80
> FS:  7fa54ad43700() GS:8949363c() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 00e8 CR3: 00025c0ee003 CR4: 003606e0
> Call Trace:
> show_smap (/home/bep/.opt/kernel/linux-e/fs/proc/task_mmu.c:805) 
> seq_read (/home/bep/.opt/kernel/linux-e/fs/seq_file.c:269) 
> __vfs_read (/home/bep/.opt/kernel/linux-e/fs/read_write.c:421) 
> vfs_read (/home/bep/.opt/kernel/linux-e/fs/read_write.c:452 
> /home/bep/.opt/kernel/linux-e/fs/read_write.c:437) 
> ksys_read