RE: Boot failure caused by "mm/cma.c: free the reserved memblock when free cma pages"

2014-09-23 Thread Wang, Yalin
Hi

Very strange,
I test it on ARM arch,
It works well.

I think revert it is ok ,
I have a discussion with Russell,
His concern make sense,
Remove this patch is ok.
https://lkml.org/lkml/2014/9/18/171


the crash is interesting,
I will have a look.
Sorry for that ..

-Original Message-
From: Sasha Levin [mailto:sasha.le...@oracle.com] 
Sent: Wednesday, September 24, 2014 7:03 AM
To: Wang, Yalin; Minchan Kim
Cc: Michal Hocko; Hugh Dickins; Joonsoo Kim; Andrew Morton; LKML; 
linux...@kvack.org
Subject: Boot failure caused by "mm/cma.c: free the reserved memblock when free 
cma pages"

Hi Yalin,

I'm seeing the following BUG when booting the latest -next kernel. I've 
bisected it down to "mm/cma.c: free the reserved memblock when free cma pages".

[2.438701] BUG: unable to handle kernel paging request at 880972493000
[2.438701] IP: memblock_isolate_range (mm/memblock.c:624)
[2.438701] PGD 34b51067 PUD 34b54067 PMD 976c56067 PTE 800972493060
[2.438701] Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
[2.438701] Dumping ftrace buffer:
[2.438701](ftrace buffer empty)
[2.438701] Modules linked in:
[2.438701] CPU: 17 PID: 1 Comm: swapper/0 Not tainted 
3.17.0-rc6-next-20140923-sasha-00037-gc40eca4 #1213
[2.438701] task: 88076d7d ti: 880048d4 task.ti: 
880048d4
[2.438701] RIP: memblock_isolate_range (mm/memblock.c:624)
[2.438701] RSP: :880048d43cf8  EFLAGS: 00010286
[2.438828] RAX: 880972493000 RBX: 00096260 RCX: 880048d43d50
[2.439590] RDX: 0020 RSI: 00096240 RDI: b2fcaa30
[2.44] RBP: 880048d43d38 R08: 880048d43d54 R09: 0001
[2.44] R10:  R11: 0001 R12: 880048d43d54
[2.44] R13: 00096240 R14:  R15: b2fcaa30
[2.44] FS:  () GS:880567c0() 
knlGS:
[2.44] CS:  0010 DS:  ES:  CR0: 8005003b
[2.44] CR2: 880972493000 CR3: 31e2f000 CR4: 06a0
[2.44] Stack:
[2.44]  ad29cda2 880048d43d50 ea002eeb4000 
00096240
[2.44]  b2fcaa30  a000 
ea002eeb4008
[2.44]  880048d43d68 b049e64a 00096240 

[2.44] Call Trace:
[2.44] ? adjust_managed_page_count (mm/page_alloc.c:5430)
[2.44] memblock_remove_range (mm/memblock.c:672)
[2.44] memblock_free (mm/memblock.c:695)
[2.44] init_cma_reserved_pageblock (mm/page_alloc.c:840)
[2.44] cma_init_reserved_areas (mm/cma.c:118 mm/cma.c:133)
[2.44] ? kfree (mm/slub.c:2674 mm/slub.c:3339)
[2.44] ? early_memunmap (mm/cma.c:129)
[2.44] do_one_initcall (init/main.c:792)
[2.44] kernel_init_freeable (init/main.c:857 init/main.c:865 
init/main.c:884 init/main.c:1005)
[2.44] ? rest_init (init/main.c:932)
[2.44] kernel_init (init/main.c:937)
[2.44] ret_from_fork (arch/x86/kernel/entry_64.S:348)
[2.44] ? rest_init (init/main.c:932)
[ 2.44] Code: 89 ff e8 ec fa ff ff 85 c0 79 e1 b8 f4 ff ff ff e9 c0 00 00 
00 4c 01 eb 45 31 f6 49 63 c6 49 3b 07 73 b9 48 c1 e0 05 49 03 47 18 <48> 8b 10 
48 8b 48 08 48 8d 34 11 48 39 d3 76 a1 49 39 f5 0f 83 All code 
   0:   89 ff   mov%edi,%edi
   2:   e8 ec fa ff ff  callq  0xfaf3
   7:   85 c0   test   %eax,%eax
   9:   79 e1   jns0xffec
   b:   b8 f4 ff ff ff  mov$0xfff4,%eax
  10:   e9 c0 00 00 00  jmpq   0xd5
  15:   4c 01 ebadd%r13,%rbx
  18:   45 31 f6xor%r14d,%r14d
  1b:   49 63 c6movslq %r14d,%rax
  1e:   49 3b 07cmp(%r15),%rax
  21:   73 b9   jae0xffdc
  23:   48 c1 e0 05 shl$0x5,%rax
  27:   49 03 47 18 add0x18(%r15),%rax
  2b:*  48 8b 10mov(%rax),%rdx  <-- trapping 
instruction
  2e:   48 8b 48 08 mov0x8(%rax),%rcx
  32:   48 8d 34 11 lea(%rcx,%rdx,1),%rsi
  36:   48 39 d3cmp%rdx,%rbx
  39:   76 a1   jbe0xffdc
  3b:   49 39 f5cmp%rsi,%r13
  3e:   0f  .byte 0xf
  3f:   83  .byte 0x83
...

Code starting with the faulting instruction 
===
   0:   48 8b 10mov(%rax),%rdx
   3:   48 8b 48 08 mov0x8(%rax),%rcx
   7:   48 8d 34 11 lea(%rcx,%rdx,1),%rsi
   b:   48 39 d3cmp%rdx,%rbx
   e:   76 a1   jbe0xffb1
  10:   49 39 f5cmp%rsi,%r13
  13:   0f  .byte 0xf
  14:   

RE: Boot failure caused by mm/cma.c: free the reserved memblock when free cma pages

2014-09-23 Thread Wang, Yalin
Hi

Very strange,
I test it on ARM arch,
It works well.

I think revert it is ok ,
I have a discussion with Russell,
His concern make sense,
Remove this patch is ok.
https://lkml.org/lkml/2014/9/18/171


the crash is interesting,
I will have a look.
Sorry for that ..

-Original Message-
From: Sasha Levin [mailto:sasha.le...@oracle.com] 
Sent: Wednesday, September 24, 2014 7:03 AM
To: Wang, Yalin; Minchan Kim
Cc: Michal Hocko; Hugh Dickins; Joonsoo Kim; Andrew Morton; LKML; 
linux...@kvack.org
Subject: Boot failure caused by mm/cma.c: free the reserved memblock when free 
cma pages

Hi Yalin,

I'm seeing the following BUG when booting the latest -next kernel. I've 
bisected it down to mm/cma.c: free the reserved memblock when free cma pages.

[2.438701] BUG: unable to handle kernel paging request at 880972493000
[2.438701] IP: memblock_isolate_range (mm/memblock.c:624)
[2.438701] PGD 34b51067 PUD 34b54067 PMD 976c56067 PTE 800972493060
[2.438701] Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
[2.438701] Dumping ftrace buffer:
[2.438701](ftrace buffer empty)
[2.438701] Modules linked in:
[2.438701] CPU: 17 PID: 1 Comm: swapper/0 Not tainted 
3.17.0-rc6-next-20140923-sasha-00037-gc40eca4 #1213
[2.438701] task: 88076d7d ti: 880048d4 task.ti: 
880048d4
[2.438701] RIP: memblock_isolate_range (mm/memblock.c:624)
[2.438701] RSP: :880048d43cf8  EFLAGS: 00010286
[2.438828] RAX: 880972493000 RBX: 00096260 RCX: 880048d43d50
[2.439590] RDX: 0020 RSI: 00096240 RDI: b2fcaa30
[2.44] RBP: 880048d43d38 R08: 880048d43d54 R09: 0001
[2.44] R10:  R11: 0001 R12: 880048d43d54
[2.44] R13: 00096240 R14:  R15: b2fcaa30
[2.44] FS:  () GS:880567c0() 
knlGS:
[2.44] CS:  0010 DS:  ES:  CR0: 8005003b
[2.44] CR2: 880972493000 CR3: 31e2f000 CR4: 06a0
[2.44] Stack:
[2.44]  ad29cda2 880048d43d50 ea002eeb4000 
00096240
[2.44]  b2fcaa30  a000 
ea002eeb4008
[2.44]  880048d43d68 b049e64a 00096240 

[2.44] Call Trace:
[2.44] ? adjust_managed_page_count (mm/page_alloc.c:5430)
[2.44] memblock_remove_range (mm/memblock.c:672)
[2.44] memblock_free (mm/memblock.c:695)
[2.44] init_cma_reserved_pageblock (mm/page_alloc.c:840)
[2.44] cma_init_reserved_areas (mm/cma.c:118 mm/cma.c:133)
[2.44] ? kfree (mm/slub.c:2674 mm/slub.c:3339)
[2.44] ? early_memunmap (mm/cma.c:129)
[2.44] do_one_initcall (init/main.c:792)
[2.44] kernel_init_freeable (init/main.c:857 init/main.c:865 
init/main.c:884 init/main.c:1005)
[2.44] ? rest_init (init/main.c:932)
[2.44] kernel_init (init/main.c:937)
[2.44] ret_from_fork (arch/x86/kernel/entry_64.S:348)
[2.44] ? rest_init (init/main.c:932)
[ 2.44] Code: 89 ff e8 ec fa ff ff 85 c0 79 e1 b8 f4 ff ff ff e9 c0 00 00 
00 4c 01 eb 45 31 f6 49 63 c6 49 3b 07 73 b9 48 c1 e0 05 49 03 47 18 48 8b 10 
48 8b 48 08 48 8d 34 11 48 39 d3 76 a1 49 39 f5 0f 83 All code 
   0:   89 ff   mov%edi,%edi
   2:   e8 ec fa ff ff  callq  0xfaf3
   7:   85 c0   test   %eax,%eax
   9:   79 e1   jns0xffec
   b:   b8 f4 ff ff ff  mov$0xfff4,%eax
  10:   e9 c0 00 00 00  jmpq   0xd5
  15:   4c 01 ebadd%r13,%rbx
  18:   45 31 f6xor%r14d,%r14d
  1b:   49 63 c6movslq %r14d,%rax
  1e:   49 3b 07cmp(%r15),%rax
  21:   73 b9   jae0xffdc
  23:   48 c1 e0 05 shl$0x5,%rax
  27:   49 03 47 18 add0x18(%r15),%rax
  2b:*  48 8b 10mov(%rax),%rdx  -- trapping 
instruction
  2e:   48 8b 48 08 mov0x8(%rax),%rcx
  32:   48 8d 34 11 lea(%rcx,%rdx,1),%rsi
  36:   48 39 d3cmp%rdx,%rbx
  39:   76 a1   jbe0xffdc
  3b:   49 39 f5cmp%rsi,%r13
  3e:   0f  .byte 0xf
  3f:   83  .byte 0x83
...

Code starting with the faulting instruction 
===
   0:   48 8b 10mov(%rax),%rdx
   3:   48 8b 48 08 mov0x8(%rax),%rcx
   7:   48 8d 34 11 lea(%rcx,%rdx,1),%rsi
   b:   48 39 d3cmp%rdx,%rbx
   e:   76 a1   jbe0xffb1
  10:   49 39 f5cmp%rsi,%r13
  13:   0f  .byte 0xf
  14:   83