[Kernel-packages] [Bug 1665113] Comment bridged from LTC Bugzilla

bugproxy Thu, 23 Feb 2017 06:11:58 -0800

------- Comment From laurent.duf...@fr.ibm.com 2017-02-23 08:51 EDT-------
A new set of 2 patches have been sent to the community :
https://patchwork.kernel.org/patch/9588337/
https://patchwork.kernel.org/patch/9588335/


I'm waiting for these 2 patches to be accepted upstream.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1665113

Title:
  [Ubuntu 17.04] Kernel panics when large number of hugepages is passed
  as an boot argument to kernel.

Status in linux package in Ubuntu:
  Triaged

Bug description:
  Issue:
  -----------
  Kernel unable to handle paging request and panic occurs when more number of 
hugepages is passed as a boot argument to the kernel .

  Environment:
  ----------------------
  Power NV : Habanaro Bare metal
  OS : Ubuntu 17.04
  Kernel Version : 4.9.0-11-generic

  Steps To reproduce:
  -----------------------------------

  1 - When the ubuntu Kernel boots try to add the boot argument
  'hugepages = 12000000'.

  The Kernel Panics and displays call traces like as below.

  [    5.030274] Unable to handle kernel paging request for data at address 
0x00000000
  [    5.030323] Faulting instruction address: 0xc000000000302848
  [    5.030366] Oops: Kernel access of bad area, sig: 11 [#1]
  [    5.030399] SMP NR_CPUS=2048 [    5.030416] NUMA 
  [    5.039443] PowerNV
  [    5.039461] Modules linked in:
  [    5.050091] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 4.9.0-11-generic 
#12-Ubuntu
  [    5.053266] Workqueue: events pcpu_balance_workfn
  [    5.080647] task: c000003c8fe9b800 task.stack: c000003ffb118000
  [    5.090876] NIP: c000000000302848 LR: c0000000002709d4 CTR: 
c00000000016cef0
  [    5.094175] REGS: c000003ffb11b410 TRAP: 0300   Not tainted  
(4.9.0-11-generic)
  [    5.103040] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>[    
5.114466]   CR: 22424222  XER: 00000000
  [    5.124932] CFAR: c000000000008a60 DAR: 0000000000000000 DSISR: 40000000 
SOFTE: 1 
  GPR00: c0000000002709d4 c000003ffb11b690 c00000000141a400 c000003fff50e300 
  GPR04: 0000000000000000 00000000024001c2 c000003ffb11b780 000000219df50000 
  GPR08: 0000003ffb090000 c000000001454fd8 0000000000000000 0000000000000000 
  GPR12: 0000000000004400 c000000007b60000 00000000024001c2 00000000024001c2 
  GPR16: 00000000024001c2 0000000000000000 0000000000000000 0000000000000002 
  GPR20: 000000000000000c 0000000000000000 0000000000000000 00000000024200c0 
  GPR24: c0000000016eef48 0000000000000000 c000003fff50fd00 00000000024001c2 
  GPR28: 0000000000000000 c000003fff50fd00 c000003fff50e300 c000003ffb11b820 
  NIP [c000000000302848] mem_cgroup_soft_limit_reclaim+0xf8/0x4f0
  [    5.213613] LR [c0000000002709d4] do_try_to_free_pages+0x1b4/0x450
  [    5.230521] Call Trace:
  [    5.230643] [c000003ffb11b760] [c0000000002709d4] 
do_try_to_free_pages+0x1b4/0x450
  [    5.254184] [c000003ffb11b800] [c000000000270d68] 
try_to_free_pages+0xf8/0x270
  [    5.281896] [c000003ffb11b890] [c000000000259b88] 
__alloc_pages_nodemask+0x7a8/0xff0
  [    5.321407] [c000003ffb11bab0] [c000000000282cd0] 
pcpu_populate_chunk+0x110/0x520
  [    5.336262] [c000003ffb11bb50] [c0000000002841b8] 
pcpu_balance_workfn+0x758/0x960
  [    5.351526] [c000003ffb11bc50] [c0000000000ecdd0] 
process_one_work+0x2b0/0x5a0
  [    5.362561] [c000003ffb11bce0] [c0000000000ed168] worker_thread+0xa8/0x660
  [    5.374007] [c000003ffb11bd80] [c0000000000f5320] kthread+0x110/0x130
  [    5.385160] [c000003ffb11be30] [c00000000000c0e8] 
ret_from_kernel_thread+0x5c/0x74
  [    5.389456] Instruction dump:
  [    5.410036] eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 3d230001 e9499a42 
3d220004 
  [    5.423598] 3929abd8 794a1f24 7d295214 eac90100 <e9360000> 2fa90000 
419eff74 3b200000 
  [    5.436503] ---[ end trace 23b650e96be5c549 ]---
  [    5.439700] 

  This is purely a negative scenario where the system does not have
  enough memory as the hugepages is given a very large argument.

  Free output in a system:
  free -h
                total        used        free      shared  buff/cache   
available
  Mem:           251G        2.1G        248G        5.2M        502M        
248G
  Swap:          2.0G        159M        1.8G

  The same scenario when tried after the linux is up like as,

  echo 12000000 > /proc/sys/vm/nr_hugepages

  HugePages_Total:   15069
  HugePages_Free:    15069
  HugePages_Rsvd:        0
  HugePages_Surp:        0
  Hugepagesize:      16384 kB
  root@ltc-haba2:~# free -h
                total        used        free      shared  buff/cache   
available
  Mem:           251G        237G         13G        5.6M        311M         
13G
  Swap:          2.0G        159M        1.8G

  In this case the kernel is able to allocate around 237 Gb for hugetlb.

  But while the system is booting it gives us panic so please let know
  if this scenario  is expected  to be handled.

  I identified the root cause of the panic.
  When the system is running with low memory during mem cgroup initialisation, 
because most of the page have been grabbed to be huge pages, we hit a chicken 
and egg issue because when trying to allocate memory for the node's cgroup 
descriptor, we try to free some memory and in this path cgroup's services are 
called which assume node's cgroup descriptor is allocated.

  I'm working on a patch which fixes this panic, but I think it is
  expected that the system fail due to OOM when all the pages are
  assigned to huge pages.

  Patch sent upstream, waiting for review : 
  https://patchwork.kernel.org/patch/9573799/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1665113/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1665113] Comment bridged from LTC Bugzilla

Reply via email to