You have been subscribed to a public bug: Problem Description ============================ We run stress tests on seedlp2, after a while a lot of oom messages echoed on the console again and again and the system hung up:
[ 8331.537440] Out of memory (oom_kill_allocating_task): Kill process 27466 (fork12) score 0 or sacrifice child [ 8331.537447] Killed process 27466 (fork12) total-vm:3072kB, anon-rss:0kB, file-rss:512kB, shmem-rss:0kB [ 8331.543871] oom_reaper: reaped process 27466 (fork12), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 8331.544167] fork12 invoked oom-killer: gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 [ 8331.544174] fork12 cpuset=/ mems_allowed=3 [ 8331.544184] CPU: 19 PID: 20947 Comm: fork12 Tainted: G OE 4.8.0-31-generic #33~16.04.1-Ubuntu [ 8331.544189] Call Trace: [ 8331.544197] [c0000000b7fdb630] [c000000000b56554] dump_stack+0xb0/0xf0 (unreliable) [ 8331.544204] [c0000000b7fdb670] [c000000000b53db4] dump_header+0x88/0x228 [ 8331.544211] [c0000000b7fdb740] [c000000000258194] oom_kill_process+0x464/0x570 [ 8331.544217] [c0000000b7fdb800] [c0000000002588a4] out_of_memory+0x574/0x590 [ 8331.544223] [c0000000b7fdb8a0] [c00000000025fb18] __alloc_pages_nodemask+0xe98/0xee0 [ 8331.544230] [c0000000b7fdba60] [c0000000002da458] alloc_pages_vma+0x108/0x360 [ 8331.544235] [c0000000b7fdbb00] [c0000000002c3958] __read_swap_cache_async+0x1b8/0x2c0 [ 8331.544241] [c0000000b7fdbb70] [c0000000002c3a8c] read_swap_cache_async+0x2c/0x60 [ 8331.544246] [c0000000b7fdbbb0] [c0000000002c3cb0] swapin_readahead+0x1f0/0x2e0 [ 8331.544253] [c0000000b7fdbc50] [c0000000002a0b78] do_swap_page+0x338/0x9a0 [ 8331.544258] [c0000000b7fdbcd0] [c0000000002a550c] handle_mm_fault+0x98c/0x14c0 [ 8331.544264] [c0000000b7fdbd80] [c000000000b4d2d0] do_page_fault+0x350/0x7d0 [ 8331.553521] [c0000000b7fdbe30] [c000000000008948] handle_page_fault+0x10/0x30 [ 8331.553552] Mem-Info: [ 8331.553567] active_anon:121 inactive_anon:6417 isolated_anon:742 [ 8331.553567] active_file:167 inactive_file:120 isolated_file:0 [ 8331.553567] unevictable:230 dirty:0 writeback:141 unstable:0 [ 8331.553567] slab_reclaimable:2483 slab_unreclaimable:24404 [ 8331.553567] mapped:122 shmem:0 pagetables:11341 bounce:0 [ 8331.553567] free:2812 free_pcp:0 free_cma:0 [ 8331.553611] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? yes [ 8331.553642] Node 3 active_anon:7744kB inactive_anon:404544kB active_file:10688kB inactive_file:7680kB unevictable:14720kB isolated(anon):53632kB isolated(file):0kB mapped:7808kB dirty:0kB writeback:9024kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:78640 all_unreclaimable? yes [ 8331.556952] Node 3 DMA free:179968kB min:180224kB low:225280kB high:270336kB active_anon:5824kB inactive_anon:419840kB active_file:6720kB inactive_file:3776kB unevictable:14720kB writepending:9024kB present:4194304kB managed:3624960kB mlocked:14720kB slab_reclaimable:158912kB slab_unreclaimable:1561856kB kernel_stack:202992kB pagetables:725824kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 8331.556978] lowmem_reserve[]: 0 0 0 0 [ 8331.556988] Node 3 DMA: 1124*64kB (UME) 172*128kB (UME) 127*256kB (UM) 95*512kB (M) 6*1024kB (M) 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 181248kB [ 8331.557022] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=16384kB [ 8331.557032] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=16777216kB [ 8331.557040] 7760 total pagecache pages [ 8331.563978] 7356 pages in swap cache [ 8331.563985] Swap cache stats: add 1612469, delete 1605113, find 1293795/2233630 [ 8331.563992] Free swap = 9015616kB [ 8331.563997] Total swap = 12096448kB [ 8331.564003] 65536 pages RAM [ 8331.564007] 0 pages HighMem/MovableOnly [ 8331.564013] 8896 pages reserved [ 8331.564019] 0 pages cma reserved [ 8331.564024] 0 pages hwpoisoned ---uname output--- 4.8.0-31-generic #33~16.04.1-Ubuntu Machine Type = lpar ---System Hang--- On the console, the oom message echoed again and again, we cannot log onto seedlp2 through it. And we cannot ssh to it, too. == Comment: #2 - Ping Tian Han <pt...@cn.ibm.com> - 2016-12-14 22:58:23 == seedlp2 dropped into xmon finally: ^C^C^C[ 9770.439387] systemd[1]: systemd-journald.service: Failed with result 'signal'. [ 9770.439490] systemd[1]: cron.service: Main process exited, code=killed, status=9/KILL [ 9770.439692] systemd[1]: cron.service: Unit entered failed state. [ 9770.439707] systemd[1]: cron.service: Failed with result 'signal'. [ 9770.439746] systemd[1]: rsyslog.service: Main process exited, code=killed, status=9/KILL [ 9770.439977] systemd[1]: rsyslog.service: Unit entered failed state. [ 9770.439993] systemd[1]: rsyslog.service: Failed with result 'signal'. [ 9770.440048] systemd[1]: systemd-logind.service: Main process exited, code=killed, status=9/KILL [ 9770.440380] systemd[1]: systemd-logind.service: Unit entered failed state. [ 9770.440402] systemd[1]: systemd-logind.service: Failed with result 'signal'. [ 9771.632945] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w) xmon(x) dump-ftrace-buffer(z) [ 9771.633012] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w) xmon(x) dump-ftrace-buffer(z) [ 9771.633088] Unable to handle kernel paging request for data at address 0x00002260 [ 9771.633097] Faulting instruction address: 0xc0000000006c3020 cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880] pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0 lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0 sp: c0000000f52cfb00 msr: 800000010280b033 dar: 2260 dsisr: 40000000 current = 0xc00000009e61a200 paca = 0xc000000007b3f300 softe: 0 irq_happened: 0x01 pid = 32533, comm = kworker/u64:0 Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11) enter ? for help [c0000000f52cfbd0] c0000000006c7e48 tty_ldisc_receive_buf+0x48/0xe0 [c0000000f52cfc00] c0000000006c872c flush_to_ldisc+0x13c/0x160 [c0000000f52cfc50] c0000000000ef5e8 process_one_work+0x1e8/0x5b0 [c0000000f52cfce0] c0000000000efa58 worker_thread+0xa8/0x650 [c0000000f52cfd80] c0000000000f8224 kthread+0x114/0x140 [c0000000f52cfe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c --- Exception: 0 at 0000000000000000 1b:mon> Unrecognized command: \x1be (type ? for help) 1b:mon> e cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880] pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0 lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0 sp: c0000000f52cfb00 msr: 800000010280b033 dar: 2260 dsisr: 40000000 current = 0xc00000009e61a200 paca = 0xc000000007b3f300 softe: 0 irq_happened: 0x01 pid = 32533, comm = kworker/u64:0 Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11) 1b:mon> t [c0000000f52cfbd0] c0000000006c7e48 tty_ldisc_receive_buf+0x48/0xe0 [c0000000f52cfc00] c0000000006c872c flush_to_ldisc+0x13c/0x160 [c0000000f52cfc50] c0000000000ef5e8 process_one_work+0x1e8/0x5b0 [c0000000f52cfce0] c0000000000efa58 worker_thread+0xa8/0x650 [c0000000f52cfd80] c0000000000f8224 kthread+0x114/0x140 [c0000000f52cfe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c --- Exception: 0 at 0000000000000000 1b:mon> r R00 = c0000000006c2ffc R16 = 0000000000000000 R01 = c0000000f52cfb00 R17 = 0000000000000000 R02 = c0000000014a6600 R18 = 0000000000000000 R03 = 0000000000000000 R19 = 0000000000000001 R04 = c0000000ea40685f R20 = 0000000000000000 R05 = c0000000ea40695f R21 = 0000000000000000 R06 = 0000000000000012 R22 = c0000000013fc787 R07 = 0000000000000001 R23 = 0000000100000000 R08 = c0000000f4fd08d8 R24 = 0000000000000000 R09 = c0000000f4fd0a20 R25 = 0000000000000000 R10 = c0000000f4fd0a48 R26 = c0000000ea40685f R11 = c0000000ffce3130 R27 = c0000000ea40695f R12 = 0000000024a42828 R28 = 0000000000000012 R13 = c000000007b3f300 R29 = c0000000ec53ce50 R14 = c0000000000f8118 R30 = 0000000000000001 R15 = c0000000e644bec0 R31 = c0000000f4fd0800 pc = c0000000006c3020 n_tty_receive_buf_common+0xc0/0xbd0 cfar= c000000000008750 slb_miss_realmode+0x50/0x78 lr = c0000000006c2ffc n_tty_receive_buf_common+0x9c/0xbd0 msr = 800000010280b033 cr = 24a42828 ctr = c0000000006c3b30 xer = 0000000000000000 trap = 300 dar = 0000000000002260 dsisr = 40000000 1b:mon> == Comment: #4 - Vaishnavi Bhat <vaish...@in.ibm.com> - 2016-12-16 04:57:55 == 1b:mon> e cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880] pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0 lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0 sp: c0000000f52cfb00 msr: 800000010280b033 dar: 2260 dsisr: 40000000 current = 0xc00000009e61a200 paca = 0xc000000007b3f300 softe: 0 irq_happened: 0x01 pid = 32533, comm = kworker/u64:0 Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11) 1b:mon> di c0000000006c3020 10 c0000000006c3020 e9192260 ld r8,8800(r25) -----------> r25 is a Null pointer c0000000006c3024 7c2004ac lwsync c0000000006c3028 80ff0130 lwz r7,304(r31) c0000000006c302c e8d90000 ld r6,0(r25) c0000000006c3030 78e5efe3 rldicl. r5,r7,61,63 c0000000006c3034 7d464050 subf r10,r6,r8 c0000000006c3038 392a1000 addi r9,r10,4096 c0000000006c303c 7d2907b4 extsw r9,r9 c0000000006c3040 41820020 beq c0000000006c3060 # n_tty_receive_buf_common+0x100/0xbd0 c0000000006c3044 3ca05555 lis r5,21845 c0000000006c3048 394a1002 addi r10,r10,4098 c0000000006c304c 60a55556 ori r5,r5,21846 c0000000006c3050 7d2a2896 mulhw r9,r10,r5 c0000000006c3054 7d4afe70 srawi r10,r10,31 c0000000006c3058 7d2a4850 subf r9,r10,r9 c0000000006c305c 7d2907b4 extsw r9,r9 1b:mon> r R00 = c0000000006c2ffc R16 = 0000000000000000 R01 = c0000000f52cfb00 R17 = 0000000000000000 R02 = c0000000014a6600 R18 = 0000000000000000 R03 = 0000000000000000 R19 = 0000000000000001 R04 = c0000000ea40685f R20 = 0000000000000000 R05 = c0000000ea40695f R21 = 0000000000000000 R06 = 0000000000000012 R22 = c0000000013fc787 R07 = 0000000000000001 R23 = 0000000100000000 R08 = c0000000f4fd08d8 R24 = 0000000000000000 R09 = c0000000f4fd0a20 R25 = 0000000000000000 R10 = c0000000f4fd0a48 R26 = c0000000ea40685f R11 = c0000000ffce3130 R27 = c0000000ea40695f R12 = 0000000024a42828 R28 = 0000000000000012 R13 = c000000007b3f300 R29 = c0000000ec53ce50 R14 = c0000000000f8118 R30 = 0000000000000001 R15 = c0000000e644bec0 R31 = c0000000f4fd0800 pc = c0000000006c3020 n_tty_receive_buf_common+0xc0/0xbd0 cfar= c000000000008750 slb_miss_realmode+0x50/0x78 lr = c0000000006c2ffc n_tty_receive_buf_common+0x9c/0xbd0 msr = 800000010280b033 cr = 24a42828 ctr = c0000000006c3b30 xer = 0000000000000000 trap = 300 dar = 0000000000002260 dsisr = 40000000 1b:mon> d $linux_banner c000000000b60158 4c696e7578207665 7273696f6e20342e |Linux version 4.| c000000000b60168 382e302d33312d67 656e657269632028 |8.0-31-generic (| c000000000b60178 6275696c64644062 6f7330312d707063 |buildd@bos01-ppc| c000000000b60188 3634656c2d303231 2920286763632076 |64el-021) (gcc v| 1b:mon> mi [123137.575504] Mem-Info: [123137.575522] active_anon:888 inactive_anon:320 isolated_anon:0 [123137.575522] active_file:1135 inactive_file:12230 isolated_file:0 [123137.575522] unevictable:230 dirty:2283 writeback:42 unstable:0 [123137.575522] slab_reclaimable:2477 slab_unreclaimable:8349 [123137.575522] mapped:331 shmem:1 pagetables:119 bounce:0 [123137.575522] free:23060 free_pcp:120 free_cma:0 [123137.575567] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? yes [123137.575600] Node 3 active_anon:56832kB inactive_anon:20480kB active_file:72640kB inactive_file:782720kB unevictable:14720kB isolated(anon):0kB isolated(file):0kB mapped:21184kB dirty:146112kB writeback:2688kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 64kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no [123137.575622] Node 3 DMA free:1475840kB min:180224kB low:225280kB high:270336kB active_anon:56832kB inactive_anon:20480kB active_file:72640kB inactive_file:782720kB unevictable:14720kB writepending:148800kB present:4194304kB managed:3624960kB mlocked:14720kB slab_reclaimable:158528kB slab_unreclaimable:534336kB kernel_stack:10512kB pagetables:7616kB bounce:0kB free_pcp:7680kB local_pcp:320kB free_cma:0kB [123137.575653] lowmem_reserve[]: 0 0 0 0 [123137.575666] Node 3 DMA: 4091*64kB (UME) 2410*128kB (UE) 1275*256kB (UME) 583*512kB (UE) 165*1024kB (UME) 47*2048kB (UM) 3*4096kB (M) 0*8192kB 0*16384kB = 1472704kB [123137.575708] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=16384kB [123137.575718] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=16777216kB [123137.575727] 14186 total pagecache pages [123137.575733] 598 pages in swap cache [123137.575739] Swap cache stats: add 2114491, delete 2113893, find 1715624/2933446 [123137.575746] Free swap = 11950336kB [123137.575752] Total swap = 12096448kB [123137.575757] 65536 pages RAM [123137.575762] 0 pages HighMem/MovableOnly [123137.575767] 8896 pages reserved [123137.575772] 0 pages cma reserved [123137.575777] 0 pages hwpoisoned Mirroring for Canonical's awareness. Thank you. ** Affects: linux (Ubuntu) Importance: Undecided Assignee: Taco Screen team (taco-screen-team) Status: New ** Tags: architecture-ppc64le bugnameltc-149962 severity-high targetmilestone-inin--- -- ISST-LTE:pVM:seedlp2:ubuntu 16.04.2: oom occurs when running stress tests https://bugs.launchpad.net/bugs/1651376 You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp