You have been subscribed to a public bug:

Problem Description
============================
We run stress tests on seedlp2, after a while a lot of oom messages echoed on 
the console again and again and the system hung up:


[ 8331.537440] Out of memory (oom_kill_allocating_task): Kill process 27466 
(fork12) score 0 or sacrifice child
[ 8331.537447] Killed process 27466 (fork12) total-vm:3072kB, anon-rss:0kB, 
file-rss:512kB, shmem-rss:0kB
[ 8331.543871] oom_reaper: reaped process 27466 (fork12), now anon-rss:0kB, 
file-rss:0kB, shmem-rss:0kB
[ 8331.544167] fork12 invoked oom-killer: 
gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[ 8331.544174] fork12 cpuset=/ mems_allowed=3
[ 8331.544184] CPU: 19 PID: 20947 Comm: fork12 Tainted: G           OE   
4.8.0-31-generic #33~16.04.1-Ubuntu
[ 8331.544189] Call Trace:
[ 8331.544197] [c0000000b7fdb630] [c000000000b56554] dump_stack+0xb0/0xf0 
(unreliable)
[ 8331.544204] [c0000000b7fdb670] [c000000000b53db4] dump_header+0x88/0x228
[ 8331.544211] [c0000000b7fdb740] [c000000000258194] 
oom_kill_process+0x464/0x570
[ 8331.544217] [c0000000b7fdb800] [c0000000002588a4] out_of_memory+0x574/0x590
[ 8331.544223] [c0000000b7fdb8a0] [c00000000025fb18] 
__alloc_pages_nodemask+0xe98/0xee0
[ 8331.544230] [c0000000b7fdba60] [c0000000002da458] alloc_pages_vma+0x108/0x360
[ 8331.544235] [c0000000b7fdbb00] [c0000000002c3958] 
__read_swap_cache_async+0x1b8/0x2c0
[ 8331.544241] [c0000000b7fdbb70] [c0000000002c3a8c] 
read_swap_cache_async+0x2c/0x60
[ 8331.544246] [c0000000b7fdbbb0] [c0000000002c3cb0] 
swapin_readahead+0x1f0/0x2e0
[ 8331.544253] [c0000000b7fdbc50] [c0000000002a0b78] do_swap_page+0x338/0x9a0
[ 8331.544258] [c0000000b7fdbcd0] [c0000000002a550c] 
handle_mm_fault+0x98c/0x14c0
[ 8331.544264] [c0000000b7fdbd80] [c000000000b4d2d0] do_page_fault+0x350/0x7d0
[ 8331.553521] [c0000000b7fdbe30] [c000000000008948] handle_page_fault+0x10/0x30
[ 8331.553552] Mem-Info:
[ 8331.553567] active_anon:121 inactive_anon:6417 isolated_anon:742
[ 8331.553567]  active_file:167 inactive_file:120 isolated_file:0
[ 8331.553567]  unevictable:230 dirty:0 writeback:141 unstable:0
[ 8331.553567]  slab_reclaimable:2483 slab_unreclaimable:24404
[ 8331.553567]  mapped:122 shmem:0 pagetables:11341 bounce:0
[ 8331.553567]  free:2812 free_pcp:0 free_cma:0
[ 8331.553611] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 
0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
all_unreclaimable? yes
[ 8331.553642] Node 3 active_anon:7744kB inactive_anon:404544kB 
active_file:10688kB inactive_file:7680kB unevictable:14720kB 
isolated(anon):53632kB isolated(file):0kB mapped:7808kB dirty:0kB 
writeback:9024kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB 
writeback_tmp:0kB unstable:0kB pages_scanned:78640 all_unreclaimable? yes
[ 8331.556952] Node 3 DMA free:179968kB min:180224kB low:225280kB high:270336kB 
active_anon:5824kB inactive_anon:419840kB active_file:6720kB 
inactive_file:3776kB unevictable:14720kB writepending:9024kB present:4194304kB 
managed:3624960kB mlocked:14720kB slab_reclaimable:158912kB 
slab_unreclaimable:1561856kB kernel_stack:202992kB pagetables:725824kB 
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 8331.556978] lowmem_reserve[]: 0 0 0 0
[ 8331.556988] Node 3 DMA: 1124*64kB (UME) 172*128kB (UME) 127*256kB (UM) 
95*512kB (M) 6*1024kB (M) 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 181248kB
[ 8331.557022] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16384kB
[ 8331.557032] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16777216kB
[ 8331.557040] 7760 total pagecache pages
[ 8331.563978] 7356 pages in swap cache
[ 8331.563985] Swap cache stats: add 1612469, delete 1605113, find 
1293795/2233630
[ 8331.563992] Free swap  = 9015616kB
[ 8331.563997] Total swap = 12096448kB
[ 8331.564003] 65536 pages RAM
[ 8331.564007] 0 pages HighMem/MovableOnly
[ 8331.564013] 8896 pages reserved
[ 8331.564019] 0 pages cma reserved
[ 8331.564024] 0 pages hwpoisoned

 
---uname output---
4.8.0-31-generic #33~16.04.1-Ubuntu
 
Machine Type = lpar 
 
---System Hang---
 On the console, the oom message echoed again and again, we cannot log onto 
seedlp2 through it. And we cannot ssh to it, too.

== Comment: #2 - Ping Tian Han <pt...@cn.ibm.com> - 2016-12-14 22:58:23 ==
seedlp2 dropped into xmon finally:

^C^C^C[ 9770.439387] systemd[1]: systemd-journald.service: Failed with result 
'signal'.
[ 9770.439490] systemd[1]: cron.service: Main process exited, code=killed, 
status=9/KILL
[ 9770.439692] systemd[1]: cron.service: Unit entered failed state.
[ 9770.439707] systemd[1]: cron.service: Failed with result 'signal'.
[ 9770.439746] systemd[1]: rsyslog.service: Main process exited, code=killed, 
status=9/KILL
[ 9770.439977] systemd[1]: rsyslog.service: Unit entered failed state.
[ 9770.439993] systemd[1]: rsyslog.service: Failed with result 'signal'.
[ 9770.440048] systemd[1]: systemd-logind.service: Main process exited, 
code=killed, status=9/KILL
[ 9770.440380] systemd[1]: systemd-logind.service: Unit entered failed state.
[ 9770.440402] systemd[1]: systemd-logind.service: Failed with result 'signal'.
[ 9771.632945] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) 
terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) 
thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) 
show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) 
show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) 
show-blocked-tasks(w) xmon(x) dump-ftrace-buffer(z)
[ 9771.633012] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) 
terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) 
thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) 
show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) 
show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) 
show-blocked-tasks(w) xmon(x) dump-ftrace-buffer(z)
[ 9771.633088] Unable to handle kernel paging request for data at address 
0x00002260
[ 9771.633097] Faulting instruction address: 0xc0000000006c3020
cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880]
    pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0
    lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0
    sp: c0000000f52cfb00
   msr: 800000010280b033
   dar: 2260
 dsisr: 40000000
  current = 0xc00000009e61a200
  paca    = 0xc000000007b3f300   softe: 0        irq_happened: 0x01
    pid   = 32533, comm = kworker/u64:0
Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 
16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11)
enter ? for help
[c0000000f52cfbd0] c0000000006c7e48 tty_ldisc_receive_buf+0x48/0xe0
[c0000000f52cfc00] c0000000006c872c flush_to_ldisc+0x13c/0x160
[c0000000f52cfc50] c0000000000ef5e8 process_one_work+0x1e8/0x5b0
[c0000000f52cfce0] c0000000000efa58 worker_thread+0xa8/0x650
[c0000000f52cfd80] c0000000000f8224 kthread+0x114/0x140
[c0000000f52cfe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c
--- Exception: 0  at 0000000000000000
1b:mon>
Unrecognized command: \x1be (type ? for help)
1b:mon> e
cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880]
    pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0
    lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0
    sp: c0000000f52cfb00
   msr: 800000010280b033
   dar: 2260
 dsisr: 40000000
  current = 0xc00000009e61a200
  paca    = 0xc000000007b3f300   softe: 0        irq_happened: 0x01
    pid   = 32533, comm = kworker/u64:0
Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 
16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11)
1b:mon> t
[c0000000f52cfbd0] c0000000006c7e48 tty_ldisc_receive_buf+0x48/0xe0
[c0000000f52cfc00] c0000000006c872c flush_to_ldisc+0x13c/0x160
[c0000000f52cfc50] c0000000000ef5e8 process_one_work+0x1e8/0x5b0
[c0000000f52cfce0] c0000000000efa58 worker_thread+0xa8/0x650
[c0000000f52cfd80] c0000000000f8224 kthread+0x114/0x140
[c0000000f52cfe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c
--- Exception: 0  at 0000000000000000
1b:mon> r
R00 = c0000000006c2ffc   R16 = 0000000000000000
R01 = c0000000f52cfb00   R17 = 0000000000000000
R02 = c0000000014a6600   R18 = 0000000000000000
R03 = 0000000000000000   R19 = 0000000000000001
R04 = c0000000ea40685f   R20 = 0000000000000000
R05 = c0000000ea40695f   R21 = 0000000000000000
R06 = 0000000000000012   R22 = c0000000013fc787
R07 = 0000000000000001   R23 = 0000000100000000
R08 = c0000000f4fd08d8   R24 = 0000000000000000
R09 = c0000000f4fd0a20   R25 = 0000000000000000
R10 = c0000000f4fd0a48   R26 = c0000000ea40685f
R11 = c0000000ffce3130   R27 = c0000000ea40695f
R12 = 0000000024a42828   R28 = 0000000000000012
R13 = c000000007b3f300   R29 = c0000000ec53ce50
R14 = c0000000000f8118   R30 = 0000000000000001
R15 = c0000000e644bec0   R31 = c0000000f4fd0800
pc  = c0000000006c3020 n_tty_receive_buf_common+0xc0/0xbd0
cfar= c000000000008750 slb_miss_realmode+0x50/0x78
lr  = c0000000006c2ffc n_tty_receive_buf_common+0x9c/0xbd0
msr = 800000010280b033   cr  = 24a42828
ctr = c0000000006c3b30   xer = 0000000000000000   trap =  300
dar = 0000000000002260   dsisr = 40000000
1b:mon>

== Comment: #4 - Vaishnavi Bhat <vaish...@in.ibm.com> - 2016-12-16 04:57:55 ==
1b:mon> e
cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880]
    pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0
    lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0
    sp: c0000000f52cfb00
   msr: 800000010280b033
   dar: 2260
 dsisr: 40000000
  current = 0xc00000009e61a200
  paca    = 0xc000000007b3f300   softe: 0        irq_happened: 0x01
    pid   = 32533, comm = kworker/u64:0
Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 
16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11)

1b:mon> di c0000000006c3020 10
c0000000006c3020  e9192260      ld      r8,8800(r25)  -----------> r25 is a 
Null pointer
c0000000006c3024  7c2004ac      lwsync
c0000000006c3028  80ff0130      lwz     r7,304(r31)
c0000000006c302c  e8d90000      ld      r6,0(r25)
c0000000006c3030  78e5efe3      rldicl. r5,r7,61,63
c0000000006c3034  7d464050      subf    r10,r6,r8
c0000000006c3038  392a1000      addi    r9,r10,4096
c0000000006c303c  7d2907b4      extsw   r9,r9
c0000000006c3040  41820020      beq     c0000000006c3060        # 
n_tty_receive_buf_common+0x100/0xbd0
c0000000006c3044  3ca05555      lis     r5,21845
c0000000006c3048  394a1002      addi    r10,r10,4098
c0000000006c304c  60a55556      ori     r5,r5,21846
c0000000006c3050  7d2a2896      mulhw   r9,r10,r5
c0000000006c3054  7d4afe70      srawi   r10,r10,31
c0000000006c3058  7d2a4850      subf    r9,r10,r9
c0000000006c305c  7d2907b4      extsw   r9,r9
1b:mon> r
R00 = c0000000006c2ffc   R16 = 0000000000000000
R01 = c0000000f52cfb00   R17 = 0000000000000000
R02 = c0000000014a6600   R18 = 0000000000000000
R03 = 0000000000000000   R19 = 0000000000000001
R04 = c0000000ea40685f   R20 = 0000000000000000
R05 = c0000000ea40695f   R21 = 0000000000000000
R06 = 0000000000000012   R22 = c0000000013fc787
R07 = 0000000000000001   R23 = 0000000100000000
R08 = c0000000f4fd08d8   R24 = 0000000000000000
R09 = c0000000f4fd0a20   R25 = 0000000000000000
R10 = c0000000f4fd0a48   R26 = c0000000ea40685f
R11 = c0000000ffce3130   R27 = c0000000ea40695f
R12 = 0000000024a42828   R28 = 0000000000000012
R13 = c000000007b3f300   R29 = c0000000ec53ce50
R14 = c0000000000f8118   R30 = 0000000000000001
R15 = c0000000e644bec0   R31 = c0000000f4fd0800
pc  = c0000000006c3020 n_tty_receive_buf_common+0xc0/0xbd0
cfar= c000000000008750 slb_miss_realmode+0x50/0x78
lr  = c0000000006c2ffc n_tty_receive_buf_common+0x9c/0xbd0
msr = 800000010280b033   cr  = 24a42828
ctr = c0000000006c3b30   xer = 0000000000000000   trap =  300
dar = 0000000000002260   dsisr = 40000000

1b:mon> d $linux_banner
c000000000b60158 4c696e7578207665 7273696f6e20342e  |Linux version 4.|
c000000000b60168 382e302d33312d67 656e657269632028  |8.0-31-generic (|
c000000000b60178 6275696c64644062 6f7330312d707063  |buildd@bos01-ppc|
c000000000b60188 3634656c2d303231 2920286763632076  |64el-021) (gcc v|

1b:mon> mi
[123137.575504] Mem-Info:
[123137.575522] active_anon:888 inactive_anon:320 isolated_anon:0
[123137.575522]  active_file:1135 inactive_file:12230 isolated_file:0
[123137.575522]  unevictable:230 dirty:2283 writeback:42 unstable:0
[123137.575522]  slab_reclaimable:2477 slab_unreclaimable:8349
[123137.575522]  mapped:331 shmem:1 pagetables:119 bounce:0
[123137.575522]  free:23060 free_pcp:120 free_cma:0
[123137.575567] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 
0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
all_unreclaimable? yes
[123137.575600] Node 3 active_anon:56832kB inactive_anon:20480kB 
active_file:72640kB inactive_file:782720kB unevictable:14720kB 
isolated(anon):0kB isolated(file):0kB mapped:21184kB dirty:146112kB 
writeback:2688kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 64kB 
writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
[123137.575622] Node 3 DMA free:1475840kB min:180224kB low:225280kB 
high:270336kB active_anon:56832kB inactive_anon:20480kB active_file:72640kB 
inactive_file:782720kB unevictable:14720kB writepending:148800kB 
present:4194304kB managed:3624960kB mlocked:14720kB slab_reclaimable:158528kB 
slab_unreclaimable:534336kB kernel_stack:10512kB pagetables:7616kB bounce:0kB 
free_pcp:7680kB local_pcp:320kB free_cma:0kB
[123137.575653] lowmem_reserve[]: 0 0 0 0
[123137.575666] Node 3 DMA: 4091*64kB (UME) 2410*128kB (UE) 1275*256kB (UME) 
583*512kB (UE) 165*1024kB (UME) 47*2048kB (UM) 3*4096kB (M) 0*8192kB 0*16384kB 
= 1472704kB
[123137.575708] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16384kB
[123137.575718] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16777216kB
[123137.575727] 14186 total pagecache pages
[123137.575733] 598 pages in swap cache
[123137.575739] Swap cache stats: add 2114491, delete 2113893, find 
1715624/2933446
[123137.575746] Free swap  = 11950336kB
[123137.575752] Total swap = 12096448kB
[123137.575757] 65536 pages RAM
[123137.575762] 0 pages HighMem/MovableOnly
[123137.575767] 8896 pages reserved
[123137.575772] 0 pages cma reserved
[123137.575777] 0 pages hwpoisoned

Mirroring for Canonical's awareness.
Thank you.

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-149962 severity-high 
targetmilestone-inin---
-- 
ISST-LTE:pVM:seedlp2:ubuntu 16.04.2: oom occurs when running stress tests
https://bugs.launchpad.net/bugs/1651376
You received this bug notification because you are a member of Kernel Packages, 
which is subscribed to linux in Ubuntu.

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to