Default Comment by Bridge

** Attachment added: "sosreport of seedlp2"
   
https://bugs.launchpad.net/bugs/1651376/+attachment/4794245/+files/sosreport-seedlp2.149962-20161219003834.tar.xz

** Changed in: ubuntu
     Assignee: (unassigned) => Taco Screen team (taco-screen-team)

** Package changed: ubuntu => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1651376

Title:
  ISST-LTE:pVM:seedlp2:ubuntu 16.04.2: oom occurs when running stress
  tests

Status in linux package in Ubuntu:
  New

Bug description:
  Problem Description
  ============================
  We run stress tests on seedlp2, after a while a lot of oom messages echoed on 
the console again and again and the system hung up:

  
  [ 8331.537440] Out of memory (oom_kill_allocating_task): Kill process 27466 
(fork12) score 0 or sacrifice child
  [ 8331.537447] Killed process 27466 (fork12) total-vm:3072kB, anon-rss:0kB, 
file-rss:512kB, shmem-rss:0kB
  [ 8331.543871] oom_reaper: reaped process 27466 (fork12), now anon-rss:0kB, 
file-rss:0kB, shmem-rss:0kB
  [ 8331.544167] fork12 invoked oom-killer: 
gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
  [ 8331.544174] fork12 cpuset=/ mems_allowed=3
  [ 8331.544184] CPU: 19 PID: 20947 Comm: fork12 Tainted: G           OE   
4.8.0-31-generic #33~16.04.1-Ubuntu
  [ 8331.544189] Call Trace:
  [ 8331.544197] [c0000000b7fdb630] [c000000000b56554] dump_stack+0xb0/0xf0 
(unreliable)
  [ 8331.544204] [c0000000b7fdb670] [c000000000b53db4] dump_header+0x88/0x228
  [ 8331.544211] [c0000000b7fdb740] [c000000000258194] 
oom_kill_process+0x464/0x570
  [ 8331.544217] [c0000000b7fdb800] [c0000000002588a4] out_of_memory+0x574/0x590
  [ 8331.544223] [c0000000b7fdb8a0] [c00000000025fb18] 
__alloc_pages_nodemask+0xe98/0xee0
  [ 8331.544230] [c0000000b7fdba60] [c0000000002da458] 
alloc_pages_vma+0x108/0x360
  [ 8331.544235] [c0000000b7fdbb00] [c0000000002c3958] 
__read_swap_cache_async+0x1b8/0x2c0
  [ 8331.544241] [c0000000b7fdbb70] [c0000000002c3a8c] 
read_swap_cache_async+0x2c/0x60
  [ 8331.544246] [c0000000b7fdbbb0] [c0000000002c3cb0] 
swapin_readahead+0x1f0/0x2e0
  [ 8331.544253] [c0000000b7fdbc50] [c0000000002a0b78] do_swap_page+0x338/0x9a0
  [ 8331.544258] [c0000000b7fdbcd0] [c0000000002a550c] 
handle_mm_fault+0x98c/0x14c0
  [ 8331.544264] [c0000000b7fdbd80] [c000000000b4d2d0] do_page_fault+0x350/0x7d0
  [ 8331.553521] [c0000000b7fdbe30] [c000000000008948] 
handle_page_fault+0x10/0x30
  [ 8331.553552] Mem-Info:
  [ 8331.553567] active_anon:121 inactive_anon:6417 isolated_anon:742
  [ 8331.553567]  active_file:167 inactive_file:120 isolated_file:0
  [ 8331.553567]  unevictable:230 dirty:0 writeback:141 unstable:0
  [ 8331.553567]  slab_reclaimable:2483 slab_unreclaimable:24404
  [ 8331.553567]  mapped:122 shmem:0 pagetables:11341 bounce:0
  [ 8331.553567]  free:2812 free_pcp:0 free_cma:0
  [ 8331.553611] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 
0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
all_unreclaimable? yes
  [ 8331.553642] Node 3 active_anon:7744kB inactive_anon:404544kB 
active_file:10688kB inactive_file:7680kB unevictable:14720kB 
isolated(anon):53632kB isolated(file):0kB mapped:7808kB dirty:0kB 
writeback:9024kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB 
writeback_tmp:0kB unstable:0kB pages_scanned:78640 all_unreclaimable? yes
  [ 8331.556952] Node 3 DMA free:179968kB min:180224kB low:225280kB 
high:270336kB active_anon:5824kB inactive_anon:419840kB active_file:6720kB 
inactive_file:3776kB unevictable:14720kB writepending:9024kB present:4194304kB 
managed:3624960kB mlocked:14720kB slab_reclaimable:158912kB 
slab_unreclaimable:1561856kB kernel_stack:202992kB pagetables:725824kB 
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
  [ 8331.556978] lowmem_reserve[]: 0 0 0 0
  [ 8331.556988] Node 3 DMA: 1124*64kB (UME) 172*128kB (UME) 127*256kB (UM) 
95*512kB (M) 6*1024kB (M) 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 181248kB
  [ 8331.557022] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16384kB
  [ 8331.557032] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16777216kB
  [ 8331.557040] 7760 total pagecache pages
  [ 8331.563978] 7356 pages in swap cache
  [ 8331.563985] Swap cache stats: add 1612469, delete 1605113, find 
1293795/2233630
  [ 8331.563992] Free swap  = 9015616kB
  [ 8331.563997] Total swap = 12096448kB
  [ 8331.564003] 65536 pages RAM
  [ 8331.564007] 0 pages HighMem/MovableOnly
  [ 8331.564013] 8896 pages reserved
  [ 8331.564019] 0 pages cma reserved
  [ 8331.564024] 0 pages hwpoisoned

   
  ---uname output---
  4.8.0-31-generic #33~16.04.1-Ubuntu
   
  Machine Type = lpar 
   
  ---System Hang---
   On the console, the oom message echoed again and again, we cannot log onto 
seedlp2 through it. And we cannot ssh to it, too.

  == Comment: #2 - Ping Tian Han <pt...@cn.ibm.com> - 2016-12-14 22:58:23 ==
  seedlp2 dropped into xmon finally:

  ^C^C^C[ 9770.439387] systemd[1]: systemd-journald.service: Failed with result 
'signal'.
  [ 9770.439490] systemd[1]: cron.service: Main process exited, code=killed, 
status=9/KILL
  [ 9770.439692] systemd[1]: cron.service: Unit entered failed state.
  [ 9770.439707] systemd[1]: cron.service: Failed with result 'signal'.
  [ 9770.439746] systemd[1]: rsyslog.service: Main process exited, code=killed, 
status=9/KILL
  [ 9770.439977] systemd[1]: rsyslog.service: Unit entered failed state.
  [ 9770.439993] systemd[1]: rsyslog.service: Failed with result 'signal'.
  [ 9770.440048] systemd[1]: systemd-logind.service: Main process exited, 
code=killed, status=9/KILL
  [ 9770.440380] systemd[1]: systemd-logind.service: Unit entered failed state.
  [ 9770.440402] systemd[1]: systemd-logind.service: Failed with result 
'signal'.
  [ 9771.632945] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) 
terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) 
thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) 
show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) 
show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) 
show-blocked-tasks(w) xmon(x) dump-ftrace-buffer(z)
  [ 9771.633012] sysrq: SysRq : HELP : loglevel(0-9) reboot(b) crash(c) 
terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) 
thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) 
show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) 
show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) 
show-blocked-tasks(w) xmon(x) dump-ftrace-buffer(z)
  [ 9771.633088] Unable to handle kernel paging request for data at address 
0x00002260
  [ 9771.633097] Faulting instruction address: 0xc0000000006c3020
  cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880]
      pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0
      lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0
      sp: c0000000f52cfb00
     msr: 800000010280b033
     dar: 2260
   dsisr: 40000000
    current = 0xc00000009e61a200
    paca    = 0xc000000007b3f300   softe: 0        irq_happened: 0x01
      pid   = 32533, comm = kworker/u64:0
  Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 
16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11)
  enter ? for help
  [c0000000f52cfbd0] c0000000006c7e48 tty_ldisc_receive_buf+0x48/0xe0
  [c0000000f52cfc00] c0000000006c872c flush_to_ldisc+0x13c/0x160
  [c0000000f52cfc50] c0000000000ef5e8 process_one_work+0x1e8/0x5b0
  [c0000000f52cfce0] c0000000000efa58 worker_thread+0xa8/0x650
  [c0000000f52cfd80] c0000000000f8224 kthread+0x114/0x140
  [c0000000f52cfe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c
  --- Exception: 0  at 0000000000000000
  1b:mon>
  Unrecognized command: \x1be (type ? for help)
  1b:mon> e
  cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880]
      pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0
      lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0
      sp: c0000000f52cfb00
     msr: 800000010280b033
     dar: 2260
   dsisr: 40000000
    current = 0xc00000009e61a200
    paca    = 0xc000000007b3f300   softe: 0        irq_happened: 0x01
      pid   = 32533, comm = kworker/u64:0
  Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 
16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11)
  1b:mon> t
  [c0000000f52cfbd0] c0000000006c7e48 tty_ldisc_receive_buf+0x48/0xe0
  [c0000000f52cfc00] c0000000006c872c flush_to_ldisc+0x13c/0x160
  [c0000000f52cfc50] c0000000000ef5e8 process_one_work+0x1e8/0x5b0
  [c0000000f52cfce0] c0000000000efa58 worker_thread+0xa8/0x650
  [c0000000f52cfd80] c0000000000f8224 kthread+0x114/0x140
  [c0000000f52cfe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c
  --- Exception: 0  at 0000000000000000
  1b:mon> r
  R00 = c0000000006c2ffc   R16 = 0000000000000000
  R01 = c0000000f52cfb00   R17 = 0000000000000000
  R02 = c0000000014a6600   R18 = 0000000000000000
  R03 = 0000000000000000   R19 = 0000000000000001
  R04 = c0000000ea40685f   R20 = 0000000000000000
  R05 = c0000000ea40695f   R21 = 0000000000000000
  R06 = 0000000000000012   R22 = c0000000013fc787
  R07 = 0000000000000001   R23 = 0000000100000000
  R08 = c0000000f4fd08d8   R24 = 0000000000000000
  R09 = c0000000f4fd0a20   R25 = 0000000000000000
  R10 = c0000000f4fd0a48   R26 = c0000000ea40685f
  R11 = c0000000ffce3130   R27 = c0000000ea40695f
  R12 = 0000000024a42828   R28 = 0000000000000012
  R13 = c000000007b3f300   R29 = c0000000ec53ce50
  R14 = c0000000000f8118   R30 = 0000000000000001
  R15 = c0000000e644bec0   R31 = c0000000f4fd0800
  pc  = c0000000006c3020 n_tty_receive_buf_common+0xc0/0xbd0
  cfar= c000000000008750 slb_miss_realmode+0x50/0x78
  lr  = c0000000006c2ffc n_tty_receive_buf_common+0x9c/0xbd0
  msr = 800000010280b033   cr  = 24a42828
  ctr = c0000000006c3b30   xer = 0000000000000000   trap =  300
  dar = 0000000000002260   dsisr = 40000000
  1b:mon>

  == Comment: #4 - Vaishnavi Bhat <vaish...@in.ibm.com> - 2016-12-16 04:57:55 ==
  1b:mon> e
  cpu 0x1b: Vector: 300 (Data Access) at [c0000000f52cf880]
      pc: c0000000006c3020: n_tty_receive_buf_common+0xc0/0xbd0
      lr: c0000000006c2ffc: n_tty_receive_buf_common+0x9c/0xbd0
      sp: c0000000f52cfb00
     msr: 800000010280b033
     dar: 2260
   dsisr: 40000000
    current = 0xc00000009e61a200
    paca    = 0xc000000007b3f300         softe: 0        irq_happened: 0x01
      pid   = 32533, comm = kworker/u64:0
  Linux version 4.8.0-31-generic (buildd@bos01-ppc64el-021) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #33~16.04.1-Ubuntu SMP Wed Dec 7 
16:15:11 UTC 2016 (Ubuntu 4.8.0-31.33~16.04.1-generic 4.8.11)

  1b:mon> di c0000000006c3020 10
  c0000000006c3020  e9192260    ld      r8,8800(r25)  -----------> r25 is a 
Null pointer
  c0000000006c3024  7c2004ac    lwsync
  c0000000006c3028  80ff0130    lwz     r7,304(r31)
  c0000000006c302c  e8d90000    ld      r6,0(r25)
  c0000000006c3030  78e5efe3    rldicl. r5,r7,61,63
  c0000000006c3034  7d464050    subf    r10,r6,r8
  c0000000006c3038  392a1000    addi    r9,r10,4096
  c0000000006c303c  7d2907b4    extsw   r9,r9
  c0000000006c3040  41820020    beq     c0000000006c3060        # 
n_tty_receive_buf_common+0x100/0xbd0
  c0000000006c3044  3ca05555    lis     r5,21845
  c0000000006c3048  394a1002    addi    r10,r10,4098
  c0000000006c304c  60a55556    ori     r5,r5,21846
  c0000000006c3050  7d2a2896    mulhw   r9,r10,r5
  c0000000006c3054  7d4afe70    srawi   r10,r10,31
  c0000000006c3058  7d2a4850    subf    r9,r10,r9
  c0000000006c305c  7d2907b4    extsw   r9,r9
  1b:mon> r
  R00 = c0000000006c2ffc   R16 = 0000000000000000
  R01 = c0000000f52cfb00   R17 = 0000000000000000
  R02 = c0000000014a6600   R18 = 0000000000000000
  R03 = 0000000000000000   R19 = 0000000000000001
  R04 = c0000000ea40685f   R20 = 0000000000000000
  R05 = c0000000ea40695f   R21 = 0000000000000000
  R06 = 0000000000000012   R22 = c0000000013fc787
  R07 = 0000000000000001   R23 = 0000000100000000
  R08 = c0000000f4fd08d8   R24 = 0000000000000000
  R09 = c0000000f4fd0a20   R25 = 0000000000000000
  R10 = c0000000f4fd0a48   R26 = c0000000ea40685f
  R11 = c0000000ffce3130   R27 = c0000000ea40695f
  R12 = 0000000024a42828   R28 = 0000000000000012
  R13 = c000000007b3f300   R29 = c0000000ec53ce50
  R14 = c0000000000f8118   R30 = 0000000000000001
  R15 = c0000000e644bec0   R31 = c0000000f4fd0800
  pc  = c0000000006c3020 n_tty_receive_buf_common+0xc0/0xbd0
  cfar= c000000000008750 slb_miss_realmode+0x50/0x78
  lr  = c0000000006c2ffc n_tty_receive_buf_common+0x9c/0xbd0
  msr = 800000010280b033   cr  = 24a42828
  ctr = c0000000006c3b30   xer = 0000000000000000   trap =  300
  dar = 0000000000002260   dsisr = 40000000

  1b:mon> d $linux_banner
  c000000000b60158 4c696e7578207665 7273696f6e20342e  |Linux version 4.|
  c000000000b60168 382e302d33312d67 656e657269632028  |8.0-31-generic (|
  c000000000b60178 6275696c64644062 6f7330312d707063  |buildd@bos01-ppc|
  c000000000b60188 3634656c2d303231 2920286763632076  |64el-021) (gcc v|

  1b:mon> mi
  [123137.575504] Mem-Info:
  [123137.575522] active_anon:888 inactive_anon:320 isolated_anon:0
  [123137.575522]  active_file:1135 inactive_file:12230 isolated_file:0
  [123137.575522]  unevictable:230 dirty:2283 writeback:42 unstable:0
  [123137.575522]  slab_reclaimable:2477 slab_unreclaimable:8349
  [123137.575522]  mapped:331 shmem:1 pagetables:119 bounce:0
  [123137.575522]  free:23060 free_pcp:120 free_cma:0
  [123137.575567] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 
0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:0 
all_unreclaimable? yes
  [123137.575600] Node 3 active_anon:56832kB inactive_anon:20480kB 
active_file:72640kB inactive_file:782720kB unevictable:14720kB 
isolated(anon):0kB isolated(file):0kB mapped:21184kB dirty:146112kB 
writeback:2688kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 64kB 
writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
  [123137.575622] Node 3 DMA free:1475840kB min:180224kB low:225280kB 
high:270336kB active_anon:56832kB inactive_anon:20480kB active_file:72640kB 
inactive_file:782720kB unevictable:14720kB writepending:148800kB 
present:4194304kB managed:3624960kB mlocked:14720kB slab_reclaimable:158528kB 
slab_unreclaimable:534336kB kernel_stack:10512kB pagetables:7616kB bounce:0kB 
free_pcp:7680kB local_pcp:320kB free_cma:0kB
  [123137.575653] lowmem_reserve[]: 0 0 0 0
  [123137.575666] Node 3 DMA: 4091*64kB (UME) 2410*128kB (UE) 1275*256kB (UME) 
583*512kB (UE) 165*1024kB (UME) 47*2048kB (UM) 3*4096kB (M) 0*8192kB 0*16384kB 
= 1472704kB
  [123137.575708] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16384kB
  [123137.575718] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16777216kB
  [123137.575727] 14186 total pagecache pages
  [123137.575733] 598 pages in swap cache
  [123137.575739] Swap cache stats: add 2114491, delete 2113893, find 
1715624/2933446
  [123137.575746] Free swap  = 11950336kB
  [123137.575752] Total swap = 12096448kB
  [123137.575757] 65536 pages RAM
  [123137.575762] 0 pages HighMem/MovableOnly
  [123137.575767] 8896 pages reserved
  [123137.575772] 0 pages cma reserved
  [123137.575777] 0 pages hwpoisoned

  Mirroring for Canonical's awareness.
  Thank you.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1651376/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to