7 day log rotate: | hloeung@floette:~$ zgrep -h SEGV /var/log/syslog* | Oct 28 14:46:34 floette kernel: [1351174.845829] init: rsyslog main process (2652) killed by SEGV signal
| hloeung@bagon:~$ zgrep -h SEGV /var/log/syslog* | Nov 2 22:17:03 bagon kernel: [2401829.665556] init: neutron-plugin-openvswitch-agent main process (124418) killed by SEGV signal | Nov 3 00:20:46 bagon kernel: [2409252.286349] init: nova-compute main process (125673) killed by SEGV signal | Oct 26 11:51:25 bagon kernel: [1759496.565022] init: nova-compute main process (94922) killed by SEGV signal | Oct 26 11:56:08 bagon kernel: [1759778.693294] init: neutron-plugin-openvswitch-agent main process (18574) killed by SEGV signal | Oct 26 11:56:23 bagon kernel: [1759794.417232] init: neutron-plugin-openvswitch-agent main process (95171) killed by SEGV signal | hloeung@gligar:~$ zgrep -h SEGV /var/log/syslog* | Oct 30 18:32:56 gligar kernel: [745109.275184] init: neutron-plugin-openvswitch-agent main process (4705) killed by SEGV signal | Oct 30 18:32:56 gligar kernel: [745109.776233] init: neutron-plugin-openvswitch-agent main process (88517) killed by SEGV signal | Oct 30 18:32:57 gligar kernel: [745110.335622] init: neutron-plugin-openvswitch-agent main process (88527) killed by SEGV signal | hloeung@patrat:~$ zgrep -h SEGV /var/log/syslog* | Oct 27 08:18:29 patrat kernel: [508926.329315] init: neutron-plugin-openvswitch-agent main process (51113) killed by SEGV signal I've disabled KSM as suggested. I'll try get wgrant or cjwatson to trigger a full rebuild and get some load on these compute nodes. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to apparmor in Ubuntu. https://bugs.launchpad.net/bugs/1508767 Title: IBM POWER8 unhandled signal 11 / SEGV Status in Ubuntu Cloud Archive: New Status in apparmor package in Ubuntu: Invalid Status in linux package in Ubuntu: Confirmed Status in linux-meta-lts-vivid package in Ubuntu: New Bug description: Hi, We have a few IBM POWER8 servers which we're currently using as OpenStack nova compute nodes. It seems we're regularly running into issues where processes are segfaulting: | hloeung@gligar:~$ zgrep -E '(SEGV)|(unhandled signal 11)' /var/log/syslog.5.gz | Oct 16 23:31:38 gligar kernel: [88351.465559] neutron-openvsw[29733]: unhandled signal 11 at 88f9010000000000 nip 00000000100ba0d8 lr 00000000101ad860 code 30001 | Oct 16 23:31:38 gligar kernel: [88351.566909] init: neutron-plugin-openvswitch-agent main process (29733) killed by SEGV signal | Oct 16 23:31:38 gligar kernel: [88351.746611] apport[29500]: unhandled signal 11 at 8850e467250040a8 nip 0000000010201f80 lr 0000000010202984 code 30001 | Oct 16 23:31:39 gligar kernel: [88352.245829] neutron-rootwra[29749]: unhandled signal 11 at 0809c4b610000000 nip 000000001014ae4c lr 000000001014b544 code 30001 | Oct 16 23:31:50 gligar kernel: [88364.040340] neutron-rootwra[30060]: unhandled signal 11 at 08a305c12b000000 nip 00000000100b74d0 lr 00000000100b73e4 code 30001 | Oct 16 23:31:51 gligar kernel: [88364.174218] neutron-rootwra[30065]: unhandled signal 11 at 088eb28e2f004078 nip 00000000100b5974 lr 00000000100aa794 code 30001 | Oct 16 23:31:52 gligar kernel: [88365.195380] neutron-rootwra[30098]: unhandled signal 11 at 88c939e322000008 nip 00000000100c8b28 lr 0000000010060384 code 30001 | Oct 16 23:31:52 gligar kernel: [88365.362374] neutron-rootwra[30106]: unhandled signal 11 at 882c58ad2800f04f nip 00003fffaef81220 lr 00003fffaef811a0 code 30001 | Oct 16 23:32:27 gligar kernel: [88400.966976] neutron-rootwra[30341]: unhandled signal 11 at 88d1fbe922001008 nip 00000000100c8b28 lr 0000000010060384 code 30001 | Oct 16 23:32:47 gligar kernel: [88420.953053] neutron-rootwra[30412]: unhandled signal 11 at 11b6629054008000 nip 00003fff9a864ac4 lr 00003fff9a84c42c code 30001 | Oct 16 23:34:49 gligar kernel: [88542.778503] neutron-rootwra[30977]: unhandled signal 11 at 88540f00000010a8 nip 00000000100aa768 lr 00000000100b74e8 code 30001 | Oct 16 23:35:23 gligar kernel: [88576.700721] neutron-openvsw[29739]: unhandled signal 11 at 08bfcbf7210000a8 nip 00000000100ab390 lr 00000000100b7c38 code 30001 | Oct 16 23:35:23 gligar kernel: [88576.804961] init: neutron-plugin-openvswitch-agent main process (29739) killed by SEGV signal | Oct 16 23:36:01 gligar kernel: [88614.995497] nova-compute[31662]: unhandled signal 11 at 8846c1c81f004008 nip 000000001014c2f0 lr 0000000010151080 code 30001 | Oct 16 23:36:02 gligar kernel: [88615.110735] nova-compute[4331]: unhandled signal 11 at 88befae9220010a8 nip 00000000100b5c8c lr 000000001014c734 code 30001 | Oct 16 23:36:02 gligar kernel: [88615.219436] init: nova-compute main process (4331) killed by SEGV signal | Oct 17 03:59:56 gligar kernel: [104449.890256] landscape-packa[63283]: unhandled signal 11 at 02f0000000000008 nip 00000000101abeac lr 00000000100a8738 code 30001 | Oct 17 04:05:00 gligar kernel: [104753.718195] sudo[63915]: unhandled signal 11 at 08e06105d1dcfff8 nip 00003fffb15cf7e4 lr 00003fffb15cfa00 code 30001 | hloeung@floette:~$ zgrep -E '(SEGV)|(unhandled signal 11)' /var/log/syslog.7.gz | Oct 14 16:55:30 floette kernel: [149326.697938] rsync[9915]: unhandled signal 11 at 00003ffff7cb0000 nip 00003fffa242d054 lr 00003fffa2426560 code 30001 | Oct 14 21:05:57 floette kernel: [164353.333697] apparmor_parser[102284]: unhandled signal 11 at 08680f0000000000 nip 000000001004bbf8 lr 0000000010028de4 code 30001 | Oct 14 22:21:24 floette kernel: [168880.481778] neutron-rootwra[153488]: unhandled signal 11 at 8860fbe21f0000a8 nip 00000000100aa768 lr 00000000100b74e8 code 30001 | Oct 14 22:21:26 floette kernel: [168882.078608] neutron-openvsw[4546]: unhandled signal 11 at 8822cbf03d000008 nip 00000000100aa764 lr 00000000100e6900 code 30001 | Oct 14 22:21:37 floette kernel: [168893.597834] init: neutron-plugin-openvswitch-agent main process (4546) killed by SEGV signal | Oct 14 22:21:39 floette kernel: [168894.949777] nova-rootwrap[153708]: unhandled signal 11 at 88d495c93c0000a8 nip 00000000100a57d4 lr 00000000100ab42c code 30001 | Oct 14 22:21:43 floette kernel: [168898.973700] neutron-rootwra[153847]: unhandled signal 11 at 08c90df318000020 nip 00000000101ac260 lr 00000000101ad92c code 30001 | Oct 14 22:21:44 floette kernel: [168900.785421] neutron-rootwra[153850]: unhandled signal 11 at 88d87b783f0000a8 nip 00000000101abf40 lr 00000000100d9cac code 30001 | Oct 14 22:21:46 floette kernel: [168902.724121] neutron-openvsw[153852]: unhandled signal 11 at 882b78783f0000a8 nip 00000000100b5c8c lr 000000001014c734 code 30001 | hloeung@patrat:~$ zgrep -E '(SEGV)|(unhandled signal 11)' /var/log/syslog.7.gz | Oct 15 00:48:13 patrat kernel: [553143.677075] rsync[89656]: unhandled signal 11 at 00003fffe6a50000 nip 00003fff77e0d054 lr 00003fff77e06560 code 30001 | Oct 16 02:42:03 wailmer kernel: [862104.157449] nova-compute[11431]: unhandled signal 11 at 081169bc370000a8 nip 00000000100ac164 lr 00000000100b7d6c code 30001 | Oct 16 02:42:03 wailmer kernel: [862104.264242] init: nova-compute main process (11431) killed by SEGV signal | Oct 16 06:38:22 wailmer kernel: [876282.603855] qemu-img[78662]: unhandled signal 11 at 11b625104e000000 nip 00003fffb6224bb4 lr 00003fffb620c42c code 30001 | Oct 16 06:38:23 wailmer kernel: [876283.336045] qemu-system-ppc[78609]: unhandled signal 11 at ffffffc10000009a nip 00003fffae1a7124 lr 0000000010314874 code 30001 | Oct 16 06:39:40 wailmer kernel: [876360.399550] neutron-rootwra[79380]: unhandled signal 11 at 0800c20428000000 nip 00000000100a6c14 lr 00000000100a6d4c code 30001 | Oct 16 06:39:47 wailmer kernel: [876367.577184] neutron-rootwra[79676]: unhandled signal 11 at 0878a100000040a8 nip 00000000100aa768 lr 000000001004ed6c code 30001 | Oct 16 06:39:49 wailmer kernel: [876369.478066] neutron-openvsw[12655]: unhandled signal 11 at 088e47f11f000008 nip 00000000100db46c lr 00000000100db424 code 30001 | Oct 16 06:39:58 wailmer kernel: [876378.286827] init: neutron-plugin-openvswitch-agent main process (12655) killed by SEGV signal | Oct 16 06:39:59 wailmer kernel: [876379.211801] sudo[79703]: unhandled signal 11 at 886baddd38005000 nip 886baddd38005000 lr 00003fff7da870a8 code 30001 | Oct 16 06:40:00 wailmer kernel: [876380.344562] libvirtd[109725]: unhandled signal 11 at 88806be02f000000 nip 00003fff78a70684 lr 00003fff78ab7a5c code 30001 | Oct 16 06:40:06 wailmer kernel: [876386.781123] init: libvirt-bin main process (109725) killed by SEGV signal | Oct 16 06:40:06 wailmer kernel: [876386.818672] sudo[79919]: unhandled signal 11 at 11bda1eb70000000 nip 00003fff82094ac4 lr 00003fff8207c42c code 30001 | Oct 16 06:40:06 wailmer kernel: [876386.921414] neutron-openvsw[79689]: unhandled signal 11 at 88f8010000005000 nip 00000000100ba0d8 lr 00000000100c97c8 code 30001 | Oct 16 06:40:06 wailmer kernel: [876387.024431] init: neutron-plugin-openvswitch-agent main process (79689) killed by SEGV signal These servers are all running Trusty with hwe-v kernel (3.19.0-31-generic #36~14.04.1-Ubuntu). ProblemType: Crash DistroRelease: Ubuntu 14.04 Package: nova-compute 1:2015.1.1-0ubuntu1~cloud2 [origin: Canonical] ProcVersionSignature: Ubuntu 3.19.0-30.34~14.04.1-generic 3.19.8-ckt6 Uname: Linux 3.19.0-30-generic ppc64le ApportVersion: 2.14.1-0ubuntu3.16 Architecture: ppc64el CrashDB: { "impl": "launchpad", "project": "cloud-archive", "bug_pattern_url": "http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml", } Date: Fri Oct 16 23:30:00 2015 ExecutablePath: /usr/bin/nova-compute InterpreterPath: /usr/bin/python2.7 PackageArchitecture: all ProcCmdline: /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf ProcEnviron: TERM=linux PATH=(custom, no user) ProcLoadAvg: 1.98 1.32 1.28 3/1516 7754 ProcSwaps: Filename Type Size Used Priority /swap.img file 8388544 0 -1 ProcVersion: Linux version 3.19.0-30-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 Signal: 6 SourcePackage: nova UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: libvirtd cpu_cores: Number of cores present = 20 cpu_coreson: Number of cores online = 20 cpu_smt: SMT is off --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Oct 22 03:34 seq crw-rw---- 1 root audio 116, 33 Oct 22 03:34 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.18 Architecture: ppc64el ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Package: linux-meta-lts-vivid PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_GB SHELL=/bin/bash ProcFB: ProcKernelCmdLine: root=UUID=fcd256a9-8aa6-4805-95ae-f8c635967753 ro console=ttyS1 ProcLoadAvg: 3.77 2.83 2.55 3/1574 89091 ProcSwaps: Filename Type Size Used Priority /swap.img file 8388544 0 -1 ProcVersion: Linux version 3.19.0-31-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #36~14.04.1-Ubuntu SMP Thu Oct 8 10:25:49 UTC 2015 ProcVersionSignature: Ubuntu 3.19.0-31.36~14.04.1-generic 3.19.8-ckt7 RelatedPackageVersions: linux-restricted-modules-3.19.0-31-generic N/A linux-backports-modules-3.19.0-31-generic N/A linux-firmware 1.127.16 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 3.19.0-31-generic ppc64le UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: adm _MarkForUpload: True cpu_cores: Number of cores present = 20 cpu_coreson: Number of cores online = 20 cpu_dscr: DSCR is 0 cpu_freq: min: 2.016 GHz (cpu 80) max: 3.691 GHz (cpu 32) avg: 3.527 GHz cpu_runmode: Could not retrieve current diagnostics mode, No firmware implementation of function cpu_smt: SMT is off To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1508767/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp