apport information ** Tags added: apport-collected xenial
** Description changed: I have been trying to easily reproduce this for days. We initially observed it in OPNFV Armband, when we tried to upgrade our Ubuntu Xenial installation kernel to linux-image-generic-hwe-16.04 (4.8). In our environment, this was easily triggered on compute nodes, when launching multiple VMs (we suspected OVS, QEMU etc.). However, in order to rule out our specifics, we looked for a simple way to reproduce it on all ThunderX nodes we have access to, and we finally found it: $ apt-get install stress-ng $ stress-ng --hdd 1024 We tested different FW versions, provided by both chip/board manufacturers, and with all of them the result is 100% reproductible, leading to a kernel Oops [1]: [ 726.070531] INFO: task kworker/0:1:312 blocked for more than 120 seconds. [ 726.077908] Tainted: G W I 4.8.0-41-generic #44~16.04.1-Ubuntu [ 726.085850] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 726.094383] kworker/0:1 D ffff0000080861bc 0 312 2 0x00000000 [ 726.094401] Workqueue: events vmstat_shepherd [ 726.094404] Call trace: [ 726.094411] [<ffff0000080861bc>] __switch_to+0x94/0xa8 [ 726.094418] [<ffff0000089854f4>] __schedule+0x224/0x718 [ 726.094421] [<ffff000008985a20>] schedule+0x38/0x98 [ 726.094425] [<ffff000008985d84>] schedule_preempt_disabled+0x14/0x20 [ 726.094428] [<ffff000008987644>] __mutex_lock_slowpath+0xd4/0x168 [ 726.094431] [<ffff000008987730>] mutex_lock+0x58/0x70 [ 726.094437] [<ffff0000080c552c>] get_online_cpus+0x44/0x70 [ 726.094440] [<ffff00000820ca24>] vmstat_shepherd+0x3c/0xe8 [ 726.094446] [<ffff0000080e1c60>] process_one_work+0x150/0x478 [ 726.094449] [<ffff0000080e1fd8>] worker_thread+0x50/0x4b8 [ 726.094453] [<ffff0000080e8eac>] kthread+0xec/0x100 [ 726.094456] [<ffff000008083690>] ret_from_fork+0x10/0x40 Over the last few days, I tested all 4.8-* and 4.10 (zesty backport), the soft lockup happens with each and every one of them. On the other hand, 4.4.0-45-generic seems to work perfectly fine (probably newer 4.4.0-* too, but due to a regression in the ethernet drivers after 4.4.0-45, we can't test those with ease) under normal conditions, yet running stress-ng leads to the same oops. [1] http://paste.ubuntu.com/24172516/ + --- + AlsaDevices: + total 0 + crw-rw---- 1 root audio 116, 1 Mar 13 19:27 seq + crw-rw---- 1 root audio 116, 33 Mar 13 19:27 timer + AplayDevices: Error: [Errno 2] No such file or directory + ApportVersion: 2.20.1-0ubuntu2.5 + Architecture: arm64 + ArecordDevices: Error: [Errno 2] No such file or directory + AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: + DistroRelease: Ubuntu 16.04 + IwConfig: Error: [Errno 2] No such file or directory + MachineType: GIGABYTE R120-T30 + Package: linux (not installed) + PciMultimedia: + + ProcEnviron: + TERM=vt220 + PATH=(custom, no user) + XDG_RUNTIME_DIR=<set> + LANG=en_US.UTF-8 + SHELL=/bin/bash + ProcFB: 0 astdrmfb + ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.8.0-41-generic root=/dev/mapper/os-root ro console=tty0 console=ttyS0,115200 console=ttyAMA0,115200 net.ifnames=1 biosdevname=0 rootdelay=90 nomodeset quiet splash vt.handoff=7 + ProcVersionSignature: Ubuntu 4.8.0-41.44~16.04.1-generic 4.8.17 + RelatedPackageVersions: + linux-restricted-modules-4.8.0-41-generic N/A + linux-backports-modules-4.8.0-41-generic N/A + linux-firmware 1.157.8 + RfKill: Error: [Errno 2] No such file or directory + Tags: xenial + Uname: Linux 4.8.0-41-generic aarch64 + UpgradeStatus: No upgrade log present (probably fresh install) + UserGroups: + + _MarkForUpload: True + dmi.bios.date: 11/22/2016 + dmi.bios.vendor: GIGABYTE + dmi.bios.version: T22 + dmi.board.asset.tag: 01234567890123456789AB + dmi.board.name: MT30-GS0 + dmi.board.vendor: GIGABYTE + dmi.board.version: 01234567 + dmi.chassis.asset.tag: 01234567890123456789AB + dmi.chassis.type: 17 + dmi.chassis.vendor: GIGABYTE + dmi.chassis.version: 01234567 + dmi.modalias: dmi:bvnGIGABYTE:bvrT22:bd11/22/2016:svnGIGABYTE:pnR120-T30:pvr0100:rvnGIGABYTE:rnMT30-GS0:rvr01234567:cvnGIGABYTE:ct17:cvr01234567: + dmi.product.name: R120-T30 + dmi.product.version: 0100 + dmi.sys.vendor: GIGABYTE ** Attachment added: "CRDA.txt" https://bugs.launchpad.net/bugs/1672521/+attachment/4837212/+files/CRDA.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1672521 Title: ThunderX: soft lockup on 4.8+ kernels To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs