On an idle Xenial cloud image I'm seeing:
[ 1485.236760] [<ffff800000086ad0>] __switch_to+0x90/0xa8
[ 1485.236772] [<ffff800000143e80>] __tick_nohz_idle_enter+0x50/0x3f0
[ 1485.236776] [<ffff800000144478>] tick_nohz_idle_enter+0x40/0x70
[ 1485.236785] [<ffff80000010baf0>] cpu_startup_entry+0x288/0x2d8
[ 1485.236791] [<ffff80000008fca8>] secondary_start_kernel+0x120/0x130
[ 1485.236795] [<000000004008290c>] 0x4008290c
after a while I get:
[ 2462.806971] rcu_sched kthread starved for 15002 jiffies! g2579 c2578 f0x0 s3
->state=0x1
[ 2667.835351] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 2667.836918] 0-...: (66 GPs behind) idle=cf0/0/0 softirq=5177/5177 fqs=0
[ 2667.838801] 2-...: (0 ticks this GP) idle=73a/0/0 softirq=4570/4570 fqs=0
[ 2667.840696] 3-...: (64 GPs behind) idle=eba/0/0 softirq=4654/4654 fqs=0
[ 2667.842533] (detected by 1, t=15002 jiffies, g=2638, c=2637, q=4389)
and at this point sleeping blocks, for example strace on sleep(1) on the
VM shows nanosleep({1, 0}) sleep forever, one has to SIGINT this as it
never times out.
Also the secondary_start_kernel() is indicative that the VM puts CPUs to
sleep and wakes them on a timer.
I can trigger this more often with more CPUs on the VM and also by
loading the host, for example, producing a lot of cache or memory
activity can trigger the initial hangs more frequently than having an
idle host.
So, I suspect there is a cpuhotplug and nohz combo causing issues here.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1531768
Title:
[arm64] lockups some time after booting
Status in Auto Package Testing:
Triaged
Status in linux package in Ubuntu:
Confirmed
Bug description:
I created an 8 CPU arm64 instance on Canonical's Scalingstack (which I
want to use for armhf autopkgtesting in LXD). I started with wily as
that has lxd available (it's not yet available in trusty nor the PPA
for arm64).
However, pretty much any LXD task that I do (I haven't tried much
else) on this machine takes unbearably long. A simple "lxc profile set
default raw.lxc lxc.seccomp=" or "lxc list" takes several minutes.
I see tons of
[ 1020.971955] rcu_sched kthread starved for 6000 jiffies! g1095 c1094 f0x0
[ 1121.166926] INFO: task fsnotify_mark:69 blocked for more than 120 seconds.
in dmesg (the attached apport info has the complete dmesg).
ProblemType: Bug
DistroRelease: Ubuntu 15.10
Package: linux-image-4.2.0-22-generic 4.2.0-22.27
ProcVersionSignature: User Name 4.2.0-22.27-generic 4.2.6
Uname: Linux 4.2.0-22-generic aarch64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Jan 7 09:18 seq
crw-rw---- 1 root audio 116, 33 Jan 7 09:18 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.19.1-0ubuntu5
Architecture: arm64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
Date: Thu Jan 7 09:24:01 2016
IwConfig:
eth0 no wireless extensions.
lo no wireless extensions.
lxcbr0 no wireless extensions.
Lspci:
00:00.0 Host bridge [0600]: Red Hat, Inc. Device [1b36:0008]
Subsystem: Red Hat, Inc Device [1af4:1100]
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize
libusb: -99
PciMultimedia:
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-22-generic
root=LABEL=cloudimg-rootfs earlyprintk
RelatedPackageVersions:
linux-restricted-modules-4.2.0-22-generic N/A
linux-backports-modules-4.2.0-22-generic N/A
linux-firmware 1.149.3
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/auto-package-testing/+bug/1531768/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp