Today's "hang" involved a zombie compiz consuming 100% of a cpu, along with an emacs instance consuming another 100%. Load average around 11, and climbing. Only 22 zombies currently, but it was 4 when I managed to get on with ssh.
I was in the process of installing software updates, using the GUI tool (rather than direct use of apt-get from the shell) when this happened. Parts of the update still seem to be running. arlie@ansuz$ ps -Fa -p1 -www UID PID PPID C SZ RSS PSR STIME TTY TIME CMD root 1 0 0 30034 4656 2 Apr28 ? 00:00:08 /sbin/init splash root 25826 25775 0 1127 1712 0 07:57 pts/18 00:00:00 /bin/sh -e /var/lib/dpkg/info/udev.postrm upgrade 229-4ubuntu17 root 25843 25826 0 6542 1352 0 07:57 pts/18 00:00:00 systemctl --system daemon-reload arlie 25846 22284 0 9342 3232 2 07:57 pts/4 00:00:00 ps -Fa -p1 -www I'm wondering now whether my first guess of a kernel issue is dead wrong, and the root cause is actually compiz. Or perhaps we have multiple causes, for the same basic symptom. Here's the current crop of defunct processes arlie@ansuz$ ps aux | grep defunct arlie 2488 0.0 0.0 0 0 ? Z<l Apr28 0:00 [pulseaudio] <defunct> arlie 2503 0.8 0.0 0 0 ? Zsl Apr28 55:08 [compiz] <defunct> arlie 2692 0.0 0.0 0 0 ? Z Apr28 0:00 [gconf-helper] <defunct> root 22212 0.0 0.0 0 0 ? Z 07:42 0:00 [check-new-relea] <defunct> sshd 24480 0.0 0.0 0 0 ? Z 07:52 0:00 [sshd] <defunct> sshd 24489 0.0 0.0 0 0 ? Z 07:52 0:00 [sshd] <defunct> sshd 24491 0.0 0.0 0 0 ? Z 07:52 0:00 [sshd] <defunct> sshd 24494 0.0 0.0 0 0 ? Z 07:53 0:00 [sshd] <defunct> sshd 24496 0.0 0.0 0 0 ? Z 07:53 0:00 [sshd] <defunct> sshd 24500 0.0 0.0 0 0 ? Z 07:53 0:00 [sshd] <defunct> sshd 24504 0.0 0.0 0 0 ? Z 07:53 0:00 [sshd] <defunct> sshd 24508 0.0 0.0 0 0 ? Z 07:53 0:00 [sshd] <defunct> sshd 24510 0.0 0.0 0 0 ? Z 07:54 0:00 [sshd] <defunct> sshd 24514 0.0 0.0 0 0 ? Z 07:54 0:00 [sshd] <defunct> sshd 24518 0.0 0.0 0 0 ? Z 07:54 0:00 [sshd] <defunct> sshd 24523 0.0 0.0 0 0 ? Z 07:54 0:00 [sshd] <defunct> sshd 24532 0.0 0.0 0 0 ? Z 07:54 0:00 [sshd] <defunct> sshd 24538 0.0 0.0 0 0 ? Z 07:55 0:00 [sshd] <defunct> sshd 24541 0.0 0.0 0 0 ? Z 07:55 0:00 [sshd] <defunct> sshd 24543 0.0 0.0 0 0 ? Z 07:55 0:00 [sshd] <defunct> sshd 25708 0.0 0.0 0 0 ? Z 07:55 0:00 [sshd] <defunct> sshd 25711 0.0 0.0 0 0 ? Z 07:56 0:00 [sshd] <defunct> arlie 26946 0.0 0.0 14228 964 pts/4 S+ 08:00 0:00 grep defunct Systemd is in top's state "D" - just like last time. That's an uninterruptable sleep. It does not appear to have accumulated any cpu time since I got in via ssh. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2503 arlie 20 0 0 0 0 Z 100.0 0.0 61:12.34 compiz 7 root 20 0 0 0 0 S 0.3 0.0 0:57.09 rcu_sched 1 root 20 0 120136 4656 3204 D 0.0 0.1 0:08.94 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd So the root cause might be systemd blocking on something. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1680502 Title: Hang after upgrade to 16.04 Status in linux package in Ubuntu: Incomplete Status in linux source package in Xenial: Incomplete Bug description: Last week I upgraded from 12.04 LTS to 14.04 LTS and then immediately to 16.04 LTS. 12.04 was not entirely stable; something was crashing regularly, and the Ubuntu tools make it hard for a user to determine what. The upgrade went moderately well; I now get error messages during system startup (about an unnamed file not being found) and a couple of other bits of flakiness, but I counted it as a success and the system as functional. This morning I tried to wake up my screen, and nothing much happened. I then attempted to ssh to the ubuntu box from another system. This requested my password almost instantly, as normal - but then nothing else happened, and the connection eventually dropped. I conclude that IP and TCP are functional, and it's possible for some processes to respond, but not many. So it's not a complete kernel hang. (In particular, I'm seeing evidence that it's getting beyond things done at interrupt level.) I don't have any debugging aids installed, so I don't believe I can get a kernel crash dump, which is what I'd want if I were debugging this. I *can* potentially retrieve and attach logs, but you'll have to tell me which ones are relevant, and do so before they rollover. (Also, logging will have to be functioning; IIRC, there were syslog issues in 12.04, and while I'd implemented whatever fix was reccommended at the time, I haven't looked at my logs since the upgrade.) This is a desk top system originally from System 76 - i.e. built for linux - that's also running a bunch of server software (postfix, apache, ...) I was not (knowingly) running anythign unusual at the time - probably Unity, a few shells, firefox, maybe guncash and/or emacs - and all the usual demons. IIRC, I was not at the very latest versions of all software installed - some new versions ahd come out since I upgraded, and I was going to deal with installing them on the weekend. I'm going to hard reboot the system now. I can then gather identifying info. If I have time this AM before work, I'll check for standard things you want in all bugs, and add them. (Right now I'm posting from my Mac laptop ;-)) --- ApportVersion: 2.20.1-0ubuntu2.5 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC1: arlie 2507 F.... pulseaudio /dev/snd/controlC0: arlie 2507 F.... pulseaudio CurrentDesktop: Unity DistroRelease: Ubuntu 16.04 HibernationDevice: RESUME=UUID=e206b01d-6cec-4b56-b469-25b106536f09 InstallationDate: Installed on 2012-04-26 (1811 days ago) InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425) MachineType: System76, Inc. Wild Dog Performance NonfreeKernelModules: nvidia Package: linux (not installed) ProcFB: ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-72-generic root=UUID=96551326-e461-4071-ab9c-0e81ad7015d7 ro quiet splash ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49 RelatedPackageVersions: linux-restricted-modules-4.4.0-72-generic N/A linux-backports-modules-4.4.0-72-generic N/A linux-firmware 1.157.8 RfKill: 0: phy0: Wireless LAN Soft blocked: yes Hard blocked: no Tags: xenial Uname: Linux 4.4.0-72-generic x86_64 UpgradeStatus: Upgraded to xenial on 2017-03-31 (11 days ago) UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo _MarkForUpload: True dmi.bios.date: 02/24/2012 dmi.bios.vendor: Intel Corp. dmi.bios.version: KCH7710H.86A.0069.2012.0224.1825 dmi.board.name: DH77KC dmi.board.vendor: Intel Corporation dmi.board.version: AAG39641-400 dmi.chassis.type: 3 dmi.chassis.vendor: System76, Inc. dmi.chassis.version: WilP9 dmi.modalias: dmi:bvnIntelCorp.:bvrKCH7710H.86A.0069.2012.0224.1825:bd02/24/2012:svnSystem76,Inc.:pnWildDogPerformance:pvrwilp9:rvnIntelCorporation:rnDH77KC:rvrAAG39641-400:cvnSystem76,Inc.:ct3:cvrWilP9: dmi.product.name: Wild Dog Performance dmi.product.version: wilp9 dmi.sys.vendor: System76, Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1680502/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp