Hi Ivan. Can you post steps to reproduce this issue? Or at least steps to get a similar environment running to get familiar with this bug? (for those who are not familiar with HDInsight). Thanks!
-- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1788643 Title: zombies pile up, system becomes unresponsive Status in systemd package in Ubuntu: New Bug description: Description: Ubuntu 16.04.5 LTS Release: 16.04 systemd: Installed: 229-4ubuntu21.4 Candidate: 229-4ubuntu21.4 Version table: *** 229-4ubuntu21.4 500 500 http://azure.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 100 /var/lib/dpkg/status 229-4ubuntu21.1 500 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 229-4ubuntu4 500 500 http://azure.archive.ubuntu.com/ubuntu xenial/main amd64 Packages This problem is in Azure. We are seeing these problems on different systems. Worker nodes (Ubuntu 16.04) in a hadoop cluster start piling up zombies and become unresponsive. The syslog and the kernel logs don't provide much information. The only error we could correlate with what we are seeing was in the audit logs. See at the end of this message, the "Connection timed out" and the "Cannot create session: Already running in a session" messages. Our first suspect was memory pressure on the machines. We added logging and settings to reboot on out of memory, but all these turned to be red herrings. Aug 18 19:11:08 wn2-d3ncsp su[112600]: Successful su for root by root Aug 18 19:11:08 wn2-d3ncsp su[112600]: + ??? root:root Aug 18 19:11:08 wn2-d3ncsp su[112600]: pam_unix(su:session): session opened for user root by (uid=0) Aug 18 19:11:08 wn2-d3ncsp systemd-logind[1486]: New session c8 of user root. Aug 18 19:11:26 wn2-d3ncsp sshd[112690]: Did not receive identification string from 10.84.93.35 Aug 18 19:11:34 wn2-d3ncsp su[112600]: pam_systemd(su:session): Failed to create session: Connection timed out Aug 18 19:11:34 wn2-d3ncsp su[112600]: pam_unix(su:session): session closed for user root Aug 18 19:11:34 wn2-d3ncsp systemd-logind[1486]: Removed session c8. Aug 18 19:12:03 wn2-d3ncsp sudo: ehiadmin : TTY=pts/1 ; PWD=/home/ehiadmin ; USER=root ; COMMAND=/bin/su - Aug 18 19:12:03 wn2-d3ncsp sudo: pam_unix(sudo:session): session opened for user root by ehiadmin(uid=0) Aug 18 19:12:03 wn2-d3ncsp su[113085]: Successful su for root by root Aug 18 19:12:03 wn2-d3ncsp su[113085]: + /dev/pts/1 root:root Aug 18 19:12:03 wn2-d3ncsp su[113085]: pam_unix(su:session): session opened for user root by ehiadmin(uid=0) Aug 18 19:12:03 wn2-d3ncsp su[113085]: pam_systemd(su:session): Cannot create session: Already running in a session Aug 18 19:12:42 wn2-d3ncsp sshd[113274]: Did not receive identification string from 10.84.93.42 Aug 18 19:13:37 wn2-d3ncsp su[113085]: pam_unix(su:session): session closed for user root Aug 18 19:13:37 wn2-d3ncsp sudo: pam_unix(sudo:session): session closed for user root Aug 18 19:13:37 wn2-d3ncsp sshd[112285]: pam_unix(sshd:session): session closed for user ehiadmin Aug 18 19:13:37 wn2-d3ncsp systemd-logind[1486]: Removed session 1291. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1788643/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp