I'm also having this issue on my file server running Ubuntu 18.10. My file server get multiple SFTP connections every second and when the load is high some connection fails with below error in logs.
sshd[84268]: pam_systemd(sshd:session): Failed to create session: Start job for unit user-1001.slice failed with 'canceled' -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1798883 Title: Ubuntu 18.04 LTS: pam_systemd: Failure to create session under heavy I/O load Status in dbus package in Ubuntu: New Status in systemd package in Ubuntu: New Bug description: I have a file server running Ubuntu 18.04 LTS which has strange behavior with systemd and dbus during heavy I/O load. The file server is set up with RAID60 with mdadm serving via NFS. When I/O load becomes high, it appears that dbus and systemd-logind are slowing down for SSH connections and I lose SSH connectivity due to login failure with pam_systemd: Oct 19 11:14:56 nfsserver sshd[20635]: pam_systemd(sshd:session): Failed to create session: Connection timed out Oct 19 11:15:06 nfsserver sshd[20821]: pam_systemd(sshd:session): Failed to create session: Connection timed out Oct 19 11:15:06 nfsserver sshd[20837]: pam_systemd(sshd:session): Failed to create session: Connection timed out Oct 19 12:05:58 nfsserver sshd[25143]: pam_systemd(sshd:session): Failed to create session: Connection timed out Oct 19 13:03:07 nfsserver sshd[47296]: pam_systemd(sshd:session): Failed to release session: Connection timed out Oct 19 13:05:39 nfsserver sshd[48532]: pam_systemd(sshd:session): Failed to create session: Message recipient disconnected from message bus without replying Oct 19 13:05:39 nfsserver sshd[48570]: pam_systemd(sshd:session): Failed to create session: Message recipient disconnected from message bus without rep CPU load gets near 50 or 60%, but not 100%. Looking through bug reports and issues that others have experienced with previous versions, it seems that when systemd-logind doesn't process the IPC queue from dbus fast enough, dbus will kick systemd-logind resulting in this failure. This appears to be so in socket statistics whereby the dbus IPC Send-Q fills and does not move during heavy load: Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port u_str ESTAB 0 153088 /var/run/dbus/system_bus_socket -2002184190 * -2002209306 The Recv-Q on the other end does not keep up: Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port u_str ESTAB 37892 0 * -2002209306 * -2002184190 I have tried to renice dbus, systemd, systemd-logind without success. The only way to resolve this issue is to reduce I/O load. Any thoughts? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/1798883/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp