Re: [systemd-devel] umount NFS problem
On Fr, 05.04.19 12:42, Harald Dunkel (harald.dun...@aixigo.de) wrote: > On 4/5/19 12:21 PM, Lennart Poettering wrote: > > On Fr, 05.04.19 11:53, Harald Dunkel (harald.dun...@aixigo.de) wrote: > > > > > > > > This is a VNC session, started via crontab @reboot. > > > > IIRC debian/ubuntu do not have pam-systemd in their PAM configuration > > for cron, which means these services are not tracked by > > logind/systemd, and hence only killed when crond likes to do that. > > > > It's a configuration bug in debian/ubuntu. > > > > No, it was just a sample. Surely systemd is sufficiently stable > to recover from some lost processes? Sure, it is. I mean, your system did shutdown in the end, didn't it? After the timeouts are hit it will go down anyway, ignoring those left-over processes. > The point is that rpcbind (and maybe others) are stopped before > the NFS umounts come up. Hopefully you agree that this is > unrelated to some cron jobs? iirc rpcbind is not needed for NFS to work after the mount is established. if rpcbind is necessary for NFS mounts it should be ordered before remote-fs.target, so that NFS moutns are shutdown before rpcbind goes down. But all of that is really something for the distro to figure out, systemd upstream doesn't really care about NFS we just provide the hooks so that distros can order their service files to the right places. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
On 4/5/19 12:21 PM, Lennart Poettering wrote: On Fr, 05.04.19 11:53, Harald Dunkel (harald.dun...@aixigo.de) wrote: This is a VNC session, started via crontab @reboot. IIRC debian/ubuntu do not have pam-systemd in their PAM configuration for cron, which means these services are not tracked by logind/systemd, and hence only killed when crond likes to do that. It's a configuration bug in debian/ubuntu. No, it was just a sample. Surely systemd is sufficiently stable to recover from some lost processes? The point is that rpcbind (and maybe others) are stopped before the NFS umounts come up. Hopefully you agree that this is unrelated to some cron jobs? Regards Harri ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
On Fri, Apr 5, 2019 at 12:53 PM Harald Dunkel wrote: > Hi Lennart, > > On 4/5/19 10:28 AM, Lennart Poettering wrote: > > > > For some reason a number of X session processes stick around to the > > very end and thus keep your /home busy. > > > > [82021.052357] systemd-shutdown[1]: Sending SIGKILL to remaining > processes... > > [82021.101976] systemd-shutdown[1]: Sending SIGKILL to PID 2513 > (gpg-agent). > > [82021.130507] systemd-shutdown[1]: Sending SIGKILL to PID 2886 > (xstartup). > > [82021.158510] systemd-shutdown[1]: Sending SIGKILL to PID 2896 > (xstartup). > > [82021.186052] systemd-shutdown[1]: Sending SIGKILL to PID 2959 (xterm). > > [82021.213129] systemd-shutdown[1]: Sending SIGKILL to PID 2960 (xterm). > > [82021.239971] systemd-shutdown[1]: Sending SIGKILL to PID 2961 (xterm). > > [82021.266285] systemd-shutdown[1]: Sending SIGKILL to PID 2966 (bash). > > [82021.292234] systemd-shutdown[1]: Sending SIGKILL to PID 2967 (bash). > > [82021.318061] systemd-shutdown[1]: Sending SIGKILL to PID 9146 > (utempter). > > [82021.343331] systemd-shutdown[1]: Sending SIGKILL to PID 9147 > (utempter). > > > > The question is how though. How do you start your X session? gdm? > > startx from the console? > > > > This is a VNC session, started via crontab @reboot. > Well that's a pretty significant omission... If (which I'm guessing is the case) the processes are started without going through regular PAM modules like a user login would, then they won't be tracked *as* a user session – they just remain as part of the 'cron' service, and systemd has no knowledge about them needing to be killed before stopping NFS. This is the problem if you see all the X11 processes in `systemctl status cron` but no session entry in `loginctl`. (While cron has its own PAM stack, due to being a non-interactive tool it uses a different module configuration – IIRC it's also special-cased in pam_systemd, but some distros don't even include pam_systemd in its config to avoid other issues.) So if your VNC server doesn't call pam_open_session() on startup, you should write a wrapper that does; similar to how existing display managers xdm/gdm/sddm work. (It doesn't need to call any of the auth/account functions from PAM.) Alternatively you could convert this into a system service that has User= and PAMName= settings; I've seen people do that to automatically start Xorg sessions and it should work all the same with VNC. (Generally, what's the point of using @reboot if you can write a .service?) You can also order the entire cron.service after remote-fs.target in order to at least avoid the race. -- Mantas Mikulėnas ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
On Fr, 05.04.19 11:53, Harald Dunkel (harald.dun...@aixigo.de) wrote: > Hi Lennart, > > On 4/5/19 10:28 AM, Lennart Poettering wrote: > > > > For some reason a number of X session processes stick around to the > > very end and thus keep your /home busy. > > > > [82021.052357] systemd-shutdown[1]: Sending SIGKILL to remaining > > processes... > > [82021.101976] systemd-shutdown[1]: Sending SIGKILL to PID 2513 (gpg-agent). > > [82021.130507] systemd-shutdown[1]: Sending SIGKILL to PID 2886 (xstartup). > > [82021.158510] systemd-shutdown[1]: Sending SIGKILL to PID 2896 (xstartup). > > [82021.186052] systemd-shutdown[1]: Sending SIGKILL to PID 2959 (xterm). > > [82021.213129] systemd-shutdown[1]: Sending SIGKILL to PID 2960 (xterm). > > [82021.239971] systemd-shutdown[1]: Sending SIGKILL to PID 2961 (xterm). > > [82021.266285] systemd-shutdown[1]: Sending SIGKILL to PID 2966 (bash). > > [82021.292234] systemd-shutdown[1]: Sending SIGKILL to PID 2967 (bash). > > [82021.318061] systemd-shutdown[1]: Sending SIGKILL to PID 9146 (utempter). > > [82021.343331] systemd-shutdown[1]: Sending SIGKILL to PID 9147 (utempter). > > > > The question is how though. How do you start your X session? gdm? > > startx from the console? > > > > This is a VNC session, started via crontab @reboot. IIRC debian/ubuntu do not have pam-systemd in their PAM configuration for cron, which means these services are not tracked by logind/systemd, and hence only killed when crond likes to do that. It's a configuration bug in debian/ubuntu. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
Hi Lennart, On 4/5/19 10:28 AM, Lennart Poettering wrote: For some reason a number of X session processes stick around to the very end and thus keep your /home busy. [82021.052357] systemd-shutdown[1]: Sending SIGKILL to remaining processes... [82021.101976] systemd-shutdown[1]: Sending SIGKILL to PID 2513 (gpg-agent). [82021.130507] systemd-shutdown[1]: Sending SIGKILL to PID 2886 (xstartup). [82021.158510] systemd-shutdown[1]: Sending SIGKILL to PID 2896 (xstartup). [82021.186052] systemd-shutdown[1]: Sending SIGKILL to PID 2959 (xterm). [82021.213129] systemd-shutdown[1]: Sending SIGKILL to PID 2960 (xterm). [82021.239971] systemd-shutdown[1]: Sending SIGKILL to PID 2961 (xterm). [82021.266285] systemd-shutdown[1]: Sending SIGKILL to PID 2966 (bash). [82021.292234] systemd-shutdown[1]: Sending SIGKILL to PID 2967 (bash). [82021.318061] systemd-shutdown[1]: Sending SIGKILL to PID 9146 (utempter). [82021.343331] systemd-shutdown[1]: Sending SIGKILL to PID 9147 (utempter). The question is how though. How do you start your X session? gdm? startx from the console? This is a VNC session, started via crontab @reboot. Regards Harri ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
On 4/5/19 8:45 AM, Mantas Mikulėnas wrote: Normally I'd expect user sessions (user-*.slice, session-*.scope, user@*.service) to be killed before mount units are stopped; I wonder how random gpg-agent processes have managed to escape that. (Actually, doesn't Debian now manage gpg-agent via user@.service? That *really* should be cleaning up everything properly...) Probably a remote login. @Michael, libpam-systemd is installed. UsePAM is enabled, too. You might also try to enable [Mount] LazyUnmount= for home.mount so that umounts appear to succeed immediately and the kernel cleans them up when it can. It mostly just hides the problem though. Looking at the log file I have the impression that rpcbind has been stopped even before the first umount attempt of /home. Can you confirm this? Regards Harri ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
On Fr, 05.04.19 08:28, Harald Dunkel (harald.dun...@aixigo.de) wrote: 65;5403;1c > Hi folks, > > I've got a device-busy-problem with /home, mounted via NFS. > Shutdown of the host takes more than 180 secs. See attached > log file. > > Apparently the umount of /home at 81925.154995 failed, (device > busy, in my case it was a lost gpg-agent). This error was > ignored, the NFS framework was shut down, the network was > stopped, and then it was too late to properly handle the /home > mount point. > > AFAIK the mount units are generated from /etc/fstab, so I wonder > if this could be improved? > > The hosts (about 50 developer PCs) are running Debian 9, systemd > 232-25+deb9u9. Unfortunately we are bound to this platform at > least for another year. > > > Every helpful hint is highly appreciated. For some reason a number of X session processes stick around to the very end and thus keep your /home busy. [82021.052357] systemd-shutdown[1]: Sending SIGKILL to remaining processes... [82021.101976] systemd-shutdown[1]: Sending SIGKILL to PID 2513 (gpg-agent). [82021.130507] systemd-shutdown[1]: Sending SIGKILL to PID 2886 (xstartup). [82021.158510] systemd-shutdown[1]: Sending SIGKILL to PID 2896 (xstartup). [82021.186052] systemd-shutdown[1]: Sending SIGKILL to PID 2959 (xterm). [82021.213129] systemd-shutdown[1]: Sending SIGKILL to PID 2960 (xterm). [82021.239971] systemd-shutdown[1]: Sending SIGKILL to PID 2961 (xterm). [82021.266285] systemd-shutdown[1]: Sending SIGKILL to PID 2966 (bash). [82021.292234] systemd-shutdown[1]: Sending SIGKILL to PID 2967 (bash). [82021.318061] systemd-shutdown[1]: Sending SIGKILL to PID 9146 (utempter). [82021.343331] systemd-shutdown[1]: Sending SIGKILL to PID 9147 (utempter). The question is how though. How do you start your X session? gdm? startx from the console? Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
Am Fr., 5. Apr. 2019 um 08:45 Uhr schrieb Mantas Mikulėnas : > The job order (home.mount vs nfs-client.target) already looks correct, so > fstab options probably won't help much; I'd try to ensure that the umount > doesn't fail in the first place. > > Normally I'd expect user sessions (user-*.slice, session-*.scope, > user@*.service) to be killed before mount units are stopped; I wonder how > random gpg-agent processes have managed to escape that. (Actually, doesn't > Debian now manage gpg-agent via user@.service? That *really* should be > cleaning up everything properly...) I would check if libpam-systemd is installed and enabled. Do you have remote logins via SSH? Does sshd_config have "UsePAM yes" -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] umount NFS problem
On Fri, Apr 5, 2019 at 9:28 AM Harald Dunkel wrote: > Hi folks, > > I've got a device-busy-problem with /home, mounted via NFS. > Shutdown of the host takes more than 180 secs. See attached > log file. > > Apparently the umount of /home at 81925.154995 failed, (device > busy, in my case it was a lost gpg-agent). This error was > ignored, the NFS framework was shut down, the network was > stopped, and then it was too late to properly handle the /home > mount point. > > AFAIK the mount units are generated from /etc/fstab, so I wonder > if this could be improved? > The job order (home.mount vs nfs-client.target) already looks correct, so fstab options probably won't help much; I'd try to ensure that the umount doesn't fail in the first place. Normally I'd expect user sessions (user-*.slice, session-*.scope, user@*.service) to be killed before mount units are stopped; I wonder how random gpg-agent processes have managed to escape that. (Actually, doesn't Debian now manage gpg-agent via user@.service? That *really* should be cleaning up everything properly...) You might also try to enable [Mount] LazyUnmount= for home.mount so that umounts appear to succeed immediately and the kernel cleans them up when it can. It mostly just hides the problem though. -- Mantas Mikulėnas ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] umount NFS problem
Hi folks, I've got a device-busy-problem with /home, mounted via NFS. Shutdown of the host takes more than 180 secs. See attached log file. Apparently the umount of /home at 81925.154995 failed, (device busy, in my case it was a lost gpg-agent). This error was ignored, the NFS framework was shut down, the network was stopped, and then it was too late to properly handle the /home mount point. AFAIK the mount units are generated from /etc/fstab, so I wonder if this could be improved? The hosts (about 50 developer PCs) are running Debian 9, systemd 232-25+deb9u9. Unfortunately we are bound to this platform at least for another year. Every helpful hint is highly appreciated. Harri -- aixigo AG, Karl-Friedrich-Strasse 68, 52072 Aachen, Germany phone: +49 241 559709-79, fax: +49 241 559709-99 eMail: harald.dun...@aixigo.de, web: http://www.aixigo.de Amtsgericht Aachen - HRB 8057, Vorstand: Erich Borsch, Christian Friedrich, Tobias Haustein, Vors. des Aufsichtsrates: Prof. Dr. Ruediger von Nitzsch shutdown-log.txt.gz Description: application/gzip ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel