On 2016-10-26 09:42:29 +0200, Marc Haber wrote:
> Hi Vincent,
>
> can you try stopping the atop daemons, verifying that they're really
> dead, and restart?
root@cventin:/home/vlefevre# ps -aef|grep atop
root 908 1 0 10:17 ? 00:00:00 /usr/sbin/atopacctd
root 963 1 0 10:17 ? 00:00:00 /usr/bin/atop -a -R -w
/var/log/atop/atop_20161026 600
root 1897 1755 0 10:18 pts/0 00:00:00 grep atop
root@cventin:/home/vlefevre# invoke-rc.d atopacct stop
This one takes time to stop. In the journal, I can see:
Oct 26 10:18:56 cventin systemd[1]: atopacct.service: State 'stop-sigterm'
timed out. Killing.
Oct 26 10:18:56 cventin systemd[1]: atopacct.service: Killing process 908
(atopacctd) with signal SIGKILL.
Oct 26 10:18:56 cventin systemd[1]: atopacct.service: Main process exited,
code=killed, status=9/KILL
So, there's a timeout (1'30" by default), then systemd kills it with kill -KILL
(thus without clean-up).
root@cventin:/home/vlefevre# invoke-rc.d atop stop
root@cventin:/home/vlefevre# ps -aef|grep atop
root 2087 1755 0 10:19 pts/0 00:00:00 grep atop
root@cventin:/home/vlefevre# invoke-rc.d atop start
root@cventin:/home/vlefevre# invoke-rc.d atopacct start
Job for atopacct.service failed because of unavailable resources or another
system error.
See "systemctl status atopacct.service" and "journalctl -xe" for details.
invoke-rc.d: initscript atopacct, action "start" failed.
● atopacct.service - Atop process accounting daemon
Loaded: loaded (/lib/systemd/system/atopacct.service; enabled; vendor
preset: enabled)
Active: failed (Result: resources) since Wed 2016-10-26 10:19:36 CEST; 9ms
ago
Docs: man:atopacctd(8)
Process: 2142 ExecStart=/usr/sbin/atopacctd (code=exited, status=0/SUCCESS)
Main PID: 908 (code=killed, signal=KILL)
Oct 26 10:19:36 cventin systemd[1]: Starting Atop process accounting daemon...
Oct 26 10:19:36 cventin atopacctd[2142]: /run/pacct_shadow.d: File exists
Oct 26 10:19:36 cventin systemd[1]: atopacct.service: PID file
/run/atopacctd.pid not readable (yet?) after start: No such file or directory
Oct 26 10:19:36 cventin systemd[1]: Failed to start Atop process accounting
daemon.
Oct 26 10:19:36 cventin systemd[1]: atopacct.service: Unit entered failed state.
Oct 26 10:19:36 cventin systemd[1]: atopacct.service: Failed with result
'resources'.
Isn't the problem due to the "File exists" error?
> Btw, this is a new issue, we should have a dedicated bug report for
> this. Failed with result 'resources' looks like a local issue though.
There are 2 issues:
1. The fact that atopacctd doesn't quit after SIGTERM. This is exactly
the same issue as before.
2. The "File exists" problem at start. Perhaps a consequence of (1)
since there was no clean-up. But I don't think that this should
cause atopacct to fail to start.
> Is your system under load, runing with resource limits, having limited
> resources?
No.
> Can you reproduce the issue after rebooting?
This was just after a reboot.
--
Vincent Lefèvre <[email protected]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)