On 2016-10-26 09:42:29 +0200, Marc Haber wrote:
> Hi Vincent,
> 
> can you try stopping the atop daemons, verifying that they're really
> dead, and restart?

root@cventin:/home/vlefevre# ps -aef|grep atop
root       908     1  0 10:17 ?        00:00:00 /usr/sbin/atopacctd
root       963     1  0 10:17 ?        00:00:00 /usr/bin/atop -a -R -w 
/var/log/atop/atop_20161026 600
root      1897  1755  0 10:18 pts/0    00:00:00 grep atop
root@cventin:/home/vlefevre# invoke-rc.d atopacct stop

This one takes time to stop. In the journal, I can see:

Oct 26 10:18:56 cventin systemd[1]: atopacct.service: State 'stop-sigterm' 
timed out. Killing.
Oct 26 10:18:56 cventin systemd[1]: atopacct.service: Killing process 908 
(atopacctd) with signal SIGKILL.
Oct 26 10:18:56 cventin systemd[1]: atopacct.service: Main process exited, 
code=killed, status=9/KILL

So, there's a timeout (1'30" by default), then systemd kills it with kill -KILL
(thus without clean-up).

root@cventin:/home/vlefevre# invoke-rc.d atop stop
root@cventin:/home/vlefevre# ps -aef|grep atop
root      2087  1755  0 10:19 pts/0    00:00:00 grep atop
root@cventin:/home/vlefevre# invoke-rc.d atop start
root@cventin:/home/vlefevre# invoke-rc.d atopacct start
Job for atopacct.service failed because of unavailable resources or another 
system error.
See "systemctl status atopacct.service" and "journalctl -xe" for details.
invoke-rc.d: initscript atopacct, action "start" failed.
● atopacct.service - Atop process accounting daemon
   Loaded: loaded (/lib/systemd/system/atopacct.service; enabled; vendor 
preset: enabled)
   Active: failed (Result: resources) since Wed 2016-10-26 10:19:36 CEST; 9ms 
ago
     Docs: man:atopacctd(8)
  Process: 2142 ExecStart=/usr/sbin/atopacctd (code=exited, status=0/SUCCESS)
 Main PID: 908 (code=killed, signal=KILL)

Oct 26 10:19:36 cventin systemd[1]: Starting Atop process accounting daemon...
Oct 26 10:19:36 cventin atopacctd[2142]: /run/pacct_shadow.d: File exists
Oct 26 10:19:36 cventin systemd[1]: atopacct.service: PID file 
/run/atopacctd.pid not readable (yet?) after start: No such file or directory
Oct 26 10:19:36 cventin systemd[1]: Failed to start Atop process accounting 
daemon.
Oct 26 10:19:36 cventin systemd[1]: atopacct.service: Unit entered failed state.
Oct 26 10:19:36 cventin systemd[1]: atopacct.service: Failed with result 
'resources'.

Isn't the problem due to the "File exists" error?

> Btw, this is a new issue, we should have a dedicated bug report for
> this. Failed with result 'resources' looks like a local issue though.

There are 2 issues:

1. The fact that atopacctd doesn't quit after SIGTERM. This is exactly
   the same issue as before.

2. The "File exists" problem at start. Perhaps a consequence of (1)
   since there was no clean-up. But I don't think that this should
   cause atopacct to fail to start.

> Is your system under load, runing with resource limits, having limited
> resources?

No.

> Can you reproduce the issue after rebooting?

This was just after a reboot.

-- 
Vincent Lefèvre <[email protected]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Reply via email to