Hello there! I am running a daemon on 3 machines running ubuntu 11.04 server and it gets restarted on all of them at approximately the same time.
$ grep 'Restart' /tmp/celeryd.log [2011-10-22 06:45:12,376: WARNING/Beat] Restarting celeryd (/usr/bin/celeryd --purge -l DEBUG -B) [2011-10-22 06:45:12,381: WARNING/MainProcess] Restarting celeryd (/usr/bin/celeryd --purge -l DEBUG -B) $ grep 'Restart' /tmp/celeryd.log [2011-10-22 06:47:17,771: WARNING/Beat] Restarting celeryd (/usr/bin/celeryd --purge -l DEBUG -B) [2011-10-22 06:47:17,775: WARNING/MainProcess] Restarting celeryd (/usr/bin/celeryd --purge -l DEBUG -B) $ grep 'Restart' /tmp/celeryd.log [2011-10-22 06:44:06,012: WARNING/Beat] Restarting celeryd (/usr/bin/celeryd --purge -l DEBUG -B) [2011-10-22 06:44:06,012: WARNING/MainProcess] Restarting celeryd (/usr/bin/celeryd --purge -l DEBUG -B) All machines have plenty of disk space, 64 GB of RAM and 32 core CPUs (AMD Opteron Processor 8356). At the time of the calculation these machines are performing pretty heavy seismic calculations and the load on them would be around 20. AFAICT memory is not an issue, the swap is barely used. I am at a loss to find out why and how these restarts occur. Any advice on how to analyse/diagnose this problem would be very much appreciated. Please note also that the daemon in question is not started via an /etc/init.d script but manually: cd /usr/openquake && nohup celeryd --purge -l DEBUG -B > /tmp/celeryd.log 2>&1 3>&1 & P.S.: logrotate is not being used Best regards/Mit freundlichen Grüßen -- Muharem Hrnjadovic <m...@foldr3.com> Public key id : B2BBFCFC Key fingerprint : A5A3 CC67 2B87 D641 103F 5602 219F 6B60 B2BB FCFC
signature.asc
Description: OpenPGP digital signature
-- ubuntu-server mailing list ubuntu-server@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-server More info: https://wiki.ubuntu.com/ServerTeam