Hello there!

I am running a daemon on 3 machines running ubuntu 11.04 server and it
gets restarted on all of them at approximately the same time.

    $ grep 'Restart' /tmp/celeryd.log
    [2011-10-22 06:45:12,376: WARNING/Beat] Restarting celeryd 
(/usr/bin/celeryd --purge -l DEBUG -B)
    [2011-10-22 06:45:12,381: WARNING/MainProcess] Restarting celeryd 
(/usr/bin/celeryd --purge -l DEBUG -B)

    $ grep 'Restart' /tmp/celeryd.log
    [2011-10-22 06:47:17,771: WARNING/Beat] Restarting celeryd 
(/usr/bin/celeryd --purge -l DEBUG -B)
    [2011-10-22 06:47:17,775: WARNING/MainProcess] Restarting celeryd 
(/usr/bin/celeryd --purge -l DEBUG -B)

    $ grep 'Restart' /tmp/celeryd.log
    [2011-10-22 06:44:06,012: WARNING/Beat] Restarting celeryd 
(/usr/bin/celeryd --purge -l DEBUG -B)
    [2011-10-22 06:44:06,012: WARNING/MainProcess] Restarting celeryd 
(/usr/bin/celeryd --purge -l DEBUG -B)

All machines have plenty of disk space, 64 GB of RAM and 32 core CPUs
(AMD Opteron Processor 8356).

At the time of the calculation these machines are performing pretty
heavy seismic calculations and the load on them would be around 20.
AFAICT memory is not an issue, the swap is barely used.

I am at a loss to find out why and how these restarts occur. Any advice
on how to analyse/diagnose this problem would be very much appreciated.

Please note also that the daemon in question is not started via an
/etc/init.d script but manually:

    cd /usr/openquake && nohup celeryd --purge -l DEBUG -B > /tmp/celeryd.log 
2>&1 3>&1 &


P.S.: logrotate is not being used


Best regards/Mit freundlichen Grüßen

-- 
Muharem Hrnjadovic <m...@foldr3.com>
Public key id   : B2BBFCFC
Key fingerprint : A5A3 CC67 2B87 D641 103F  5602 219F 6B60 B2BB FCFC

Attachment: signature.asc
Description: OpenPGP digital signature

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam

Reply via email to