Thanks for the explaination.
Some testing I did on RedHat Enterprise Linux 4 update 4 (at least I believe
I'm at update 4) behaves otherwise.
Here's a clip of the stop in my init.d: (ignore the crudeness of the beefy
test to ensure the proc is dead)
rm -f /var/lock/subsys/app
if [ -f "/var/run/app.pid" ]; then
PID=`cat /var/run/app.pid`
kill $PID 2>/dev/null 1>&2 && success || failure
RETVAL=$?
CNT=0
LMT=10
while [ $CNT -lt $LMT ]; do
sleep 2
if [ "`kill -0 $PID 2>&1`" == "" ]; then
let CNT=10
else
let CNT=CNT+1
fi
done
if [ $CNT -eq $LMT ]; then
kill -9 $PID 2>/dev/null 1>&2 && success || failure
RETVAL=$?
fi
rm -f /var/run/app.pid
However when I call monit stop for the process it executes and returns
immediately. In a separate console I would check ps to see if the app is
alive, it takes about 2 seconds or so to die after the monit stop returned.
Infact the 2 second sleep itself should stop monit from returning
immediately.
I don't remember this being a problem on RHEL3 at all. The kernel used is
2.6.9-42.ELsmp.
The logic used to test file/pid after a stop is fine enough to me.
Thanks!
- Chris
From: Jan-Henrik Haukeland <[EMAIL PROTECTED]>
Reply-To: This is the general mailing list for monit
<[email protected]>
To: This is the general mailing list for monit <[email protected]>
Subject: Re: Can monit block/wait on start/stop exec at all? (Chris
McKenzie
Date: Thu, 17 May 2007 23:05:07 +0200
On 17. mai. 2007, at 19.38, Chris McKenzie wrote:
I want to know if I can get monit to wait for the program exec
On stop and restart, monit will in fact wait for the program to stop (see
control.c and the function do_stop). On restart, monit waits until the
program is stopped before it starts the program again. The way monit does
this is first to call the process stop program and then go into a loop and
check if either the pid in the pid file is gone or if the pid file itself
is gone. If this is the case it goes on to call the start program again
otherwise an alert error is raised.
There will be a problem if the program to be stopped removes its pid file
before it is actually stopped. Normally one of the last thing a daemon
program should do is to remove its pid file. If it does this earlier in
the shutdown process there is going to be a problem since monit then will
assume that the process is gone (because the pid file is gone) and
continue and call the start program.
Looking at the monit code now I can see that this can be improved by
caching the pid before calling stop and instead of testing for the
existence of both the pid file and process id only test for the process
id. I'll see if I can hack a solution and I'll let you know when it can be
tested.
Best regards
--
Jan-Henrik Haukeland
http://tildeslash.com/
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general