Eric, 

That's where I started - the problem with that is that it
will start ospf every time apache fails to restart. I end up with
entries in the log like: 

Dec 6 08:47:39 tecate monit[9988]: 'apache'
process is not running 
 Dec 6 08:47:39 tecate monit[9988]: 'apache'
trying to restart 
 Dec 6 08:47:39 tecate monit[9988]: 'ospfd' stop:
/etc/init.d/ospfd 
 Dec 6 08:47:39 tecate monit[9988]: 'apache' start:
/etc/init.d/httpd 
 Dec 6 08:47:40 tecate monit[9988]: 'ospfd' unmonitor
on user request 
 Dec 6 08:47:40 tecate monit[9988]: monit daemon at
9988 awakened 
 Dec 6 08:48:09 tecate monit[9988]: 'apache' failed to
start 
 Dec 6 08:48:09 tecate monit[9988]: 'ospfd' start:
/etc/init.d/ospfd 
 Dec 6 08:48:09 tecate monit[9988]: 'ospfd' unmonitor
action done 
 Dec 6 08:48:09 tecate monit[9988]: Awakened by User
defined signal 1 

The biggest problem is when this happens it leaves
ospfd running even if apache isn't. Martin commented that dependencies
are "soft", they define the start/stop order but don't wait for the
parent to recover before starting the dependent service. 

I'm going to
take a look at the code today, the problem I'm seeing right now looks
like a race condition. My guess is that it when I call "monit stop
ospfd" it hasn't yet marked apache as not existing, so the "if does not
exist" block is being executed again and again and again. 

Here is the
config I am working with now: 

check process apache with pidfile
/var/run/httpd.pid
 start program = "/etc/init.d/httpd start"
 stop
program = "/etc/init.d/httpd stop"
 if does not exist
 then exec
"/usr/bin/monit stop ospfd"
 else if recovered then exec "/usr/bin/monit
monitor ospfd"
 if failed host localhost port 80 protocol http
 and
request "/" then restart
 if children > 50 then restart
 if 2 restarts
within 2 cycles then timeout
 group server

check process ospfd with
pidfile /var/run/quagga/ospfd.pid
 start program = "/etc/init.d/ospfd
start"
 stop program = "/etc/init.d/ospfd stop"
 group network

On
08.12.2011 00:10, Eric Pailleau wrote: 

> Hello,
> did you simply try
this ?
> 
> ---8 50 then restart
> if 2 restarts within 2 cycles then
timeout
> group server
> depends on tomcat
> check process ospfd with
pidfile /var/run/quagga/ospfd.pid
> start program = "/etc/init.d/ospfd
start"
> stop program = "/etc/init.d/ospfd stop"
> depends on apache
>
depends on fcserver
> depends on mysql
> depends on tomcat
> group
network
> ---8Taking out the depends doesn't make a difference, it still
stays in that loop where it is spewing to the logs.
> 
> I'm off-site
today, I'll look at this more tomorrow morning when I can pay attention
to it rather than to the lecture I'm supposed to be listening to. :-)
>

> On 07.12.2011 13:13, Martin Pala wrote:
> 
>> Yes, it Eric is
correct. The "monit stop…" in the exec action cannot be combined in this
case with the "depends on…"

-- 

Dan Rich 

http://www.employees.org/~drich/ [1]
 "Step up to red alert!" "Are you
sure, sir?
 It means changing the bulb in the sign..."
 - Red Dwarf
(BBC)   

Links:
------
[1] http://www.employees.org/%7Edrich/
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to