On 09/01/2011 09:40 AM, Götz Reinicke wrote:
> Am 01.09.11 15:08, schrieb Prentice Bisbal:
>> On 09/01/2011 08:36 AM, Götz Reinicke wrote:
>>> Hi,
>>>
>>> recently I updated our ldapd on our RH EL 6.1 to the most recent version
>>> openldap-2.4.23-15.el6_1.1.x86_64 (from 2.4.19-15)
>>>
>>> Since than the deamon died twice in the middle of the night, leaving no
>>> traces to me why.
>>>
>>> The 2.4.19-15-version never died ...
>>>
>>
>> I can't offer any advice as to why it died, but if you haven't done so,
>> I recommend creating a 'watchdog' script that checks to make sure
>> 'slapd' is running, and if it isn't, restart it. Run it from cron every
>> couple of minutes and have it e-mail you every time it needs to restart
>> slapd.
>>
>> This will protect your sanity and avoid phone calls in the middle of the
>> night, as well as automatically collect statistics on how often and when
>> it's dying, which may help your correlate it to another event occuring
>> at the same time which is the root cause. If your watchdog script runs
>> frequently enough, users might not even notice it's down.
> 
> Thanks for your suggestion, may be you could give me a hint on how to
> set this up? In the example config, there is a check for an existing
> .pid-file.
> 
> That would not work in my case, as the pid file is still there, but the
> slapd-process died.
> 


Use pgrep to check the output of ps, like this, and then check the exit
value returned by pgrep. something like this. (Do not copy exactly - not
tested, sure to have syntax errors. Now warranties implied, etc)

#!/bin/bash

pgrep slapd
retval=$?

if [ $retval != 0 ]; then
        # remove PID file
        rm -rf /path/to/pid
        # restart slapd
        service ldap start
        echo "LDAP server restarted at $(date)" | mail -s "LDAP restarted"
[email protected]
fi

The exit statuses of pgrep are documented in the man page:


 EXIT STATUS
       0      One or more processes matched the criteria.

       1      No processes matched.

       2      Syntax error in the command line.

       3      Fatal error: out of memory etc.


--
PRentice



_______________________________________________
rhelv6-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv6-list

Reply via email to