When you look more closely what killproc() in Centos6 init script actually
does, then you probably spot this:
# TERM first, then KILL if not dead
kill -TERM $pid >/dev/null 2>&1
usleep 100000
if checkpid $pid ; then
try=0
while [ $try -lt $delay ] ; do
checkpid $pid || break
sleep 1
let try+=1
done
if checkpid $pid ; then
kill -KILL $pid >/dev/null 2>&1
usleep 100000
fi
I think it pretty much explains what is happening, and confirms my fears
about TERM followed by KILL. Since these two signals are separated by 3
second interval (see the 'delay' variable in killproc()), sec is killed if
the database saving procedure lasts for more than 3 seconds. So to resolve
this issue, the simplest is to edit
the init script to use 'kill -TERM <sec_pid>' instead of killproc(). If
possible, you could also try to make the database saving procedure more
efficient, since 3 seconds is a lot of time.
kind regards,
risto
2015-10-21 23:00 GMT+03:00 Bond Masuda <[email protected]>:
>
>
> On 10/21/2015 12:54 PM, Risto Vaarandi wrote:
>
> hi Bond,
>
> there is no time limit for the shutdown procedure. In fact, since sec is a
> single-threaded tool, it would be impossible to impose such a timeout. In
> your rule example, the execution of the 'action' field prevents sec from
> doing anything else, and since your 'action' field does not seem to contain
> any actions that would fork background processes, the entire action list is
> executed before sec can continue with other activities.
>
> The only timeout that sec applies is for child processes which are running
> at the moment of termination. The logic works as follows -- firstly, sec
> processes the SEC_SHUTDOWN event (that would also include your rule), then
> the sec process will sleep for 3 seconds, and finally the TERM signal will
> be sent to all child processes and sec will call exit(0). However, since
> the database disconnect is not done in a child process, the 3 second
> timeout has no effect to your rule.
>
> What I am suspecting is one of the following:
> 1) the SEC_SHUTDOWN event does not reach your rule under some
> circumstances (there might be a preceding rule in your rule sequence which
> produces occasional matches),
>
>
> Ok, thank you for the suggestion. I'm going to look into this.
>
> 2) the TERM signal is sent by a script or application which delivers the
> KILL signal to the sec process, once it has discovered after couple of
> seconds that the sec process is still running.
>
> If you want to check scenario 1, you start the action list with 'logonly'
> statement and see if this produces a message about the start of execution.
> Just out of curiosity -- how exactly is the TERM signal delivered to the
> sec process?
>
>
> I'm on centos-6, and i run 'service sec stop', which has this content in
> the init script:
>
> stop() {
> echo -n $"Stopping $prog: "
> killproc $prog
> RETVAL=$?
> echo
> [ $RETVAL -eq 0 ] && rm -f $lockfile
> return $RETVAL
> }
>
> where killproc is defined in /etc/rc.d/init.d/functions:
>
> killproc() {
> local RC killlevel= base pid pid_file= delay try binary=
>
> RC=0; delay=3; try=0
> # Test syntax.
> if [ "$#" -eq 0 ]; then
> echo $"Usage: killproc [-p pidfile] [ -d delay] {program}
> [-signal]"
> return 1
> fi
> if [ "$1" = "-p" ]; then
> pid_file=$2
> shift 2
> fi
> if [ "$1" = "-b" ]; then
> if [ -z $pid_file ]; then
> echo $"-b option can be used only with -p"
> echo $"Usage: killproc -p pidfile -b binary program"
> return 1
> fi
> binary=$2
> shift 2
> fi
> if [ "$1" = "-d" ]; then
> delay=$(echo $2 | awk -v RS=' ' -v IGNORECASE=1
> '{if($1!~/^[0-9.]+[smhd]
> ?$/) exit
> 1;d=$1~/s$|^[0-9.]*$/?1:$1~/m$/?60:$1~/h$/?60*60:$1~/d$/?24*60*60:-1;if(d==-1)
> exit 1;delay+=d*$1} END {printf("%d",delay+0.5)}')
> if [ "$?" -eq 1 ]; then
> echo $"Usage: killproc [-p pidfile] [ -d delay] {program}
> [-sign
> al]"
> return 1
> fi
> shift 2
> fi
>
>
> # check for second arg to be kill level
> [ -n "${2:-}" ] && killlevel=$2
>
> # Save basename.
> base=${1##*/}
>
> # Find pid.
> __pids_var_run "$1" "$pid_file" "$binary"
> RC=$?
> if [ -z "$pid" ]; then
> if [ -z "$pid_file" ]; then
> pid="$(__pids_pidof "$1")"
> else
> [ "$RC" = "4" ] && { failure $"$base shutdown" ; return $RC ;}
> fi
> fi
>
> # Kill it.
> if [ -n "$pid" ] ; then
> [ "$BOOTUP" = "verbose" -a -z "${LSB:-}" ] && echo -n
> "$base "
> if [ -z "$killlevel" ] ; then
> if checkpid $pid 2>&1; then
> # TERM first, then KILL if not dead
> kill -TERM $pid >/dev/null 2>&1
> usleep 100000
> if checkpid $pid ; then
> try=0
> while [ $try -lt $delay ] ; do
> checkpid $pid || break
> sleep 1
> let try+=1
> done
> if checkpid $pid ; then
> kill -KILL $pid >/dev/null 2>&1
> usleep 100000
> fi
> fi
> fi
> checkpid $pid
> RC=$?
> [ "$RC" -eq 0 ] && failure $"$base shutdown" || success
> $"$base
> shutdown"
> RC=$((! $RC))
> # use specified level only
> else
> if checkpid $pid; then
> kill $killlevel $pid >/dev/null 2>&1
> RC=$?
> [ "$RC" -eq 0 ] && success $"$base $killlevel" || failur
> e $"$base $killlevel"
> elif [ -n "${LSB:-}" ]; then
> RC=7 # Program is not running
> fi
> fi
> else
> if [ -n "${LSB:-}" -a -n "$killlevel" ]; then
> RC=7 # Program is not running
> else
> failure $"$base shutdown"
> RC=0
> fi
> fi
>
> # Remove pid file if any.
> if [ -z "$killlevel" ]; then
> rm -f "${pid_file:-/var/run/$base.pid}"
> fi
> return $RC
> }
>
> Thank you Risto.
> Bond
>
>
> kind regards,
> risto
>
>
> 2015-10-21 22:23 GMT+03:00 Bond Masuda <[email protected]>:
>
>> In my SEC rule set, I am using an SQLite in-memory database to cache
>> data. When I shutdown SEC, I save this sqlite database to disk and
>> reload it into memory when SEC starts.
>>
>> I've now observed several times, and it seems to be when the database is
>> large, that the save to disk procedure during SEC_SHUTDOWN doesn't
>> complete. In fact, I try to log messages so I have an idea of success or
>> failure of the $dbh->sqlite_backup_to_file() call; and I sometimes get
>> neither success nor failure log messages; SEC just shuts down. Here is
>> the log when this happens:
>>
>> Wed Oct 21 15:00:50 2015: SIGTERM received: shutting down SEC
>>
>> This is what I expect, and when it works normally:
>>
>> Tue Oct 20 22:28:24 2015: SIGTERM received: shutting down SEC
>> Tue Oct 20 22:28:26 2015: INFO: database saved to disk on attempt 1.
>> Tue Oct 20 22:28:26 2015: INFO: database disconnect successful.
>>
>> This is my rule during SEC_SHUTDOWN:
>>
>> # save database to disk
>> type=Single
>> ptype=SubStr
>> pattern=SEC_SHUTDOWN
>> context=[SEC_INTERNAL_EVENT]
>> continue=TakeNext
>> desc=Save database to disk
>> action= lcall %ret -> ( sub{ \
>> my $db_backup = '/var/lib/sec/cache.sqlite3'; \
>> my $tries = 0; \
>> my $ret; \
>> my $msg; \
>> my @return; \
>> do{ \
>> $ret = $dbh->sqlite_backup_to_file($db_backup); \
>> $tries++; \
>> } until ( $ret && ($tries <= 5) ); \
>> push(@return,$ret); \
>> if( $ret == 1 ){ \
>> $msg = "database saved to disk on attempt $tries."; \
>> } else { \
>> $msg = $DBI::errstr; \
>> } \
>> push(@return,$msg); \
>> return @return; \
>> } ); \
>> lcall %is_success %ret -> ( sub{ \
>> my ($rc, $msg) = split(/\n/,$_[0]); \
>> return $rc; \
>> } ); \
>> lcall %msg %ret -> ( sub{ \
>> my ($rc, $msg) = split(/\n/,$_[0]); \
>> return $msg; \
>> } ); \
>> if %is_success ( logonly INFO: %msg ) \
>> else ( logonly CRIT: database failed to save to disk ); \
>> lcall %ret -> ( sub{ \
>> my $ret = $dbh->disconnect(); \
>> return $ret; \
>> } ); \
>> if %ret ( logonly INFO: database disconnect successful. ) \
>> else ( logonly CRIT: database disconnect failed. )
>>
>> As you can see above, either success or failure should log a message,
>> but when this problem occurs, I get nothing. So, I'm wondering if during
>> SEC shutdown, is there a time limit on how long the shutdown procedure
>> has before it just exits completely? I wonder if when the database is
>> large, that the save to disk procedure takes too long and SEC just
>> exists without allowing it to complete? Is this possible?
>>
>> Thanks
>> Bond
>>
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Simple-evcorr-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users
>>
>
>
>
------------------------------------------------------------------------------
_______________________________________________
Simple-evcorr-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users