Tom,

Let me clarify.... I was meant shutdown in the context of issuing a stop against postgres not shutting down the OS. Sorry if I am confusing things.

The scripts we are using to issue start, stop etc for postgres seem to be causing the issue. I changed the config to use timestamps in the log and the act of stopping and starting the server caused the same error to occur. :-(

From the scripts we are using:

StartService ()
{
       if [ "${POSTGRES:=-YES-}" = "-YES-" ]; then

ConsoleMessage "Starting PostgreSQL database server"
su - postgres -c '/usr/local/pgsql/bin/pg_ctl start -D /usr/local/pgsql/data -l /usr/local/pgsql/data/logfile -o -i'


       fi
}

StopService()
{
       ConsoleMessage "Stopping PostgreSQL database services"
       /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data
       x=`/bin/ps axc | /usr/bin/grep postgres`
       if /bin/test "$x"
       then
               set $x
               kill -9 $x
       fi
}


Thanks.


--sean


Tom Lane wrote:


"Ed L." <[EMAIL PROTECTED]> writes:


Uh, no, I didn't say signal 9 is SIGTERM. Isn't a "smart" shutdown request an indication of a SIGTERM? I'm just speculating about what happened, but isn't that what you'd see during a system shutdown? The kernel sending SIGTERMs?



Yes, the trace is sort of consistent with the idea of a system shutdown: you'd see SIGTERMs issued, followed some time later by SIGKILL. I thought Sean had said that the machine did not shut down during this interval, and so mentally eliminated that theory --- but based on his latest comment I guess that is what happened after all.

So that does leave me with a question: why didn't it work more cleanly?
Our signal responses are designed around the assumption that during
shutdown the kernel will send SIGTERM to *all* the Postgres processes.
Backends interpret that as an immediate shutdown and should exit quickly
enough to avoid getting SIGKILL'd later.  It looks like either the
postmaster was sent SIGTERM but the backends weren't, or the interval
between SIGTERM and SIGKILL was unreasonably short.  I don't think I
believe the latter; the last time I checked this on Darwin, it seemed to
be using the traditional 20-second grace period.

Another question: if that was a shutdown we were looking at, how did the
postmaster live long enough to record the final log lines?  It shoulda
gotten SIGKILL'd at the same time as its children.

In short, there's something pretty odd about the way these signals are
being passed around.  It looks something like a standard system shutdown
sequence, but not enough like it.

regards, tom lane




---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend

Reply via email to