Re: [HACKERS] [bug fix] "pg_ctl stop" times out when it should respond quickly

Tom Lane Tue, 03 Dec 2013 14:36:21 -0800

"MauMau" <[email protected]> writes:
> The problem occurs in the sequence below:


> 1. postmaster creates $PGDATA/postmaster.pid.
> 2. postmaster tries to resolve the value of listen_addresses to IP 
> addresses.  This took about 15 seconds in my failure scenario.
> 3. During 2, pg_ctl sends SIGTERM to postmaster.
> 4. postmaster terminates immediately without deleting 
> $PGDATA/postmaster.pid.  This is because it hasn't set signal handlers yet.
> 5. "pg_ctl stop" waits in a loop until $PGDATA/postmaster.pid disappears. 
> But the file does not disappear and it times out.

Hm.  I wonder if we shouldn't block SIGTERM etc. earlier.  It hardly seems
improbable that such signals would arrive during a slow startup.

> *** 907,913 ****
>   
>               for (cnt = 0; cnt < wait_seconds; cnt++)
>               {
> !                     if ((pid = get_pgpid()) != 0)
>                       {
>                               print_msg(".");
>                               pg_usleep(1000000);             /* 1 sec */
> --- 907,914 ----
>   
>               for (cnt = 0; cnt < wait_seconds; cnt++)
>               {
> !                     if ((pid = get_pgpid()) != 0 &&
> !                             postmaster_is_alive((pid_t) pid))
>                       {
>                               print_msg(".");
>                               pg_usleep(1000000);             /* 1 sec */

If you're going to do a postmaster_is_alive check, why bother with
repeated get_pgpid()?

I think the reason why it was coded like that was that we hadn't written
postmaster_is_alive() yet, or maybe we had but didn't want to trust it.
However, with the coding you have here, we're fully exposed to any failure
modes postmaster_is_alive() may have; so there's not a lot of value in
accepting those and get_pgpid's failure modes too.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] "pg_ctl stop" times out when it should respond quickly

Reply via email to