On 6 October 2015 at 22:16, Tom Lane <t...@sss.pgh.pa.us> wrote: > Perform an immediate shutdown if the postmaster.pid file is removed. > > The postmaster now checks every minute or so (worst case, at most two > minutes) that postmaster.pid is still there and still contains its own PID. > If not, it performs an immediate shutdown, as though it had received > SIGQUIT. > > The original goal behind this change was to ensure that failed buildfarm > runs would get fully cleaned up, even if the test scripts had left a > postmaster running, which is not an infrequent occurrence. When the > buildfarm script removes a test postmaster's $PGDATA directory, its next > check on postmaster.pid will fail and cause it to exit. Previously, manual > intervention was often needed to get rid of such orphaned postmasters, > since they'd block new test postmasters from obtaining the expected socket > address. > > However, by checking postmaster.pid and not something else, we can provide > additional robustness: manual removal of postmaster.pid is a frequent DBA > mistake, and now we can at least limit the damage that will ensue if a new > postmaster is started while the old one is still alive. > > Back-patch to all supported branches, since we won't get the desired > improvement in buildfarm reliability otherwise. > > Branch > ------ > REL9_3_STABLE > > Details > ------- > http://git.postgresql.org/pg/commitdiff/31bc563b9be306623c5e9a52816b432945fa6df9 > > Modified Files > -------------- > src/backend/postmaster/postmaster.c | 52 ++++++++++++++++++++------ > src/backend/utils/init/miscinit.c | 70 +++++++++++++++++++++++++++++++++++ > src/include/miscadmin.h | 1 + > 3 files changed, 112 insertions(+), 11 deletions(-)
The log contains a misleading output following the removal of the pid file: 2015-10-09 15:39:32 BST [31507]: [4-1] user=,db=,client= LOG: could not open file "postmaster.pid": No such file or directory 2015-10-09 15:39:32 BST [31507]: [5-1] user=,db=,client= LOG: performing immediate shutdown because data directory lock file is invalid 2015-10-09 15:39:32 BST [31507]: [6-1] user=,db=,client= LOG: received immediate shutdown request 2015-10-09 15:39:32 BST [31556]: [1-1] user=,db=,client= WARNING: terminating connection because of crash of another server process 2015-10-09 15:39:32 BST [31556]: [2-1] user=,db=,client= DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2015-10-09 15:39:32 BST [31556]: [3-1] user=,db=,client= HINT: In a moment you should be able to reconnect to the database and repeat your command. Is this anything we need to worry about? -- Thom -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers