Using the patch I posted a little while ago to reduce pg_ctl's reaction
time, I noticed that there were semi-reproducible ten-second delays in
postmaster shutdown, in some cases in the recovery tests where we quickly
stop and restart and again stop the postmaster.  The cause turned out to
be that the autovacuum launcher is failing to exit until after it times
out trying to load the global stats file.  It goes like this:

* at startup, the stats collector reads and deletes the permanent
stats files, which have stats_timestamp equal to when they were written
during the prior shutdown.

* the autovac launcher tries to read the temp stats file.  There isn't
any, so it sends an inquiry message with cutoff_time equal to the current
time less PGSTAT_STAT_INTERVAL.

* the collector notes that the cutoff_time is before what it has recorded
as the stats_timestamp, and does nothing.

* the autovac launcher keeps waiting, and occasionally retransmitting
the inquiry message.  The collector keeps ignoring it.

* when the postmaster sends SIGTERM to the autovac launcher, that isn't
enough to break it out of the loop in backend_read_statsfile.  So only
after PGSTAT_MAX_WAIT_TIME (10s) does the loop give up, and only after
that does the autovac launcher realize it should exit.

We don't usually see this in the regression tests without the pg_ctl
patch, because without that, we typically spend enough time twiddling
our thumbs after the first postmaster shutdown for PGSTAT_STAT_INTERVAL
to elapse, so that the reloaded stats files appear stale already.

So this opens a number of issues:

* The stats collector, when reading in the permanent stats files,
ought not believe that their timestamps have anything to do with
when it needs to write out temp stats files.  This seems like an
ancient thinko dating back to when we introduced a separation
between temp and permanent stats files.  It seems easily fixable,
though, as in the attached patch.

* It's not great that the autovac launcher's SIGTERM handler doesn't
trigger query cancel handling; if it did, that would have allowed the
CHECK_FOR_INTERRUPTS in backend_read_statsfile to get us out of this
situation.  But I'm not (yet) sufficiently enthused about this to do
the research needed on whether it'd be safe.

* Perhaps it's not such a great idea that the loop in
backend_read_statsfile doesn't advance min_ts from its initial value.
If it had done so, that would've triggered a stats file update after at
most PGSTAT_STAT_INTERVAL, getting us out of this situation a lot sooner.
But that behavior is quite intentional right now, and again I'm lacking
the energy to investigate what would be the downsides of changing it.

Anyway, I present the attached proposed patch, which fixes this
problem by zeroing the stats_timestamp fields when the collector
reads them.  (The same code is also used in backends and the autovac
launcher to read temp stats files, and there we don't want to reset
the fields.)  I also noticed that the code wasn't being careful to
reset globalStats and archiverStats to zeroes if they were partially
filled from a corrupted stats file, so this deals with that too.

This seems like a backpatchable bug fix to me; any objections?

                        regards, tom lane

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 65b7b32..a0b0eec 100644
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
*************** pgstat_read_statsfiles(Oid onlydb, bool 
*** 4947,4962 ****
--- 4947,4974 ----
  	{
  		ereport(pgStatRunningInCollector ? LOG : WARNING,
  				(errmsg("corrupted statistics file \"%s\"", statfile)));
+ 		memset(&globalStats, 0, sizeof(globalStats));
  		goto done;
  	}
  
  	/*
+ 	 * In the collector, disregard the timestamp we read from the permanent
+ 	 * stats file; we should be willing to write a temp stats file immediately
+ 	 * upon the first request from any backend.  This only matters if the old
+ 	 * file's timestamp is less than PGSTAT_STAT_INTERVAL ago, but that's not
+ 	 * an unusual scenario.
+ 	 */
+ 	if (pgStatRunningInCollector)
+ 		globalStats.stats_timestamp = 0;
+ 
+ 	/*
  	 * Read archiver stats struct
  	 */
  	if (fread(&archiverStats, 1, sizeof(archiverStats), fpin) != sizeof(archiverStats))
  	{
  		ereport(pgStatRunningInCollector ? LOG : WARNING,
  				(errmsg("corrupted statistics file \"%s\"", statfile)));
+ 		memset(&archiverStats, 0, sizeof(archiverStats));
  		goto done;
  	}
  
*************** pgstat_read_statsfiles(Oid onlydb, bool 
*** 5002,5007 ****
--- 5014,5028 ----
  				dbentry->functions = NULL;
  
  				/*
+ 				 * In the collector, disregard the timestamp we read from the
+ 				 * permanent stats file; we should be willing to write a temp
+ 				 * stats file immediately upon the first request from any
+ 				 * backend.
+ 				 */
+ 				if (pgStatRunningInCollector)
+ 					dbentry->stats_timestamp = 0;
+ 
+ 				/*
  				 * Don't create tables/functions hashtables for uninteresting
  				 * databases.
  				 */
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to