On Thu, Jun 9, 2011 at 5:14 AM, Daniel Farina <dan...@heroku.com> wrote: > The first fact is that turning off hot standby will let the cluster > start up, but only after seeing a spate of messages like these (dozen > or dozens, not thousands): > > 2011-06-09 08:02:32 UTC LOG: restored log file > "000000020000002C000000C0" from archive > 2011-06-09 08:02:33 UTC WARNING: xlog min recovery request > 2C/C1F09658 is past current point 2C/C037B278 > 2011-06-09 08:02:33 UTC CONTEXT: writing block 0 of relation > base/16385/16784_vm > xlog redo insert: rel 1663/16385/128029; tid 114321/63 > 2011-06-09 08:02:33 UTC LOG: restartpoint starting: xlog > > Most importantly, *all* such messages are in visibility map forks > (_vm). I reasonably confident that my code does not start reading > data until pg_start_backup() has returned, and blocks on > pg_stop_backup() after having read all the data. Also, the mailing > list correspondence at > http://archives.postgresql.org/pgsql-hackers/2010-11/msg02034.php > suggests that the visibility map is not flushed at checkpoints, so > perhaps with some poor timing an old page can wander onto disk even > after a checkpoint barrier that pg_start_backup waits for. (I have not > yet found the critical section that makes visibilitymap buffers immune > to checkpoint though).
I don't believe there's anything that makes visibilitymap buffers immune to checkpoint. I might be wrong. You could probably find out by inserting a few lines of code into the buffer manager to inspect each buffer tag as the buffer is written out and elog() if the fork number is VISIBILITYMAP_FORKNUM. There is some weirdness around how the LSNs of visibility map pages are set, however. See visibilitymap_set()/clear(). Clearing a visibility map bit is done without touching the LSN at all; setting the bit advances the LSN to that of the heap page, if it currently precedes that value. It's not obvious to me whether or how that's related. I'm a bit puzzled by the warning message itself, too. I think that in order to generate that warning, you need to try to flush a page to disk whose LSN is higher than the current minimum recovery point. And the minimum recovery point is initially set to point to the redo pointer from the last checkpoint. But during an online backup, it's expected that there may be pages on disk with LSNs higher than the redo pointer from the last checkpoint. So actually now I'm not sure why people don't see this warning whenever a lengthy recovery is required to reach consitency, especially with small values of shared_buffers. I'm probably missing something here... One thing that might make it easier to understand what's going on here is to crank the log level up to DEBUG2 and post the full output from a recovery that exhibits this problem, rather than just a few lines. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers