[HACKERS] Avoiding shutdown checkpoint at failover

Simon Riggs Tue, 01 Nov 2011 05:12:08 -0700

When a server fails, we need to promote a standby as quickly as possible.

Currently when we promote a standby to a primary we need to run a
shutdown checkpoint before users can begin write transactions, which
in many cases can take minutes.


The reason we run a shutdown checkpoint is to prevent needing to
re-enter recovery if we crash after promotion. When we only had file
based replication, all WAL files were reloaded from archive each time,
so the restartpoint prior to the end of recovery was not guaranteed to
be available in pg_xlog. Once we had exited archive recovery it would
be difficult to re-access the archive.

Now with streaming replication, we keep the WAL files in pg_xlog
directly, so the last restartpoint is always available if we should
crash.

So if streaming replication is active at the point we promote, then we
can skip the shutdown checkpoint. It's that simple.

To make it even simpler, I suggest we also change file de-archiving so
that it writes normal WAL files, not RECOVERYXLOG, so that way we can
avoid the checkpoint in all cases.

There are comments saying we can only increment a timeline via a
shutdown checkpoint, but if we were smart we'd have noticed the
timeline change via the WAL file numbering anyway. Best way seems to
be to have a XLOG_TIMELINE_CHANGE record written instead of the
shutdown checkpoint.

When I say skip the shutdown checkpoint, I mean remove it from the
critical path of required actions at the end of recovery. We can still
have a normal checkpoint kicked off at that time, but that no longer
needs to be on the critical path.

Any problems foreseen? If not, looks like a quick patch.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Avoiding shutdown checkpoint at failover

Reply via email to