Hi,

On 09/07/2010 05:17 PM, Tom Lane wrote:
Oh yes it is.  If the slave replays WAL that didn't happen on the
master, it might for instance have heap tuples in TID slots that are
empty on the master, or index pages laid out differently from the
master.  Trying to apply additional WAL from the master will fail badly.

Sure. Reverting to the master's state would be required to be able to safely proceed. Granted, that's far from simple.

Robert's argument about read queries on the standby convinced me, that you always need to recover to the node with the newest transactions applied (i.e. better advance rather than revert). Making sure the standby can't ever be ahead of the master node certainly is the simplest way to guarantee that. At its cost for normal operation, though.

How about a master failure which leads to a fail-over, immediately followed by a failure of that former standby (and now a master)? The old master might then be in the very same situation: having WAL applied that the new master doesn't. Do we require former masters to fetch a base backup? How does it know the difference, once it gets back up?

We can *not* allow the slave to replay WAL ahead of what is known
committed to disk on the master.  The only way to make that safe
is the compare-notes-and-ship-WAL-back approach that Robert mentioned.

Agreed.

(And it's worth pointing out that this approach has a pretty nasty requirement for a full-cluster crash: all nodes that were synchronously replicated to need to come back up after such a crash, so as to be able to reliably determine which has the newest transaction).

If you feel that decoupling WAL application is absolutely essential
to have a credible feature, then you'd better bite the bullet and
start working on the ship-WAL-back code.

My feeling is that WAL is the wrong format to do replication. But that's a another story.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to