Ants Aasma <a...@cybertec.at> writes: > We already rely on WAL-before-data to ensure correct recovery. What is > proposed here is to slightly redefine it to require WAL to be > replicated before it is considered to be flushed. This ensures that no > data page on disk differs from the WAL that the slave has. The > machinery to do this is already mostly there, we already wait for WAL > flushes and we know the write location on the slave. The second > requirement is that we never start up as master and we don't trust any > local WAL. This is actually how pacemaker clusters work, you would > only need to amend the RA to wipe the WAL and configure postgresql > with restart_after_crash = false.
> It would be very helpful in restoring HA capability after failover if > we wouldn't have to read through the whole database after a VM goes > down and is migrated with the shared disk onto a new host. The problem with this is it's making an idealistic assumption that a crashed master didn't do anything wrong or lose/corrupt any data during its crash. As soon as you realize that's an unsafe assumption, the whole thing becomes worthless to you. If the idea had zero implementation cost, I would say "sure, let people play with it until they find out (probably the hard way) that it's a bad idea". But it's going to introduce, at the very least, additional complexity into a portion of the system that is critical and plenty complicated enough already. That being the case, I don't want it there at all, not even as an option. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers