On 14.06.2013 12:11, Samrat Revagade wrote:
We have already started a discussion on pgsql-hackers for the problem of
taking fresh backup during the failback operation here is the link for that:

http://www.postgresql.org/message-id/caf8q-gxg3pqtf71nvece-6ozraew5pwhk7yqtbjgwrfu513...@mail.gmail.com

Let me again summarize the problem we are trying to address.

When the master fails, last few WAL files may not reach the standby. But
the master may have gone ahead and made changes to its local file system
after flushing WAL to the local storage.  So master contains some file
system level changes that standby does not have.  At this point, the data
directory of master is ahead of standby's data directory.

Subsequently, the standby will be promoted as new master.  Later when the
old master wants to be a standby of the new master, it can't just join the
setup since there is inconsistency in between these two servers. We need to
take the fresh backup from the new master.  This can happen in both the
synchronous as well as asynchronous replication.

Did you see the thread on the little tool I wrote called pg_rewind?

http://www.postgresql.org/message-id/519df910.4020...@vmware.com

It solves that problem, for both clean and unexpected shutdown. It needs some more work and a lot more testing, but requires no changes to the backend. Robert Haas pointed out in that thread that it has a problem with hint bits that are not WAL-logged, but it will still work if you also enable the new checksums feature, which forces hint bit updates to be WAL-logged. Perhaps we could add a GUC to enable hint bits to be WAL-logged, regardless of checksums, to make pg_rewind work.

I think that's a more flexible approach to solve this problem. It doesn't require an online feedback loop from the standby to master, for starters.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to