On 14.06.2013 12:11, Samrat Revagade wrote:
We have already started a discussion on pgsql-hackers for the problem of
taking fresh backup during the failback operation here is the link for that:
http://www.postgresql.org/message-id/caf8q-gxg3pqtf71nvece-6ozraew5pwhk7yqtbjgwrfu513...@mail.gmail.com
Let me again summarize the problem we are trying to address.
When the master fails, last few WAL files may not reach the standby. But
the master may have gone ahead and made changes to its local file system
after flushing WAL to the local storage. So master contains some file
system level changes that standby does not have. At this point, the data
directory of master is ahead of standby's data directory.
Subsequently, the standby will be promoted as new master. Later when the
old master wants to be a standby of the new master, it can't just join the
setup since there is inconsistency in between these two servers. We need to
take the fresh backup from the new master. This can happen in both the
synchronous as well as asynchronous replication.
Did you see the thread on the little tool I wrote called pg_rewind?
http://www.postgresql.org/message-id/519df910.4020...@vmware.com
It solves that problem, for both clean and unexpected shutdown. It needs
some more work and a lot more testing, but requires no changes to the
backend. Robert Haas pointed out in that thread that it has a problem
with hint bits that are not WAL-logged, but it will still work if you
also enable the new checksums feature, which forces hint bit updates to
be WAL-logged. Perhaps we could add a GUC to enable hint bits to be
WAL-logged, regardless of checksums, to make pg_rewind work.
I think that's a more flexible approach to solve this problem. It
doesn't require an online feedback loop from the standby to master, for
starters.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers