On Sun, Sep 2, 2012 at 4:12 PM, Bruce Momjian <br...@momjian.us> wrote: > Do we ever want to document a way to connect slaves to a new master, > rather than recreating the slave?
I have written an instruction for myself and I am planning to publish it on http://code.google.com/p/pgcookbook/. See the attachment. Hope you will find it useful. If anybody would like to provide any criticism I will highly appreciate it. > > --------------------------------------------------------------------------- > > On Tue, Mar 27, 2012 at 10:47:48AM -0700, Ken Brush wrote: >> Hello everyone, >> >> I notice that the documentation at: >> http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial >> >> Doesn't contain steps in a Multiple Slave setup for re-establishing >> them after a slave has become the new master. >> >> Based on the documentation, here are the most fail-proof steps I came up >> with: >> >> 1. Master dies :( >> 2. Touch the trigger file on the most caught up slave. >> 3. Slave is now the new master :) >> 4. use pg_basebackup or other binary replication trick (rsync, tar >> over ssh, etc...) to bring the other slaves up to speed with the new >> master. >> 5. start the other slaves pointing to the new master. >> >> But, that can take time (about 1-2 hours) with my medium sized DB >> (580GB currently). >> >> After testing a few different ideas that I gleaned from posts on the >> mail list, I came up with this alternative method: >> >> 1. Master dies :( >> 2. Touch the trigger file on the most caught up slave >> 3. Slave is now the new master. >> 4. On the other slaves do the following: >> 5. Shutdown postgres on the slave >> 6. Delete every file in /data/pgsql/data/pg_xlog >> 7. Modify the recovery.conf file to point to the new master and >> include the line "recovery_target_timeline='latest'" >> 8. Copy the history file from the new master to the slave (it's the >> most recent #.history file in the xlog directory) >> 9. Startup postgres on the slave and watch it sync up to the new >> master (about 1-5 minutes usually) >> >> My question is this. Is the alternative method adequate? I tested it a >> bit and couldn't find any problems with data loss or inconsistency. >> >> I still use the fail-proof method above to re-incorporate the old >> master as a new slave. >> >> Sincerely, >> -Ken >> >> -- >> Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > -- > Bruce Momjian <br...@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sergey Konoplev a database and software architect http://www.linkedin.com/in/grayhemp Jabber: gray...@gmail.com Skype: gray-hemp Phone: +79160686204
Streaming replication based failover Let us suppose that there is a hot standby replication set up in a cluster. The db1 server is a master and dbX are replicas. Also suppose that we need to do a failover on one of our slaves. First we need to define what replica will be a new master. In case of the master failure you need to find the most caught up replica. To do this compare WAL replay locations on replicas and chose the one with the biggest value. postgres=# SELECT pg_last_xlog_replay_location(); pg_last_xlog_replay_location ------------------------------ BAA/37DA2888 (1 row) Note that If you chose not the most caught up one than other replicas that have replayed later WAL entries must be reconfigured from scratch. Otherwise their data can be corrupted and you will not get any warnings about it. In case of the planned switchover choose one that will be a new master on your own. Then stop all the slaves except the new master. Let say the remaining slave is db2. Use the command below to help the master and the remaining slave get ahead of other slaves if you are not sure that they already are. The command creates a minimal WAL entry. postgres=# SELECT txid_current(); Now touch the failover file on the remaining slave to promote it as a new master. postgres@db2: ~ $ touch /path/to/failover And you will see similar things in logs. 2011-04-05 11:46:48 MSD @ 69971 [4d9ac05d.11153-5]:LOG: trigger file found: /db/data/failover 2011-04-05 11:46:48 MSD @ 69974 [4d9ac05d.11156-2]:FATAL: terminating walreceiver process due to administrator command 2011-04-05 11:46:48 MSD @ 69971 [4d9ac05d.11153-6]:LOG: redo done at 0/4012A68 2011-04-05 11:46:48 MSD @ 69971 [4d9ac05d.11153-7]:LOG: last completed transaction was at log time 2011-04-05 11:41:29.199406+04 2011-04-05 11:46:48 MSD @ 69971 [4d9ac05d.11153-8]:LOG: selected new timeline ID: 2 2011-04-05 11:46:49 MSD @ 69971 [4d9ac05d.11153-9]:LOG: archive recovery complete 2011-04-05 11:46:49 MSD @ 69972 [4d9ac05d.11154-1]:LOG: checkpoint starting: end-of-recovery immediate wait 2011-04-05 11:46:49 MSD @ 69972 [4d9ac05d.11154-2]:LOG: checkpoint complete: wrote 22 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 3 recycled; write=0.010 s, sync=0.044 s, total=0.060 s 2011-04-05 11:46:49 MSD @ 69969 [4d9ac05c.11151-2]:LOG: database system is ready to accept connections The failover file will be removed automatically and recovery.conf will be renamed to recovery.done. On the stopped slaves delete everything in the pg_xlog directory. postgres@dbX: ~ $ rm -rf /db/data/pg_xlog/* And copy the history file from the new master. postgres@dbX: ~ $ scp db2:/db/data/pg_xlog/*.history /db/data/pg_xlog/ Then change the DSN to the new master and add the following instruction in recovery.conf. This will make replicas to follow the latest created timeline. recovery_target_timeline = 'latest' Now start the postgres service and wait until it is synced up to the new master. dbX: ~ # /etc/init.d/postgresql start
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers