Hi,

2014-10-29 17:46 GMT+01:00 Robert Haas <robertmh...@gmail.com <mailto:robertmh...@gmail.com>>:


   Yes, but after the restart, the slave will also rewind to the most
   recent restart-point to begin replay, and some of the sanity checks
   that recovery.conf enforces will be lost during that replay.  A safe
   way to do this might be to shut down the master, make a note of the
   ending WAL position on the master, and then promote the slave (without
   shutting it down) once it's reached that point in replay.


As far as I remember (I can’t test it right now but I am 99% sure) promoting the slave makes it impossible to connect the old master to the new one without making a base_backup. The reason is the timeline change. It complains.

The only way to do this is:
1. Stop the master
2. Restart the slave without recovery conf
3. Restart the old master master with a recovery conf.

I have done this a couple of times back and forward and it "worked". I mean it didn't complain.


    > I also thought that if there was a crash on the original master
   and it
    > applied WAL entries on itself that are not presented on the slave
   then it
    > will throw an error when I try to connect it to the new master
   (to the old
    > slave).

   I don't think you're going to be that lucky.

    > It would be nice to know as creating a base_backup takes much time.

   rsync can speed things up by copying only changed data, but yes,
   it's a problem.


Actually I am more afraid of rsyncing database data files between the nodes than trusting the postgresql error log. There is no technical reason for that, it's more like psychological.

Is it possible that the new master has unreplicated changes and won't notice that when connecting to the old slave? I thought that wal records might have unique identifiers but I don't know the details.



Reply via email to