I am working with PostgreSQL 9.1.3 - I setup a master and standby -
Initiated replication and verified that it was occurring - Failed over
from master to standby and verified that the database could be updated
on the new master - I then configured the former standby as a master,
the former master as a standby and used the following command from the
new master to transfer data to the new standby:
rsync -av --exclude postgresql.conf --exclude pg_xlog
/var/lib/pgsql/9.1/data/* 192.7.143.213:/var/lib/pgsql/9.1/data (with
all previous data being removed by rm -rf *)
I then started the new standby first and after a short wait the new
master - The standby starts and initially has connect errors but after
the master is started the standby appears to start streaming replication
and then errors with a timeline error - On the new master all looks good
and the database can be accessed via psql.
Here is the content of my recovery.conf file:
standby_mode=on
primary_conninfo='host=192.7.143.111 port=5432 user=ruser
password=ruserpass'
trigger_file='/var/lib/pgsql/9.1/data/failover'
recovery_target_timeline = 'latest' (same effect when this line is
removed)
Before New Master Starts:
2012-07-11 10:43:04.476 EDT---10876-LOCATION: libpqrcv_connect,
libpqwalreceiver.c:102
2012-07-11 10:43:09.476 EDT---10877-FATAL: XX000: could not connect to
the primary server: could not connect to server: Connection refused
Is the server running on host 192.7.143.111 and
accepting
TCP/IP connections on port 5432?
2012-07-11 10:43:09.476 EDT---10877-LOCATION: libpqrcv_connect,
libpqwalreceiver.c:102
2012-07-11 10:43:14.476 EDT---10878-FATAL: XX000: could not connect to
the primary server: could not connect to server: Connection refused
Is the server running on host 192.7.143.111 and
accepting
TCP/IP connections on port 5432?
After new Master Starts:
2012-07-11 10:43:14.476 EDT---10878-LOCATION: libpqrcv_connect,
libpqwalreceiver.c:102
2012-07-11 10:43:19.479 EDT---10879-LOG: 0: streaming replication
successfully connected to primary
2012-07-11 10:43:19.479 EDT---10879-LOCATION: libpqrcv_connect,
libpqwalreceiver.c:171
2012-07-11 10:43:20.749 EDT---10738-LOG: 0: unexpected timeline ID
1 in log file 0, segment 2, offset 0
2012-07-11 10:43:20.749 EDT---10738-LOCATION: ValidXLOGHeader,
xlog.c:4123
2012-07-11 10:43:20.749 EDT---10879-FATAL: 57P01: terminating
walreceiver process due to administrator command
2012-07-11 10:43:20.749 EDT---10879-LOCATION: ProcessWalRcvInterrupts,
walreceiver.c:150
2012-07-11 10:43:20.849 EDT---10738-LOG: 0: unexpected timeline ID
1 in log file 0, segment 2, offset 0
2012-07-11 10:43:20.849 EDT---10738-LOCATION: ValidXLOGHeader,
xlog.c:4123
2012-07-11 10:43:24.849 EDT---10738-LOG: 0: unexpected timeline ID
1 in log file 0, segment 2, offset 0
2012-07-11 10:43:24.849 EDT---10738-LOCATION: ValidXLOGHeader,
xlog.c:4123
2012-07-11 10:43:29.849 EDT---10738-LOG: 0: unexpected timeline ID
1 in log file 0, segment 2, offset 0
2012-07-11 10:43:29.849 EDT---10738-LOCATION: ValidXLOGHeader,
xlog.c:4123
2012-07-11 10:43:34.850 EDT---10738-LOG: 0: unexpected timeline ID
1 in log file 0, segment 2, offset 0
2012-07-11 10:43:34.851 EDT---10738-LOCATION: ValidXLOGHeader,
xlog.c:4123
2012-07-11 10:43:38.415 EDT---10731-LOG: 0: received fast shutdown
request
2012-07-11 10:43:38.415 EDT---10731-LOCATION: pmdie, postmaster.c:2251
2012-07-11 10:43:38.415 EDT---10738-LOG: 0: unexpected timeline ID
1 in log file 0, segment 2, offset 0
2012-07-11 10:43:38.415 EDT---10738-LOCATION: ValidXLOGHeader,
xlog.c:4123
2012-07-11 10:43:38.416 EDT---10731-LOG: 0: startup process (PID
10738) exited with exit code 1
2012-07-11 10:43:38.416 EDT---10731-LOCATION: LogChildExit,
postmaster.c:2867
2012-07-11 10:43:38.416 EDT---10731-LOG: 0: aborting startup due to
startup process failure
2012-07-11 10:43:38.416 EDT---10731-LOCATION: reaper, postmaster.c:2377
Al Gregorio
Volt Delta
585-899-8812
When you're moving in the positive, your destination is the brightest
star.
Stevie Wonder