On 2013-01-18 08:24:31 +0900, Michael Paquier wrote: > On Fri, Jan 18, 2013 at 3:05 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > > > I encountered the problem that the timeline switch is not performed > > expectedly. > > I set up one master, one standby and one cascade standby. All the servers > > share the archive directory. restore_command is specified in the > > recovery.conf > > in those two standbys. > > > > I shut down the master, and then promoted the standby. In this case, the > > cascade standby should switch to new timeline and replication should be > > successfully restarted. But the timeline was never changed, and the > > following > > log messages were kept outputting. > > > > sby2 LOG: restarted WAL streaming at 0/3000000 on timeline 1 > > sby2 LOG: replication terminated by primary server > > sby2 DETAIL: End of WAL reached on timeline 1 > > sby2 LOG: restarted WAL streaming at 0/3000000 on timeline 1 > > sby2 LOG: replication terminated by primary server > > sby2 DETAIL: End of WAL reached on timeline 1 > > sby2 LOG: restarted WAL streaming at 0/3000000 on timeline 1 > > sby2 LOG: replication terminated by primary server > > sby2 DETAIL: End of WAL reached on timeline 1 > > > I am seeing similar issues with master at 88228e6. > This is easily reproducible by setting up 2 slaves under a master, then > kill the master. Promote slave 1 and reconnect slave 2 to slave 1, then > you will notice that the timeline jump is not done.
Can you reproduce that one with 7fcbf6a^ (i.e before xlogreader got split off?). > The replication delays are still here. That one is caused by this nice bug, courtesy of yours truly: diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 90ba32e..1174493 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8874,7 +8874,7 @@ retry: /* See if we need to retrieve more data */ if (readFile < 0 || (readSource == XLOG_FROM_STREAM && - receivedUpto <= targetPagePtr + reqLen)) + receivedUpto < targetPagePtr + reqLen)) { if (StandbyMode) { I didn't notice because I had a testscript inserting stuff continuously and it cause at most lagging by one record... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers