> On Tuesday, September 25, 2012 6:29 PM Heikki Linnakangas wrote: > On 25.09.2012 10:08, Heikki Linnakangas wrote: > > On 24.09.2012 16:33, Amit Kapila wrote: > >> In any case, it will be better if you can split it into multiple > patches: > >> 1. Having new functionality of "Switching timeline over streaming > >> replication" > >> 2. Refactoring related changes.
> Ok, here you go. xlog-c-split-1.patch contains the refactoring of existing code, with no user-visible changes. > streaming-tli-switch-2.patch applies over xlog-c-split-1.patch, and contains the new functionality. Please find the initial review of the patch. Still more review is pending, but I thought whatever is done I shall post Basic stuff: ---------------------- - Patch applies OK - Compiles cleanly with no warnings - Regression tests pass. - Documentation changes are mostly fine. - Basic replication tests works. Testing --------- Start primary server Start standby server Start cascade standby server Stopped the primary server Promoted the standby server with ./pg_ctl -D data_repl promote In postgresql.conf file archive_mode = off The following logs are observing in the cascade standby server. LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 LOG: walreceiver ended streaming and awaits new instructions LOG: record with zero length at 0/17E3888 LOG: re-handshaking at position 0/1000000 on tli 1 LOG: fetching timeline history file for timeline 2 from primary server LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 LOG: walreceiver ended streaming and awaits new instructions LOG: re-handshaking at position 0/1000000 on tli 1 LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 In postgresql.conf file archive_mode = on The following logs are observing in the cascade standby server. LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 LOG: walreceiver ended streaming and awaits new instructions sh: /home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000 00000002: No such file or directory LOG: record with zero length at 0/20144B8 sh: /home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000 00000002: No such file or directory LOG: re-handshaking at position 0/2000000 on tli 1 LOG: fetching timeline history file for timeline 2 from primary server LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 LOG: walreceiver ended streaming and awaits new instructions sh: /home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000 00000002: No such file or directory sh: /home/amit/installation/bin/data_sub/pg_xlog/archive_status/0000000100000000 00000002: No such file or directory LOG: re-handshaking at position 0/2000000 on tli 1 LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 LOG: walreceiver ended streaming and awaits new instructions Verified that files are present in respective directories. Code Review ---------------- 1. In function readTimeLineHistory(), two mechanisms are used to fetch timeline from history file + sscanf(fline, "%u\t%X/%X", &tli, &switchpoint_hi, &switchpoint_lo); + + /* expect a numeric timeline ID as first field of line */ + tli = (TimeLineID) strtoul(ptr, &endptr, 0); If we use new mechanism, it will not be able to detect error as it is doing in current case. 2. In function readTimeLineHistory(), + fd = AllocateFile(path, "r"); + if (fd == NULL) + { + if (errno != ENOENT) + ereport(FATAL, + (errcode_for_file_access(), + errmsg("could not open file \"%s\": %m", path))); + /* Not there, so assume no parents */ + return list_make1_int((int) targetTLI); + } still return list_make1_int((int) targetTLI); is used. 3. Function timelineOfPointInHistory(), should return the timeline of recptr passed to it. a. is it okay to decide based on xlog recordpointer that which timeline it belongs to, as different timelines can have same xlog recordpointer? b. it seems from logic that it will return timeline previous to the timeline of recptr passed. For example if the timeline 3's switchpoint is equal to recptr passed then it will return timeline 2. 4. In writeTimeLineHistory function variable endTLI is never used. 5. In header of function writeTimeLineHistory(), can give explanation about XLogRecPtr switchpoint 6. @@ -6869,11 +5947,35 @@ StartupXLOG(void) */ if (InArchiveRecovery) { + char reason[200]; + + /* + * Write comment to history file to explain why and where timeline + * changed. Comment varies according to the recovery target used. + */ + if (recoveryTarget == RECOVERY_TARGET_XID) + snprintf(reason, sizeof(reason), + "%s transaction %u", + recoveryStopAfter ? "after" : "before", + recoveryStopXid); In the comment above this line you mentioned why and where timeline changed. However in the reason field only specifies about where part. 7. + * Returns the redo pointer of the "previous" checkpoint. +GetOldestRestartPoint(XLogRecPtr *oldrecptr, TimeLineID *oldtli) +{ + if (InRedo) + { + LWLockAcquire(ControlFileLock, LW_SHARED); + *oldrecptr = ControlFile->checkPointCopy.redo; + *oldtli = ControlFile->checkPointCopy.ThisTimeLineID; + LWLockRelease(ControlFileLock); + } a. In this function, is it required to take ControlFileLock as earlier also there was no lock to protect this read when it get called from RestoreArchivedFile, and I think at this point no one else can modify these values. However for code consistency purpose like whenever or wherever read the controlfile values, read it with read lock. b. As per your comment it should have returned "previous" checkpoint, however the function returns values of latest checkpoint. 8. In function writeTimeLineHistoryFile(), will it not be better to directly write rather than to later do pg_fsync(). as it is just one time write. 9. +XLogRecPtr +timeLineSwitchPoint(XLogRecPtr startpoint, TimeLineID tli) .. .. + * starting point. This is because the client can legimately spelling of legitimately needs to be corrected. 10.+XLogRecPtr +timeLineSwitchPoint(XLogRecPtr startpoint, TimeLineID tli) .. .. + if (tli < ThisTimeLineID) + { + if (!nexttle) + elog(ERROR, "could not find history entry for child of timeline %u", tli); /* shouldn't happen */ + } I don't understand the meaning of the above check, as I think this situation can occur when this function gets called from StartReplication, because always tli sent by standby to new master will be less than ThisTimeLineID and it can be first in list. Documentation --------------- 1. In explanation of TIMELINE_HISTORY: Filename of the timeline history file. This is always of the form [insert correct example here]. Give example. 2. In protocol.sgml change, I feel better explain when the COPYDONE message will be initiated. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers