> Consider a scenario like this, > > Server A: primary > Server B :replica of A > Server C :replica of B > > and somehow A down ,so B gets promoted. > Server A: down > Server B :new primary > Server C :replica of B > > In this case, pg_rewind can be used to reconstruct the cascade; the source is C and the target is A. > However, we get error as belows by running pg_rewind. > > ``` > pg_rewind: fetched file "global/pg_control", length 8192 > pg_rewind: source and target cluster are on the same timeline > pg_rewind: no rewind required > ```
To fix the above mentioned behavior of pg_rewind, I suggest to change the cascade standby's (i.e. server C's) minRecoveryPointTLI when it receives the new timeline information from the new primary (i.e. server B). When server B is promoted, it creates an end-of-recovery record by calling CreateEndOfRecoveryRecord(). (in xlog.c) And also updates B's minRecoveryPoint and minRecoveryPointTLI. ``` /* * Update the control file so that crash recovery can follow the timeline * changes to this point. */ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); ControlFile->minRecoveryPoint = recptr; ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID; UpdateControlFile(); LWLockRelease(ControlFileLock); ``` Since C is a replica of B, the end-of-recovery record is replicated from B to C, so the record is replayed in C by xlog_redo(). The attached patch updates minRecoveryPoint and minRecoveryPointTLI at this point by mimicking CreateEndOfRecoveryRecord(). With this patch, you can run pg_rewind with cascade standby immediately. (without waiting for checkpointing) Thoughts? Masaki Kuwamura
v1-0001-pg_rewind-Fix-bug-using-cascade-standby-as-source.patch
Description: Binary data