On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Tue, Sep 8, 2015 at 1:14 AM, Alexander Korotkov wrote: > > On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier wrote: > >> The code above looks correct to me when scanning the WAL history onwards > >> though, which is what is done when extracting the page map, but not > >> backwards when we try to find the last common checkpoint record. This > code > >> actually fails trying to open 2/0/2 that does not exist in the promoted > >> standby's pg_xlog in my test case. > >> > >> Attached is a small script I have used to reproduce the failure. > > > > > > Right, thanks! It should be fixed in the attached version of patch. > > So, instead of a code review, I have been torturing your patch and did > advanced tests on it with the attached script, that creates a cluster > as follows: > master (5432) > / \ > 1 (5433) 2 (5434) > | > 3 (5435) > Once cluster is created, nodes are promoted in a certain order giving > them different timeline jump properties: > - master, stays on tli 1 > - standby 1, tli 1->2 > - standby 2, tli 1->3 > - standby 3, tli 1->2->4 > And data is inserted on each of them to make WAL fork properly. > Finally the script tries to rewind one node using another node as > source, and then tries to link this target node back to the source > node via streaming replication. > > I have tested all the one-one permutations possible in the structure > above (see commented portions at the bottom of my script), and all of > them worked. I have to say that from the testing prospective this > patch looks in great shape, and will greatly improve the use cases of > pg_rewind! > Great! Many thanks for your testing! I am planning to do as well a detailed code review rather soon. > Good, thanks. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company