On 23.12.2012 16:37, Fujii Masao wrote:
On Fri, Dec 21, 2012 at 1:48 AM, Fujii Masao<masao.fu...@gmail.com>  wrote:
On Sat, Dec 15, 2012 at 9:36 AM, Fujii Masao<masao.fu...@gmail.com>  wrote:
I found another "requested timeline does not contain minimum recovery point"
error scenario in HEAD:

1. Set up the master 'M', one standby 'S1', and one cascade standby 'S2'.
2. Shutdown the master 'M' and promote the standby 'S1', and wait for 'S2'
     to reconnect to 'S1'.
3. Set up new cascade standby 'S3' connecting to 'S2'.
     Then 'S3' fails to start the recovery because of the following error:

     FATAL:  requested timeline 2 does not contain minimum recovery
point 0/3000000 on timeline 1
     LOG:  startup process (PID 33104) exited with exit code 1
     LOG:  aborting startup due to startup process failure

The result of pg_controldata of 'S3' is:

Latest checkpoint location:           0/3000088
Prior checkpoint location:            0/2000060
Latest checkpoint's REDO location:    0/3000088
Latest checkpoint's REDO WAL file:    000000020000000000000003
Latest checkpoint's TimeLineID:       2
<snip>
Min recovery ending location:         0/3000000
Min recovery ending loc's timeline:   1
Backup start location:                0/0
Backup end location:                  0/0

The content of the timeline history file '00000002.history' is:

1       0/3000088       no recovery target specified

I still could reproduce this problem. Attached is the shell script
which reproduces the problem.

This problem happens when new standby starts up from the backup
taken from another standby and its recovery starts from the shutdown
checkpoint record which causes timeline switch. In this case,
the timeline of minimum recovery point can be different from that of
latest checkpoint (i.e., shutdown checkpoint). But the following check
in StartupXLOG() assumes that they are always the same wrongly.
So the problem happens.

        /*
         * The min recovery point should be part of the requested timeline's
         * history, too.
         */
        if (!XLogRecPtrIsInvalid(ControlFile->minRecoveryPoint)&&
                tliOfPointInHistory(ControlFile->minRecoveryPoint - 1, 
expectedTLEs) !=
                        ControlFile->minRecoveryPointTLI)
                ereport(FATAL,
                                (errmsg("requested timeline %u does not contain 
minimum recovery
point %X/%X on timeline %u",
                                                recoveryTargetTLI,
                                                (uint32) 
(ControlFile->minRecoveryPoint>>  32),
                                                (uint32) 
ControlFile->minRecoveryPoint,
                                                
ControlFile->minRecoveryPointTLI)));

No, it doesn't assume that min recovery point is on the same timeline as the checkpoint record. This is another variant of the "timeline history files are not included in the backup" problem discussed on the other thread with subject "pg_basebackup from cascading standby after timeline switch". If you remove the min recovery point check above, the test case still fails, with a different error message:

LOG: unexpected timeline ID 1 in log segment 000000020000000000000003, offset 0

If you modify the test script to copy the 00000002.history file to the data-standby3/pg_xlog after running pg_basebackup, the test case works. (we still need to fix it, of course)

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to