On 02.11.2010 07:15, Fujii Masao wrote:
On Mon, Nov 1, 2010 at 8:32 PM, Heikki Linnakangas
<heikki.linnakan...@enterprisedb.com>  wrote:
Yeah, that's one approach. Another is to validate the TLI in the xlog page
header, it should always match the current timeline we're on. That would
feel more robust to me.

Yeah, that seems better.

We're a bit fuzzy about what TLI is written in the page header when the
timeline changing checkpoint record is written, though. If the checkpoint
record fits in the previous page, the page will carry the old TLI, but if
the checkpoint record begins a new WAL page, the new page is initialized
with the new TLI. I think we should rearrange that so that the page header
will always carry the old TLI.

Or after rescanning the timeline history files, what about refetching the last
applied record and checking whether the TLI in the xlog page header is the
same as the previous TLI? IOW, what about using the header of the xlog page
including the last applied record instead of the following checkpoint record?

I guess that would work too, but it seems problematic to move backwards during recovery.

Anyway ISTM we should also check that the min recovery point is not ahead
of the TLI switch location. So we need to fetch the record in the min recovery
point and validate the TLI of the xlog page header. Otherwise, the database
might get corrupted. This can happen, for example, when you remove all the
WAL files in pg_xlog directory and restart the standby.

Yes, that's another problem. We don't know which timeline the min recovery point refers to. We should store TLI along with minRecoveryPoint, then we can at least check that we're on the right timeline when we reach minRecoveryPoint and throw an error.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to