On 3/21/17 8:45 PM, Venkata B Nagothi wrote:
On Tue, Mar 21, 2017 at 8:46 AM, David Steele <da...@pgmasters.net

    Unfortunately, I don't think the first patch (recoveryStartPoint)
    will work as currently implemented.  The problem I see is that the
    new function recoveryStartsHere() depends on pg_control containing a
    checkpoint right at the end of the backup.  There's no guarantee
    that this is true, even if pg_control is copied last.  That means a
    time, lsn, or xid that occurs somewhere in the middle of the backup
    can be selected without complaint from this code depending on timing.


Yes, that is true.  The latest best position, checkpoint position, xid
and timestamp of the restored backup of the backup is shown up in the
pg_controldata, which means, that is the position from which the
recovery would start.

Backup recovery starts from the checkpoint in the backup_label, not from the checkpoint in pg_control. The original checkpoint that started the backup is generally overwritten in pg_control by the end of the backup.

Which in-turn means, WALs start getting replayed
from that position towards --> minimum recovery position (which is the
end backup, which again means applying WALs generated between start and
the end backup) all the way through to  --> recovery target position.

minRecoveryPoint is only used when recovering a backup that was made from a standby. For backups taken on the master, the backup end WAL record is used.

The best start position to check with would the position shown up in the
pg_control file, which is way much better compared to the current
postgresql behaviour.

I don't agree, for the reasons given previously.

    The tests pass (or rather fail as expected) because they are written
    using values before the start of the backup.  It's actually the end
    of the backup (where the database becomes consistent on recovery)
    that defines where PITR can start and this distinction definitely
    matters for very long backups.  A better test would be to start the
    backup, get a time/lsn/xid after writing some data, and then make
    sure that time/lsn/xid is flagged as invalid on recovery.

The current behaviour is that, if the XID shown-up by the pg_controldata
is say 1400 and the specified recovery_target_xid is say 200, then,
postgresql just completes the recovery and promotes at the immediate
consistent recovery target, which means, the DBA has to all the way
start the restoration and recovery process again to promote the database
at the intended position.

I'm not saying that the current behavior is good, only that the proposed behavior is incorrect as far as I can tell.

    It is also problematic to assume that transaction IDs commit in
    order. If checkPoint.latestCompletedXid contains 5 then a recovery
    to xid 4 may or may not be successful depending on the commit
    order.  However, it appears in this case the patch would disallow
    recovery to 4.

If the txid 4 is the recovery target xid, then, the appropriate backup
taken previous to txid 4 must be used or as an alternative recovery
target like recovery_target_timestamp must be used to proceed to the
respective recovery target xid.

I'm not sure I follow you here, but I do know that using the last committed xid says little about which xids precede or follow it.

    I think this logic would need to be baked into recoveryStopsAfter()
    and/or recoveryStopsBefore() in order to work.  It's not clear to me
    what that would look like, however, or if the xid check is even
    practical.


The above two functions would determine if the recovery process has to
stop at a particular position or not and the patch has nothing to do
with it.

To summarise, this patch only determines if the WAL replay has to start
at all. In other words, if the specified recovery target falls prior to
the restored database position, then, the WAL replay should not start,
which prevent the database from getting promoted at an unintended
recovery target.

I understand what you are trying to do. My point is that the information you are working from (whatever checkpoint happens to be in pg_control when recovery starts) is not sufficient for the task. This method is inaccurate and would disallow valid recovery scenarios.

I suggest doing a thorough read-through of StartupXLOG(), particularly the section where the backup_label is loaded.

Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to