Re: [HACKERS] PITR Phase 2 - Design Planning

Hans-Jürgen Schönig Tue, 27 Apr 2004 01:02:10 -0700

Simon Riggs wrote:

Since Phase1 is functioning and should hopefully soon complete, we can
now start thinking about Phase 2: full recovery to a point-in-time.

Previous thinking was that a command line switch would be used to
specify recover to a given point in time, rather than the default, which
will be recover all the way to end of (available) xlogs.

Recovering to a specific point in time forces us to consider what the
granularity is of time.
We could recover:
1.to end of a full transaction log file
2.to end of a full transaction

Transaction log files currently have timestamps, so that is
straightforward, but probably not the best we can do. We would
rollforward until the xlog file time > desired point in time.

To make (2) work we would have to have a timestamp associated with each
transaction. This could be in one of two places:
1. the transaction record in the clog
2. the log record in the xlog
We would then recover the xlog record by record, until we found a record
that had a timestamp > desired point-in-time.

Currently, neither of these places have a timestamp. Hmmmm. We can't use
pg_control because we are assuming that it needs recovery...

I can't see any general way of adding a timestamp in any less than 2
bytes. We don't need a timezone. The timestamp could refer to a number
of seconds since last checkpoint; since this is limited already by a GUC
to force checkpoints every so often. Although code avoids a checkpoint
if no updates have taken place, we wouldn't be too remiss to use a
forced checkpoint every 32,000 seconds (9 hours).
Assuming that accuracy of the point-in-time was of the order of
seconds?? If we went to 0.1 second accuracy, we could checkpoint (force)
every 40 minutes or so. All of that seems too restrictive.
If we went to milliseconds, then we could use a 4 byte value and use a
checkpoint (force) every 284 hours or 1.5 weeks.
Thoughts?

Clog uses 2 bits per transaction, so even 2 bytes extra per transaction
will make the clog 9 times larger than originally intended. This could
well cause it to segment quicker, but I'm sure no one would be happy
with that. So, lets not add anything to the clog.

The alternative is to make the last part of the XlogHeader record a
timestamp value, increasing each xlog write. It might be possible to
make this part of the header optional depending upon whether or not PITR
was required, but then my preference is against such dynamic coding.

So, I propose:

- appending 8 byte date/time data into xlog file header record
- appending 4 bytes of time offset onto each xlog record
- altering the recovery logic to compare the calculated time of each
xlog record (file header + offset) against the desired point-in-time,
delivered to it by GUC.

Input is sought from anybody with detailed NTP knowledge, since the
working of NTP drift correction may have some subtle interplay with this
proposal.

Also, while that code is being altered, some additional log records need to be added when recovery of each new xlog starts, with timing, to allow DBAs watching a recovery to calculate expected completion times for the recovery, which is essential for long recovery situations.

I am also considering any changes that may be required to prepare the
way for a future implementation of parallel redo recovery.

Best regards, Simon Riggs, 2ndQuadrant
http://www.2ndquadrant.com


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Simon,

I have one question which might be important: If we use timestamps inside the WAL system to find out where to stop. What happens if somebody changes the time of the system? (e.g. correcting the system clock by calling ntpdate). Wouldn't it confuse the PITR system? How do you plan to handle that? Unfortunately time is nothing which can be used as a key (at least not from my point of view).

Just some lousy ideas early in the morning ...

        Regards,

                Hans


--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/2952/30706 or +43/664/233 90 75
www.cybertec.at, www.postgresql.at, kernel.cybertec.at


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] PITR Phase 2 - Design Planning

Reply via email to