Re: [HACKERS] Skip checkpoint on promoting from streaming replication

Kyotaro HORIGUCHI Thu, 21 Jun 2012 21:04:37 -0700

Hello,

> Is it guaranteed that all the files (e.g., the latest timeline history file)
> required for such crash recovery exist in pg_xlog? If yes, your
> approach might work well.


Particularly regarding the promotion, the files reuiqred are the
history file of the latest timeline, the WAL file including redo
location of the latest restartpoint, and all WAL files after the
first one each of which is of appropriate timeline.

On current (9.2/9.3dev) implement, as far as I know, archive
recovery and stream replication will create regular WAL files
requireded during recovery sequence in slave's pg_xlog
direcotory. And only restart point removes them older than the
one on which the restart point takes place. If so, all required
files mentioned above should be in pg_xlog directory. Is there
something I've forgotten?

However, it will be more robust if we could check if all required
files available on promotion. I could guess two approaches which
might accomplish that.

1. Record the id of the WAL segment which is not in pg_xlog as
   regular WAL file on reading.

   For example, if we modify archive recovery so as to make work
   WAL files out of pg_xlog or give a special name which cannot
   be refferred to for fetching them in crash recovery afterward,
   record the id of the segment. The shutdown checkpoint on
   promotion or end of recovery cannot be skipped if this
   recorded segment id is equal or larger than redo point of the
   latest of checkpoint. This approach of cource reduces the
   chance to skip shutdown checkpoint than forcing to copy all
   required files into pg_xlog, but still seems to be effective
   for most common cases, say promoting enough minutes after
   wal-streaming started to have a restart point on a WAL in
   pg_xlog.

   I hope this is promising.

   Temporary WAL file for streaming? It seems for me to make
   shutdown checkpoint mandatory since no WAL files before
   promotion becomes accessible at the moment. On the other hand,
   preserving somehow the WALs after the latest restartpoint
   seems to have not significant difference to the current way
   from the viewpoint of disk consumption.

2. Check for all required WAL files on promotion or end of
   recovery.

   We could check the existence of all required files on
   promotion scanning with the manner similar to recovery. But
   this requires to add the codes similar to the existing or
   induces the labor to weave new function into existing
   code. Furthurmore, this seems to take certain time on
   promotion (or end of recovery).

   The discussion about temporary wal files would be the same to 1.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Skip checkpoint on promoting from streaming replication

Reply via email to