From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Simon Riggs > This > * reduces disk space requirements on master > * removes a minor bug in fast failover > * simplifies code
I welcome this patch. I was wondering why PostgreSQL retains the previous checkpoint. (Although unrelated to this, I've also been wondering why PostgreSQL flushes WAL to disk when writing a page in the shared buffer, because PostgreSQL doesn't use WAL for undo.) Here are my review comments. (1) - * a) we keep WAL for two checkpoint cycles, back to the "prev" checkpoint. + * a) we keep WAL for only one checkpoint cycle (prior to PG11 we kept + * WAL for two checkpoint cycles to allow us to recover from the + * secondary checkpoint if the first checkpoint failed, though we + * only did this on the master anyway, not on standby. Keeping just + * one checkpoint simplifies processing and reduces disk space in + * many smaller databases.) /* - * The last valid checkpoint record required for a streaming - * recovery exists in neither standby nor the primary. + * We used to attempt to go back to a secondary checkpoint + * record here, but only when not in standby_mode. We now + * just fail if we can't read the last checkpoint because + * this allows us to simplify processing around checkpoints. */ if (fast_promote) { - checkPointLoc = ControlFile->prevCheckPoint; + /* + * It appears to be a bug that we used to use prevCheckpoint here + */ + checkPointLoc = ControlFile->checkPoint; - XLByteToSeg(PriorRedoPtr, _logSegNo, wal_segment_size); + /* Trim from the last checkpoint, not the last - 1 */ + XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size); I suggest to remove references to the secondary checkpoint (the old behavior) from the comments. I'm afraid those could confuse new developers. (2) @@ -8110,10 +8100,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, ereport(LOG, (errmsg("invalid primary checkpoint link in control file"))); break; @@ -8135,10 +8121,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, ereport(LOG, (errmsg("invalid primary checkpoint record"))); break; @@ -8154,10 +8136,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, ereport(LOG, (errmsg("invalid resource manager ID in primary checkpoint record"))); break; etc, etc. "primary checkpoint" should be just "checkpoint", now that the primary/secondary concept will disappear. (3) Should we change the default value of max_wal_size from 1 GB to a smaller size? I vote for "no" for performance. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers