Re: [HACKERS] 16-bit page checksums for 9.2

Robert Haas Tue, 03 Jan 2012 17:50:24 -0800

On Fri, Dec 30, 2011 at 11:58 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On 12/29/11, Ants Aasma <ants.aa...@eesti.ee> wrote:
>> Unless I'm missing something, double-writes are needed for all writes,
>> not only the first page after a checkpoint. Consider this sequence of
>> events:
>>
>> 1. Checkpoint
>> 2. Double-write of page A (DW buffer write, sync, heap write)
>> 3. Sync of heap, releasing DW buffer for new writes.
>>  ... some time goes by
>> 4. Regular write of page A
>> 5. OS writes one part of page A
>> 6. Crash!
>>
>> Now recovery comes along, page A is broken in the heap with no
>> double-write buffer backup nor anything to recover it by in the WAL.
>
> Isn't 3 the very definition of a checkpoint, meaning that 4 is not
> really a regular write as it is the first one after a checkpoint?


I think you nailed it.

> But it doesn't seem safe to me replace a page from the DW buffer and
> then apply WAL to that replaced page which preceded the age of the
> page in the buffer.

That's what LSNs are for.

If we write the page to the checkpoint buffer just once per
checkpoint, recovery can restore the double-written versions of the
pages and then begin WAL replay, which will restore all the subsequent
changes made to the page.  Recovery may also need to do additional
double-writes if it encounters pages that for which we wrote WAL but
never flushed the buffer, because a crash during recovery can also
create torn pages.  When we reach a restartpoint, we fsync everything
down to disk and then nuke the double-write buffer.  Similarly, in
normal running, we can nuke the double-write buffer at checkpoint
time, once the fsyncs are complete.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 16-bit page checksums for 9.2

Reply via email to