Re: [HACKERS] corrupt pages detected by enabling checksums

Jeff Davis Fri, 05 Apr 2013 16:40:05 -0700

On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote:
> Maybe we could scan forward to check whether a corrupted WAL record is
> followed by one or more valid ones with sensible LSNs. If it is,
> chances are high that we haven't actually hit the end of the WAL. In
> that case, we could either log a warning, or (better, probably) abort
> crash recovery.


+1.

> Corruption of fields which we require to scan past the record would
> cause false negatives, i.e. no trigger an error even though we do
> abort recovery mid-way through. There's a risk of false positives too,
> but they require quite specific orderings of writes and thus seem
> rather unlikely. (AFAICS, the OS would have to write some parts of
> record N followed by the whole of record N+1 and then crash to cause a
> false positive).

Does the xlp_pageaddr help solve this?

Also, we'd need to be a little careful when written-but-not-flushed WAL
data makes it to disk, which could cause a false positive and may be a
fairly common case.

Regards,
        Jeff Davis




-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] corrupt pages detected by enabling checksums

Reply via email to