Re: pgsql: Validate page level checksums in base backups

2018-04-04 Thread Michael Banck
Hi, On Wed, Apr 04, 2018 at 11:38:35AM +0200, Magnus Hagander wrote: > On Tue, Apr 3, 2018 at 10:48 PM, Michael Banck > wrote: > > > Hi, > > > > On Tue, Apr 03, 2018 at 08:48:08PM +0200, Magnus Hagander wrote: > > > On Tue, Apr 3, 2018 at 8:29 PM, Tom Lane

Re: pgsql: Validate page level checksums in base backups

2018-04-04 Thread Magnus Hagander
On Tue, Apr 3, 2018 at 10:48 PM, Michael Banck wrote: > Hi, > > On Tue, Apr 03, 2018 at 08:48:08PM +0200, Magnus Hagander wrote: > > On Tue, Apr 3, 2018 at 8:29 PM, Tom Lane wrote: > > I'd bet a good lunch that nondefault BLCKSZ would break it, as

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread David Steele
On 4/3/18 4:48 PM, Michael Banck wrote: Attached is a patch which does that hopefully: 1. creates two user tables, one large enough for at least 6 blocks (around 360kb), the other just one block. 2. stops the cluster before scribbling over its data and starts it afterwards. 3. uses the

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread Michael Banck
Hi, On Tue, Apr 03, 2018 at 08:48:08PM +0200, Magnus Hagander wrote: > On Tue, Apr 3, 2018 at 8:29 PM, Tom Lane wrote: > I'd bet a good lunch that nondefault BLCKSZ would break it, as well, > > since the way in which the corruption is induced is just guessing > > as to where

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread Tom Lane
Magnus Hagander writes: > On Tue, Apr 3, 2018 at 8:29 PM, Tom Lane wrote: >> It's scribbling on the source cluster's disk files and assuming that that >> translates one-for-one to what gets sent to the slave server --- but what >> if some of the blocks

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread Magnus Hagander
On Tue, Apr 3, 2018 at 8:29 PM, Tom Lane wrote: > Magnus Hagander writes: > > Yeah, there's clearly a second problem here. > > I think this test script is broken in many ways. > > It's scribbling on the source cluster's disk files and assuming that that

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread Peter Geoghegan
On Tue, Apr 3, 2018 at 11:29 AM, Tom Lane wrote: > Also, scribbling on tables as sensitive as pg_class is just asking for > trouble IMO. I don't see anything in this test, for example, that > prevents autovacuum from running and causing a PANIC before the test > can complete.

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread Tom Lane
Magnus Hagander writes: > Yeah, there's clearly a second problem here. I think this test script is broken in many ways. It's scribbling on the source cluster's disk files and assuming that that translates one-for-one to what gets sent to the slave server --- but what if

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread Magnus Hagander
On Tue, Apr 3, 2018 at 7:13 PM, Andres Freund wrote: > On 2018-04-03 11:52:26 +, Magnus Hagander wrote: > > Validate page level checksums in base backups > > > > When base backups are run over the replication protocol (for example > > using pg_basebackup), verify the

Re: pgsql: Validate page level checksums in base backups

2018-04-03 Thread Andres Freund
On 2018-04-03 11:52:26 +, Magnus Hagander wrote: > Validate page level checksums in base backups > > When base backups are run over the replication protocol (for example > using pg_basebackup), verify the checksums of all data blocks if > checksums are enabled. If checksum failures are

pgsql: Validate page level checksums in base backups

2018-04-03 Thread Magnus Hagander
Validate page level checksums in base backups When base backups are run over the replication protocol (for example using pg_basebackup), verify the checksums of all data blocks if checksums are enabled. If checksum failures are encountered, log them as warnings but don't abort the backup. This