Re: Re: Re: Enabling checksums on a streaming replica
On Fri, Jun 28, 2019 at 07:43:30AM -0400, Brad Nicholson wrote: > So if all the checksums are being recalculated on the replica, this > approach should be relatively safe, should it not? Yep. > Assuming pg_checksums is doing the right thing (and it looks to me like it > should be). The constraint that a cluster needs to be cleanly shut down to be able to enable checksums with pg_checksums is the actual deal here. After that of course comes the WAL retention on the primary or in the WAL archives that a standby would need again to catch up while it was offline. -- Michael signature.asc Description: PGP signature
Re: Re: Re: Enabling checksums on a streaming replica
Michael Paquier wrote on 06/27/2019 11:17:10 PM: > From: Michael Paquier > To: Brad Nicholson > Cc: Grigory Smolkin , pgsql- > gene...@lists.postgresql.org > Date: 06/27/2019 11:17 PM > Subject: [EXTERNAL] Re: Re: Enabling checksums on a streaming replica > > On Wed, Jun 26, 2019 at 02:08:19PM -0400, Brad Nicholson wrote: > > In testing, it doesn't appear to matter. I've ensured that I've generated > > some full page writes (confirmed via pg_waldump), and those apply > > fine. > > Full pages writes are first written from shared buffers to WAL, where > their checksums does not actually apply. When the WAL records are > read an applied, a full page image is recovered in shared buffers. > The checksum of the page would get updated once the shared buffer page > used is evicted and written back to disk. > So if all the checksums are being recalculated on the replica, this approach should be relatively safe, should it not? Assuming pg_checksums is doing the right thing (and it looks to me like it should be). Brad.
Re: Re: Enabling checksums on a streaming replica
On Wed, Jun 26, 2019 at 02:08:19PM -0400, Brad Nicholson wrote: > In testing, it doesn't appear to matter. I've ensured that I've generated > some full page writes (confirmed via pg_waldump), and those apply > fine. Full pages writes are first written from shared buffers to WAL, where their checksums does not actually apply. When the WAL records are read an applied, a full page image is recovered in shared buffers. The checksum of the page would get updated once the shared buffer page used is evicted and written back to disk. > The one thing I'm not sure of, when verifying checksums via pg_checksums I > see blocks being skipped. I'm not sure what or why it is skipping blocks, > and if that indicates a problem or not. Some files can get skipped entirely, but all the blocks of a file are basically checked. Well, except for new pages of course which have no checksums to look at yet. -- Michael signature.asc Description: PGP signature
Re: Re: Enabling checksums on a streaming replica
Grigory Smolkin wrote on 06/26/2019 12:37:10 PM: > From: Grigory Smolkin > To: pgsql-general@lists.postgresql.org > Date: 06/26/2019 12:37 PM > Subject: [EXTERNAL] Re: Enabling checksums on a streaming replica > > Hello! > Very interesting idea, but what about full page writes, that comes > from master? > Can they be a problem? In testing, it doesn't appear to matter. I've ensured that I've generated some full page writes (confirmed via pg_waldump), and those apply fine. The one thing I'm not sure of, when verifying checksums via pg_checksums I see blocks being skipped. I'm not sure what or why it is skipping blocks, and if that indicates a problem or not. pg_checksums -c /data/10_repl/ Checksum scan completed Files scanned: 1530 Blocks scanned: 100782 Blocks skipped: 3 Bad checksums: 0 Data checksum version: 1 Brad
Re: Enabling checksums on a streaming replica
Hello! Very interesting idea, but what about full page writes, that comes from master? Can they be a problem? On 6/26/19 6:32 PM, Brad Nicholson wrote: I'm wondering about the validity of using the pg_checksums utility to enable checksum's on a streaming replication standby, and then promoting that standby as a way to enable checksums on existing clusters. I've tested the process out, and it "works" (by works I mean doesn't blow up or log any errors). But this seems far enough outside of supported territory that I'm curious what others think. The process is: - Primary with checksums off - create a streaming replica off that primary - stop the secondary - enable checksums on the secondary with the pg_checksums utility - start the replica - promote the replica I've thrown load at it while the it was replicating from checksums off -> checksums on, promoted it, and verified the checksums after with pg_checksums. Any thoughts on this approach? Thanks, Brad -- Grigory Smolkin Postgres Professional: http://www.postgrespro.com The Russian Postgres Company