Prevent pg_rewind destroying the data

Christopher Pereira Sun, 20 Dec 2020 13:13:08 -0800

Hi,

When pg_rewind is interrupted due to network errors, the cluster getscorrupted:

Running pg_rewind for a second time returns "pg_rewind: fatal: targetserver must be shut down cleanly".

Trying to fix the cluster with "/usr/pgsql-12/bin/postmaster' --single-F -D '/var/lib/pgsql/12/mydb' -c archive_mode=on -carchive_command=false" throws:


   LOG:  could not read from log segment 0000003B000000000000003E,
   offset 0: read 0 of 8192
   LOG:  invalid primary checkpoint record
   PANIC:  could not locate a valid checkpoint record

When a cluster failsover because of a network problem, chances are highthat another network problem may occur while we run pg_rewind.It would be nice if pg_rewind wouldn't destroy the data and leave thecluster in a state where retrying pg_rewind can succeed.

As a workaround we are thinking in taking a LVM snapshot or do a "cp--reflink" before running pg_rewind and restore if there is a failure,but it would be nice if pg_rewind were "non destructive".


Is this possible?
Am I missing something?

We are using PG 12.

Prevent pg_rewind destroying the data

Reply via email to