Re: trying again to get incremental backup

Jakub Wartak Wed, 01 Nov 2023 05:57:53 -0700

On Mon, Oct 30, 2023 at 6:46 PM Robert Haas <robertmh...@gmail.com> wrote:
>
> On Thu, Sep 28, 2023 at 6:22 AM Jakub Wartak
> <jakub.war...@enterprisedb.com> wrote:
> > If that is still an area open for discussion: wouldn't it be better to
> > just specify LSN as it would allow resyncing standby across major lag
> > where the WAL to replay would be enormous? Given that we had
> > primary->standby where standby would be stuck on some LSN, right now
> > it would be:
> > 1) calculate backup manifest of desynced 10TB standby (how? using
> > which tool?)  - even if possible, that means reading 10TB of data
> > instead of just putting a number, isn't it?
> > 2) backup primary with such incremental backup >= LSN
> > 3) copy the incremental backup to standby
> > 4) apply it to the impaired standby
> > 5) restart the WAL replay
>
> As you may be able to tell from the flurry of posts and new patch
> sets, I'm trying hard to sort out the remaining open items that
> pertain to this patch set, and I'm now back to thinking about this
> one.
>
> TL;DR: I think the idea has some potential, but there are some
> pitfalls that I'm not sure how to address.
>
> I spent some time looking at how we currently use the data from the
> backup manifest. Currently, we do two things with it. First, when
> we're backing up each file, we check whether it's present in the
> backup manifest and, if not, we back it up in full. This actually
> feels fairly poor. If it makes any difference at all, then presumably
> the underlying algorithm is buggy and needs to be fixed. Maybe that
> should be ripped out altogether or turned into some kind of sanity
> check that causes a big explosion if it fails. Second, we check
> whether the WAL ranges reported by the client match up with the
> timeline history of the server (see PrepareForIncrementalBackup). This
> set of sanity checks seems fairly important to me, and I'd regret
> discarding them. I think there's some possibility that they might
> catch user error, like where somebody promotes multiple standbys and
> maybe they even get the same timeline on more than one of them, and
> then confusion might ensue.
[..]


> Another practical problem here is that, right now, pg_combinebackup
> doesn't have an in-place mode. It knows how to take a bunch of input
> backups and write out an output backup, but that output backup needs
> to go into a new, fresh directory (or directories plural, if there are
> user-defined tablespaces). I had previously considered adding such a
> mode, but the idea I had at the time wouldn't have worked for this
> case. I imagined that someone might want to run "pg_combinebackup
> --in-place full incr" and clobber the contents of the incr directory
> with the output, basically discarding the incremental backup you took
> in favor of a full backup that could have been taken at the same point
> in time.
[..]

Thanks for answering! It all sounds like this
resync-standby-using-primary-incrbackup idea isn't fit for the current
pg_combinebackup, but rather for a new tool hopefully in future. It
could take the current LSN from stuck standby, calculate manifest on
the lagged and offline standby (do we need to calculate manifest
Checksum in that case? I cannot find code for it), deliver it via
"UPLOAD_MANIFEST" to primary and start fetching and applying the
differences while doing some form of copy-on-write from old & incoming
incrbackup data to "$relfilenodeid.new" and then durable_unlink() old
one and durable_rename("$relfilenodeid.new", "$relfilenodeid". Would
it still be possible in theory? (it could use additional safeguards
like rename controlfile when starting and just before ending to
additionally block startup if it hasn't finished). Also it looks as
per comment nearby struct IncrementalBackupInfo.manifest_files that
even checksums are just more for safeguarding rather than core
implementation (?)

What I've meant in the initial idea is not to hinder current efforts,
but asking if the current design will not stand in a way for such a
cool new addition in future ?

> One thing I also realized when thinking about this is that you could
> probably hose yourself with the patch set as it stands today by taking
> a full backup, downgrading to wal_level=minimal for a while, doing
> some WAL-skipping operations, upgrading to a higher WAL-level again,
> and then taking an incremental backup. I think the solution to that is
> probably for the WAL summarizer to refuse to run if wal_level=minimal.
> Then there would be a gap in the summary files which an incremental
> backup attempt would detect.

As per earlier test [1], I've already tried to simulate that in
incrbackuptests-0.1.tgz/test_across_wallevelminimal.sh , but that
worked (but that was with CTAS-wal-minimal-optimization -> new
relfilenodeOID is used for CTAS which got included in the incremental
backup as it's new file) Even retested that with Your v7 patch with
asserts, same. When simulating with "BEGIN; TRUNCATE nightmare; COPY
nightmare FROM '/tmp/copy.out'; COMMIT;" on wal_level=minimal it still
recovers using incremental backup because the WAL contains:

rmgr: Storage, desc: CREATE base/5/36425
[..]
rmgr: XLOG,    desc: FPI , blkref #0: rel 1663/5/36425 blk 0 FPW
[..]

e.g. TRUNCATE sets a new relfilenode each time, so they will always be
included in backup and wal_level=minimal optimizations kicks only for
commands that issue a new relfilenode. True/false?

postgres=# select oid, relfilenode, relname from pg_class where
relname like 'night%' order by 1;
  oid  | relfilenode |       relname
-------+-------------+---------------------
 16384 |           0 | nightmare
 16390 |       36420 | nightmare_p0
 16398 |       36425 | nightmare_p1
 36411 |           0 | nightmare_pkey
 36413 |       36422 | nightmare_p0_pkey
 36415 |       36427 | nightmare_p1_pkey
 36417 |           0 | nightmare_brin_idx
 36418 |       36423 | nightmare_p0_ts_idx
 36419 |       36428 | nightmare_p1_ts_idx
(9 rows)

postgres=# truncate nightmare;
TRUNCATE TABLE
postgres=# select oid, relfilenode, relname from pg_class where
relname like 'night%' order by 1;
  oid  | relfilenode |       relname
-------+-------------+---------------------
 16384 |           0 | nightmare
 16390 |       36434 | nightmare_p0
 16398 |       36439 | nightmare_p1
 36411 |           0 | nightmare_pkey
 36413 |       36436 | nightmare_p0_pkey
 36415 |       36441 | nightmare_p1_pkey
 36417 |           0 | nightmare_brin_idx
 36418 |       36437 | nightmare_p0_ts_idx
 36419 |       36442 | nightmare_p1_ts_idx

-J.

[1] - 
https://www.postgresql.org/message-id/CAKZiRmzT%2BbX2ZYdORO32cADtfQ9DvyaOE8fsOEWZc2V5FkEWVg%40mail.gmail.com

Re: trying again to get incremental backup

Reply via email to