Re: pg_combinebackup does not detect missing files

2024-05-18 Thread David Steele
On 5/18/24 21:06, Tomas Vondra wrote: On 5/17/24 14:20, Robert Haas wrote: On Fri, May 17, 2024 at 1:18 AM David Steele wrote: However, I think allowing the user to optionally validate the input would be a good feature. Running pg_verifybackup as a separate step is going to be a more

Re: pg_combinebackup does not detect missing files

2024-05-17 Thread David Steele
On 5/17/24 22:20, Robert Haas wrote: On Fri, May 17, 2024 at 1:18 AM David Steele wrote: However, I think allowing the user to optionally validate the input would be a good feature. Running pg_verifybackup as a separate step is going to be a more expensive then verifying/copying at the same

Re: pg_combinebackup does not detect missing files

2024-05-16 Thread David Steele
On 4/25/24 00:05, Robert Haas wrote: On Tue, Apr 23, 2024 at 7:23 PM David Steele wrote: I don't understand what you mean here. I thought we were in agreement that verifying contents would cost a lot more. The verification that we can actually do without much cost can only check for missing

Re: Add recovery to pg_control and remove backup_label

2024-05-16 Thread David Steele
On 4/16/24 18:55, Stefan Fercot wrote: Hi, On Sunday, March 10th, 2024 at 4:47 AM, David Steele wrote: I've had a new idea which may revive this patch. The basic idea is to keep backup_label but also return a copy of pg_control from pg_stop_backup(). This copy of pg_control would be safe from

Return pg_control from pg_backup_stop().

2024-05-16 Thread David Steele
register this in the July CF. Regards, -David [1] https://www.postgresql.org/message-id/2daf8adc-8db7-4204-a7f2-a7e94e2bf...@pgmasters.netFrom 531872dcdb09f5d2f89af44a3c04c9d2d6da89be Mon Sep 17 00:00:00 2001 From: David Steele Date: Fri, 17 May 2024 02:42:35 + Subject: Return pg_control from

Re: CREATE TABLE/ProcessUtility hook behavior change

2024-05-15 Thread David Steele
On 4/30/24 21:15, jian he wrote: On Tue, Apr 30, 2024 at 4:35 PM David Steele wrote: On 4/30/24 12:57, jian he wrote: On Tue, Apr 30, 2024 at 10:26 AM David Steele wrote: Since bb766cde cannot be readily applied to older commits in master I'm unable to continue bisecting to find the ALTER

Re: CREATE TABLE/ProcessUtility hook behavior change

2024-04-30 Thread David Steele
On 4/30/24 12:57, jian he wrote: On Tue, Apr 30, 2024 at 10:26 AM David Steele wrote: Since bb766cde cannot be readily applied to older commits in master I'm unable to continue bisecting to find the ALTER TABLE behavioral change. This seems to leave me with manual code inspection

CREATE TABLE/ProcessUtility hook behavior change

2024-04-29 Thread David Steele
Hackers, While testing pgAudit against PG17 I noticed the following behavioral change: CREATE TABLE public.test ( id INT, name TEXT, description TEXT, CONSTRAINT test_pkey PRIMARY KEY (id) ); NOTICE: AUDIT: SESSION,23,1,DDL,CREATE

Re: pg_combinebackup does not detect missing files

2024-04-23 Thread David Steele
On 4/22/24 23:53, Robert Haas wrote: On Sun, Apr 21, 2024 at 8:47 PM David Steele wrote: I figured that wouldn't be particularly meaningful, and that's pretty much the only kind of validation that's even theoretically possible without a bunch of extra overhead, since we compute checksums

Re: pg_combinebackup does not detect missing files

2024-04-21 Thread David Steele
On 4/20/24 01:47, Robert Haas wrote: On Thu, Apr 18, 2024 at 7:36 PM David Steele wrote: Sure -- pg_combinebackup would only need to verify the data that it uses. I'm not suggesting that it should do an exhaustive verify of every single backup in the chain. Though I can see how it sounded

Re: pg_combinebackup does not detect missing files

2024-04-18 Thread David Steele
On 4/19/24 00:50, Robert Haas wrote: On Wed, Apr 17, 2024 at 7:09 PM David Steele wrote: Fair enough. I accept that your reasoning is not random, but I'm still not very satisfied that the user needs to run a separate and rather expensive process to do the verification when pg_combinebackup

Re: pg_combinebackup fails on file named INCREMENTAL.*

2024-04-18 Thread David Steele
On 4/19/24 01:10, Robert Haas wrote: On Wed, Apr 17, 2024 at 7:56 PM David Steele wrote: Thanks! I've tested this and it works as advertised. Ideally I'd want an error on backup if there is a similar file in any data directories that would cause an error on combine, but I admit

Re: pg_combinebackup fails on file named INCREMENTAL.*

2024-04-17 Thread David Steele
On 4/18/24 00:14, Robert Haas wrote: On Tue, Apr 16, 2024 at 9:25 AM Robert Haas wrote: On Mon, Apr 15, 2024 at 10:12 PM David Steele wrote: Anyway, I think it should be fixed or documented as a caveat since it causes a hard failure on restore. Alright, I'll look into this. Here's

Re: pg_combinebackup does not detect missing files

2024-04-17 Thread David Steele
On 4/18/24 01:03, Robert Haas wrote: On Tue, Apr 16, 2024 at 7:25 PM David Steele wrote: But it will not go out of its way to perform checks that are unrelated to its documented purpose. And I don't think it should, especially if we have another tool that already does that. I'm not averse

Re: pg_combinebackup does not detect missing files

2024-04-16 Thread David Steele
On 4/16/24 23:50, Robert Haas wrote: On Wed, Apr 10, 2024 at 9:36 PM David Steele wrote: I've been playing around with the incremental backup feature trying to get a sense of how it can be practically used. One of the first things I always try is to delete random files and see what happens

Re: pg_combinebackup fails on file named INCREMENTAL.*

2024-04-15 Thread David Steele
On 4/16/24 06:33, Robert Haas wrote: On Sun, Apr 14, 2024 at 11:53 PM David Steele wrote: Since incremental backup is using INCREMENTAL as a keyword (rather than storing incremental info in the manifest) it is vulnerable to any file in PGDATA with the pattern INCREMENTAL.*. Yeah, that's true

pg_combinebackup fails on file named INCREMENTAL.*

2024-04-14 Thread David Steele
Hackers, Since incremental backup is using INCREMENTAL as a keyword (rather than storing incremental info in the manifest) it is vulnerable to any file in PGDATA with the pattern INCREMENTAL.*. For example: $ pg_basebackup -c fast -D test/backup/full -F plain $ touch

Re: post-freeze damage control

2024-04-14 Thread David Steele
On 4/13/24 21:02, Tomas Vondra wrote: On 4/13/24 01:23, David Steele wrote: Even for the summarizer, though, I do worry about the complexity of maintaining it over time. It seems like it would be very easy to introduce a bug and have it go unnoticed until it causes problems in the field. A lot

Re: post-freeze damage control

2024-04-12 Thread David Steele
On 4/12/24 22:27, Tomas Vondra wrote: On 4/12/24 08:42, David Steele wrote: On 4/11/24 20:26, Tomas Vondra wrote: On 4/11/24 03:52, David Steele wrote: On 4/11/24 10:23, Tom Kincaid wrote: The extensive Beta process we have can be used to build confidence we need in a feature that has

Re: post-freeze damage control

2024-04-12 Thread David Steele
On 4/12/24 22:12, Tomas Vondra wrote: On 4/11/24 23:48, David Steele wrote: On 4/11/24 20:26, Tomas Vondra wrote: >> FWIW that discussion also mentions stuff that I think the feature should not do. In particular, I don't think the ambition was (or should be) to make pg_basebackup into a

Re: Add notes to pg_combinebackup docs

2024-04-12 Thread David Steele
On 4/12/24 22:40, Tomas Vondra wrote: On 4/12/24 11:50, David Steele wrote: On 4/12/24 19:09, Magnus Hagander wrote: On Fri, Apr 12, 2024 at 12:14 AM David Steele > > But yeah, having to keep the backups as expanded directories is not > great, I'd love to

Re: Add notes to pg_combinebackup docs

2024-04-12 Thread David Steele
On 4/12/24 19:09, Magnus Hagander wrote: On Fri, Apr 12, 2024 at 12:14 AM David Steele OK, sure, but if the plan is to make it practical later doesn't that make the feature something to be avoided now? That could be said for any feature. When we shipped streaming replication

Re: post-freeze damage control

2024-04-12 Thread David Steele
On 4/11/24 20:26, Tomas Vondra wrote: On 4/11/24 03:52, David Steele wrote: On 4/11/24 10:23, Tom Kincaid wrote: The extensive Beta process we have can be used to build confidence we need in a feature that has extensive review and currently has no known issues or outstanding objections. I

Re: post-freeze damage control

2024-04-11 Thread David Steele
On 4/12/24 12:15, Robert Haas wrote: On Thu, Apr 11, 2024 at 5:48 PM David Steele wrote: But they'll try because it is a new pg_basebackup feature and they'll assume it is there to be used. Maybe it would be a good idea to make it clear in the documentation that significant tooling

Re: Add notes to pg_combinebackup docs

2024-04-11 Thread David Steele
On 4/11/24 20:51, Tomas Vondra wrote: On 4/11/24 02:01, David Steele wrote: I have a hard time seeing this feature as being very useful, especially for large databases, until pg_combinebackup works on tar (and compressed tar). Right now restoring an incremental requires at least twice

Re: post-freeze damage control

2024-04-11 Thread David Steele
On 4/11/24 20:26, Tomas Vondra wrote: On 4/11/24 03:52, David Steele wrote: On 4/11/24 10:23, Tom Kincaid wrote: The extensive Beta process we have can be used to build confidence we need in a feature that has extensive review and currently has no known issues or outstanding objections. I

Re: post-freeze damage control

2024-04-10 Thread David Steele
On 4/11/24 10:23, Tom Kincaid wrote: The extensive Beta process we have can be used to build confidence we need in a feature that has extensive review and currently has no known issues or outstanding objections. I did have objections, here [1] and here [2]. I think the complexity, space

pg_combinebackup does not detect missing files

2024-04-10 Thread David Steele
Hackers, I've been playing around with the incremental backup feature trying to get a sense of how it can be practically used. One of the first things I always try is to delete random files and see what happens. You can delete pretty much anything you want from the most recent incremental

Re: Add notes to pg_combinebackup docs

2024-04-10 Thread David Steele
On 4/9/24 19:44, Tomas Vondra wrote: On 4/9/24 09:59, Martín Marqués wrote: Hello, While doing some work/research on the new incremental backup feature some limitations were not listed in the docs. Mainly the fact that pg_combienbackup works with plain format and not tar. Right. The docs

Re: post-freeze damage control

2024-04-10 Thread David Steele
On 4/10/24 09:50, Michael Paquier wrote: On Wed, Apr 10, 2024 at 09:29:38AM +1000, David Steele wrote: Even so, only keeping WAL for the last backup is a dangerous move in any case. Lots of things can happen to a backup (other than bugs in the software) so keeping WAL back to the last full

Re: post-freeze damage control

2024-04-09 Thread David Steele
On 4/10/24 01:59, Andrey M. Borodin wrote: On 9 Apr 2024, at 18:45, Alvaro Herrera wrote: Maybe we should explicitly advise users to not delete that WAL from their archives, until pg_combinebackup is hammered a bit more. As a backup tool maintainer, I always reference to out-of-the box

Re: Possibility to disable `ALTER SYSTEM`

2024-03-20 Thread David Steele
On 3/20/24 22:30, Michael Banck wrote: On Tue, Mar 19, 2024 at 10:51:50AM -0400, Tom Lane wrote: Heikki Linnakangas writes: Perhaps we could make that even better with a GUC though. I propose a GUC called 'configuration_managed_externally = true / false". If you set it to true, we prevent

Re: Add basic tests for the low-level backup method.

2024-03-14 Thread David Steele
On 3/15/24 18:32, Michael Paquier wrote: On Fri, Mar 15, 2024 at 06:23:15PM +1300, David Steele wrote: Well, this is what we recommend in the docs, i.e. using bin mode to save backup_label, so it seems OK to me. Indeed, I didn't notice that this is actually documented, so what I did took

Re: Add basic tests for the low-level backup method.

2024-03-14 Thread David Steele
On 3/15/24 12:38, Michael Paquier wrote: On Fri, Mar 15, 2024 at 09:40:38AM +1300, David Steele wrote: Is the missing test in meson the reason we did not see test failures for Windows in CI? The test has to be listed in src/test/recovery/meson.build or the CI would ignore it. Right -- I

Re: Add basic tests for the low-level backup method.

2024-03-14 Thread David Steele
On 3/14/24 20:00, Michael Paquier wrote: On Thu, Mar 14, 2024 at 09:12:52AM +1300, David Steele wrote: I think you are right that the start message is better since it can only appear once when the backup_label is found. The completed message could in theory appear after a restart, though

Re: Add basic tests for the low-level backup method.

2024-03-13 Thread David Steele
On 3/13/24 19:15, Michael Paquier wrote: On Wed, Mar 13, 2024 at 01:12:28PM +1300, David Steele wrote: Not sure what to look for here. There are no distinct messages for crash recovery. Perhaps there should be? The closest thing I can think of here would be "database system was not pro

Re: Add basic tests for the low-level backup method.

2024-03-12 Thread David Steele
On 2/29/24 16:55, Michael Paquier wrote: On Thu, Feb 29, 2024 at 10:30:52AM +1300, David Steele wrote: On 11/12/23 08:21, David Steele wrote: As promised in [1], attached are some basic tests for the low-level backup method. Added to the 2024-03 CF. There is already

Re: Add checkpoint/redo LSNs to recovery errors.

2024-03-09 Thread David Steele
On 2/29/24 16:42, Michael Paquier wrote: On Thu, Feb 29, 2024 at 10:53:15AM +1300, David Steele wrote: This patch adds checkpoint/redo LSNs to recovery error messages where they may be useful for debugging. I've scanned a bit xlogrecovery.c, and I have spotted a couple of that could gain more

Re: Add recovery to pg_control and remove backup_label

2024-03-09 Thread David Steele
On 1/29/24 12:28, David Steele wrote: On 1/28/24 19:11, Michael Paquier wrote: On Fri, Jan 26, 2024 at 06:27:30PM +0530, vignesh C wrote: Please post an updated version for the same. [1] - http://cfbot.cputube.org/patch_46_3511.log With the recent introduction of incremental backups

Add checkpoint/redo LSNs to recovery errors.

2024-02-28 Thread David Steele
9678d0cd928c58e4a3d038c8aa4c191f5bdddbe7 Mon Sep 17 00:00:00 2001 From: David Steele Date: Wed, 28 Feb 2024 21:45:39 + Subject: Add checkpoint/redo LSNs to recovery errors. When backup_label is present the LSNs are already output in a log message, but it still seems like a good idea to repeat them. When

Re: Add basic tests for the low-level backup method.

2024-02-28 Thread David Steele
On 11/12/23 08:21, David Steele wrote: As promised in [1], attached are some basic tests for the low-level backup method. Added to the 2024-03 CF. Regards, -David

Re: Use of backup_label not noted in log

2024-01-29 Thread David Steele
On 1/28/24 20:09, Michael Paquier wrote: On Fri, Jan 26, 2024 at 12:08:46PM +0900, Michael Paquier wrote: Well, I'm OK with this consensus on 1d35f705e if folks think this is useful enough for all the stable branches. I have done that down to REL_15_STABLE for now as this is able to apply

Re: Add recovery to pg_control and remove backup_label

2024-01-28 Thread David Steele
On 1/28/24 19:11, Michael Paquier wrote: On Fri, Jan 26, 2024 at 06:27:30PM +0530, vignesh C wrote: Please post an updated version for the same. [1] - http://cfbot.cputube.org/patch_46_3511.log With the recent introduction of incremental backups that depend on backup_label and the rather

Re: Use of backup_label not noted in log

2024-01-26 Thread David Steele
On 1/25/24 20:52, Michael Paquier wrote: On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: I would still advocate for a back patch here. It is frustrating to get logs from users that just say: LOG: invalid checkpoint record PANIC: could not locate a valid checkpoint record

Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele
On 1/25/24 17:42, Tom Lane wrote: David Steele writes: Another thing to note here -- knowing the LSN is important but also knowing that backup recovery was attempted (i.e. backup_label exists) is really crucial. Knowing both just saves so much time in back and forth debugging. It appears

Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele
On 1/25/24 09:29, Michael Banck wrote: Hi, On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: I would still advocate for a back patch here. It is frustrating to get logs from users that just say: LOG: invalid checkpoint record PANIC: could not locate a valid checkpoint record

Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele
On 1/25/24 04:12, Michael Paquier wrote: On Mon, Jan 22, 2024 at 04:36:27PM +0900, Michael Paquier wrote: + if (ControlFile->backupStartPoint != InvalidXLogRecPtr) Nit 1: I would use XLogRecPtrIsInvalid here. + ereport(LOG, + (errmsg("completed backup recovery with

Re: Use of backup_label not noted in log

2024-01-19 Thread David Steele
On 11/20/23 15:08, David Steele wrote: On 11/20/23 14:27, Andres Freund wrote: I've wondered whether it's worth also adding an explicit message just after ReachedEndOfBackup(), but it seems far less urgent due to the existing "consistent recovery state reached at %X/%X" message.

Re: Detecting some cases of missing backup_label

2023-12-21 Thread David Steele
On 12/21/23 07:37, Andres Freund wrote: On 2023-12-20 13:11:37 -0400, David Steele wrote: I've run this through a bunch of scenarios (in my head) with parallel backups and it does seem to hold up. I think we'd need to write the state file before XLOG_BACKUP_START just in case. Seems better

Re: Detecting some cases of missing backup_label

2023-12-20 Thread David Steele
On 12/18/23 10:39, Stephen Frost wrote: Greetings, * Stephen Frost (sfr...@snowman.net) wrote: * Andres Freund (and...@anarazel.de) wrote: I recently mentioned to Robert (and also Heikki earlier), that I think I see a way to detect an omitted backup_label in a relevant subset of the cases

Re: Add recovery to pg_control and remove backup_label

2023-11-21 Thread David Steele
On 11/21/23 16:00, Andres Freund wrote: Hi, On 2023-11-21 14:48:59 -0400, David Steele wrote: I'd not call 7.06->4.77 or 6.76->4.77 "virtually free". OK, but how does that look with compression With compression it's obviously somewhat different - but that part is

Re: Add recovery to pg_control and remove backup_label

2023-11-21 Thread David Steele
On 11/21/23 13:59, Andres Freund wrote: On 2023-11-21 13:41:15 -0400, David Steele wrote: On 11/20/23 16:41, Andres Freund wrote: On 2023-11-20 15:56:19 -0400, David Steele wrote: I understand this is an option -- but does it need to be? What is the benefit of excluding the manifest? It's

Re: Add recovery to pg_control and remove backup_label

2023-11-21 Thread David Steele
On 11/20/23 16:37, Andres Freund wrote: On 2023-11-20 11:11:13 -0500, Robert Haas wrote: I think we need more votes to make a change this big. I have a concern, which I think I've expressed before, that we keep whacking around the backup APIs, and that has a cost which is potentially larger

Re: Add recovery to pg_control and remove backup_label

2023-11-21 Thread David Steele
On 11/20/23 16:41, Andres Freund wrote: On 2023-11-20 15:56:19 -0400, David Steele wrote: I understand this is an option -- but does it need to be? What is the benefit of excluding the manifest? It's not free to create the manifest, particularly if checksums are enabled. It's virtually

Re: Add recovery to pg_control and remove backup_label

2023-11-21 Thread David Steele
On 11/21/23 12:41, Andres Freund wrote: On 2023-11-21 07:42:42 -0400, David Steele wrote: On 11/20/23 19:58, Andres Freund wrote: On 2023-11-21 08:52:08 +0900, Michael Paquier wrote: On Mon, Nov 20, 2023 at 12:37:46PM -0800, Andres Freund wrote: Given that, I wonder if what we should do

Re: Add recovery to pg_control and remove backup_label

2023-11-21 Thread David Steele
On 11/20/23 19:58, Andres Freund wrote: On 2023-11-21 08:52:08 +0900, Michael Paquier wrote: On Mon, Nov 20, 2023 at 12:37:46PM -0800, Andres Freund wrote: Given that, I wonder if what we should do is to just add a new field to pg_control that says "error out if backup_label does not exist",

Re: Use of backup_label not noted in log

2023-11-21 Thread David Steele
On 11/20/23 23:54, Michael Paquier wrote: On Mon, Nov 20, 2023 at 03:31:20PM -0400, David Steele wrote: I still wonder if we need "base backup" in the messages? That sort of implies (at least to me) you used pg_basebackup but that may not be the case. Or just s/base backup/backup/

Re: Add recovery to pg_control and remove backup_label

2023-11-20 Thread David Steele
On 11/20/23 15:47, Robert Haas wrote: On Mon, Nov 20, 2023 at 2:41 PM David Steele wrote: I can't see why a backup would continue to be valid without a manifest -- that's not very standard for backup software. If you have the critical info in backup_label, you can't afford to lose that, so why

Re: Add recovery to pg_control and remove backup_label

2023-11-20 Thread David Steele
On 11/20/23 14:44, Robert Haas wrote: On Mon, Nov 20, 2023 at 12:54 PM David Steele wrote: Another thing we could do is explicitly error if we see backup_label in PGDATA during recovery. That's just a few lines of code so would not be a big deal to maintain. This error would only be visible

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 15:03, Andres Freund wrote: On 2023-11-20 11:35:15 +0100, Laurenz Albe wrote: If we add a message for starting with "backup_label", shouldn't we also add a corresponding message for starting from a checkpoint found in the control file? If you see that in a problem report, you

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 14:27, Andres Freund wrote: Hi, On 2023-11-19 14:28:12 -0400, David Steele wrote: On 11/18/23 17:49, Andres Freund wrote: On 2023-11-18 10:01:42 -0800, Andres Freund wrote: Not enamored with the phrasing of the log messages, but here's a prototype: When starting up

Re: Add recovery to pg_control and remove backup_label

2023-11-20 Thread David Steele
On 11/20/23 12:11, Robert Haas wrote: On Sun, Nov 19, 2023 at 8:16 PM Michael Paquier wrote: (I am not exactly sure how, but we've lost pgsql-hackers on the way when you sent v5. Now added back in CC with the two latest patches you've proposed attached.) Here is a short summary of what

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 12:24, Robert Haas wrote: On Mon, Nov 20, 2023 at 5:35 AM Laurenz Albe wrote: I can accept that adding log messages to back branches is ok. Perhaps I am too nervous about things like that, because as an extension developer I have been bitten too often by ABI breaks in minor

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 06:35, Laurenz Albe wrote: If we add a message for starting with "backup_label", shouldn't we also add a corresponding message for starting from a checkpoint found in the control file? If you see that in a problem report, you immediately know what is going on. +1. It is easier to

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
[Resending since I accidentally replied off-list] On 11/18/23 17:49, Andres Freund wrote: On 2023-11-18 10:01:42 -0800, Andres Freund wrote: What about adding it to the "redo starts at" message, something like redo starts at 12/12345678, taken from control file or redo starts at

Re: Add recovery to pg_control and remove backup_label

2023-11-20 Thread David Steele
. This looks right to me. On Sun, Nov 19, 2023 at 02:14:32PM -0400, David Steele wrote: On 11/15/23 20:03, Michael Paquier wrote: As the label is only an informational field, the parsing added to pg_verifybackup is not really needed because it is used nowhere in the validation process, so keeping

Re: Use of backup_label not noted in log

2023-11-18 Thread David Steele
On 11/17/23 01:41, Laurenz Albe wrote: On Thu, 2023-11-16 at 20:18 -0800, Andres Freund wrote: I've often had to analyze what caused corruption in PG instances, where the symptoms match not having had backup_label in place when bringing on the node. However that's surprisingly hard - the only

Re: Use of backup_label not noted in log

2023-11-18 Thread David Steele
On 11/17/23 00:18, Andres Freund wrote: I've often had to analyze what caused corruption in PG instances, where the symptoms match not having had backup_label in place when bringing on the node. However that's surprisingly hard - the only log messages that indicate use of backup_label are at

Add basic tests for the low-level backup method.

2023-11-11 Thread David Steele
, but this at least supplies some basic tests and provides a place for future improvement. Regards, -David [1] https://www.postgresql.org/message-id/2daf8adc-8db7-4204-a7f2-a7e94e2bfa4b%40pgmasters.netFrom b1a10ab4ab912e6f174206fcb8ce67496f93df83 Mon Sep 17 00:00:00 2001 From: David Steele Date: Sat, 11

Re: Add recovery to pg_control and remove backup_label

2023-11-10 Thread David Steele
On 11/10/23 00:37, Michael Paquier wrote: On Tue, Nov 07, 2023 at 05:20:27PM +0900, Michael Paquier wrote: On Mon, Nov 06, 2023 at 05:39:02PM -0400, David Steele wrote: I've retested today, and miss the failure. I'll let you know if I see this again. I've done a few more dozen runs

Re: Add recovery to pg_control and remove backup_label

2023-11-06 Thread David Steele
On 11/6/23 02:35, Michael Paquier wrote: On Sun, Nov 05, 2023 at 01:45:39PM -0400, David Steele wrote: Rebased on 151ffcf6. I like this patch a lot. Even if the backup_label file is removed, we still have all the debug information from the backup history file, thanks to its LABEL, BACKUP

Re: Add recovery to pg_control and remove backup_label

2023-11-06 Thread David Steele
On 11/6/23 01:05, Michael Paquier wrote: On Fri, Oct 27, 2023 at 10:10:42AM -0400, David Steele wrote: We are still planning to address this issue in the back branches. FWIW, redesigning the backend code in charge of doing base backups in the back branches is out of scope. Based on a read

Re: Add recovery to pg_control and remove backup_label

2023-11-05 Thread David Steele
Rebased on 151ffcf6.diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 8cb24d6ae54..6be8fb902c5 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -935,19 +935,20 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true); ready to archive.

Re: Add recovery to pg_control and remove backup_label

2023-11-05 Thread David Steele
On 10/27/23 13:45, David G. Johnston wrote: Let me modify that to make it a bit more clear, I actually wouldn't care if pg_backup_end outputs an entire binary pg_control file as part of the SQL resultset. My proposal would be to, in addition, place in the temporary directory on the server,

Re: Add recovery to pg_control and remove backup_label

2023-10-27 Thread David Steele
On 10/26/23 17:27, David G. Johnston wrote: On Thu, Oct 26, 2023 at 2:02 PM David Steele <mailto:da...@pgmasters.net>> wrote: Are we planning on dealing with torn writes in the back branches in some way or are we just throwing in the towel and saying the old method is too error-prone

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

2023-10-27 Thread David Steele
On 10/27/23 03:22, Michael Paquier wrote: On Mon, Oct 16, 2023 at 02:54:35PM +0900, Michael Paquier wrote: On Sat, Oct 14, 2023 at 03:45:33PM -0400, David Steele wrote: On 9/28/23 19:59, Michael Paquier wrote: Another idea I had was to force the creation of recovery.signal by pg_basebackup

Add recovery to pg_control and remove backup_label

2023-10-26 Thread David Steele
Hackers, This was originally proposed in [1] but that thread went through a number of different proposals so it seems better to start anew. The basic idea here is to simplify and harden recovery by getting rid of backup_label and storing recovery information directly in pg_control. Instead

Re: Remove dead code in pg_ctl.c

2023-10-25 Thread David Steele
On 10/25/23 17:30, Nathan Bossart wrote: On Wed, Oct 25, 2023 at 03:02:01PM -0500, Nathan Bossart wrote: On Wed, Oct 25, 2023 at 02:53:31PM -0400, David Steele wrote: It looks like this code was missed in 39969e2a when exclusive backup was removed. Indeed. I'll plan on committing

Remove dead code in pg_ctl.c

2023-10-25 Thread David Steele
Hackers, It looks like this code was missed in 39969e2a when exclusive backup was removed. Regards, -Daviddiff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c index 807d7023a99..4099d240e03 100644 --- a/src/bin/pg_ctl/pg_ctl.c +++ b/src/bin/pg_ctl/pg_ctl.c @@ -96,7 +96,6 @@ static

Re: Add trailing commas to enum definitions

2023-10-23 Thread David Steele
On 10/23/23 17:04, Tom Lane wrote: Nathan Bossart writes: From a long-term perspective, I think standardizing on the trailing comma style will actually improve git-blame because patches won't need to add a comma to the previous line when adding a value. Yeah, that's a good point. I had

Re: trying again to get incremental backup

2023-10-23 Thread David Steele
On 10/23/23 11:44, Robert Haas wrote: On Fri, Oct 20, 2023 at 11:30 AM David Steele wrote: I don't plan to stand in your way on this feature. I'm reviewing what patches I can out of courtesy and to be sure that nothing adjacent to your work is being affected. My apologies if my reviews

Re: trying again to get incremental backup

2023-10-20 Thread David Steele
On 10/19/23 16:00, Robert Haas wrote: On Thu, Oct 19, 2023 at 3:18 PM David Steele wrote: 0001 looks pretty good to me. The only thing I find a little troublesome is the repeated construction of file names with/without segment numbers in ResetUnloggedRelationsInDbspaceDir(), .e.g

Re: trying again to get incremental backup

2023-10-19 Thread David Steele
On 10/19/23 12:05, Robert Haas wrote: On Wed, Oct 4, 2023 at 4:08 PM Robert Haas wrote: Clearly there's a good amount of stuff to sort out here, but we've still got quite a bit of time left before feature freeze so I'd like to have a go at it. Please let me know your thoughts, if you have any.

Re: The danger of deleting backup_label

2023-10-19 Thread David Steele
On 10/19/23 10:56, Robert Haas wrote: On Thu, Oct 19, 2023 at 10:43 AM David Steele wrote: What I meant here (but said badly) is that in the case of snapshot backups, the backup_label and tablespace_map will likely need to be stored somewhere off the server since they can't be part

Re: The danger of deleting backup_label

2023-10-19 Thread David Steele
On 10/19/23 10:24, Robert Haas wrote: On Wed, Oct 18, 2023 at 7:15 PM David Steele wrote: (b) be stored someplace else, I don't think the additional fields *need* to be stored anywhere at all, at least not by us. We can provide them as output from pg_backup_stop() and the caller can do

Re: The danger of deleting backup_label

2023-10-18 Thread David Steele
On 10/18/23 08:39, Robert Haas wrote: On Tue, Oct 17, 2023 at 4:17 PM David Steele wrote: Given that the above can't be back patched, I'm thinking we don't need backup_label at all going forward. We just write the values we need for recovery into pg_control and return *that* from

Re: The danger of deleting backup_label

2023-10-18 Thread David Steele
On 10/17/23 22:13, Kyotaro Horiguchi wrote: At Tue, 17 Oct 2023 16:16:42 -0400, David Steele wrote in Given that the above can't be back patched, I'm thinking we don't need backup_label at all going forward. We just write the values we need for recovery into pg_control and return *that* from

Re: Rename backup_label to recovery_control

2023-10-18 Thread David Steele
On 10/18/23 03:07, Peter Eisentraut wrote: On 16.10.23 17:15, David Steele wrote: I also do wonder with recovery_control is really a better name. Maybe I just have backup_label too firmly stuck in my head, but is what that file does really best described as recovery control? I'm not so sure

Re: The danger of deleting backup_label

2023-10-17 Thread David Steele
On 10/14/23 11:30, David Steele wrote: On 10/12/23 10:19, David Steele wrote: On 10/11/23 18:10, Thomas Munro wrote: As Stephen mentioned[1], we could perhaps also complain if both backup label and control file exist, and then hint that the user should remove the *control file

Re: Improving Physical Backup/Restore within the Low Level API

2023-10-17 Thread David Steele
On 10/17/23 14:28, Robert Haas wrote: On Mon, Oct 16, 2023 at 5:21 PM David G. Johnston wrote: But no, by default, and probably so far as pg_basebackup is concerned, a server crash during backup results in requiring outside intervention in order to get the server to restart. Others may

Re: The danger of deleting backup_label

2023-10-16 Thread David Steele
On 10/16/23 15:06, Robert Haas wrote: On Mon, Oct 16, 2023 at 1:00 PM David Steele wrote: After some agonizing (we hope) they decide to delete backup_label and, wow, it just works! So now they merrily go on their way with a corrupted cluster. They also remember for the next time that deleting

Re: Rename backup_label to recovery_control

2023-10-16 Thread David Steele
On 10/16/23 12:12, Robert Haas wrote: On Mon, Oct 16, 2023 at 12:06 PM Michael Banck wrote: Not sure what to do about this, but as people/companies start moving to 15, I am afraid we will get people complaining about this. I think having exclusive mode still be the default for

Re: Rename backup_label to recovery_control

2023-10-16 Thread David Steele
On 10/16/23 12:06, Michael Banck wrote: On Mon, Oct 16, 2023 at 11:15:53AM -0400, David Steele wrote: On 10/16/23 10:19, Robert Haas wrote: We got rid of exclusive backup mode. We replaced pg_start_backup with pg_backup_start. I do think this was an improvement. For example it allows us

Re: The danger of deleting backup_label

2023-10-16 Thread David Steele
On 10/16/23 12:25, Robert Haas wrote: On Mon, Oct 16, 2023 at 11:45 AM David Steele wrote: Hmmm, the reason to back patch this is that it would fix [1], which sure looks like a problem to me even if it is not a "bug". We can certainly require backup software to retry pg_con

Re: The danger of deleting backup_label

2023-10-16 Thread David Steele
On 10/16/23 10:55, Robert Haas wrote: On Sat, Oct 14, 2023 at 11:33 AM David Steele wrote: All of this is fixable in HEAD, but seems incredibly dangerous to back patch. Even so, I have attached the patch in case somebody sees an opportunity that I do not. I really do not think we should

Re: Rename backup_label to recovery_control

2023-10-16 Thread David Steele
On 10/16/23 10:19, Robert Haas wrote: On Sat, Oct 14, 2023 at 2:22 PM David Steele wrote: I was recently discussing the complexities of dealing with pg_control and backup_label with some hackers at PGConf NYC, when David Christensen commented that backup_label was not a very good name since

Re: Rename backup_label to recovery_control

2023-10-16 Thread David Steele
On 10/16/23 00:26, Kyotaro Horiguchi wrote: At Mon, 16 Oct 2023 13:16:42 +0900 (JST), Kyotaro Horiguchi wrote in Just an idea in a slightly different direction, but I'm wondering if we can simply merge the content of backup_label into control file. The file is 8192 bytes long, yet only 256

Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

2023-10-14 Thread David Steele
On 9/28/23 19:59, Michael Paquier wrote: On Thu, Sep 28, 2023 at 04:23:42PM -0400, David Steele wrote: So overall, +1 for Michael's patch, though I have only read through it and not tested it yet. Reviews, thoughts and opinions are welcome. OK, I have now reviewed and tested the patch

Rename backup_label to recovery_control

2023-10-14 Thread David Steele
Hackers, I was recently discussing the complexities of dealing with pg_control and backup_label with some hackers at PGConf NYC, when David Christensen commented that backup_label was not a very good name since it gives the impression of being informational and therefore something the user

Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

2023-10-14 Thread David Steele
On 10/13/23 10:40, David Steele wrote: On 10/12/23 19:15, Michael Paquier wrote: On Thu, Oct 12, 2023 at 10:41:39AM -0400, David Steele wrote: After some more thought, I think we could massage the "pg_control in backup_label" method into something that could be back patched,

  1   2   3   4   5   6   7   8   9   >