Re: Use of backup_label not noted in log

2024-01-29 Thread Michael Paquier
On Mon, Jan 29, 2024 at 10:03:19AM -0400, David Steele wrote: > I took a pass at this on PG14 and things definitely look a lot different > back there. Not only is the timeline missing, but there are two sections of > code for ending a backup, one for standby backup and one for primary.

Re: Use of backup_label not noted in log

2024-01-29 Thread David Steele
On 1/28/24 20:09, Michael Paquier wrote: On Fri, Jan 26, 2024 at 12:08:46PM +0900, Michael Paquier wrote: Well, I'm OK with this consensus on 1d35f705e if folks think this is useful enough for all the stable branches. I have done that down to REL_15_STABLE for now as this is able to apply

Re: Use of backup_label not noted in log

2024-01-28 Thread Michael Paquier
On Fri, Jan 26, 2024 at 12:08:46PM +0900, Michael Paquier wrote: > Well, I'm OK with this consensus on 1d35f705e if folks think this is > useful enough for all the stable branches. I have done that down to REL_15_STABLE for now as this is able to apply cleanly there. Older branches have a lack

Re: Use of backup_label not noted in log

2024-01-26 Thread David Steele
On 1/25/24 20:52, Michael Paquier wrote: On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: I would still advocate for a back patch here. It is frustrating to get logs from users that just say: LOG: invalid checkpoint record PANIC: could not locate a valid checkpoint record It

Re: Use of backup_label not noted in log

2024-01-25 Thread Michael Paquier
On Thu, Jan 25, 2024 at 05:44:52PM -0400, David Steele wrote: > On 1/25/24 17:42, Tom Lane wrote: >> We're talking about 1d35f705e, right? That certainly looks harmless >> and potentially useful. I'm +1 for back-patching. > > That's the one. If we were modifying existing messages I would be

Re: Use of backup_label not noted in log

2024-01-25 Thread Michael Paquier
On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: > I would still advocate for a back patch here. It is frustrating to get logs > from users that just say: > > LOG: invalid checkpoint record > PANIC: could not locate a valid checkpoint record > > It would be very helpful to know

Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele
On 1/25/24 17:42, Tom Lane wrote: David Steele writes: Another thing to note here -- knowing the LSN is important but also knowing that backup recovery was attempted (i.e. backup_label exists) is really crucial. Knowing both just saves so much time in back and forth debugging. It appears

Re: Use of backup_label not noted in log

2024-01-25 Thread Tom Lane
David Steele writes: > Another thing to note here -- knowing the LSN is important but also > knowing that backup recovery was attempted (i.e. backup_label exists) is > really crucial. Knowing both just saves so much time in back and forth > debugging. > It appears the tally for back patching

Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele
On 1/25/24 09:29, Michael Banck wrote: Hi, On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: I would still advocate for a back patch here. It is frustrating to get logs from users that just say: LOG: invalid checkpoint record PANIC: could not locate a valid checkpoint record It

Re: Use of backup_label not noted in log

2024-01-25 Thread Michael Banck
Hi, On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: > I would still advocate for a back patch here. It is frustrating to get logs > from users that just say: > > LOG: invalid checkpoint record > PANIC: could not locate a valid checkpoint record > > It would be very helpful to

Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele
On 1/25/24 04:12, Michael Paquier wrote: On Mon, Jan 22, 2024 at 04:36:27PM +0900, Michael Paquier wrote: + if (ControlFile->backupStartPoint != InvalidXLogRecPtr) Nit 1: I would use XLogRecPtrIsInvalid here. + ereport(LOG, + (errmsg("completed backup recovery with

Re: Use of backup_label not noted in log

2024-01-25 Thread Michael Paquier
On Mon, Jan 22, 2024 at 04:36:27PM +0900, Michael Paquier wrote: > + if (ControlFile->backupStartPoint != InvalidXLogRecPtr) > > Nit 1: I would use XLogRecPtrIsInvalid here. > > + ereport(LOG, > + (errmsg("completed backup recovery with redo LSN %X/%X", > +

Re: Use of backup_label not noted in log

2024-01-21 Thread Michael Paquier
On Fri, Jan 19, 2024 at 09:32:26AM -0400, David Steele wrote: > Any status on this patch? If we do back patch it would be nice to see this > in the upcoming minor releases. I'm in favor of a back patch, as I think > this is minimally invasive and would be very useful for debugging recovery >

Re: Use of backup_label not noted in log

2024-01-19 Thread David Steele
On 11/20/23 15:08, David Steele wrote: On 11/20/23 14:27, Andres Freund wrote: I've wondered whether it's worth also adding an explicit message just after ReachedEndOfBackup(), but it seems far less urgent due to the existing "consistent recovery state reached at %X/%X" message. I think the

Re: Use of backup_label not noted in log

2023-11-21 Thread David Steele
On 11/20/23 23:54, Michael Paquier wrote: On Mon, Nov 20, 2023 at 03:31:20PM -0400, David Steele wrote: I still wonder if we need "base backup" in the messages? That sort of implies (at least to me) you used pg_basebackup but that may not be the case. Or just s/base backup/backup/? That's

Re: Use of backup_label not noted in log

2023-11-20 Thread Laurenz Albe
On Mon, 2023-11-20 at 11:03 -0800, Andres Freund wrote: > > If we add a message for starting with "backup_label", shouldn't > > we also add a corresponding message for starting from a checkpoint > > found in the control file?  If you see that in a problem report, > > you immediately know what is

Re: Use of backup_label not noted in log

2023-11-20 Thread Michael Paquier
On Mon, Nov 20, 2023 at 03:31:20PM -0400, David Steele wrote: > On 11/20/23 15:03, Andres Freund wrote: >> Besides the phrasing and the additional log message (I have no opinion about >> whether it should be backpatched or not), I used %u for TimelineID as >> appropriate, and added a comma before

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 15:03, Andres Freund wrote: On 2023-11-20 11:35:15 +0100, Laurenz Albe wrote: If we add a message for starting with "backup_label", shouldn't we also add a corresponding message for starting from a checkpoint found in the control file? If you see that in a problem report, you

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 14:27, Andres Freund wrote: Hi, On 2023-11-19 14:28:12 -0400, David Steele wrote: On 11/18/23 17:49, Andres Freund wrote: On 2023-11-18 10:01:42 -0800, Andres Freund wrote: Not enamored with the phrasing of the log messages, but here's a prototype: When starting up with

Re: Use of backup_label not noted in log

2023-11-20 Thread Andres Freund
Hi, On 2023-11-20 11:35:15 +0100, Laurenz Albe wrote: > On Mon, 2023-11-20 at 17:30 +0900, Michael Paquier wrote: > > + if (ControlFile->backupStartPoint != InvalidXLogRecPtr) > > + ereport(LOG, > > + (errmsg("continuing to start from base backup with redo > >

Re: Use of backup_label not noted in log

2023-11-20 Thread Andres Freund
Hi, On 2023-11-20 17:30:31 +0900, Michael Paquier wrote: > On Sat, Nov 18, 2023 at 01:49:15PM -0800, Andres Freund wrote: > > Note that the LSN in the "continuing" case is the one the backup started at, > > not where recovery will start. > > > > I've wondered whether it's worth also adding an

Re: Use of backup_label not noted in log

2023-11-20 Thread Andres Freund
Hi, On 2023-11-20 11:24:25 -0500, Robert Haas wrote: > I do also think it is worth considering how this proposal interacts > with the proposal to remove backup_label. If that proposal goes > through, then this proposal is obsolete, I believe. I think it's the opposite, if anything. Today you can

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 12:24, Robert Haas wrote: On Mon, Nov 20, 2023 at 5:35 AM Laurenz Albe wrote: I can accept that adding log messages to back branches is ok. Perhaps I am too nervous about things like that, because as an extension developer I have been bitten too often by ABI breaks in minor

Re: Use of backup_label not noted in log

2023-11-20 Thread Robert Haas
On Mon, Nov 20, 2023 at 5:35 AM Laurenz Albe wrote: > I can accept that adding log messages to back branches is ok. > Perhaps I am too nervous about things like that, because as an extension > developer I have been bitten too often by ABI breaks in minor releases > in the past. I think that

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
On 11/20/23 06:35, Laurenz Albe wrote: If we add a message for starting with "backup_label", shouldn't we also add a corresponding message for starting from a checkpoint found in the control file? If you see that in a problem report, you immediately know what is going on. +1. It is easier to

Re: Use of backup_label not noted in log

2023-11-20 Thread David Steele
[Resending since I accidentally replied off-list] On 11/18/23 17:49, Andres Freund wrote: On 2023-11-18 10:01:42 -0800, Andres Freund wrote: What about adding it to the "redo starts at" message, something like redo starts at 12/12345678, taken from control file or redo starts at

Re: Use of backup_label not noted in log

2023-11-20 Thread Laurenz Albe
I can accept that adding log messages to back branches is ok. Perhaps I am too nervous about things like that, because as an extension developer I have been bitten too often by ABI breaks in minor releases in the past. On Mon, 2023-11-20 at 17:30 +0900, Michael Paquier wrote: > + if

Re: Use of backup_label not noted in log

2023-11-20 Thread Michael Paquier
On Sat, Nov 18, 2023 at 01:49:15PM -0800, Andres Freund wrote: > Note that the LSN in the "continuing" case is the one the backup started at, > not where recovery will start. > > I've wondered whether it's worth also adding an explicit message just after > ReachedEndOfBackup(), but it seems far

Re: Use of backup_label not noted in log

2023-11-18 Thread Andres Freund
Hi, On 2023-11-18 10:01:42 -0800, Andres Freund wrote: > > What about adding it to the "redo starts at" message, something like > > > > redo starts at 12/12345678, taken from control file > > > > or > > > > redo starts at 12/12345678, taken from backup label > > I think it'd make sense to

Re: Use of backup_label not noted in log

2023-11-18 Thread Andres Freund
Hi, On 2023-11-18 09:30:01 -0400, David Steele wrote: > I know this isn't really a bug, but not being able to tell where recovery > information came from seems like a major omission in the logging. Yea. I was preparing to forecefully suggest that some monitoring tooling should verify that new

Re: Use of backup_label not noted in log

2023-11-18 Thread Andres Freund
Hi, On 2023-11-17 06:41:46 +0100, Laurenz Albe wrote: > On Thu, 2023-11-16 at 20:18 -0800, Andres Freund wrote: > > I've often had to analyze what caused corruption in PG instances, where the > > symptoms match not having had backup_label in place when bringing on the > > node. However that's

Re: Use of backup_label not noted in log

2023-11-18 Thread David Steele
On 11/17/23 01:41, Laurenz Albe wrote: On Thu, 2023-11-16 at 20:18 -0800, Andres Freund wrote: I've often had to analyze what caused corruption in PG instances, where the symptoms match not having had backup_label in place when bringing on the node. However that's surprisingly hard - the only

Re: Use of backup_label not noted in log

2023-11-18 Thread David Steele
On 11/17/23 00:18, Andres Freund wrote: I've often had to analyze what caused corruption in PG instances, where the symptoms match not having had backup_label in place when bringing on the node. However that's surprisingly hard - the only log messages that indicate use of backup_label are at

Re: Use of backup_label not noted in log

2023-11-16 Thread Laurenz Albe
On Thu, 2023-11-16 at 20:18 -0800, Andres Freund wrote: > I've often had to analyze what caused corruption in PG instances, where the > symptoms match not having had backup_label in place when bringing on the > node. However that's surprisingly hard - the only log messages that indicate > use of

Use of backup_label not noted in log

2023-11-16 Thread Andres Freund
Hi, I've often had to analyze what caused corruption in PG instances, where the symptoms match not having had backup_label in place when bringing on the node. However that's surprisingly hard - the only log messages that indicate use of backup_label are at DEBUG1. Given how crucial use of