Re: standby promotion can create unreadable WAL

2022-08-29 Thread Robert Haas
On Mon, Aug 29, 2022 at 12:21 AM Kyotaro Horiguchi wrote: > Mmm. That seems wrong. So forget about that. The proposed patch looks > fine to me. Thanks for thinking it over. Committed and back-patched as far as v10, since that's the oldest supported release. -- Robert Haas EDB: http://www.enter

Re: standby promotion can create unreadable WAL

2022-08-29 Thread Dilip Kumar
On Fri, Aug 26, 2022 at 6:14 PM Dilip Kumar wrote: > > On Tue, Aug 23, 2022 at 12:06 AM Robert Haas wrote: > > > However, if anything > > did try to look at file #4 it would get confused. Maybe that can > > happen if this is a streaming standby, where we only write an > > end-of-recovery record u

Re: standby promotion can create unreadable WAL

2022-08-28 Thread Kyotaro Horiguchi
At Mon, 29 Aug 2022 13:13:52 +0900 (JST), Kyotaro Horiguchi wrote in > we return it to the caller. Is it worth to do a small refactoring > like the attached? If no, I'm fine with the proposed patch including > the added assertion. Mmm. That seems wrong. So forget about that. The proposed pat

Re: standby promotion can create unreadable WAL

2022-08-28 Thread Kyotaro Horiguchi
At Sun, 28 Aug 2022 10:16:21 +0530, Dilip Kumar wrote in > On Fri, Aug 26, 2022 at 7:53 PM Robert Haas wrote: > > v2 attached. > > The patch LGTM, this patch will apply on master and v15. PFA patch > for back branches. StandbyMode is obviously wrong. On the other hand I thought that !Archiv

Re: standby promotion can create unreadable WAL

2022-08-27 Thread Dilip Kumar
On Fri, Aug 26, 2022 at 7:53 PM Robert Haas wrote: > > On Fri, Aug 26, 2022 at 10:06 AM Alvaro Herrera > wrote: > > There's a small typo in the comment: "When find that". I suppose that > > was meant to be "When we find that". You end that para with "and thus > > we should not do this", but th

Re: standby promotion can create unreadable WAL

2022-08-26 Thread Imseih (AWS), Sami
>I think, however, that your fix is wrong and this one is right. >Fundamentally, the server is either in normal running, or crash >recovery, or archive recovery. Standby mode is just an optional >behavior of archive recovery Good point. Thanks for clearing my understanding. Thanks

Re: standby promotion can create unreadable WAL

2022-08-26 Thread Robert Haas
On Fri, Aug 26, 2022 at 11:59 AM Imseih (AWS), Sami wrote: > >I agree. Testing StandbyMode here seems bogus. I thought initially > >that the test should perhaps be for InArchiveRecovery rather than > >ArchiveRecoveryRequested, but I see that the code which switches to a > >new time

Re: standby promotion can create unreadable WAL

2022-08-26 Thread Imseih (AWS), Sami
>I agree. Testing StandbyMode here seems bogus. I thought initially >that the test should perhaps be for InArchiveRecovery rather than >ArchiveRecoveryRequested, but I see that the code which switches to a >new timeline cares about ArchiveRecoveryRequested, so I think that is >t

Re: standby promotion can create unreadable WAL

2022-08-26 Thread Robert Haas
On Fri, Aug 26, 2022 at 10:06 AM Alvaro Herrera wrote: > There's a small typo in the comment: "When find that". I suppose that > was meant to be "When we find that". You end that para with "and thus > we should not do this", but that sounds like it wouldn't matter if we > did. Maybe "and thus d

Re: standby promotion can create unreadable WAL

2022-08-26 Thread Alvaro Herrera
On 2022-Aug-26, Robert Haas wrote: > I agree. Testing StandbyMode here seems bogus. I thought initially > that the test should perhaps be for InArchiveRecovery rather than > ArchiveRecoveryRequested, but I see that the code which switches to a > new timeline cares about ArchiveRecoveryRequested, s

Re: standby promotion can create unreadable WAL

2022-08-26 Thread Robert Haas
On Fri, Aug 26, 2022 at 8:44 AM Dilip Kumar wrote: > ArchiveRecoveryRequested is true. So in the below check[1] instead of > (!StandbyMode), we can just put (! ArchiveRecoveryRequested), and then > we don't need any other fix. Am I missing anything? > > [1] > ReadRecord{ > ..record = XLogPrefetc

Re: standby promotion can create unreadable WAL

2022-08-26 Thread Dilip Kumar
On Tue, Aug 23, 2022 at 12:06 AM Robert Haas wrote: > However, if anything > did try to look at file #4 it would get confused. Maybe that can > happen if this is a streaming standby, where we only write an > end-of-recovery record upon promotion, rather than a checkpoint, or > maybe if there are c

Re: standby promotion can create unreadable WAL

2022-08-24 Thread Robert Haas
On Wed, Aug 24, 2022 at 4:40 AM Kyotaro Horiguchi wrote: > Me, too. There are two ways to deal with this, I think. One is start > writing new records from abortedContRecPtr as if it were not > exist. Another is copying WAL file up to missingContRecPtr. Since the > first segment of the new timelin

Re: standby promotion can create unreadable WAL

2022-08-24 Thread Kyotaro Horiguchi
Nice find! At Wed, 24 Aug 2022 11:09:44 +0530, Dilip Kumar wrote in > On Tue, Aug 23, 2022 at 12:06 AM Robert Haas wrote: > > > Nothing that uses xlogreader is going to be able to bridge the gap > > between file #4 and file #5. In this case it doesn't matter very much, > > because we immediat

Re: standby promotion can create unreadable WAL

2022-08-23 Thread Dilip Kumar
On Tue, Aug 23, 2022 at 12:06 AM Robert Haas wrote: > Nothing that uses xlogreader is going to be able to bridge the gap > between file #4 and file #5. In this case it doesn't matter very much, > because we immediately write a checkpoint record into file #5, so if > we crash we won't try to repla

Re: standby promotion can create unreadable WAL

2022-08-23 Thread Robert Haas
On Mon, Aug 22, 2022 at 10:38 PM Nathan Bossart wrote: > There was some previous discussion on this [0] [1]. > > [0] https://postgr.es/m/2B4510B2-3D70-4990-BFE3-0FE64041C08A%40amazon.com > [1] > https://postgr.es/m/20220127.100738.1985658263632578184.horikyota.ntt%40gmail.com Thanks. It seems li

Re: standby promotion can create unreadable WAL

2022-08-22 Thread Nathan Bossart
On Mon, Aug 22, 2022 at 02:36:36PM -0400, Robert Haas wrote: > (Incidentally, there's also a bug in pg_waldump here: it's reporting > the wrong LSN as the source of the error. 0/4FFF700 is not the record > that's busted, as shown by the fact that it was successfully decoded > and shown in the outpu

standby promotion can create unreadable WAL

2022-08-22 Thread Robert Haas
My colleague Dilip Kumar and I have discovered what I believe to be a bug in the recently-added "overwrite contrecord" stuff. I'm not sure whether or not this bug has any serious consequences. I think that there may be a scenario where it does, but I'm not sure about that. Suppose you have a prima