Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
At Fri, 21 Jun 2024 11:48:22 +0530, Amit Kapila wrote in > On Wed, Jun 19, 2024 at 10:44 AM Hayato Kuroda (Fujitsu) > wrote: > > > > Dear Horiguchi-san, > > > > Thanks for sharing the patch! I agree this approach (ensure WAL records are > > flushed) > > Is more proper than others. > > > > I have an unclear point. According to the comment atop GetInsertRecPtr(), > > it just > > returns the approximated value - the position of the last full WAL page [1]. > > If there is a continuation WAL record which across a page, will it return > > the > > halfway point of the WAL record (end of the first WAL page)? If so, the > > proposed > > fix seems not sufficient. We have to point out the exact the end of the > > record. > > > > You have a point but if this theory is correct why we are not able to > reproduce the issue after this patch? Also, how to get the WAL > location up to which we need to flush? Is XLogCtlData->logInsertResult > the one we are looking for? It is not exposed, but of course logInsertResult is more straightforward source for the LSN. The reason why the patch is working well is due to the following bit of the code. xlog.c:958, in XLInsertRecord() > /* >* Update shared LogwrtRqst.Write, if we crossed page boundary. >*/ > if (StartPos / XLOG_BLCKSZ != EndPos / XLOG_BLCKSZ) > { > SpinLockAcquire(>info_lck); > /* advance global request to include new block(s) */ > if (XLogCtl->LogwrtRqst.Write < EndPos) > XLogCtl->LogwrtRqst.Write = EndPos; > SpinLockRelease(>info_lck); > RefreshXLogWriteResult(LogwrtResult); > } The code, which exists has existed for a long time, ensures that GetInsertRecPtr() returns the accurate end of a record when it spanns over page boundaries. This would need to be written in the new comment if we use GetInsertRecPtr(). regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Wed, Jun 19, 2024 at 10:44 AM Hayato Kuroda (Fujitsu) wrote: > > Dear Horiguchi-san, > > Thanks for sharing the patch! I agree this approach (ensure WAL records are > flushed) > Is more proper than others. > > I have an unclear point. According to the comment atop GetInsertRecPtr(), it > just > returns the approximated value - the position of the last full WAL page [1]. > If there is a continuation WAL record which across a page, will it return the > halfway point of the WAL record (end of the first WAL page)? If so, the > proposed > fix seems not sufficient. We have to point out the exact the end of the > record. > You have a point but if this theory is correct why we are not able to reproduce the issue after this patch? Also, how to get the WAL location up to which we need to flush? Is XLogCtlData->logInsertResult the one we are looking for? -- With Regards, Amit Kapila.
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Wed, Jun 19, 2024 at 05:14:50AM +, Hayato Kuroda (Fujitsu) wrote: > I have an unclear point. According to the comment atop GetInsertRecPtr(), it > just > returns the approximated value - the position of the last full WAL page [1]. > If there is a continuation WAL record which across a page, will it return the > halfway point of the WAL record (end of the first WAL page)? If so, the > proposed > fix seems not sufficient. We have to point out the exact the end of the > record. Yeah, that a thing of the patch I am confused with. How are we sure that this is the correct LSN to rely on? If that it the case, the patch does not offer an explanation about why it is better. WalSndWaitForWal() is called only in the context of page callback for a logical WAL sender. Shouldn't we make the flush conditional on what's stored in XLogReaderState.missingContrecPtr? Aka, if we know that we're in the middle of the decoding of a continuation record, we should wait until we've dealt with it, no? In short, I would imagine that WalSndWaitForWal() should know more about XLogReaderState is doing. But perhaps I'm missing something. -- Michael signature.asc Description: PGP signature
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
FYI - I applied this latest patch and re-ran the original failing test script 10 times (e.g. 10 x 100 test iterations; it took 4+ hours). There were zero failures observed in my environment. == Kind Regards, Peter Smith. Fujitsu Australia
RE: 001_rep_changes.pl fails due to publisher stuck on shutdown
Dear Horiguchi-san, Thanks for sharing the patch! I agree this approach (ensure WAL records are flushed) Is more proper than others. I have an unclear point. According to the comment atop GetInsertRecPtr(), it just returns the approximated value - the position of the last full WAL page [1]. If there is a continuation WAL record which across a page, will it return the halfway point of the WAL record (end of the first WAL page)? If so, the proposed fix seems not sufficient. We have to point out the exact the end of the record. [1]: /* * GetInsertRecPtr -- Returns the current insert position. * * NOTE: The value *actually* returned is the position of the last full * xlog page. It lags behind the real insert position by at most 1 page. * For that, we don't need to scan through WAL insertion locks, and an * approximation is enough for the current usage of this function. */ XLogRecPtr GetInsertRecPtr(void) Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
At Thu, 13 Jun 2024 09:29:03 +0530, Amit Kapila wrote in > Yeah, but the commit you quoted later reverted by commit 703f148e98 > and committed again as c6c3334364. Yeah, right.. > > aiming to prevent walsenders from > > generating competing WAL (by, for example, CREATE_REPLICATION_SLOT) > > records with the shutdown checkpoint. Thus, it seems that the > > walsender cannot see the shutdown record, > > > > This is true of logical walsender. The physical walsender do send > shutdown checkpoint record before getting terminated. Yes, I know. They differ in their blocking mechanisms. > > and a certain amount of > > bytes before it, as the walsender appears to have relied on the > > checkpoint flushing its record, rather than on XLogBackgroundFlush(). > > > > If we approve of the walsender being terminated before the shutdown > > checkpoint, we need to "fix" the comment, then provide a function to > > ensure the synchronization of WAL records. > > > > Which comment do you want to fix? Yeah. The part you seem to think I was trying to fix is actually fine. Instead, I have revised the comment on the modified section to make its intention clearer. > > I'll consider this direction for a while. > > > > Okay, thanks. The attached patch is it. It's only for the master. I decided not to create a new function because the simple code has only one caller. I haven't seen the test script fail with this fix. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 663bdeaf8d4d2f5a192dd3ecb6d2817d9b1311f1 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Tue, 18 Jun 2024 16:36:43 +0900 Subject: [PATCH] Ensure WAL is written out when terminating a logical walsender In commit c6c3334364, XLogBackgroundFlush() was assumed to flush all written WAL to the end, but this function may not flush the last incomplete page in a single call. In such cases, if the last written page ends with a continuation record, the latter part will not be flushed, causing shutdown to wait indefinitely for that part. This commit ensures that the written records are fully flushed to the end, guaranteeing expected behavior. --- src/backend/replication/walsender.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index c623b07cf0..5aa0f58238 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1858,12 +1858,13 @@ WalSndWaitForWal(XLogRecPtr loc) ProcessRepliesIfAny(); /* - * If we're shutting down, trigger pending WAL to be written out, - * otherwise we'd possibly end up waiting for WAL that never gets - * written, because walwriter has shut down already. + * If we're shutting down, all WAL-writing processes are gone, leaving + * only checkpointer to perform the shutdown checkpoint. Ensure that + * any pending WAL is written out here; otherwise, we'd possibly end up + * waiting for WAL that never gets written. */ - if (got_STOPPING) - XLogBackgroundFlush(); + if (got_STOPPING && !RecoveryInProgress()) + XLogFlush(GetInsertRecPtr()); /* * To avoid the scenario where standbys need to catch up to a newer -- 2.43.0
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Wed, Jun 12, 2024 at 6:43 AM Kyotaro Horiguchi wrote: > > At Tue, 11 Jun 2024 14:27:28 +0530, Amit Kapila > wrote in > > On Tue, Jun 11, 2024 at 12:34 PM Kyotaro Horiguchi > > wrote: > > > > > > At Tue, 11 Jun 2024 11:32:12 +0530, Amit Kapila > > > wrote in > > > > Sorry, it is not clear to me why we failed to flush the last > > > > continuation record in logical walsender? I see that we try to flush > > > > the WAL after receiving got_STOPPING in WalSndWaitForWal(), why is > > > > that not sufficient? > > > > > > It seems that, it uses XLogBackgroundFlush(), which does not guarantee > > > flushing WAL until the end. > > > > > > > What would it take to ensure the same? I am trying to explore this > > path because currently logical WALSender sends any outstanding logs up > > to the shutdown checkpoint record (i.e., the latest record) and waits > > for them to be replicated to the standby before exit. Please take a > > look at the comments where we call WalSndDone(). The fix you are > > proposing will break that guarantee. > > Shutdown checkpoint is performed after the walsender completed > termination since 086221cf6b, > Yeah, but the commit you quoted later reverted by commit 703f148e98 and committed again as c6c3334364. > aiming to prevent walsenders from > generating competing WAL (by, for example, CREATE_REPLICATION_SLOT) > records with the shutdown checkpoint. Thus, it seems that the > walsender cannot see the shutdown record, > This is true of logical walsender. The physical walsender do send shutdown checkpoint record before getting terminated. > and a certain amount of > bytes before it, as the walsender appears to have relied on the > checkpoint flushing its record, rather than on XLogBackgroundFlush(). > > If we approve of the walsender being terminated before the shutdown > checkpoint, we need to "fix" the comment, then provide a function to > ensure the synchronization of WAL records. > Which comment do you want to fix? > I'll consider this direction for a while. > Okay, thanks. -- With Regards, Amit Kapila.
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
At Tue, 11 Jun 2024 14:27:28 +0530, Amit Kapila wrote in > On Tue, Jun 11, 2024 at 12:34 PM Kyotaro Horiguchi > wrote: > > > > At Tue, 11 Jun 2024 11:32:12 +0530, Amit Kapila > > wrote in > > > Sorry, it is not clear to me why we failed to flush the last > > > continuation record in logical walsender? I see that we try to flush > > > the WAL after receiving got_STOPPING in WalSndWaitForWal(), why is > > > that not sufficient? > > > > It seems that, it uses XLogBackgroundFlush(), which does not guarantee > > flushing WAL until the end. > > > > What would it take to ensure the same? I am trying to explore this > path because currently logical WALSender sends any outstanding logs up > to the shutdown checkpoint record (i.e., the latest record) and waits > for them to be replicated to the standby before exit. Please take a > look at the comments where we call WalSndDone(). The fix you are > proposing will break that guarantee. Shutdown checkpoint is performed after the walsender completed termination since 086221cf6b, aiming to prevent walsenders from generating competing WAL (by, for example, CREATE_REPLICATION_SLOT) records with the shutdown checkpoint. Thus, it seems that the walsender cannot see the shutdown record, and a certain amount of bytes before it, as the walsender appears to have relied on the checkpoint flushing its record, rather than on XLogBackgroundFlush(). If we approve of the walsender being terminated before the shutdown checkpoint, we need to "fix" the comment, then provide a function to ensure the synchronization of WAL records. I'll consider this direction for a while. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Tue, Jun 11, 2024 at 12:34 PM Kyotaro Horiguchi wrote: > > At Tue, 11 Jun 2024 11:32:12 +0530, Amit Kapila > wrote in > > Sorry, it is not clear to me why we failed to flush the last > > continuation record in logical walsender? I see that we try to flush > > the WAL after receiving got_STOPPING in WalSndWaitForWal(), why is > > that not sufficient? > > It seems that, it uses XLogBackgroundFlush(), which does not guarantee > flushing WAL until the end. > What would it take to ensure the same? I am trying to explore this path because currently logical WALSender sends any outstanding logs up to the shutdown checkpoint record (i.e., the latest record) and waits for them to be replicated to the standby before exit. Please take a look at the comments where we call WalSndDone(). The fix you are proposing will break that guarantee. -- With Regards, Amit Kapila.
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
At Tue, 11 Jun 2024 09:27:20 +0900, Michael Paquier wrote in > On Thu, Jun 06, 2024 at 03:19:20PM +0900, Kyotaro Horiguchi wrote: > > So, I believe the attached small patch fixes the behavior. I haven't > > come up with a good test script for this issue. Something like > > 026_overwrite_contrecord.pl might work, but this situation seems a bit > > more complex than what it handles. > > Hmm. Indeed you will not be able to reuse the same trick with the end > of a segment. Still you should be able to get a rather stable test by > using the same tricks as 039_end_of_wal.pl to spawn a record across > multiple pages, no? With the trick, we could write a page-spanning record, but I'm not sure we can control the behavior of XLogBackgroundFlush(). As Amit suggested, we have the option to create a variant of the function that guarantees flushing WAL until the end. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
At Tue, 11 Jun 2024 11:32:12 +0530, Amit Kapila wrote in > Sorry, it is not clear to me why we failed to flush the last > continuation record in logical walsender? I see that we try to flush > the WAL after receiving got_STOPPING in WalSndWaitForWal(), why is > that not sufficient? It seems that, it uses XLogBackgroundFlush(), which does not guarantee flushing WAL until the end. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Thu, Jun 6, 2024 at 11:49 AM Kyotaro Horiguchi wrote: > > At Thu, 6 Jun 2024 12:49:45 +1000, Peter Smith wrote > in > > Hi, I have reproduced this multiple times now. > > > > I confirmed the initial post/steps from Alexander. i.e. The test > > script provided [1] gets itself into a state where function > > ReadPageInternal (called by XLogDecodeNextRecord and commented "Wait > > for the next page to become available") constantly returns > > XLREAD_FAIL. Ultimately the test times out because WalSndLoop() loops > > forever, since it never calls WalSndDone() to exit the walsender > > process. > > Thanks for the repro; I believe I understand what's happening here. > > During server shutdown, the latter half of the last continuation > record may fail to be flushed. This is similar to what is described in > the commit message of commit ff9f111bce. While shutting down, > WalSndLoop() waits for XLogSendLogical() to consume WAL up to > flushPtr, but in this case, the last record cannot complete without > the continuation part starting from flushPtr, which is > missing. Sorry, it is not clear to me why we failed to flush the last continuation record in logical walsender? I see that we try to flush the WAL after receiving got_STOPPING in WalSndWaitForWal(), why is that not sufficient? -- With Regards, Amit Kapila.
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Thu, Jun 06, 2024 at 03:19:20PM +0900, Kyotaro Horiguchi wrote: > During server shutdown, the latter half of the last continuation > record may fail to be flushed. This is similar to what is described in > the commit message of commit ff9f111bce. While shutting down, > WalSndLoop() waits for XLogSendLogical() to consume WAL up to > flushPtr, but in this case, the last record cannot complete without > the continuation part starting from flushPtr, which is > missing. However, in such cases, xlogreader.missingContrecPtr is set > to the beginning of the missing part, but something similar to -/* If EndRecPtr is still past our flushPtr, it means we caught up. */ -if (logical_decoding_ctx->reader->EndRecPtr >= flushPtr) +/* + * If EndRecPtr is still past our flushPtr, it means we caught up. When + * the server is shutting down, the latter part of a continuation record + * may be missing. If got_STOPPING is true, assume we are caught up if the + * last record is missing its continuation part at flushPtr. + */ +if (logical_decoding_ctx->reader->EndRecPtr >= flushPtr || +(got_STOPPING && + logical_decoding_ctx->reader->missingContrecPtr == flushPtr)) FWIW, I don't have a better idea than what you are proposing here. We just cannot receive more data past the page boundary in a shutdown sequence in this context, so checking after the missingContrecPtr is a good compromise to ensure that we don't remain stuck when shutting down a logical WAL sender. I'm surprised that we did not hear about that more often on the lists, or perhaps we did but just discarded it? This is going to take some time to check across all the branches down to v12 that this is stable, because this area of the code tends to change slightly every year.. Well, that's my job. > So, I believe the attached small patch fixes the behavior. I haven't > come up with a good test script for this issue. Something like > 026_overwrite_contrecord.pl might work, but this situation seems a bit > more complex than what it handles. Hmm. Indeed you will not be able to reuse the same trick with the end of a segment. Still you should be able to get a rather stable test by using the same tricks as 039_end_of_wal.pl to spawn a record across multiple pages, no? -- Michael signature.asc Description: PGP signature
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Mon, 10 Jun 2024 at 15:10, Shlok Kyal wrote: > > On Thu, 6 Jun 2024 at 11:49, Kyotaro Horiguchi > wrote: > > > > At Thu, 6 Jun 2024 12:49:45 +1000, Peter Smith > > wrote in > > > Hi, I have reproduced this multiple times now. > > > > > > I confirmed the initial post/steps from Alexander. i.e. The test > > > script provided [1] gets itself into a state where function > > > ReadPageInternal (called by XLogDecodeNextRecord and commented "Wait > > > for the next page to become available") constantly returns > > > XLREAD_FAIL. Ultimately the test times out because WalSndLoop() loops > > > forever, since it never calls WalSndDone() to exit the walsender > > > process. > > > > Thanks for the repro; I believe I understand what's happening here. > > > > During server shutdown, the latter half of the last continuation > > record may fail to be flushed. This is similar to what is described in > > the commit message of commit ff9f111bce. While shutting down, > > WalSndLoop() waits for XLogSendLogical() to consume WAL up to > > flushPtr, but in this case, the last record cannot complete without > > the continuation part starting from flushPtr, which is > > missing. However, in such cases, xlogreader.missingContrecPtr is set > > to the beginning of the missing part, but something similar to > > > > So, I believe the attached small patch fixes the behavior. I haven't > > come up with a good test script for this issue. Something like > > 026_overwrite_contrecord.pl might work, but this situation seems a bit > > more complex than what it handles. > > > > Versions back to 10 should suffer from the same issue and the same > > patch will be applicable without significant changes. > > I tested the changes for PG 12 to master as we do not support prior versions. > The patch applied successfully for master and PG 16. I ran the test > provided in [1] multiple times and it ran successfully each time. > The patch did not apply on PG 15. I did a similar change for PG 15 and > created a patch. I ran the test multiple times and it was successful > every time. > The patch did not apply on PG 14 to PG 12. I did a similar change in > each branch. But the tests did not pass in each branch. I, by mistake, applied wrong changes in PG 14 to PG 12. I tested again for all versions and the test ran successfully for all of them till PG12. I have also attached the patch which applies for PG14 to PG12. Sorry for the inconvenience. Thanks and Regards, Shlok Kyal v2-0001-Fix-infinite-loop-in-walsender-during-publisher-s.patch Description: Binary data v2-0001-Fix-infinite-loop-in-walsender-during-publisher-s-PG-15.patch Description: Binary data v2-0001-Fix-infinite-loop-in-walsender-during-publisher-s-PG-14.patch Description: Binary data
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Thu, 6 Jun 2024 at 11:49, Kyotaro Horiguchi wrote: > > At Thu, 6 Jun 2024 12:49:45 +1000, Peter Smith wrote > in > > Hi, I have reproduced this multiple times now. > > > > I confirmed the initial post/steps from Alexander. i.e. The test > > script provided [1] gets itself into a state where function > > ReadPageInternal (called by XLogDecodeNextRecord and commented "Wait > > for the next page to become available") constantly returns > > XLREAD_FAIL. Ultimately the test times out because WalSndLoop() loops > > forever, since it never calls WalSndDone() to exit the walsender > > process. > > Thanks for the repro; I believe I understand what's happening here. > > During server shutdown, the latter half of the last continuation > record may fail to be flushed. This is similar to what is described in > the commit message of commit ff9f111bce. While shutting down, > WalSndLoop() waits for XLogSendLogical() to consume WAL up to > flushPtr, but in this case, the last record cannot complete without > the continuation part starting from flushPtr, which is > missing. However, in such cases, xlogreader.missingContrecPtr is set > to the beginning of the missing part, but something similar to > > So, I believe the attached small patch fixes the behavior. I haven't > come up with a good test script for this issue. Something like > 026_overwrite_contrecord.pl might work, but this situation seems a bit > more complex than what it handles. > > Versions back to 10 should suffer from the same issue and the same > patch will be applicable without significant changes. I tested the changes for PG 12 to master as we do not support prior versions. The patch applied successfully for master and PG 16. I ran the test provided in [1] multiple times and it ran successfully each time. The patch did not apply on PG 15. I did a similar change for PG 15 and created a patch. I ran the test multiple times and it was successful every time. The patch did not apply on PG 14 to PG 12. I did a similar change in each branch. But the tests did not pass in each branch. I have attached a patch which applies successfully on the PG 15 branch. [1]: https://www.postgresql.org/message-id/f15d665f-4cd1-4894-037c-afdbe3692...@gmail.com Thanks and Regards, Shlok Kyal v2-0001-Fix-infinite-loop-in-walsender-during-publisher-s.patch Description: Binary data v2-0001-Fix-infinite-loop-in-walsender-during-publisher-s-PG-15.patch Description: Binary data
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
At Thu, 6 Jun 2024 12:49:45 +1000, Peter Smith wrote in > Hi, I have reproduced this multiple times now. > > I confirmed the initial post/steps from Alexander. i.e. The test > script provided [1] gets itself into a state where function > ReadPageInternal (called by XLogDecodeNextRecord and commented "Wait > for the next page to become available") constantly returns > XLREAD_FAIL. Ultimately the test times out because WalSndLoop() loops > forever, since it never calls WalSndDone() to exit the walsender > process. Thanks for the repro; I believe I understand what's happening here. During server shutdown, the latter half of the last continuation record may fail to be flushed. This is similar to what is described in the commit message of commit ff9f111bce. While shutting down, WalSndLoop() waits for XLogSendLogical() to consume WAL up to flushPtr, but in this case, the last record cannot complete without the continuation part starting from flushPtr, which is missing. However, in such cases, xlogreader.missingContrecPtr is set to the beginning of the missing part, but something similar to So, I believe the attached small patch fixes the behavior. I haven't come up with a good test script for this issue. Something like 026_overwrite_contrecord.pl might work, but this situation seems a bit more complex than what it handles. Versions back to 10 should suffer from the same issue and the same patch will be applicable without significant changes. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 99cad7bd53a94b4b90937fb1eb2f37f2ebcadf6a Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 6 Jun 2024 14:56:53 +0900 Subject: [PATCH] Fix infinite loop in walsender during publisher shutdown When a publisher server is shutting down, there can be a case where the last WAL record at that point is a continuation record with its latter part not yet flushed. In such cases, the walsender attempts to read this unflushed part and ends up in an infinite loop. To prevent this situation, modify the logical WAL sender to consider itself caught up in this case. The records that are not fully flushed at this point are generally not significant, so simply ignoring them should not cause any issues. --- src/backend/replication/walsender.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index c623b07cf0..635396c138 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -3426,8 +3426,15 @@ XLogSendLogical(void) flushPtr = GetFlushRecPtr(NULL); } - /* If EndRecPtr is still past our flushPtr, it means we caught up. */ - if (logical_decoding_ctx->reader->EndRecPtr >= flushPtr) + /* + * If EndRecPtr is still past our flushPtr, it means we caught up. When + * the server is shutting down, the latter part of a continuation record + * may be missing. If got_STOPPING is true, assume we are caught up if the + * last record is missing its continuation part at flushPtr. + */ + if (logical_decoding_ctx->reader->EndRecPtr >= flushPtr || + (got_STOPPING && + logical_decoding_ctx->reader->missingContrecPtr == flushPtr)) WalSndCaughtUp = true; /* -- 2.43.0
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Wed, May 29, 2024 at 9:00 PM Alexander Lakhin wrote: > > Hello hackers, > > As a recent buildfarm test failure [1] shows: > [14:33:02.374](0.333s) ok 23 - update works with dropped subscriber column > ### Stopping node "publisher" using mode fast > # Running: pg_ctl -D > /home/bf/bf-build/adder/HEAD/pgsql.build/testrun/subscription/001_rep_changes/data/t_001_rep_changes_publisher_data/pgdata > -m fast stop > waiting for server to shut down.. ... ... ... .. failed > pg_ctl: server does not shut down > # pg_ctl stop failed: 256 > # Postmaster PID for node "publisher" is 549 > [14:39:04.375](362.001s) Bail out! pg_ctl stop failed > > 001_rep_changes_publisher.log > 2024-05-16 14:33:02.907 UTC [2238704][client backend][4/22:0] LOG: statement: > DELETE FROM tab_rep > 2024-05-16 14:33:02.925 UTC [2238704][client backend][:0] LOG: disconnection: > session time: 0:00:00.078 user=bf > database=postgres host=[local] > 2024-05-16 14:33:02.939 UTC [549][postmaster][:0] LOG: received fast > shutdown request > 2024-05-16 14:33:03.000 UTC [549][postmaster][:0] LOG: aborting any > active transactions > 2024-05-16 14:33:03.049 UTC [549][postmaster][:0] LOG: background worker > "logical replication launcher" (PID > 2223110) exited with exit code 1 > 2024-05-16 14:33:03.062 UTC [901][checkpointer][:0] LOG: shutting down > 2024-05-16 14:39:04.377 UTC [549][postmaster][:0] LOG: received > immediate shutdown request > 2024-05-16 14:39:04.382 UTC [549][postmaster][:0] LOG: database system > is shut down > > the publisher node may hang on stopping. > > I reproduced the failure (with aggressive autovacuum) locally and > discovered that it happens because: > 1) checkpointer calls WalSndInitStopping() (which sends > PROCSIG_WALSND_INIT_STOPPING to walsender), and then spins inside > WalSndWaitStopping() indefinitely, because: > 2) walsender receives the signal, sets got_STOPPING = true, but can't exit > WalSndLoop(): > 3) it never sets got_SIGUSR2 (to get into WalSndDone()) in > XLogSendLogical(): > 4) it never sets WalSndCaughtUp to true: > 5) logical_decoding_ctx->reader->EndRecPtr can't reach flushPtr in > XLogSendLogical(): > 6) EndRecPtr doesn't advance in XLogNextRecord(): > 7) XLogDecodeNextRecord() fails do decode a record that crosses a page > boundary: > 8) ReadPageInternal() (commented "Wait for the next page to become > available") constantly returns XLREAD_FAIL: > 9) state->routine.page_read()/logical_read_xlog_page() constantly returns > -1: > 10) flushptr = WalSndWaitForWal() stops advancing, because > got_STOPPING == true (see 2). > > That is, walsender doesn't let itself to catch up, if it gets the stop > signal when it's lagging behind and decoding a record requires reading > the next wal page. > > Please look at the reproducing test (based on 001_rep_changes.pl) attached. > If fails for me as below: > # 17 > Bailout called. Further testing stopped: pg_ctl stop failed > FAILED--Further testing stopped: pg_ctl stop failed > make: *** [Makefile:21: check] Ошибка 255 > > [1] > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder=2024-05-16%2014%3A22%3A38 > [2] > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop=2024-04-24%2014%3A38%3A35 > (apparently the same) > Hi Alexander, FYI, by injecting a lot of logging, I’ve confirmed your findings that for the failing scenario, the ‘got_SIGUSR2’ flag never gets set to true, meaning the WalSndLoop() cannot finish. Furthermore, I agree with your step 8 finding that when it fails the ReadPageInternal function call (the one in XLogDecodeNextRecord with the comment "Wait for the next page to become available") constantly returns -1. I will continue digging next week... == Kind Regards, Peter Smith. Fujitsu Australia
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Thu, May 30, 2024 at 2:09 AM vignesh C wrote: > > On Wed, 29 May 2024 at 16:30, Alexander Lakhin wrote: > > > > Hello hackers, > > > > As a recent buildfarm test failure [1] shows: > > [14:33:02.374](0.333s) ok 23 - update works with dropped subscriber column > > ### Stopping node "publisher" using mode fast > > # Running: pg_ctl -D > > /home/bf/bf-build/adder/HEAD/pgsql.build/testrun/subscription/001_rep_changes/data/t_001_rep_changes_publisher_data/pgdata > > -m fast stop > > waiting for server to shut down.. ... ... ... .. failed > > pg_ctl: server does not shut down > > # pg_ctl stop failed: 256 > > # Postmaster PID for node "publisher" is 549 > > [14:39:04.375](362.001s) Bail out! pg_ctl stop failed > > > > 001_rep_changes_publisher.log > > 2024-05-16 14:33:02.907 UTC [2238704][client backend][4/22:0] LOG: > > statement: DELETE FROM tab_rep > > 2024-05-16 14:33:02.925 UTC [2238704][client backend][:0] LOG: > > disconnection: session time: 0:00:00.078 user=bf > > database=postgres host=[local] > > 2024-05-16 14:33:02.939 UTC [549][postmaster][:0] LOG: received fast > > shutdown request > > 2024-05-16 14:33:03.000 UTC [549][postmaster][:0] LOG: aborting any > > active transactions > > 2024-05-16 14:33:03.049 UTC [549][postmaster][:0] LOG: background > > worker "logical replication launcher" (PID > > 2223110) exited with exit code 1 > > 2024-05-16 14:33:03.062 UTC [901][checkpointer][:0] LOG: shutting down > > 2024-05-16 14:39:04.377 UTC [549][postmaster][:0] LOG: received > > immediate shutdown request > > 2024-05-16 14:39:04.382 UTC [549][postmaster][:0] LOG: database system > > is shut down > > > > the publisher node may hang on stopping. > > > > I reproduced the failure (with aggressive autovacuum) locally and > > discovered that it happens because: > > 1) checkpointer calls WalSndInitStopping() (which sends > > PROCSIG_WALSND_INIT_STOPPING to walsender), and then spins inside > > WalSndWaitStopping() indefinitely, because: > > 2) walsender receives the signal, sets got_STOPPING = true, but can't exit > > WalSndLoop(): > > 3) it never sets got_SIGUSR2 (to get into WalSndDone()) in > > XLogSendLogical(): > > 4) it never sets WalSndCaughtUp to true: > > 5) logical_decoding_ctx->reader->EndRecPtr can't reach flushPtr in > > XLogSendLogical(): > > 6) EndRecPtr doesn't advance in XLogNextRecord(): > > 7) XLogDecodeNextRecord() fails do decode a record that crosses a page > > boundary: > > 8) ReadPageInternal() (commented "Wait for the next page to become > > available") constantly returns XLREAD_FAIL: > > 9) state->routine.page_read()/logical_read_xlog_page() constantly returns > > -1: > > 10) flushptr = WalSndWaitForWal() stops advancing, because > > got_STOPPING == true (see 2). > > > > That is, walsender doesn't let itself to catch up, if it gets the stop > > signal when it's lagging behind and decoding a record requires reading > > the next wal page. > > > > Please look at the reproducing test (based on 001_rep_changes.pl) attached. > > If fails for me as below: > > # 17 > > Bailout called. Further testing stopped: pg_ctl stop failed > > FAILED--Further testing stopped: pg_ctl stop failed > > make: *** [Makefile:21: check] Ошибка 255 > > Thank you, Alexander, for sharing the script. I was able to reproduce > the issue using the provided script. Furthermore, while investigating > its origins, I discovered that this problem persists across all > branches up to PG10 (the script needs slight adjustments to run it on > older versions). It's worth noting that this issue isn't a result of > recent version changes. > Hi, FWIW using the provided scripting I was also able to reproduce the problem on HEAD but for me, it was more rare. -- the script passed ok 3 times all 100 iterations; it eventually failed on the 4th run on the 75th iteration. == Kind Regards, Peter Smith. Fujitsu Australia
Re: 001_rep_changes.pl fails due to publisher stuck on shutdown
On Wed, 29 May 2024 at 16:30, Alexander Lakhin wrote: > > Hello hackers, > > As a recent buildfarm test failure [1] shows: > [14:33:02.374](0.333s) ok 23 - update works with dropped subscriber column > ### Stopping node "publisher" using mode fast > # Running: pg_ctl -D > /home/bf/bf-build/adder/HEAD/pgsql.build/testrun/subscription/001_rep_changes/data/t_001_rep_changes_publisher_data/pgdata > -m fast stop > waiting for server to shut down.. ... ... ... .. failed > pg_ctl: server does not shut down > # pg_ctl stop failed: 256 > # Postmaster PID for node "publisher" is 549 > [14:39:04.375](362.001s) Bail out! pg_ctl stop failed > > 001_rep_changes_publisher.log > 2024-05-16 14:33:02.907 UTC [2238704][client backend][4/22:0] LOG: statement: > DELETE FROM tab_rep > 2024-05-16 14:33:02.925 UTC [2238704][client backend][:0] LOG: disconnection: > session time: 0:00:00.078 user=bf > database=postgres host=[local] > 2024-05-16 14:33:02.939 UTC [549][postmaster][:0] LOG: received fast > shutdown request > 2024-05-16 14:33:03.000 UTC [549][postmaster][:0] LOG: aborting any > active transactions > 2024-05-16 14:33:03.049 UTC [549][postmaster][:0] LOG: background worker > "logical replication launcher" (PID > 2223110) exited with exit code 1 > 2024-05-16 14:33:03.062 UTC [901][checkpointer][:0] LOG: shutting down > 2024-05-16 14:39:04.377 UTC [549][postmaster][:0] LOG: received > immediate shutdown request > 2024-05-16 14:39:04.382 UTC [549][postmaster][:0] LOG: database system > is shut down > > the publisher node may hang on stopping. > > I reproduced the failure (with aggressive autovacuum) locally and > discovered that it happens because: > 1) checkpointer calls WalSndInitStopping() (which sends > PROCSIG_WALSND_INIT_STOPPING to walsender), and then spins inside > WalSndWaitStopping() indefinitely, because: > 2) walsender receives the signal, sets got_STOPPING = true, but can't exit > WalSndLoop(): > 3) it never sets got_SIGUSR2 (to get into WalSndDone()) in > XLogSendLogical(): > 4) it never sets WalSndCaughtUp to true: > 5) logical_decoding_ctx->reader->EndRecPtr can't reach flushPtr in > XLogSendLogical(): > 6) EndRecPtr doesn't advance in XLogNextRecord(): > 7) XLogDecodeNextRecord() fails do decode a record that crosses a page > boundary: > 8) ReadPageInternal() (commented "Wait for the next page to become > available") constantly returns XLREAD_FAIL: > 9) state->routine.page_read()/logical_read_xlog_page() constantly returns > -1: > 10) flushptr = WalSndWaitForWal() stops advancing, because > got_STOPPING == true (see 2). > > That is, walsender doesn't let itself to catch up, if it gets the stop > signal when it's lagging behind and decoding a record requires reading > the next wal page. > > Please look at the reproducing test (based on 001_rep_changes.pl) attached. > If fails for me as below: > # 17 > Bailout called. Further testing stopped: pg_ctl stop failed > FAILED--Further testing stopped: pg_ctl stop failed > make: *** [Makefile:21: check] Ошибка 255 Thank you, Alexander, for sharing the script. I was able to reproduce the issue using the provided script. Furthermore, while investigating its origins, I discovered that this problem persists across all branches up to PG10 (the script needs slight adjustments to run it on older versions). It's worth noting that this issue isn't a result of recent version changes. Regards, Vignesh
001_rep_changes.pl fails due to publisher stuck on shutdown
Hello hackers, As a recent buildfarm test failure [1] shows: [14:33:02.374](0.333s) ok 23 - update works with dropped subscriber column ### Stopping node "publisher" using mode fast # Running: pg_ctl -D /home/bf/bf-build/adder/HEAD/pgsql.build/testrun/subscription/001_rep_changes/data/t_001_rep_changes_publisher_data/pgdata -m fast stop waiting for server to shut down.. ... ... ... .. failed pg_ctl: server does not shut down # pg_ctl stop failed: 256 # Postmaster PID for node "publisher" is 549 [14:39:04.375](362.001s) Bail out! pg_ctl stop failed 001_rep_changes_publisher.log 2024-05-16 14:33:02.907 UTC [2238704][client backend][4/22:0] LOG: statement: DELETE FROM tab_rep 2024-05-16 14:33:02.925 UTC [2238704][client backend][:0] LOG: disconnection: session time: 0:00:00.078 user=bf database=postgres host=[local] 2024-05-16 14:33:02.939 UTC [549][postmaster][:0] LOG: received fast shutdown request 2024-05-16 14:33:03.000 UTC [549][postmaster][:0] LOG: aborting any active transactions 2024-05-16 14:33:03.049 UTC [549][postmaster][:0] LOG: background worker "logical replication launcher" (PID 2223110) exited with exit code 1 2024-05-16 14:33:03.062 UTC [901][checkpointer][:0] LOG: shutting down 2024-05-16 14:39:04.377 UTC [549][postmaster][:0] LOG: received immediate shutdown request 2024-05-16 14:39:04.382 UTC [549][postmaster][:0] LOG: database system is shut down the publisher node may hang on stopping. I reproduced the failure (with aggressive autovacuum) locally and discovered that it happens because: 1) checkpointer calls WalSndInitStopping() (which sends PROCSIG_WALSND_INIT_STOPPING to walsender), and then spins inside WalSndWaitStopping() indefinitely, because: 2) walsender receives the signal, sets got_STOPPING = true, but can't exit WalSndLoop(): 3) it never sets got_SIGUSR2 (to get into WalSndDone()) in XLogSendLogical(): 4) it never sets WalSndCaughtUp to true: 5) logical_decoding_ctx->reader->EndRecPtr can't reach flushPtr in XLogSendLogical(): 6) EndRecPtr doesn't advance in XLogNextRecord(): 7) XLogDecodeNextRecord() fails do decode a record that crosses a page boundary: 8) ReadPageInternal() (commented "Wait for the next page to become available") constantly returns XLREAD_FAIL: 9) state->routine.page_read()/logical_read_xlog_page() constantly returns -1: 10) flushptr = WalSndWaitForWal() stops advancing, because got_STOPPING == true (see 2). That is, walsender doesn't let itself to catch up, if it gets the stop signal when it's lagging behind and decoding a record requires reading the next wal page. Please look at the reproducing test (based on 001_rep_changes.pl) attached. If fails for me as below: # 17 Bailout called. Further testing stopped: pg_ctl stop failed FAILED--Further testing stopped: pg_ctl stop failed make: *** [Makefile:21: check] Ошибка 255 [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder=2024-05-16%2014%3A22%3A38 [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop=2024-04-24%2014%3A38%3A35 (apparently the same) Best regards, Alexander 099_walsender_stop.pl Description: Perl program