Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-25 Thread vignesh C
On Tue, 24 Jun 2025 at 00:20, Alexander Korotkov wrote: > > On Mon, Jun 23, 2025 at 4:33 PM Amit Kapila wrote: > > On Mon, Jun 23, 2025 at 6:01 PM Alexander Korotkov > > wrote: > > > > > > On Mon, Jun 23, 2025 at 3:00 PM Jelte Fennema-Nio > > > wrote: > > > > On Mon, 23 Jun 2025 at 12:24, Ale

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-23 Thread Jelte Fennema-Nio
On Mon, 23 Jun 2025 at 20:50, Alexander Korotkov wrote: > I decided to remove the test while we're investigating the issue. It > might take a bit longer for us to fix, but that wouldn't distort > others' work. Sounds good. I reset the backoff of all jobs in the CFBot database, so that the commit

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-23 Thread Alexander Korotkov
On Mon, Jun 23, 2025 at 4:33 PM Amit Kapila wrote: > On Mon, Jun 23, 2025 at 6:01 PM Alexander Korotkov > wrote: > > > > On Mon, Jun 23, 2025 at 3:00 PM Jelte Fennema-Nio > > wrote: > > > On Mon, 23 Jun 2025 at 12:24, Alexander Korotkov > > > wrote: > > > > On Mon, Jun 23, 2025 at 3:29 AM Mi

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-23 Thread Amit Kapila
On Mon, Jun 23, 2025 at 6:01 PM Alexander Korotkov wrote: > > On Mon, Jun 23, 2025 at 3:00 PM Jelte Fennema-Nio wrote: > > On Mon, 23 Jun 2025 at 12:24, Alexander Korotkov > > wrote: > > > On Mon, Jun 23, 2025 at 3:29 AM Michael Paquier > > > wrote: > > > > > Yeah, that's what I think too. T

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-23 Thread vignesh C
On Sun, 22 Jun 2025 at 05:46, Alexander Korotkov wrote: > > On Sat, Jun 21, 2025 at 2:42 AM Tom Lane wrote: > > > > Alexander Korotkov writes: > > > And I see the following variable values. > > > > > (lldb) p/x targetPagePtr > > > (XLogRecPtr) 0x29004000 > > > (lldb) p/x RecPtr > > > (XL

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-23 Thread Alexander Korotkov
On Mon, Jun 23, 2025 at 3:00 PM Jelte Fennema-Nio wrote: > On Mon, 23 Jun 2025 at 12:24, Alexander Korotkov wrote: > > On Mon, Jun 23, 2025 at 3:29 AM Michael Paquier wrote: > > > > Yeah, that's what I think too. The unintentional omission of a > > > > pre-shutdown delay in the 046 test has exp

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-23 Thread Jelte Fennema-Nio
On Mon, 23 Jun 2025 at 12:24, Alexander Korotkov wrote: > > On Mon, Jun 23, 2025 at 3:29 AM Michael Paquier wrote: > > > Yeah, that's what I think too. The unintentional omission of a > > > pre-shutdown delay in the 046 test has exposed some pre-existing > > > fragility in pg_logical_slot_get_ch

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-23 Thread Alexander Korotkov
On Mon, Jun 23, 2025 at 3:29 AM Michael Paquier wrote: > > Yeah, that's what I think too. The unintentional omission of a > > pre-shutdown delay in the 046 test has exposed some pre-existing > > fragility in pg_logical_slot_get_changes(). So I'm not in favor > > of changing 046 till we understan

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-22 Thread Michael Paquier
On Sat, Jun 21, 2025 at 08:56:50PM -0400, Tom Lane wrote: > Hmm. My theory about what's happening is that we are writing a WAL > record that spans across a page boundary, and the asynchronous > immediate-stop request comes in and kills that operation, so that > the first half of the record is on d

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-21 Thread Tom Lane
Alexander Korotkov writes: > On Sat, Jun 21, 2025 at 2:42 AM Tom Lane wrote: >> But "Wait for the next page to become available" seems awfully >> trusting that there will be another page. Should this be >> using the no-wait code path? > Thank you for the help. It seems to me that problem is de

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-21 Thread Alexander Korotkov
On Sat, Jun 21, 2025 at 2:42 AM Tom Lane wrote: > > Alexander Korotkov writes: > > And I see the following variable values. > > > (lldb) p/x targetPagePtr > > (XLogRecPtr) 0x29004000 > > (lldb) p/x RecPtr > > (XLogRecPtr) 0x29002138 > > > I hardly understand how is this possible g

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Tom Lane
Alexander Korotkov writes: > And I see the following variable values. > (lldb) p/x targetPagePtr > (XLogRecPtr) 0x29004000 > (lldb) p/x RecPtr > (XLogRecPtr) 0x29002138 > I hardly understand how is this possible given it was compiled with "-O0". > I'm planning to continue investi

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Tom Lane
Alexander Korotkov writes: > I think this indicates unfinished intention to wait for checkpoint > completion. But I think both cases (checkpoint finished and > unfinished) should work correctly. So, I believe there is a backend > problem. I'm trying to reproduce this locally. Sorry for the > c

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Alexander Korotkov
On Sat, Jun 21, 2025 at 1:40 AM Alexander Korotkov wrote: > On Sat, Jun 21, 2025 at 1:25 AM Tom Lane wrote: > > I wrote: > > > But in the buildfarm failures I don't see any 'checkpoint complete' > > > before the shutdown. > > > > Ooops, I lied: we have at least one case where the checkpoint does

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Alexander Korotkov
On Sat, Jun 21, 2025 at 1:25 AM Tom Lane wrote: > I wrote: > > But in the buildfarm failures I don't see any 'checkpoint complete' > > before the shutdown. > > Ooops, I lied: we have at least one case where the checkpoint does > finish but then it hangs up anyway: > > https://buildfarm.postgresql.

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Tom Lane
I wrote: > But in the buildfarm failures I don't see any 'checkpoint complete' > before the shutdown. Ooops, I lied: we have at least one case where the checkpoint does finish but then it hangs up anyway: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=melonworm&dt=2025-06-20%2019%3

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Tom Lane
Melanie Plageman writes: > Quite a few animals have started failing since this commit (for example > [1]) . I haven't looked into why, but I suspect something is wrong. It looks to me like it's being triggered by this questionable bit in 046_checkpoint_logical_slot.pl: # Continue the checkpoint

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Alexander Korotkov
On Fri, Jun 20, 2025, 19:10 Melanie Plageman wrote: > > On Thu, Jun 19, 2025 at 7:31 PM Alexander Korotkov < > akorot...@postgresql.org> wrote: > >> Improve runtime and output of tests for replication slots checkpointing. >> >> The TAP tests that verify logical and physical replication slot behav

Re: pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-20 Thread Melanie Plageman
On Thu, Jun 19, 2025 at 7:31 PM Alexander Korotkov wrote: > Improve runtime and output of tests for replication slots checkpointing. > > The TAP tests that verify logical and physical replication slot behavior > during checkpoints (046_checkpoint_logical_slot.pl and > 047_checkpoint_physical_slot

pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-19 Thread Alexander Korotkov
Improve runtime and output of tests for replication slots checkpointing. The TAP tests that verify logical and physical replication slot behavior during checkpoints (046_checkpoint_logical_slot.pl and 047_checkpoint_physical_slot.pl) inserted two batches of 2 million rows each, generating approxim

pgsql: Improve runtime and output of tests for replication slots checkp

2025-06-19 Thread Alexander Korotkov
Improve runtime and output of tests for replication slots checkpointing. The TAP tests that verify logical and physical replication slot behavior during checkpoints (046_checkpoint_logical_slot.pl and 047_checkpoint_physical_slot.pl) inserted two batches of 2 million rows each, generating approxim