On 5/5/23 2:28 PM, Amit Kapila wrote:
On Fri, May 5, 2023 at 5:36 PM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:

It seems due to some reason the current wal file is not switched due
to some reason.

Oh wait, here is a NON failing one: https://cirrus-ci.com/task/5086849685782528 
(I modified the
.cirrus.yml so that we can download the "testrun.zip" file even if the test is 
not failing).

So, in this testrun.zip we can see, that the test is ok:

$ grep -i retain 
./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
[10:06:08.789](0.000s) ok 19 - invalidated logical slots do not lead to 
retaining WAL

and that the WAL file we expect to be removed is:

$ grep "WAL file is" 
./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
[10:06:08.789](0.925s) # BDT WAL file is 
/Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/000000010000000000000003

This WAL file has been removed by the standby:

$ grep -i 000000010000000000000003 
./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_standby.log
 | grep -i recy
2023-05-05 10:06:08.787 UTC [17521][checkpointer] DEBUG:  00000: recycled write-ahead log 
file "000000010000000000000003"

But on the primary, it has been recycled way after that time:

$ grep -i 000000010000000000000003 
./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log
 | grep -i recy
2023-05-05 10:06:13.370 UTC [16785][checkpointer] DEBUG:  00000: recycled write-ahead log 
file "000000010000000000000003"

As, the checkpoint on the primary after the WAL file switch only recycled (001 
and 002):

$ grep -i recycled 
./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log
2023-05-05 10:05:57.196 UTC [16785][checkpointer] LOG:  00000: checkpoint 
complete: wrote 4 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled; 
write=0.001 s, sync=0.001 s, total=0.027 s; sync files=0, longest=0.000 s, 
average=0.000 s; distance=11219 kB, estimate=11219 kB; lsn=0/2000060, redo 
lsn=0/2000028
2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG:  00000: recycled write-ahead log 
file "000000010000000000000001"
2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG:  00000: recycled write-ahead log 
file "000000010000000000000002"
2023-05-05 10:06:08.138 UTC [16785][checkpointer] LOG:  00000: checkpoint 
complete: wrote 20 buffers (15.6%); 0 WAL file(s) added, 0 removed, 2 recycled; 
write=0.001 s, sync=0.001 s, total=0.003 s; sync files=0, longest=0.000 s, 
average=0.000 s; distance=32768 kB, estimate=32768 kB; lsn=0/40000D0, redo 
lsn=0/4000098


So, even on a successful test, we can see that the WAL file we expect to be 
removed on the standby has not been recycled on the primary before the test.

I think we need to add more DEBUG info to find that
out. Can you please try to print 'RedoRecPtr', '_logSegNo', and
recptr?


Yes, will do.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Reply via email to