Hi, I Recently encountered a situation on the field in which the message “could not truncate directory "pg_serial": apparent wraparound” was logged even through there was no danger of wraparound. This was on a brand new cluster and only took a few minutes to see the message in the logs.
Reading on some history of this error message, it appears that there was work done to improve SLRU truncation and associated wraparound log messages [1]. The attached repro on master still shows that this message can be logged incorrectly. The repro runs updates with 90 threads in serializable mode and kicks off a “long running” select on the same table in serializable mode. As soon as the long running select commits, the next checkpoint fails to truncate the SLRU and logs the error message. Besides the confusing log message, there may also also be risk with pg_serial getting unnecessarily bloated and depleting the disk space. Is this a bug? [1] https://www.postgresql.org/message-id/flat/20190202083822.GC32531%40gust.leadboat.com Regards, Sami Imseih Amazon Web Services (AWS)
### create the pgbench script echo "\set idv random(1, 10000)" > /tmp/pgbench.sql echo "BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;" >> /tmp/pgbench.sql echo "UPDATE tab1 SET id2 = id2 WHERE id = :idv;" >> /tmp/pgbench.sql echo "COMMIT;" >> /tmp/pgbench.sql ### create the driver script cat <<EOT > /tmp/repro.sh psql<<EOF begin transaction isolation level serializable; select count(*) from tab1; -- start a pgbench in the background \! nohup pgbench -c90 -f/tmp/pgbench.sql -T 120 & select pg_sleep(100); COMMIT; CHECKPOINT; select pg_sleep(30); EOF EOT chmod a+x /tmp/repro.sh ### create the table psql<<EOF drop table if exists tab1; create table tab1 (id int primary key, id2 int); insert into tab1 select n, 0 from generate_series(1, 10000) as n; EOF ### run the repro /tmp/repro.sh ### afterwards, cat the logfile .... 2023-08-23 00:28:43.499 UTC [2372685] LOG: could not truncate directory "pg_serial": apparent wraparound ...