Re: A failure in 031_recovery_conflict.pl on Debian/s390x

Christoph Berg Tue, 08 Aug 2023 07:02:26 -0700

Re: Andres Freund
> Hm, that could just be a "harmless" race. Does it still happen if you apply
> the attached patch in addition?


Putting that patch on top of v8 made it pass 294 times before exiting
like this:

[08:52:34.134](0.032s) ok 1 - buffer pin conflict: cursor with conflicting pin 
established
Waiting for replication conn standby's replay_lsn to pass 0/3430000 on primary
done
timed out waiting for match: (?^:User was holding shared buffer pin for too 
long) at t/031_recovery_conflict.pl line 318.

But admittedly, this build machine is quite sluggish at times, likely
due to neighboring VMs on the same host. (Perhaps that even explains
the behavior from before this patch.) I'll still attach the logs since
I frankly can't judge what errors are acceptable here and which
aren't.

Christoph

2023-08-08 08:52:33.312 UTC [1734403] LOG:  starting PostgreSQL 17devel (Debian 
17~~devel-1) on s390x-ibm-linux-gnu, compiled by gcc (Debian 13.2.0-1) 13.2.0, 
64-bit
2023-08-08 08:52:33.312 UTC [1734403] LOG:  listening on Unix socket 
"/tmp/UNCayI_17k/.s.PGSQL.63827"
2023-08-08 08:52:33.318 UTC [1734406] LOG:  database system was shut down at 
2023-08-08 08:52:33 UTC
2023-08-08 08:52:33.350 UTC [1734403] LOG:  database system is ready to accept 
connections
2023-08-08 08:52:33.477 UTC [1734411] 031_recovery_conflict.pl LOG:  statement: 
CREATE TABLESPACE test_recovery_conflict_tblspc LOCATION ''
2023-08-08 08:52:33.484 UTC [1734413] 031_recovery_conflict.pl LOG:  received 
replication command: SHOW data_directory_mode
2023-08-08 08:52:33.484 UTC [1734413] 031_recovery_conflict.pl STATEMENT:  SHOW 
data_directory_mode
2023-08-08 08:52:33.485 UTC [1734413] 031_recovery_conflict.pl LOG:  received 
replication command: SHOW wal_segment_size
2023-08-08 08:52:33.485 UTC [1734413] 031_recovery_conflict.pl STATEMENT:  SHOW 
wal_segment_size
2023-08-08 08:52:33.485 UTC [1734413] 031_recovery_conflict.pl LOG:  received 
replication command: IDENTIFY_SYSTEM
2023-08-08 08:52:33.485 UTC [1734413] 031_recovery_conflict.pl STATEMENT:  
IDENTIFY_SYSTEM
2023-08-08 08:52:33.485 UTC [1734413] 031_recovery_conflict.pl LOG:  received 
replication command: BASE_BACKUP ( LABEL 'pg_basebackup base backup',  
PROGRESS,  CHECKPOINT 'fast',  WAIT 0,  MANIFEST 'yes',  TARGET 'client')
2023-08-08 08:52:33.485 UTC [1734413] 031_recovery_conflict.pl STATEMENT:  
BASE_BACKUP ( LABEL 'pg_basebackup base backup',  PROGRESS,  CHECKPOINT 'fast', 
 WAIT 0,  MANIFEST 'yes',  TARGET 'client')
2023-08-08 08:52:33.490 UTC [1734404] LOG:  checkpoint starting: immediate 
force wait
2023-08-08 08:52:33.500 UTC [1734404] LOG:  checkpoint complete: wrote 7 
buffers (5.5%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, 
sync=0.001 s, total=0.011 s; sync files=0, longest=0.000 s, average=0.000 s; 
distance=11350 kB, estimate=11350 kB; lsn=0/2000060, redo lsn=0/2000028
2023-08-08 08:52:33.507 UTC [1734414] 031_recovery_conflict.pl LOG:  received 
replication command: SHOW data_directory_mode
2023-08-08 08:52:33.507 UTC [1734414] 031_recovery_conflict.pl STATEMENT:  SHOW 
data_directory_mode
2023-08-08 08:52:33.508 UTC [1734414] 031_recovery_conflict.pl LOG:  received 
replication command: CREATE_REPLICATION_SLOT "pg_basebackup_1734414" TEMPORARY 
PHYSICAL ( RESERVE_WAL)
2023-08-08 08:52:33.508 UTC [1734414] 031_recovery_conflict.pl STATEMENT:  
CREATE_REPLICATION_SLOT "pg_basebackup_1734414" TEMPORARY PHYSICAL ( 
RESERVE_WAL)
2023-08-08 08:52:33.513 UTC [1734414] 031_recovery_conflict.pl LOG:  received 
replication command: IDENTIFY_SYSTEM
2023-08-08 08:52:33.513 UTC [1734414] 031_recovery_conflict.pl STATEMENT:  
IDENTIFY_SYSTEM
2023-08-08 08:52:33.518 UTC [1734414] 031_recovery_conflict.pl LOG:  received 
replication command: START_REPLICATION SLOT "pg_basebackup_1734414" 0/2000000 
TIMELINE 1
2023-08-08 08:52:33.518 UTC [1734414] 031_recovery_conflict.pl STATEMENT:  
START_REPLICATION SLOT "pg_basebackup_1734414" 0/2000000 TIMELINE 1
2023-08-08 08:52:33.596 UTC [1734413] 031_recovery_conflict.pl LOG:  temporary 
file: path "base/pgsql_tmp/pgsql_tmp1734413.0", size 137324
2023-08-08 08:52:33.596 UTC [1734413] 031_recovery_conflict.pl STATEMENT:  
BASE_BACKUP ( LABEL 'pg_basebackup base backup',  PROGRESS,  CHECKPOINT 'fast', 
 WAIT 0,  MANIFEST 'yes',  TARGET 'client')
2023-08-08 08:52:33.741 UTC [1734425] standby LOG:  received replication 
command: IDENTIFY_SYSTEM
2023-08-08 08:52:33.741 UTC [1734425] standby STATEMENT:  IDENTIFY_SYSTEM
2023-08-08 08:52:33.766 UTC [1734425] standby LOG:  received replication 
command: START_REPLICATION 0/3000000 TIMELINE 1
2023-08-08 08:52:33.766 UTC [1734425] standby STATEMENT:  START_REPLICATION 
0/3000000 TIMELINE 1
2023-08-08 08:52:33.823 UTC [1734427] 031_recovery_conflict.pl LOG:  statement: 
CREATE DATABASE test_db
2023-08-08 08:52:33.878 UTC [1734429] 031_recovery_conflict.pl LOG:  statement: 
CREATE TABLE test_recovery_conflict_table1(a int, b int);
2023-08-08 08:52:33.887 UTC [1734429] 031_recovery_conflict.pl LOG:  statement: 
INSERT INTO test_recovery_conflict_table1 SELECT i % 3, 0 FROM 
generate_series(1,20) i;
2023-08-08 08:52:33.898 UTC [1734429] 031_recovery_conflict.pl LOG:  statement: 
CREATE TABLE test_recovery_conflict_table2(a int, b int);
2023-08-08 08:52:33.936 UTC [1734431] 031_recovery_conflict.pl LOG:  statement: 
SELECT pg_current_wal_flush_lsn()
2023-08-08 08:52:33.962 UTC [1734433] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/342B100' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 08:52:34.013 UTC [1734437] 031_recovery_conflict.pl LOG:  statement: 
BEGIN;
2023-08-08 08:52:34.018 UTC [1734437] 031_recovery_conflict.pl LOG:  statement: 
INSERT INTO test_recovery_conflict_table1 VALUES (1,0);
2023-08-08 08:52:34.024 UTC [1734437] 031_recovery_conflict.pl LOG:  statement: 
ROLLBACK;
2023-08-08 08:52:34.029 UTC [1734437] 031_recovery_conflict.pl LOG:  statement: 
BEGIN;
2023-08-08 08:52:34.034 UTC [1734437] 031_recovery_conflict.pl LOG:  statement: 
LOCK test_recovery_conflict_table1;
2023-08-08 08:52:34.040 UTC [1734437] 031_recovery_conflict.pl LOG:  statement: 
COMMIT;
2023-08-08 08:52:34.072 UTC [1734439] 031_recovery_conflict.pl LOG:  statement: 
SELECT pg_current_wal_flush_lsn()
2023-08-08 08:52:34.097 UTC [1734441] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/342B1C0' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 08:52:34.149 UTC [1734443] 031_recovery_conflict.pl LOG:  statement: 
VACUUM test_recovery_conflict_table1;
2023-08-08 08:52:34.192 UTC [1734445] 031_recovery_conflict.pl LOG:  statement: 
SELECT pg_current_wal_flush_lsn()
2023-08-08 08:52:34.211 UTC [1734447] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/3430000' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 08:52:34.348 UTC [1734449] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/3430000' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 08:52:34.479 UTC [1734451] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/3430000' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 08:52:34.617 UTC [1734453] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/3430000' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 08:55:38.427 UTC [1734403] LOG:  received immediate shutdown request
2023-08-08 08:55:38.437 UTC [1734403] LOG:  database system is shut down

2023-08-08 08:52:33.722 UTC [1734420] LOG:  starting PostgreSQL 17devel (Debian 
17~~devel-1) on s390x-ibm-linux-gnu, compiled by gcc (Debian 13.2.0-1) 13.2.0, 
64-bit
2023-08-08 08:52:33.722 UTC [1734420] LOG:  listening on Unix socket 
"/tmp/UNCayI_17k/.s.PGSQL.63828"
2023-08-08 08:52:33.728 UTC [1734423] LOG:  database system was interrupted; 
last known up at 2023-08-08 08:52:33 UTC
2023-08-08 08:52:33.728 UTC [1734423] LOG:  entering standby mode
2023-08-08 08:52:33.729 UTC [1734423] LOG:  redo starts at 0/2000028
2023-08-08 08:52:33.729 UTC [1734423] WARNING:  unexpected directory entry 
"16384" found in pg_tblspc/
2023-08-08 08:52:33.729 UTC [1734423] DETAIL:  All directory entries in 
pg_tblspc/ should be symbolic links.
2023-08-08 08:52:33.729 UTC [1734423] HINT:  Remove those directories, or set 
allow_in_place_tablespaces to ON transiently to let recovery complete.
2023-08-08 08:52:33.729 UTC [1734423] LOG:  consistent recovery state reached 
at 0/2000100
2023-08-08 08:52:33.733 UTC [1734420] LOG:  database system is ready to accept 
read-only connections
2023-08-08 08:52:33.768 UTC [1734424] LOG:  started streaming WAL from primary 
at 0/3000000 on timeline 1
2023-08-08 08:52:34.108 UTC [1734435] 031_recovery_conflict.pl LOG:  statement: 
BEGIN;
2023-08-08 08:52:34.113 UTC [1734435] 031_recovery_conflict.pl LOG:  statement: 
DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM 
test_recovery_conflict_table1;
2023-08-08 08:52:34.119 UTC [1734435] 031_recovery_conflict.pl LOG:  statement: 
FETCH FORWARD FROM test_recovery_conflict_cursor;
2023-08-08 08:52:34.128 UTC [1734435] 031_recovery_conflict.pl LOG:  statement: 
;
2023-08-08 08:55:38.434 UTC [1734424] FATAL:  could not receive data from WAL 
stream: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
2023-08-08 08:55:38.434 UTC [1734423] LOG:  invalid record length at 0/347E6A8: 
expected at least 24, got 0
2023-08-08 08:55:38.440 UTC [1734495] FATAL:  could not connect to the primary 
server: connection to server on socket "/tmp/UNCayI_17k/.s.PGSQL.63827" failed: 
No such file or directory
                Is the server running locally and accepting connections on that 
socket?
2023-08-08 08:55:38.441 UTC [1734423] LOG:  waiting for WAL to become available 
at 0/347E6C0
2023-08-08 08:55:38.536 UTC [1734420] LOG:  received immediate shutdown request
2023-08-08 08:55:38.543 UTC [1734420] LOG:  database system is shut down

# Checking port 63827
# Found port 63827
Name: primary
Data directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
Backup directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/backup
Archive directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/archives
Connection string: port=63827 host=/tmp/UNCayI_17k
Log file: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_primary.log
# Running: initdb -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -A trust -N
The files belonging to this database system will be owned by user "myon".
This user must also own the server process.

The database cluster will be initialized with this locale configuration:
  provider:    libc
  LC_COLLATE:  en_US.utf8
  LC_CTYPE:    en_US.utf8
  LC_MESSAGES: C
  LC_MONETARY: en_US.utf8
  LC_NUMERIC:  en_US.utf8
  LC_TIME:     en_US.utf8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok

Sync to disk skipped.
The data directory might become corrupt if the operating system crashes.

Success. You can now start the database server using:

    pg_ctl -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -l logfile start

# Running: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/../../../src/test/regress/pg_regress
 --config-auth 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
### Starting node "primary"
# Running: pg_ctl -w -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -l 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_primary.log
 -o --cluster-name=primary start
waiting for server to start.... done
server started
# Postmaster PID for node "primary" is 1734403
# Taking pg_basebackup my_backup from node "primary"
# Running: pg_basebackup -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/backup/my_backup
 -h /tmp/UNCayI_17k -p 63827 --checkpoint fast --no-sync
# Backup finished
# Checking port 63828
# Found port 63828
Name: standby
Data directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata
Backup directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/backup
Archive directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/archives
Connection string: port=63828 host=/tmp/UNCayI_17k
Log file: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log
# Initializing node "standby" from backup "my_backup" of node "primary"
### Enabling streaming replication for node "standby"
### Starting node "standby"
# Running: pg_ctl -w -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata
 -l 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log
 -o --cluster-name=standby start
waiting for server to start.... done
server started
# Postmaster PID for node "standby" is 1734420
Waiting for replication conn standby's replay_lsn to pass 0/342B100 on primary
done
Waiting for replication conn standby's replay_lsn to pass 0/342B1C0 on primary
done
[08:52:34.101](1.605s) # issuing query via background psql: 
#     BEGIN;
#     DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM 
test_recovery_conflict_table1;
#     FETCH FORWARD FROM test_recovery_conflict_cursor;
[08:52:34.134](0.032s) ok 1 - buffer pin conflict: cursor with conflicting pin 
established
Waiting for replication conn standby's replay_lsn to pass 0/3430000 on primary
done
timed out waiting for match: (?^:User was holding shared buffer pin for too 
long) at t/031_recovery_conflict.pl line 318.
# Postmaster PID for node "primary" is 1734403
### Stopping node "primary" using mode immediate
# Running: pg_ctl -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -m immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "primary"
# Postmaster PID for node "standby" is 1734420
### Stopping node "standby" using mode immediate
# Running: pg_ctl -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata
 -m immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "standby"
[08:55:38.639](184.505s) # Tests were run but no plan was declared and 
done_testing() was not seen.
[08:55:38.639](0.000s) # Looks like your test exited with 255 just after 1.

Re: A failure in 031_recovery_conflict.pl on Debian/s390x

Reply via email to