Here are the performance test results and analysis with the recent patches.
Test Setup:
- Created Pub and Sub nodes with logical replication and below configurations.
autovacuum_naptime = '30s'
shared_buffers = '30GB'
max_wal_size = 20GB
min_wal_size = 10GB
track_commit_timestamp = on (only on Sub node).
- Pub and Sub had different pgbench tables with initial data of scale=100.
-------------------------------
Case-0: Collected data on pgHead
-------------------------------
- Ran pgbench(read-write) on both the publisher and the subscriber
with 30 clients for a duration of 15 minutes, collecting data over 3
runs.
Results:
Run# pub_TPS sub_TPS
1 30551.63471 29476.81709
2 30112.31203 28933.75013
3 29599.40383 28379.4977
Median 30112.31203 28933.75013
-------------------------------
Case-1: Long run(15-minutes) tests when retain_conflitc_info=ON
-------------------------------
- Code: pgHead + v19 patches.
- At Sub set autovacuum=false.
- Ran pgbench(read-write) on both the publisher and the subscriber
with 30 clients for a duration of 15 minutes, collecting data over 3
runs.
Results:
Run# pub_TPS sub_TPS
1 30326.57637 4890.410972
2 30412.85115 4787.192754
3 30860.13879 4864.549117
Median 30412.85115 4864.549117
regression 1% -83%
- A 15-minute pgbench run test showed higher reduction in the sub's
TPS. As the test run time increased the TPS reduced further at the Sub
node.
-------------------------------
Case-2 : Re-ran the case-1 with autovacuum enabled and running every 30 seconds.
-------------------------------
- Code: pgHead + v19 patches.
- At Sub set autovacuum=true.
- Also measured the frequency of slot.xmin and the worker's
oldest_nonremovable_xid updates.
Results:
Run# pub_TPS sub_TPS #slot.xmin_updates
#worker's_oldest_nonremovable_xid_updates
1 31080.30944 4573.547293 0 1
regression 3% -84%
- Autovacuum did not help in improving the Sub's TPS.
- The slot's xmin was not advanced.
~~~~
Observations and RCA for TPS reduction in above tests:
- The launcher was not able to advance slot.xmin during the 15-minute
pgbench run, leading to increased dead tuple accumulation on the
subscriber node.
- The launcher failed to advance slot.xmin because the apply worker
could not set the oldest_nonremovable_xid early and frequently enough
due to following two reasons -
1) For large pgbench tables (scale=100), the tablesync takes time
to complete, forcing the apply worker to wait before updating its
oldest_nonremovable_xid.
2) With 30 clients generating operations at a pace that a single
apply worker cannot match, the worker fails to catch up with the
rapidly increasing remote_lsn, lagging behind the Publisher's LSN
throughout the 15-minute run.
Considering the above reasons, for better performance measurements,
collected data when table_sync is off, with a varying number of
clients on the publisher node. Below test used the v21 patch set,
which also includes improvement patches (006 and 007) for more
frequent slot.xmin updates.
-------------------------------
Case-3: Create the subscription with option "copy_data=false", so, no
tablesync in the picture.
-------------------------------
Test setup:
- Code: pgHead + v21 patches.
- Created Pub and Sub nodes with logical replication and below configurations.
autovacuum_naptime = '30s'
shared_buffers = '30GB'
max_wal_size = 20GB
min_wal_size = 10GB
track_commit_timestamp = on (only on Sub node).
- The Pub and Sub had different pgbench tables with initial data of scale=100.
- Ran pgbench(read-write) on both the pub and the sub for a duration
of 15 minutes, using 30 clients on the Subscriber while varying the
number of clients on the Publisher.
- In addition to TPS, the frequency of slot.xmin and the worker's
oldest_nonremovable_xid updates was also measured.
Observations:
- As the number of clients on the publisher increased, the
publisher's TPS improved, but the subscriber's TPS dropped
significantly.
- The frequency of slot.xmin updates also declined with more clients
on the publisher, indicating that the apply worker updated its
oldest_nonremovable_xid less frequently as the read-write operations
on the publisher increased.
Results:
#Pub-clients pubTPS pubTPS_increament subTPS pubTPS_reduction
#slot.xmin_updates #worker's_oldest_nonremovable_xid_updates
1 1364.487898 0 35000.06738 0 6976 6977
2 2706.100445 98% 32297.81408 -8% 5838 5839
4 5079.522778 272% 8581.034791 -75% 268 269
30 31308.18524 2195% 5324.328696 -85% 4 5
Note: In the above result table, the column -
- "PubTPS_increment" represents the % improvement in the Pub's
TPS compared to its TPS in the initial run with #Pub-clients=1 and
- "SubTPS_reduction" indicates the % decrease in the Sub's TPS
compared to its TPS in the initial run with #Pub-clients=1.
~~~~
Conclusion:
There is some improvement in slot.xmin update frequency with
table_sync off and the additional patches that updates slot's xmin
aggressively.
However, the key point is that with a large number of clients
generating write operations, apply worker LAGs with a large margin
leading to non-updation of slot's xmin as the test run time increases.
This is also visible [in case-3] that with only 1 client on publisher,
there is no degradation on the subscriber. As the number of clients
increases, the degradation also increases.
Based on this test analysis I can say that we need some way/option to
invalidate such slots that LAG by a threshold margin, as mentioned at
[1]. This should solve the performance degradation and bloat problem.
~~~~
(Attached the test scripts used for above tests)
[1]
https://www.postgresql.org/message-id/CAA4eK1Jyo4odkVsnSeAWPh8Wgpw12EbS9q8s_eN14LtcFNXCSA%40mail.gmail.com
--
Thanks,
Nisha
#!/bin/bash
##################
### Definition ###
##################
port_pub=6633
port_sub=6634
## prefix
PUB_PREFIX="$HOME/project/pg1/postgres/inst/bin"
## scale factor
SCALE=100
## pgbench init command
INIT_COMMAND="pgbench -i -U postgres postgres -s $SCALE"
SOURCE=$1
################
### clean up ###
################
./pg_ctl stop -D data_pub -w
./pg_ctl stop -D data_sub -w
rm -rf data* *log
#######################
### setup publisher ###
#######################
./initdb -D data_pub -U postgres
cat << EOF >> data_pub/postgresql.conf
port=$port_pub
# autovacuum = false
shared_buffers = '30GB'
max_wal_size = 20GB
min_wal_size = 10GB
wal_level = logical
EOF
./pg_ctl -D data_pub start -w -l pub.log
${PUB_PREFIX}/$INIT_COMMAND -p $port_pub
./psql -U postgres -p $port_pub -c "CREATE PUBLICATION pub FOR ALL TABLES;"
#######################
### setup sublisher ###
#######################
./initdb -D data_sub -U postgres
cat << EOF >> data_sub/postgresql.conf
port=$port_sub
# autovacuum = false
autovacuum_naptime = '30s'
shared_buffers = '30GB'
max_wal_size = 20GB
min_wal_size = 10GB
track_commit_timestamp = on
#log_min_messages = DEBUG1
EOF
./pg_ctl -D data_sub start -w -l sub.log
./$INIT_COMMAND -p $port_sub
(
echo "CREATE TABLE pgbench_pub_history (tid int,bid int,aid bigint,delta int,mtime timestamp,filler char(22));"
echo "CREATE TABLE pgbench_pub_tellers (tid int not null primary key,bid int,tbalance int,filler char(84));"
echo "CREATE TABLE pgbench_pub_accounts (aid bigint not null primary key,bid int,abalance int,filler char(84));"
echo "CREATE TABLE pgbench_pub_branches (bid int not null primary key,bbalance int,filler char(88));"
) | ./psql -p $port_sub -U postgres
if [ $SOURCE = "head" ]
then
./psql -U postgres -p $port_sub -c "CREATE SUBSCRIPTION sub CONNECTION 'port=6633 user=postgres' PUBLICATION pub;"
else
./psql -U postgres -p $port_sub -c "CREATE SUBSCRIPTION sub CONNECTION 'port=6633 user=postgres' PUBLICATION pub WITH (retain_conflict_info = on);"
fi
# Wait until all the table sync is done
REMAIN="f"
while [ "$REMAIN" = "f" ]
do
# Sleep a bit to avoid running the query too much
sleep 1s
# Check pg_subscription_rel catalog. This query is ported from wait_for_subscription_sync()
# defined in Cluster.pm.
REMAIN=`./psql -qtA -U postgres -p $port_sub -c "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');"`
# Print the result for the debugging purpose
echo $REMAIN
done
#!/bin/bash
##################
### Definition ###
##################
#export PATH="$HOME/project/pg1/postgres/inst/bin:$PATH"
port_pub=6633
port_sub=6634
## prefix
PUB_PREFIX="$HOME/project/pg1/postgres/inst/bin"
## Used source
SOURCE=v21_tswait
## Number of runs
NUMRUN=1
## Measurement duration
DURATION=900
## Number of clients during a run
NUMCLIENTS=30
###########################
### measure performance ###
###########################
for i in `seq ${NUMRUN}`
do
# Prepare clean enviroment for each measurements
#./v2_setup_n.sh $SOURCE
sh $HOME/project/update_deleted/perf_test_v21/v21_case1_setup.sh $SOURCE
echo "=================="
echo "${SOURCE}_${i}.dat"
echo "=================="
# Do actual measurements
${PUB_PREFIX}/pgbench -p $port_pub -U postgres postgres -c $NUMCLIENTS -j $NUMCLIENTS -T $DURATION -P 30 > pub_${SOURCE}_${i}.dat &
./pgbench -p $port_sub -U postgres postgres -c $NUMCLIENTS -j $NUMCLIENTS -T $DURATION -P 30 > sub_${SOURCE}_${i}.dat
echo "=================="
sleep 10s
done