Re: speed up a logical replica setup

Shubham Khanna Fri, 16 Feb 2024 01:20:31 -0800

On Thu, Feb 15, 2024 at 4:53 PM Hayato Kuroda (Fujitsu)
<kuroda.hay...@fujitsu.com> wrote:
>
> Dear Euler,
>
> > Policy)
> >
> > Basically, we do not prohibit to connect to primary/standby.
> > primary_slot_name may be changed during the conversion and
> > tuples may be inserted on target just after the promotion, but it seems no 
> > issues.
> >
> > API)
> >
> > -D (data directory) and -d (databases) are definitively needed.
> >
> > Regarding the -P (a connection string for source), we can require it for 
> > now.
> > But note that it may cause an inconsistency if the pointed not by -P is 
> > different
> > from the node pointde by primary_conninfo.
> >
> > As for the connection string for the target server, we can choose two ways:
> > a)
> > accept native connection string as -S. This can reuse the same parsing
> > mechanism as -P,
> > but there is a room that non-local server is specified.
> >
> > b)
> > accept username/port as -U/-p
> > (Since the policy is like above, listen_addresses would not be overwritten. 
> > Also,
> > the port just specify the listening port).
> > This can avoid connecting to non-local, but more options may be needed.
> > (E.g., Is socket directory needed? What about password?)
> >
> > Other discussing point, reported issue)
> >
> > Points raised by me [1] are not solved yet.
> >
> > * What if the target version is PG16-?
> > * What if the found executables have diffent version with 
> > pg_createsubscriber?
> > * What if the target is sending WAL to another server?
> >    I.e., there are clusters like `node1->node2-.node3`, and the target is 
> > node2.
> > * Can we really cleanup the standby in case of failure?
> >    Shouldn't we suggest to remove the target once?
> > * Can we move outputs to stdout?
>
> Based on the discussion, I updated the patch set. Feel free to pick them and 
> include.
> Removing -P patch was removed, but removing -S still remained.
>
> Also, while testing the patch set, I found some issues.
>
> 1.
> Cfbot got angry [1]. This is because WIFEXITED and others are defined in 
> <sys/wait.h>,
> but the inclusion was removed per comment. Added the inclusion again.
>
> 2.
> As Shubham pointed out [3], when we convert an intermediate node of cascading 
> replication,
> the last node would stuck. This is because a walreciever process requires 
> nodes have the same
> system identifier (in WalReceiverMain), but it would be changed by 
> pg_createsubscriebr.
>
> 3.
> Moreover, when we convert a last node of cascade, it won't work well. Because 
> we cannot create
> publications on the standby node.
>
> 4.
> If the standby server was initialized as PG16-, this command would fail.
> Because the API of pg_logical_create_replication_slot() were changed.
>
> 5.
> Also, used pg_ctl commands must have same versions with the instance.
> I think we should require all the executables and servers must be a same 
> major version.
>
> Based on them, below part describes attached ones:
>
> V20-0001: same as Euler's patch, v17-0001.
> V20-0002: Update docs per recent changes. Same as v19-0002
> V20-0003: Modify the alignment of codes. Same as v19-0003
> V20-0004: Change an argument of get_base_conninfo. Same as v19-0004
> === experimental patches ===
> V20-0005: Add testcases. Same as v19-0004
> V20-0006: Update a comment above global variables. Same as v19-0005
> V20-0007: Address comments from Vignesh. Some parts you don't like
>           are reverted.
> V20-0008: Fix error message in get_bin_directory(). Same as v19-0008
> V20-0009: Remove -S option. Refactored from v16-0007
> V20-0010: Add check versions of executables and the target, per above and [4]
> V20-0011: Detect a disconnection while waiting the recovery, per [4]
> V20-0012: Avoid running pg_createsubscriber for cascade physical replication, 
> per above.
>
> [1]: https://cirrus-ci.com/task/4619792833839104
> [2]: 
> https://www.postgresql.org/message-id/CALDaNm1r9ZOwZamYsh6MHzb%3D_XvhjC_5XnTAsVecANvU9FOz6w%40mail.gmail.com
> [3]: 
> https://www.postgresql.org/message-id/CAHv8RjJcUY23ieJc5xqg6-QeGr1Ppp4Jwbu7Mq29eqCBTDWfUw%40mail.gmail.com
> [4]: 
> https://www.postgresql.org/message-id/TYCPR01MB1207713BEC5C379A05D65E342F54B2%40TYCPR01MB12077.jpnprd01.prod.outlook.com


I found a couple of issues, while verifying the cascaded replication
with the following scenarios:
Scenario 1) Create cascade replication like node1->node2->node3
without using replication slots (attached
cascade_3node_setup_without_slots has the script for this):
Then I ran pg_createsubscriber by specifying primary as node1 and
standby as node3, this scenario runs successfully. I was not sure if
this should be supported or not?
Scenario 2) Create cascade replication like node1->node2->node3 using
replication slots (attached cascade_3node_setup_with_slots has the
script for this):
Here, slot name was used as slot1 for node1 to node2 and slot2 for
node2 to node3. Then I ran pg_createsubscriber by specifying primary
as node1 and standby as node3. In this case pg_createsubscriber fails
with the following error:
pg_createsubscriber: error: could not obtain replication slot
information: got 0 rows, expected 1 row
[Inferior 1 (process 2623483) exited with code 01]

This is failing because slot name slot2 is used between node2->node3
but pg_createsubscriber is checked for slot1, the slot which is used
for replication between node1->node2.
Thoughts?

Thanks and Regards,
Shubham Khanna.

#!/bin/bash

#
# This script tests a case of cascading replication.
#
# Creating system,:
#   node1-(physical replication)->node2-(physical replication)->node3
#

# Stop instances
sudo pkill -9 postgres
pg_ctl stop -D node1
pg_ctl stop -D node2
pg_ctl stop -D node3

# Remove old files
rm -rf node1
rm -rf node2
rm -rf node3
rm log_node2 log_node1 log_node3

# Initialize primary
initdb -D node1

echo "wal_level = logical" >> node1/postgresql.conf
echo "max_replication_slots=10" >> node1/postgresql.conf

pg_ctl -D node1 -l log_node1 start

psql -d postgres -c "CREATE DATABASE db1";
psql -d db1 -c "CHECKPOINT";

sleep 3

# Initialize standby
pg_basebackup -h 127.0.0.1 -X stream -v -R -D node2
echo "Port=9000" >> node2/postgresql.conf

pg_ctl -D node2 -l log_node2 start

# Initialize another standby
pg_basebackup -h 127.0.0.1 -X stream -p 9000 -v -R -D node3
echo "Port=9001" >> node3/postgresql.conf
pg_ctl -D node3 -l log_node3 start

# Insert a tuple to primary
psql -d db1 -c "CREATE TABLE c1(a int)";
psql -d db1 -c "Insert into c1 Values(2)";
sleep 3

# And verify it can be seen on another standby
psql -d db1 -p 9001 -c "Select * from c1";


pg_createsubscriber -D node3/ -P "host=localhost port=5432 dbname=postgres" -d postgres -d db1 -p 9001 -r -v
pg_ctl -D node3 -l log_node3 start
psql -d db1 -c "INSERT INTO c1 VALUES(3)";
psql -d db1 -p 9001 -c "SELECT * FROM c1";

#!/bin/bash

#
# This script tests a case of cascading replication.
#
# Creating system,:
#   node1-(physical replication)->node2-(physical replication)->node3
#

# Stop instances
sudo pkill -9 postgres
pg_ctl stop -D node1
pg_ctl stop -D node2
pg_ctl stop -D node3

# Remove old files
rm -rf node1
rm -rf node2
rm -rf node3
rm log_node2 log_node1 log_node3

# Initialize primary
initdb -D node1

echo "wal_level = logical" >> node1/postgresql.conf
echo "max_replication_slots=10" >> node1/postgresql.conf

pg_ctl -D node1 -l log_node1 start

psql -d postgres -c "CREATE DATABASE db1";
psql -d db1 -c "CHECKPOINT";

sleep 3

# Initialize standby
pg_basebackup -h 127.0.0.1 -X stream -S slot1 -C -v -R -D node2
echo "Port=9000" >> node2/postgresql.conf

pg_ctl -D node2 -l log_node2 start

# Initialize another standby
pg_basebackup -h 127.0.0.1 -X stream -p 9000 -S slot2 -C -v -R -D node3
echo "Port=9001" >> node3/postgresql.conf
pg_ctl -D node3 -l log_node3 start

# Insert a tuple to primary
psql -d db1 -c "CREATE TABLE c1(a int)";
psql -d db1 -c "Insert into c1 Values(2)";
sleep 3

# And verify it can be seen on another standby
psql -d db1 -p 9001 -c "Select * from c1";


pg_createsubscriber -D node3/ -P "host=localhost port=5432 dbname=postgres" -d postgres -d db1 -p 9001 -r -v
pg_ctl -D node3 -l log_node3 start
psql -d db1 -c "INSERT INTO c1 VALUES(3)";
psql -d db1 -p 9001 -c "SELECT * FROM c1";

Re: speed up a logical replica setup

Reply via email to